β
ezpz testΒΆ
Run the bundled test suite (great for first-time validation):
(should take ~ 1 min)
-
Or, if you already have
torch+mpi4py, try without installing:-
localhost (MacBook Pro):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505
(ezpz) #[12/26/25 @ 14:59:27][~/v/s/ezpz][distributed-metrics][$βΒ»!?] [σ° 20s] ; ezpz test [2025-12-26 14:59:36,627513][I][ezpz/test_dist:132:__post_init__] Outputs will be saved to /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936 [2025-12-26 14:59:36,629251][I][ezpz/dist:1506:setup_torch_distributed] Using fw='ddp' with torch_{device,backend}= {mps, gloo} [2025-12-26 14:59:36,635161][I][ezpz/dist:1371:setup_torch_DDP] Caught MASTER_PORT=58309 from environment! [2025-12-26 14:59:36,635780][I][ezpz/dist:1387:setup_torch_DDP] Using torch.distributed.init_process_group with - master_addr='Sams-MacBook-Pro-2.local' - master_port='58309' - world_size=2 - rank=0 - local_rank=0 - timeout=datetime.timedelta(seconds=3600) - backend='gloo' [2025-12-26 14:59:36,636493][I][ezpz/dist:1019:init_process_group] Calling torch.distributed.init_process_group_with: rank=0 world_size=2 backend=gloo [2025-12-26 14:59:36,753207][I][ezpz/dist:1732:setup_torch] Using device='mps' with backend='gloo' + 'gloo' for distributed training. [2025-12-26 14:59:36,781940][I][ezpz/dist:1779:setup_torch] ['Sams-MacBook-Pro-2.local'][device='mps'][node=0/0][rank=1/1][local_rank=1/1] [2025-12-26 14:59:36,806242][W][ezpz/dist:544:print_dist_setup] Using [2 / 2] available "mps" devices !! [2025-12-26 14:59:36,806669][I][ezpz/dist:1779:setup_torch] ['Sams-MacBook-Pro-2.local'][device='mps'][node=0/0][rank=0/1][local_rank=0/1] [2025-12-26 14:59:36,807024][I][ezpz/test_dist:678:main] Took: 0.18 seconds to setup torch [2025-12-26 14:59:36,816995][I][ezpz/test_dist:461:train] Model size: 567434 parameters [2025-12-26 14:59:36,817813][I][ezpz/test_dist:465:train] ================================================================= Layer (type:depth-idx) Param # ================================================================= SequentialLinearNet -- ββSequential: 1-1 567,434 ================================================================= Total params: 567,434 Trainable params: 567,434 Non-trainable params: 0 ================================================================= [2025-12-26 14:59:36,818532][I][ezpz/test_dist:473:train] Took: 0.00975050003034994 seconds to build model [2025-12-26 14:59:36,818765][W][ezpz/test_dist:590:build_model_and_optimizer] MPS does not support torch.distributed collectives; falling back to CPU [2025-12-26 14:59:36,819313][I][ezpz/test_dist:601:build_model_and_optimizer] model= SequentialLinearNet( (layers): Sequential( (0): Linear(in_features=784, out_features=512, bias=True) (1): ReLU() (2): Linear(in_features=512, out_features=256, bias=True) (3): ReLU() (4): Linear(in_features=256, out_features=128, bias=True) (5): ReLU() (6): Linear(in_features=128, out_features=10, bias=True) ) ) [2025-12-26 14:59:37,383487][I][ezpz/dist:685:wrap_model] Wrapping model with: ddp [2025-12-26 14:59:37,402510][I][ezpz/test_dist:479:train] Took: 0.58 seconds to build optimizer [2025-12-26 14:59:37,586325][I][ezpz/history:220:__init__] Using History with distributed_history=True [2025-12-26 14:59:37,668674][I][ezpz/dist:2044:setup_wandb] Setting up wandb from rank=0 [2025-12-26 14:59:37,669043][I][ezpz/dist:2045:setup_wandb] Using WB_PROJECT=ezpz.test_dist wandb: Currently logged in as: foremans (aurora_gpt) to https://api.wandb.ai. Use `wandb login --relogin` to force relogin wandb: Tracking run with wandb version 0.23.1 wandb: Run data is saved locally in /Users/samforeman/vibes/saforem2/ezpz/wandb/run-20251226_145937-rj4d7rus wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run soft-grass-6851 wandb: View project at https://wandb.ai/aurora_gpt/ezpz.test_dist wandb: View run at https://wandb.ai/aurora_gpt/ezpz.test_dist/runs/rj4d7rus [2025-12-26 14:59:39,090331][I][ezpz/dist:2074:setup_wandb] wandb.run=[soft-grass-6851](https://wandb.ai/aurora_gpt/ezpz.test_dist/runs/rj4d7rus) [2025-12-26 14:59:39,218933][I][ezpz/dist:2117:setup_wandb] Running on machine='localhost' [2025-12-26 14:59:39,479361][I][ezpz/test_dist:482:train] Took: 2.08 seconds to build trainer [2025-12-26 14:59:39,480013][I][ezpz/test_dist:486:train] config: { "acc_events": false, "backend": "DDP", "batch_size": 128, "cp": 1, "dataset": "mnist", "dataset_root": "/Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/datasets/mnist", "dtype": "bf16", "input_size": 784, "layer_sizes": [ 512, 256, 128 ], "log_freq": 1, "no_distributed_history": false, "num_workers": 0, "output_size": 10, "pp": 1, "print_freq": 10, "profile_memory": true, "pyinstrument_profiler": false, "pytorch_profiler": false, "pytorch_profiler_active": 3, "pytorch_profiler_repeat": 5, "pytorch_profiler_wait": 1, "pytorch_profiler_warmup": 2, "rank_zero_only": false, "record_shapes": true, "tp": 1, "train_iters": 200, "warmup": 5, "with_flops": true, "with_modules": true, "with_stack": true } [2025-12-26 14:59:39,481620][I][ezpz/test_dist:488:train] Took: 3.65 to get here. [2025-12-26 14:59:39,984314][I][ezpz/test_dist:369:train] Warmup complete at step 5 [2025-12-26 14:59:40,063715][I][ezpz/test_dist:325:train_step] iter=10 loss=1.188046 accuracy=0.593750 dtf=0.008557 dtb=0.001808 loss/mean=1.197080 loss/max=1.206113 loss/min=1.188046 loss/std=0.009037 accuracy/mean=0.625000 accuracy/max=0.656250 accuracy/min=0.593750 accuracy/std=0.031250 dtf/mean=0.008275 dtf/max=0.008557 dtf/min=0.007993 dtf/std=0.000282 dtb/mean=0.002003 dtb/max=0.002198 dtb/min=0.001808 dtb/std=0.000195 [2025-12-26 14:59:40,274480][I][ezpz/test_dist:325:train_step] iter=20 loss=0.650923 accuracy=0.742188 dtf=0.010504 dtb=0.008142 loss/mean=0.728713 loss/max=0.806504 loss/min=0.650923 loss/std=0.077790 accuracy/mean=0.769531 accuracy/max=0.796875 accuracy/min=0.742188 accuracy/std=0.027344 dtf/mean=0.010317 dtf/max=0.010504 dtf/min=0.010130 dtf/std=0.000187 dtb/mean=0.008175 dtb/max=0.008207 dtb/min=0.008142 dtb/std=0.000032 [2025-12-26 14:59:40,534115][I][ezpz/test_dist:325:train_step] iter=30 loss=0.642461 accuracy=0.804688 dtf=0.009502 dtb=0.001911 loss/mean=0.528537 loss/max=0.642461 loss/min=0.414612 loss/std=0.113924 accuracy/mean=0.824219 accuracy/max=0.843750 accuracy/min=0.804688 accuracy/std=0.019531 dtf/mean=0.010211 dtf/max=0.010919 dtf/min=0.009502 dtf/std=0.000708 dtb/mean=0.001896 dtb/max=0.001911 dtb/min=0.001881 dtb/std=0.000015 [2025-12-26 14:59:40,729254][I][ezpz/test_dist:325:train_step] iter=40 loss=0.349402 accuracy=0.898438 dtf=0.007339 dtb=0.004863 loss/mean=0.359106 loss/max=0.368810 loss/min=0.349402 loss/std=0.009704 accuracy/mean=0.890625 accuracy/max=0.898438 accuracy/min=0.882812 accuracy/std=0.007812 dtf/mean=0.007400 dtf/max=0.007461 dtf/min=0.007339 dtf/std=0.000061 dtb/mean=0.004861 dtb/max=0.004863 dtb/min=0.004860 dtb/std=0.000000 [2025-12-26 14:59:40,904186][I][ezpz/test_dist:325:train_step] iter=50 loss=0.345590 accuracy=0.867188 dtf=0.006774 dtb=0.001858 loss/mean=0.350946 loss/max=0.356301 loss/min=0.345590 loss/std=0.005355 accuracy/mean=0.878906 accuracy/max=0.890625 accuracy/min=0.867188 accuracy/std=0.011719 dtf/mean=0.006920 dtf/max=0.007066 dtf/min=0.006774 dtf/std=0.000146 dtb/mean=0.001857 dtb/max=0.001858 dtb/min=0.001856 dtb/std=0.000001 [2025-12-26 14:59:41,069650][I][ezpz/test_dist:325:train_step] iter=60 loss=0.376659 accuracy=0.890625 dtf=0.007758 dtb=0.001745 loss/mean=0.320235 loss/max=0.376659 loss/min=0.263812 loss/std=0.056424 accuracy/mean=0.914062 accuracy/max=0.937500 accuracy/min=0.890625 accuracy/std=0.023438 dtf/mean=0.007664 dtf/max=0.007758 dtf/min=0.007569 dtf/std=0.000095 dtb/mean=0.001749 dtb/max=0.001753 dtb/min=0.001745 dtb/std=0.000004 [2025-12-26 14:59:41,242790][I][ezpz/test_dist:325:train_step] iter=70 loss=0.575540 accuracy=0.828125 dtf=0.007760 dtb=0.001824 loss/mean=0.494479 loss/max=0.575540 loss/min=0.413418 loss/std=0.081061 accuracy/mean=0.855469 accuracy/max=0.882812 accuracy/min=0.828125 accuracy/std=0.027344 dtf/mean=0.007917 dtf/max=0.008074 dtf/min=0.007760 dtf/std=0.000157 dtb/mean=0.001858 dtb/max=0.001892 dtb/min=0.001824 dtb/std=0.000034 [2025-12-26 14:59:41,415724][I][ezpz/test_dist:325:train_step] iter=80 loss=0.196338 accuracy=0.953125 dtf=0.007632 dtb=0.003868 loss/mean=0.225939 loss/max=0.255540 loss/min=0.196338 loss/std=0.029601 accuracy/mean=0.933594 accuracy/max=0.953125 accuracy/min=0.914062 accuracy/std=0.019531 dtf/mean=0.007239 dtf/max=0.007632 dtf/min=0.006847 dtf/std=0.000393 dtb/mean=0.004381 dtb/max=0.004893 dtb/min=0.003868 dtb/std=0.000513 [2025-12-26 14:59:41,579460][I][ezpz/test_dist:325:train_step] iter=90 loss=0.331747 accuracy=0.906250 dtf=0.008618 dtb=0.004053 loss/mean=0.344878 loss/max=0.358009 loss/min=0.331747 loss/std=0.013131 accuracy/mean=0.906250 accuracy/max=0.906250 accuracy/min=0.906250 accuracy/std=0.000000 dtf/mean=0.008693 dtf/max=0.008768 dtf/min=0.008618 dtf/std=0.000075 dtb/mean=0.004049 dtb/max=0.004053 dtb/min=0.004045 dtb/std=0.000004 [2025-12-26 14:59:41,729606][I][ezpz/test_dist:325:train_step] iter=100 loss=0.188108 accuracy=0.937500 dtf=0.007073 dtb=0.001962 loss/mean=0.180938 loss/max=0.188108 loss/min=0.173769 loss/std=0.007169 accuracy/mean=0.945312 accuracy/max=0.953125 accuracy/min=0.937500 accuracy/std=0.007812 dtf/mean=0.006854 dtf/max=0.007073 dtf/min=0.006634 dtf/std=0.000219 dtb/mean=0.001962 dtb/max=0.001962 dtb/min=0.001962 dtb/std=0.000000 [2025-12-26 14:59:41,884339][I][ezpz/test_dist:325:train_step] iter=110 loss=0.267521 accuracy=0.890625 dtf=0.007719 dtb=0.002057 loss/mean=0.383564 loss/max=0.499606 loss/min=0.267521 loss/std=0.116043 accuracy/mean=0.871094 accuracy/max=0.890625 accuracy/min=0.851562 accuracy/std=0.019531 dtf/mean=0.007575 dtf/max=0.007719 dtf/min=0.007431 dtf/std=0.000144 dtb/mean=0.002060 dtb/max=0.002063 dtb/min=0.002057 dtb/std=0.000003 [2025-12-26 14:59:42,050014][I][ezpz/test_dist:325:train_step] iter=120 loss=0.210285 accuracy=0.937500 dtf=0.011066 dtb=0.001822 loss/mean=0.241504 loss/max=0.272723 loss/min=0.210285 loss/std=0.031219 accuracy/mean=0.937500 accuracy/max=0.937500 accuracy/min=0.937500 accuracy/std=0.000000 dtf/mean=0.010052 dtf/max=0.011066 dtf/min=0.009037 dtf/std=0.001015 dtb/mean=0.001869 dtb/max=0.001915 dtb/min=0.001822 dtb/std=0.000047 [2025-12-26 14:59:42,230004][I][ezpz/test_dist:325:train_step] iter=130 loss=0.139174 accuracy=0.968750 dtf=0.010818 dtb=0.001807 loss/mean=0.133106 loss/max=0.139174 loss/min=0.127037 loss/std=0.006068 accuracy/mean=0.964844 accuracy/max=0.968750 accuracy/min=0.960938 accuracy/std=0.003906 dtf/mean=0.010070 dtf/max=0.010818 dtf/min=0.009322 dtf/std=0.000748 dtb/mean=0.004232 dtb/max=0.006658 dtb/min=0.001807 dtb/std=0.002425 [2025-12-26 14:59:42,401759][I][ezpz/test_dist:325:train_step] iter=140 loss=0.217151 accuracy=0.921875 dtf=0.007524 dtb=0.001881 loss/mean=0.205181 loss/max=0.217151 loss/min=0.193212 loss/std=0.011969 accuracy/mean=0.929688 accuracy/max=0.937500 accuracy/min=0.921875 accuracy/std=0.007812 dtf/mean=0.007589 dtf/max=0.007655 dtf/min=0.007524 dtf/std=0.000065 dtb/mean=0.001849 dtb/max=0.001881 dtb/min=0.001817 dtb/std=0.000032 [2025-12-26 14:59:42,562758][I][ezpz/test_dist:325:train_step] iter=150 loss=0.388715 accuracy=0.882812 dtf=0.006638 dtb=0.001826 loss/mean=0.378151 loss/max=0.388715 loss/min=0.367587 loss/std=0.010564 accuracy/mean=0.886719 accuracy/max=0.890625 accuracy/min=0.882812 accuracy/std=0.003906 dtf/mean=0.006729 dtf/max=0.006820 dtf/min=0.006638 dtf/std=0.000091 dtb/mean=0.001828 dtb/max=0.001829 dtb/min=0.001826 dtb/std=0.000002 [2025-12-26 14:59:42,732920][I][ezpz/test_dist:325:train_step] iter=160 loss=0.197628 accuracy=0.921875 dtf=0.010449 dtb=0.002640 loss/mean=0.255450 loss/max=0.313271 loss/min=0.197628 loss/std=0.057821 accuracy/mean=0.917969 accuracy/max=0.921875 accuracy/min=0.914062 accuracy/std=0.003906 dtf/mean=0.010021 dtf/max=0.010449 dtf/min=0.009594 dtf/std=0.000428 dtb/mean=0.002552 dtb/max=0.002640 dtb/min=0.002463 dtb/std=0.000089 [2025-12-26 14:59:42,889920][I][ezpz/test_dist:325:train_step] iter=170 loss=0.325840 accuracy=0.867188 dtf=0.007486 dtb=0.002018 loss/mean=0.304081 loss/max=0.325840 loss/min=0.282321 loss/std=0.021760 accuracy/mean=0.882812 accuracy/max=0.898438 accuracy/min=0.867188 accuracy/std=0.015625 dtf/mean=0.007106 dtf/max=0.007486 dtf/min=0.006727 dtf/std=0.000380 dtb/mean=0.002002 dtb/max=0.002018 dtb/min=0.001986 dtb/std=0.000016 [2025-12-26 14:59:43,052496][I][ezpz/test_dist:325:train_step] iter=180 loss=0.146518 accuracy=0.945312 dtf=0.007811 dtb=0.001911 loss/mean=0.152537 loss/max=0.158556 loss/min=0.146518 loss/std=0.006019 accuracy/mean=0.945312 accuracy/max=0.945312 accuracy/min=0.945312 accuracy/std=0.000000 dtf/mean=0.007945 dtf/max=0.008078 dtf/min=0.007811 dtf/std=0.000133 dtb/mean=0.001863 dtb/max=0.001911 dtb/min=0.001816 dtb/std=0.000048 [2025-12-26 14:59:43,202332][I][ezpz/test_dist:325:train_step] iter=190 loss=0.141739 accuracy=0.953125 dtf=0.009768 dtb=0.002052 loss/mean=0.185415 loss/max=0.229091 loss/min=0.141739 loss/std=0.043676 accuracy/mean=0.953125 accuracy/max=0.953125 accuracy/min=0.953125 accuracy/std=0.000000 dtf/mean=0.009895 dtf/max=0.010022 dtf/min=0.009768 dtf/std=0.000127 dtb/mean=0.002053 dtb/max=0.002054 dtb/min=0.002052 dtb/std=0.000001 [2025-12-26 14:59:43,943497][I][ezpz/history:2385:finalize] Saving plots to /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/mplot (matplotlib) and /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot (tplot) accuracy accuracy/min βββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ 0.984β€ β βββββ βββββββββ0.961β€ -------------------------β 0.919β€ βββββββββββββββββββββββββββββ0.836β€ ------------ ----- --- -- - - β β ββββββββββββββββ ββββ β βββββ0.711β€---- β 0.854β€ βββββ βββ β0.586β€-- β 0.789β€ ββββ β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 0.724β€ββββ β 1.0 49.2 97.5 145.8 194.0 ββββ βaccuracy/min iter 0.659β€ββ β accuracy/std 0.594β€β β βββββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β0.105β€ * β 1.0 49.2 97.5 145.8 194.0 0.088β€ *** β accuracy iter 0.053β€**** * * * β accuracy/mean 0.035β€************* ******* ********** β βββββββββββββββββββββββββββββββββββ0.000β€ *******************************β 0.969β€ Β· Β·Β·Β· Β·Β· Β·Β·Β·Β·Β·β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 0.910β€ Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·β 1.0 49.2 97.5 145.8 194.0 β Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Β·Β·Β·Β· Β· Β·Β· βaccuracy/std iter 0.852β€ Β·Β·Β·Β·Β·Β· Β· β accuracy/max 0.793β€ Β·Β·Β·Β· β βββββββββββββββββββββββββββββββββββ β Β·Β·Β· β0.984β€ ++++ +++ +++++++++ +++++β 0.734β€ Β·Β· β0.928β€ +++++++++++++++++++++++++++ +++β 0.676β€Β·Β· β0.816β€ +++++ β βΒ·Β· β0.760β€+++ β 0.617β€Β·Β· β0.648β€++ β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 1.0 49.2 97.5 145.8 194.0 1.0 49.2 97.5 145.8 194.0 accuracy/mean iter accuracy/max iter text saved in /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot/accuracy.txt βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.984β€ ++ accuracy/max ββ β + ββ ββ+ β β -- accuracy/min + ββββ β β β β β ββ βββ++β β Β·Β· accuracy/mean +Β· Β· ββ + βΒ· +ββββββ β+ ββ βββββ βββββ ββββββ β ββ accuracy ββ+β+ββ ββ Β· ββββ+ ββββββββββΒ·+ββββββββ+βββββββΒ·βββββ 0.918β€ βββββββββ+ββββ ββββββββββββ βββββΒ·βββββββββββββββββ-ββββΒ·-ββββ β βββββββββββββββββββββββββββββ β--βββββββββ β- -ββ -ββββ- ββ β β + +βΒ·ββββββΒ·ββΒ·ββββββ ββ-ββ Β·βββ -β-β ββ - -β ββ β β + β ++βΒ·βββββΒ·-ββ-ββ-βΒ·β ββ -βΒ· β ββ β 0.852β€ + β++ββΒ·ββΒ·Β·β ββ ββ β- - - β β +β+ βββΒ·β - Β·Β·-- β ββ β β ++β+ βββΒ·β - -Β· β β β ++β+βββββ - β 0.785β€ ++βββββ-- β β ++βββββ - β β +ββββ β β β βββββ β β β ββββ- - β 0.719β€ ββββ- - β ββββββ - β ββββββ β βββββ β 0.652β€ββ- β βΒ·β β βΒ·β β β-β β 0.586β€-β β ββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 text saved in /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot/accuracy_summary.txt accuracy/mean hist accuracy/max hist ββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββ 74.0β€ ββββ β71.0β€ ββββ β 61.7β€ ββββ β59.2β€ ββββ β β ββββ β β ββββ β 49.3β€ ββββββββ47.3β€ βββββββ β 37.0β€ ββββββββ35.5β€ βββββββββββ β βββββββββββ β βββββββββββ 24.7β€ βββββββββββ23.7β€ βββββββββββ 12.3β€ βββββββββββ11.8β€ βββββββββββββββ β ββββββββββββββββββββββ β βββββββββββββββββββββββββ 0.0β€βββββββββββββββββββββββββββββββββββ 0.0β€βββββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β 0.60 0.70 0.79 0.89 0.98 0.63 0.72 0.82 0.91 1.00 accuracy/min hist accuracy/std hist ββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββ 84β€ βββ β91.0β€ββββ β β βββ β βββββ β 70β€ βββ β75.8β€ββββ β 56β€ βββ β60.7β€βββββββ β β βββ β ββββββββ β 42β€ ββββββββ45.5β€βββββββ β β ββββββββββββ ββββββββ β 28β€ ββββββββββββ30.3β€ββββββββββ β 14β€ ββββββββββββ15.2β€ββββββββββ β β βββββββββββββββββββ βββββββββββ ββββ β 0β€βββββββββββββββββββββββββββββββββββββ 0.0β€ββββββββββββββββββββ βββββββββββ ββ¬βββββββββ¬βββββββββ¬ββββββββ¬βββββββββ¬β ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β 0.57 0.67 0.77 0.88 0.98 -0.005 0.024 0.053 0.081 0.110 text saved in /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot/accuracy_hist.txt dtb dtb/min ββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββ 0.0083β€ β β ββ0.0081β€ - - -β 0.0072β€ β β ββ0.0060β€- -- -- - -- -β β ββ β βββ0.0038β€- ----- ------ --- --- ------β 0.0061β€βββ ββ β ββ β βββββ0.0016β€--------------------------------β 0.0050β€βββββββ βββ ββ ββ β β βββββ ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β 0.0038β€βββββββ ββββββ βββ ββ β βββββ 1.0 49.2 97.5 145.8 194.0 ββββββββ ββββββ βββ ββ β βββββdtb/min iter 0.0027β€βββββββββββββββββββ ββββββββββββ dtb/std 0.0016β€ββ β ββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β0.00326β€ * β 1.0 49.2 97.5 145.8 194.0 0.00272β€ * * β dtb iter 0.00163β€****** * * * ** * * * β dtb/mean 0.00109β€******* * *** ** ** ***** **β ββββββββββββββββββββββββββββββββββ0.00000β€*******************************β 0.0082β€ Β· Β· Β·β ββ¬ββββββββ¬βββββββ¬ββββββββ¬ββββββββ 0.0071β€ Β· Β· Β·β 1.0 49.2 97.5 145.8 β Β· Β· Β· Β·βdtb/std iter 0.0060β€ Β·Β· Β· Β·Β· Β·β dtb/max 0.0049β€Β· Β·Β· Β·Β· Β· Β· Β·Β· Β·β ββββββββββββββββββββββββββββββββββ βΒ· Β·Β·Β·Β·Β· Β·Β·Β·Β·Β·Β· Β· Β· Β· Β·Β·Β· Β· Β·Β· Β·β0.0090β€ ++ +β 0.0038β€Β· Β·Β·Β·Β·Β· Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Β·Β·Β· Β· Β·Β·Β·Β·β0.0078β€ ++++ + + + +β 0.0027β€Β·Β·Β·Β·Β·Β·Β· Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Β·Β·Β·Β·Β· Β·Β·Β·Β·β0.0053β€+++++++ ++++++++++++ +++++ ++++β βΒ·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·β0.0041β€+++++++ ++++++++++++ ++++++++++β 0.0016β€ Β· Β· Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· β0.0016β€++++++++++++++++++++++++++++++++β ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β 1.0 49.2 97.5 145.8 194.0 1.0 49.2 97.5 145.8 194.0 dtb/mean iter dtb/max iter text saved in /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot/dtb.txt ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.0090β€ ++ dtb/max β β -- dtb/min β β Β·Β· dtb/mean ββ β ββ dtb ββ 0.0078β€ β + β ββ β β + β ββ β β + β ββ β β + β ββ 0.0065β€ βββ ++ β + + +β ββ β βββ ++ + β ++ ++ ββ ββ β βββΒ·++ + β ++ ++ β ββ +ββ β Β· ββββΒ·+ + β + + +++ ++ β β ββ βββ 0.0053β€ββ βΒ·ββΒ·+ +βββ β ββ + +++ ++ β β ββ βββ βββ βΒ·Β·βΒ·+ ββββ β ββ + β +++ ++ β β ββ βββ βββ βΒ·Β·βΒ·Β· ββββ+ +βββ + ββ + βββ +++ +Β· ββ + β β ββ βββ βββββΒ·Β·βΒ·Β· ββββ+ +βββ + βββ + βββ + +++ Β·Β· ββ + β β ββ βββ βββββΒ·Β·ββΒ· ββββββ Β·βββ +Β· βββ ++βββ+β ++Β· Β·Β· βββ +ββ β ββ βββ 0.0041β€ββββΒ·Β·ββΒ· ββββββ Β·βββ++β βββ Β·ββββΒ·β Β·Β·Β· Β·Β· +βββ +ββ β ββ βββ βββββΒ·-ββΒ· ββββββ ββββ++β βββ Β·ββββΒ·β Β·Β·Β· Β·Β· ββββ +ββ β ββ βββ βββββΒ·-ββΒ· ββββββ ββββ++β βββ Β·ββββΒ·β Β·Β·Β· Β·Β· ββββ Β·ββ β ββ βββ βββββΒ·-ββΒ· ββββββ ββββ+Β·β βββ Β·ββββββ Β·Β·Β· Β·Β· ββββ Β·ββ β ββββββ 0.0028β€ββββΒ·-βββ+ββββββ ββββΒ·Β·β βββ Β·ββββββ Β·Β·Β· Β·β ββββ +Β·ββΒ·β β ββββββ βββββΒ·-ββββββββ ββ + ββββββββββββ ββΒ·βββ--β Β·Β·Β·+ ββΒ·ββββββΒ·ββββ β ββββββ βββββ-- ββββββ ββββββββββββββββββββΒ·βββ- ββ+Β·Β·Β·βββββββββββΒ·βββββ+β ββββββ βββ β ββΒ·- -ββββββββββΒ·βββ-ββββββ β- ββββββββββΒ·ββββββββΒ·βββββ+βββ ββ 0.0016β€ β βββ β - β - β β- ββββ- - ββ β β ββ β β β ββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬βββββββββββββββββ¬ββββββββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 text saved in /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot/dtb_summary.txt dtb/mean hist dtb/max hist βββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ 107.0β€ββββ β105.0β€ββββ β 89.2β€ββββ β 87.5β€ββββ β βββββ β βββββ β 71.3β€ββββ β 70.0β€ββββ β 53.5β€ββββ β 52.5β€ββββ β βββββ β βββββ β 35.7β€ββββ β 35.0β€ββββ β 17.8β€βββββββ ββββ β 17.5β€βββββββ ββββ β ββββββββββββββββββ βββ β ββββββββββββββββββββββββ β 0.0β€ββββββββββββββββββββββββββ βββββ 0.0β€βββββββββββββββββββββββ ββββββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ 0.0013 0.0031 0.0049 0.0067 0.0013 0.0033 0.0053 0.0073 dtb/min hist dtb/std hist βββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ 126β€ββββ β160.0β€ββββ β βββββ β βββββ β 105β€ββββ β133.3β€ββββ β 84β€ββββ β106.7β€ββββ β βββββ β βββββ β 63β€ββββ β 80.0β€ββββ β βββββ β βββββ β 42β€ββββ β 53.3β€ββββ β 21β€βββββββ β 26.7β€ββββ β ββββββββ ββββββββ ββββ β ββββββββ β 0β€ββββββββββββββββββββββββββββ βββββ 0.0β€ββββββββββββββββββββββββββ βββββ ββ¬βββββββββ¬ββββββββ¬βββββββββ¬βββββββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ 0.0013 0.0031 0.0049 0.0066 -0.00015 0.00074 0.00163 0.00252 text saved in /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot/dtb_hist.txt dtf dtf/min ββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββ 0.0155β€ β β β β0.0153β€ - - β 0.0139β€ β β β β0.0122β€ -------- - - - - -- β β βββ β β β ββ β0.0091β€--------------------------------β 0.0124β€ ββββ β β β βββ β0.0060β€- -- --- ----------------------β 0.0108β€ ββββββββ β β β β βββ β ββββ β ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β 0.0092β€βββββββββββββ βββββββββββββββββ β 1.0 49.2 97.5 145.8 194.0 ββ βββββββββββββββββββββββββββββββdtf/min iter 0.0076β€β β ββββββββββββββββββββββββββββ dtf/std 0.0060β€β β ββββββββββ ββββ βββββ ββββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β0.0382β€ * β 1.0 49.2 97.5 145.8 194.0 0.0318β€ * β dtf iter 0.0191β€ * β dtf/mean 0.0127β€ * β ββββββββββββββββββββββββββββββββββ0.0000β€********************************β 0.0510β€ Β· β ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β 0.0435β€ Β· β 1.0 49.2 97.5 145.8 194.0 β Β· βdtf/std iter 0.0360β€ Β· β dtf/max 0.0285β€ Β· β βββββββββββββββββββββββββββββββββββ β Β· β0.089β€ + β 0.0210β€ Β· β0.075β€ + β 0.0135β€ Β·Β·Β· Β· Β· Β· Β· β0.048β€ + β β Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· β0.034β€ + β 0.0060β€Β·Β· Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·β0.006β€+++++++++++++++++++++++++++++++++β ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 1.0 49.2 97.5 145.8 194.0 1.0 49.2 97.5 145.8 194.0 dtf/mean iter dtf/max iter text saved in /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot/dtf.txt βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.089β€ ++ dtf/max β β -- dtf/min β β Β·Β· dtf/mean β β ββ dtf β 0.075β€ + β β + β β + β β + β 0.061β€ + β β + β β + β β Β· β 0.048β€ Β· β β Β· β β Β· β β Β· β β Β· β 0.034β€ Β· β β Β· β β Β· β β Β· β 0.020β€ Β· β β +β Β· β Β·β β β βΒ·β+β ββΒ· Β·β + Β· β β + Β· +Β· ββ ββ β β βββββββββββββββββββββββΒ·ββββΒ·+β+ββΒ·+ββ+ββββΒ·ββββββββΒ·ββββββββββββΒ·βΒ·β Β· β 0.006β€ββ ββββ β ββββ βββββββββββ βββββββββββββββββββ βββ βββββββΒ·ββββββββββ ββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 text saved in /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot/dtf_summary.txt dtf/mean hist dtf/max hist βββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ 168β€ββββ β189.0β€ββββ β 140β€ββββ β157.5β€ββββ β βββββ β βββββ β 112β€ββββ β126.0β€ββββ β 84β€ββββ β 94.5β€ββββ β βββββ β βββββ β 56β€ββββ β 63.0β€ββββ β 28β€ββββ β 31.5β€ββββ β ββββββββ β βββββ β 0β€ββββββββββ βββββ 0.0β€βββββββ βββββ ββ¬βββββββββ¬ββββββββ¬βββββββββ¬ββββββββ¬β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 0.004 0.016 0.028 0.041 0.053 0.002 0.025 0.048 0.070 0.093 dtf/min hist dtf/std hist ββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ 45.0β€βββββββ β193.0β€ββββ β βββββββββββ β βββββ β 37.5β€ββββββββββ β160.8β€ββββ β 30.0β€ββββββββββββββ β128.7β€ββββ β βββββββββββββββ β βββββ β 22.5β€ββββββββββββββ β 96.5β€ββββ β βββββββββββββββ β βββββ β 15.0β€βββββββββββββββββ β 64.3β€ββββ β 7.5β€βββββββββββββββββββββ β 32.2β€ββββ β ββββββββββββββββββββββββββββ β βββββ β 0.0β€βββββββββββββββββββββββββββββββββββ 0.0β€βββ βββββ ββ¬ββββββββ¬βββββββββ¬ββββββββ¬βββββββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 0.0056 0.0081 0.0106 0.0132 -0.002 0.009 0.019 0.030 0.040 text saved in /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot/dtf_hist.txt loss loss/min ββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββ 1.75β€β β1.75β€- β 1.47β€β β1.19β€-- β ββ β0.64β€ --------- - - - β 1.19β€ββ β0.08β€ ----------------------------β 0.92β€ β β ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β 0.64β€ ββββ β 1.0 49.2 97.5 145.8 194.0 β βββββββββ ββ ββloss/min iter 0.36β€ βββββββββββββββββ ββββββββββββββ loss/std 0.08β€ βββ β βββββββββββββββββββ βββββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β0.207β€ * β 1.0 49.2 97.5 145.8 194.0 0.173β€ ** β loss iter 0.104β€ **** * * * **** * * β loss/mean 0.069β€******************************** β ββββββββββββββββββββββββββββββββββββ0.000β€*** ******* *** *****************β 1.76β€Β· β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 1.48β€Β· β 1.0 49.2 97.5 145.8 194.0 βΒ· βloss/std iter 1.21β€ Β· β loss/max 0.94β€ Β· β ββββββββββββββββββββββββββββββββββββ β Β· β1.76β€+ β 0.66β€ Β·Β·Β· β1.49β€++ β 0.39β€ Β·Β·Β· Β·Β·Β· Β· Β· β0.94β€ +++ β β Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·β0.67β€ +++++++++++++++++++++ + +++ β 0.12β€ Β· Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·β0.12β€ ++++++++++++++++++++++++++β ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β 1.0 49.2 97.5 145.8 194.0 1.0 49.2 97.5 145.8 194.0 loss/mean iter loss/max iter text saved in /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot/loss.txt ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 1.76β€ ++ loss/max β β -- loss/min β β Β·Β· loss/mean β β ββ loss β 1.48β€β β β β β β β β β β β 1.20β€ βΒ· β β βΒ· β β βΒ· β β ββ β 0.92β€ β+ β β ββ β β -β + β+ β β -ββΒ·+β+ β β ββΒ·ββΒ·+ β 0.64β€ ββββββββ β β β--βββββ β Β· + β β β - -ββββββ ββΒ·Β· β β β + β β - β -ββββββββ+Β· β +β β+ β ββ+ β β + + ββ β 0.36β€ β βββββββββββΒ·βββββββββΒ·βββββββ β + Β· β +ββ + +β β ββ β β ββββ-ββββ-ββββββββββ-ββββββββββ++Β·+ββββββββββΒ·Β·+βββ +ββΒ·ββ ββ β β ββ ββ β β ββ -ββββββββββββββββββββββββββββββββββββΒ·ββββ β β ββ β βββββββ β--ββββ βββββββββββββββββ 0.08β€ β - ββ ββ βββ β ββ¬ββββββββββββββββββ¬βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 text saved in /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot/loss_summary.txt loss/mean hist loss/max hist ββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββ 98.0β€ββββ β76.0β€βββββββ β 81.7β€ββββ β63.3β€βββββββ β βββββ β ββββββββ β 65.3β€βββββββ β50.7β€βββββββ β 49.0β€βββββββ β38.0β€βββββββ β ββββββββ β ββββββββ β 32.7β€βββββββ β25.3β€βββββββ β 16.3β€βββββββ β12.7β€ββββββββββ β βββββββββββββββ β ββββββββββββββββββ β 0.0β€βββββββββββββββββ βββββββ βββββ 0.0β€βββββββββββββββββββββββββββ βββββ ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β 0.04 0.49 0.94 1.38 1.83 0.05 0.49 0.94 1.39 1.83 loss/min hist loss/std hist βββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββ 101.0β€ββββ β76.0β€ββββ β βββββ β βββββ β 84.2β€ββββ β63.3β€ββββ β 67.3β€ββββ β50.7β€ββββ β ββββββββ β ββββββββ β 50.5β€βββββββ β38.0β€βββββββ β ββββββββ β βββββββββββ β 33.7β€βββββββ β25.3β€ββββββββββββββ β 16.8β€ββββββββββ β12.7β€ββββββββββββββ β ββββββββββββββ β ββββββββββββββββββββββ β 0.0β€ββββββββββββββββ βββ ββββββββ 0.0β€ββββββββββββββββββββββββ ββββββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β 0.01 0.46 0.92 1.37 1.82 -0.009 0.047 0.104 0.160 0.216 text saved in /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/plots/tplot/loss_hist.txt [2025-12-26 14:59:47,081689][I][ezpz/history:2433:finalize] Saving history report to /Users/samforeman/vibes/saforem2/ezpz/outputs/ezpz.test_dist/2025-12-26-145936/report.md [2025-12-26 14:59:47,085092][I][ezpz/test_dist:348:finalize] dataset=<xarray.Dataset> Size: 39kB Dimensions: (draw: 194) Coordinates: * draw (draw) int64 2kB 0 1 2 3 4 5 6 ... 188 189 190 191 192 193 Data variables: (12/25) iter (draw) int64 2kB 6 7 8 9 10 11 12 ... 194 195 196 197 198 199 loss (draw) float32 776B 1.751 1.595 1.422 ... 0.2113 0.1499 accuracy (draw) float32 776B 0.7031 0.6484 0.6875 ... 0.9141 0.9531 dtf (draw) float64 2kB 0.007276 0.006534 ... 0.007024 0.007607 dtb (draw) float64 2kB 0.004568 0.002039 ... 0.001932 0.008309 iter_mean (draw) float64 2kB 6.0 7.0 8.0 9.0 ... 197.0 198.0 199.0 ... ... dtf_min (draw) float64 2kB 0.006776 0.006534 ... 0.006853 0.00668 dtf_std (draw) float64 2kB 0.00025 0.0001769 ... 8.545e-05 0.0004633 dtb_mean (draw) float64 2kB 0.003325 0.002154 ... 0.002945 0.00807 dtb_max (draw) float64 2kB 0.004568 0.002269 ... 0.003958 0.008309 dtb_min (draw) float64 2kB 0.002083 0.00204 ... 0.001932 0.007832 dtb_std (draw) float64 2kB 0.001242 0.0001147 ... 0.001013 0.0002383 [2025-12-26 14:59:47,608766][I][ezpz/test_dist:500:train] Took: 8.13 seconds to finish training [2025-12-26 14:59:47,609602][I][ezpz/test_dist:695:main] Took: 11.78 seconds wandb: wandb: π View run soft-grass-6851 at: wandb: Find logs at: wandb/run-20251226_145937-rj4d7rus/logs [2025-12-23-162222] Execution time: 19s sec
-
{Aurora, Sunspot} @ ALCF
Output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584
#[12/26/25,12:56:24][x4310c1s0b0n0][/f/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007] $ module load frameworks \ && TMPDIR=$(pwd) uv run \ --python=$(which python3) \ --with "git+https://github.com/saforem2/ezpz" \ ezpz test [2025-12-26 12:56:59,991844][I][ezpz/launch:396:launch] ----[π ezpz.launch][started][2025-12-26-125659]---- [2025-12-26 12:57:00,950846][I][ezpz/launch:416:launch] Job ID: 8234998 [2025-12-26 12:57:00,951634][I][ezpz/launch:417:launch] nodelist: ['x4310c1s0b0n0', 'x4310c1s1b0n0'] [2025-12-26 12:57:00,952019][I][ezpz/launch:418:launch] hostfile: /var/spool/pbs/aux/8234998.aurora-pbs-0001.hostmgmt.cm.aurora.alcf.anl.gov [2025-12-26 12:57:01,231960][I][ezpz/pbs:264:get_pbs_launch_cmd] β Using [24/24] GPUs [2 hosts] x [12 GPU/host] [2025-12-26 12:57:01,233271][I][ezpz/launch:367:build_executable] Building command to execute by piecing together: [2025-12-26 12:57:01,233694][I][ezpz/launch:368:build_executable] (1.) launch_cmd: mpiexec --envall --np=24 --ppn=12 --hostfile=/var/spool/pbs/aux/8234998.aurora-pbs-0001.hostmgmt.cm.aurora.alcf.anl.gov --no-vni --cpu-bind=verbose,list:2-4:10-12:18-20:26-28:34-36:42-44:54-56:62-64:70-72:78-80:86-88:94-96 [2025-12-26 12:57:01,234378][I][ezpz/launch:369:build_executable] (2.) cmd_to_launch: /home/foremans/datascience/foremans/.cache/builds-v0/.tmpCcpdMz/bin/python -m ezpz.test_dist [2025-12-26 12:57:01,235161][I][ezpz/launch:433:launch] Took: 1.24 seconds to build command. [2025-12-26 12:57:01,235513][I][ezpz/launch:436:launch] Executing: mpiexec --envall --np=24 --ppn=12 --hostfile=/var/spool/pbs/aux/8234998.aurora-pbs-0001.hostmgmt.cm.aurora.alcf.anl.gov --no-vni --cpu-bind=verbose,list:2-4:10-12:18-20:26-28:34-36:42-44:54-56:62-64:70-72:78-80:86-88:94-96 /home/foremans/datascience/foremans/.cache/builds-v0/.tmpCcpdMz/bin/python -m ezpz.test_dist [2025-12-26 12:57:01,236843][I][ezpz/launch:220:get_aurora_filters] Filtering for Aurora-specific messages. To view list of filters, run with EZPZ_LOG_LEVEL=DEBUG [2025-12-26 12:57:01,237331][I][ezpz/launch:443:launch] Execution started @ 2025-12-26-125701... [2025-12-26 12:57:01,237728][I][ezpz/launch:138:run_command] Caught 24 filters [2025-12-26 12:57:01,238051][I][ezpz/launch:139:run_command] Running command: mpiexec --envall --np=24 --ppn=12 --hostfile=/var/spool/pbs/aux/8234998.aurora-pbs-0001.hostmgmt.cm.aurora.alcf.anl.gov --no-vni --cpu-bind=verbose,list:2-4:10-12:18-20:26-28:34-36:42-44:54-56:62-64:70-72:78-80:86-88:94-96 /home/foremans/datascience/foremans/.cache/builds-v0/.tmpCcpdMz/bin/python -m ezpz.test_dist cpubind:list x4310c1s1b0n0 pid 147083 rank 12 0: mask 0x1c cpubind:list x4310c1s1b0n0 pid 147084 rank 13 1: mask 0x1c00 cpubind:list x4310c1s1b0n0 pid 147085 rank 14 2: mask 0x1c0000 cpubind:list x4310c1s1b0n0 pid 147086 rank 15 3: mask 0x1c000000 cpubind:list x4310c1s1b0n0 pid 147087 rank 16 4: mask 0x1c00000000 cpubind:list x4310c1s1b0n0 pid 147088 rank 17 5: mask 0x1c0000000000 cpubind:list x4310c1s1b0n0 pid 147089 rank 18 6: mask 0x1c0000000000000 cpubind:list x4310c1s1b0n0 pid 147090 rank 19 7: mask 0x1c000000000000000 cpubind:list x4310c1s1b0n0 pid 147091 rank 20 8: mask 0x1c00000000000000000 cpubind:list x4310c1s1b0n0 pid 147092 rank 21 9: mask 0x1c0000000000000000000 cpubind:list x4310c1s1b0n0 pid 147093 rank 22 10: mask 0x1c000000000000000000000 cpubind:list x4310c1s1b0n0 pid 147094 rank 23 11: mask 0x1c00000000000000000000000 cpubind:list x4310c1s0b0n0 pid 114692 rank 0 0: mask 0x1c cpubind:list x4310c1s0b0n0 pid 114693 rank 1 1: mask 0x1c00 cpubind:list x4310c1s0b0n0 pid 114694 rank 2 2: mask 0x1c0000 cpubind:list x4310c1s0b0n0 pid 114695 rank 3 3: mask 0x1c000000 cpubind:list x4310c1s0b0n0 pid 114696 rank 4 4: mask 0x1c00000000 cpubind:list x4310c1s0b0n0 pid 114697 rank 5 5: mask 0x1c0000000000 cpubind:list x4310c1s0b0n0 pid 114698 rank 6 6: mask 0x1c0000000000000 cpubind:list x4310c1s0b0n0 pid 114699 rank 7 7: mask 0x1c000000000000000 cpubind:list x4310c1s0b0n0 pid 114700 rank 8 8: mask 0x1c00000000000000000 cpubind:list x4310c1s0b0n0 pid 114701 rank 9 9: mask 0x1c0000000000000000000 cpubind:list x4310c1s0b0n0 pid 114702 rank 10 10: mask 0x1c000000000000000000000 cpubind:list x4310c1s0b0n0 pid 114703 rank 11 11: mask 0x1c00000000000000000000000 [2025-12-26 12:57:09,319444][I][ezpz/test_dist:132:__post_init__] Outputs will be saved to /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709 [2025-12-26 12:57:09,322179][I][ezpz/dist:1506:setup_torch_distributed] Using fw='ddp' with torch_{device,backend}= {xpu, xccl} [2025-12-26 12:57:09,323025][I][ezpz/dist:1371:setup_torch_DDP] Caught MASTER_PORT=57733 from environment! [2025-12-26 12:57:09,323626][I][ezpz/dist:1387:setup_torch_DDP] Using torch.distributed.init_process_group with - master_addr='x4310c1s0b0n0.hsn.cm.aurora.alcf.anl.gov' - master_port='57733' - world_size=24 - rank=0 - local_rank=0 - timeout=datetime.timedelta(seconds=3600) - backend='xccl' [2025-12-26 12:57:09,324720][I][ezpz/dist:1019:init_process_group] Calling torch.distributed.init_process_group_with: rank=0 world_size=24 backend=xccl [2025-12-26 12:57:11,367607][I][ezpz/dist:1732:setup_torch] Using device='xpu' with backend='xccl' + 'xccl' for distributed training. [2025-12-26 12:57:11,369584][W][ezpz/dist:544:print_dist_setup] Using [24 / 24] available "xpu" devices !! [2025-12-26 12:57:11,370083][I][ezpz/dist:1779:setup_torch] ['x4310c1s0b0n0'][device='xpu'][node=0/1][rank=00/23][local_rank=00/11] [2025-12-26 12:57:11,369426][I][ezpz/dist:1779:setup_torch] ['x4310c1s0b0n0'][device='xpu'][node=0/1][rank=02/23][local_rank=02/11] [2025-12-26 12:57:11,369554][I][ezpz/dist:1779:setup_torch] ['x4310c1s0b0n0'][device='xpu'][node=1/1][rank=05/23][local_rank=05/11] [2025-12-26 12:57:11,369558][I][ezpz/dist:1779:setup_torch] ['x4310c1s0b0n0'][device='xpu'][node=0/1][rank=06/23][local_rank=06/11] [2025-12-26 12:57:11,369660][I][ezpz/dist:1779:setup_torch] ['x4310c1s0b0n0'][device='xpu'][node=1/1][rank=07/23][local_rank=07/11] [2025-12-26 12:57:11,369585][I][ezpz/dist:1779:setup_torch] ['x4310c1s0b0n0'][device='xpu'][node=0/1][rank=08/23][local_rank=08/11] [2025-12-26 12:57:11,369597][I][ezpz/dist:1779:setup_torch] ['x4310c1s0b0n0'][device='xpu'][node=1/1][rank=09/23][local_rank=09/11] [2025-12-26 12:57:11,369637][I][ezpz/dist:1779:setup_torch] ['x4310c1s0b0n0'][device='xpu'][node=0/1][rank=10/23][local_rank=10/11] [2025-12-26 12:57:11,369590][I][ezpz/dist:1779:setup_torch] ['x4310c1s0b0n0'][device='xpu'][node=1/1][rank=11/23][local_rank=11/11] [2025-12-26 12:57:11,369710][I][ezpz/dist:1779:setup_torch] ['x4310c1s0b0n0'][device='xpu'][node=1/1][rank=01/23][local_rank=01/11] [2025-12-26 12:57:11,369715][I][ezpz/dist:1779:setup_torch] ['x4310c1s0b0n0'][device='xpu'][node=1/1][rank=03/23][local_rank=03/11] [2025-12-26 12:57:11,369129][I][ezpz/dist:1779:setup_torch] ['x4310c1s1b0n0'][device='xpu'][node=0/1][rank=14/23][local_rank=02/11] [2025-12-26 12:57:11,369276][I][ezpz/dist:1779:setup_torch] ['x4310c1s1b0n0'][device='xpu'][node=0/1][rank=12/23][local_rank=00/11] [2025-12-26 12:57:11,369686][I][ezpz/dist:1779:setup_torch] ['x4310c1s0b0n0'][device='xpu'][node=0/1][rank=04/23][local_rank=04/11] [2025-12-26 12:57:11,369570][I][ezpz/dist:1779:setup_torch] ['x4310c1s1b0n0'][device='xpu'][node=1/1][rank=13/23][local_rank=01/11] [2025-12-26 12:57:11,369439][I][ezpz/dist:1779:setup_torch] ['x4310c1s1b0n0'][device='xpu'][node=1/1][rank=15/23][local_rank=03/11] [2025-12-26 12:57:11,372392][I][ezpz/test_dist:678:main] Took: 2.07 seconds to setup torch [2025-12-26 12:57:11,369272][I][ezpz/dist:1779:setup_torch] ['x4310c1s1b0n0'][device='xpu'][node=0/1][rank=16/23][local_rank=04/11] [2025-12-26 12:57:11,369296][I][ezpz/dist:1779:setup_torch] ['x4310c1s1b0n0'][device='xpu'][node=1/1][rank=17/23][local_rank=05/11] [2025-12-26 12:57:11,369515][I][ezpz/dist:1779:setup_torch] ['x4310c1s1b0n0'][device='xpu'][node=0/1][rank=18/23][local_rank=06/11] [2025-12-26 12:57:11,369551][I][ezpz/dist:1779:setup_torch] ['x4310c1s1b0n0'][device='xpu'][node=1/1][rank=19/23][local_rank=07/11] [2025-12-26 12:57:11,369556][I][ezpz/dist:1779:setup_torch] ['x4310c1s1b0n0'][device='xpu'][node=0/1][rank=20/23][local_rank=08/11] [2025-12-26 12:57:11,369524][I][ezpz/dist:1779:setup_torch] ['x4310c1s1b0n0'][device='xpu'][node=1/1][rank=21/23][local_rank=09/11] [2025-12-26 12:57:11,369569][I][ezpz/dist:1779:setup_torch] ['x4310c1s1b0n0'][device='xpu'][node=0/1][rank=22/23][local_rank=10/11] [2025-12-26 12:57:11,369353][I][ezpz/dist:1779:setup_torch] ['x4310c1s1b0n0'][device='xpu'][node=1/1][rank=23/23][local_rank=11/11] [2025-12-26 12:57:11,386631][I][ezpz/test_dist:461:train] Model size: 567434 parameters [2025-12-26 12:57:11,388753][I][ezpz/test_dist:465:train] ================================================================= Layer (type:depth-idx) Param # ================================================================= SequentialLinearNet -- ββSequential: 1-1 567,434 ================================================================= Total params: 567,434 Trainable params: 567,434 Non-trainable params: 0 ================================================================= [2025-12-26 12:57:11,390055][I][ezpz/test_dist:473:train] Took: 0.007092675659805536 seconds to build model [2025-12-26 12:57:11,392504][I][ezpz/test_dist:601:build_model_and_optimizer] model= SequentialLinearNet( (layers): Sequential( (0): Linear(in_features=784, out_features=512, bias=True) (1): ReLU() (2): Linear(in_features=512, out_features=256, bias=True) (3): ReLU() (4): Linear(in_features=256, out_features=128, bias=True) (5): ReLU() (6): Linear(in_features=128, out_features=10, bias=True) ) ) [2025-12-26 12:57:11,394462][I][ezpz/dist:685:wrap_model] Wrapping model with: ddp 2025:12:26-12:57:11:(114692) |CCL_WARN| value of CCL_OP_SYNC changed to be 1 (default:0) 2025:12:26-12:57:11:(114692) |CCL_WARN| value of CCL_PROCESS_LAUNCHER changed to be pmix (default:hydra) [2025-12-26 12:57:24,214420][I][ezpz/test_dist:479:train] Took: 12.82 seconds to build optimizer [2025-12-26 12:57:24,257102][I][ezpz/history:220:__init__] Using History with distributed_history=True [2025-12-26 12:57:24,262059][I][ezpz/dist:2044:setup_wandb] Setting up wandb from rank=0 [2025-12-26 12:57:24,262600][I][ezpz/dist:2045:setup_wandb] Using WB_PROJECT=ezpz.test_dist wandb: Currently logged in as: foremans (aurora_gpt) to https://api.wandb.ai. Use `wandb login --relogin` to force relogin wandb: Tracking run with wandb version 0.21.3 wandb: Run data is saved locally in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/wandb/run-20251226_125724-adhgoy9j wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run winter-salad-6843 wandb: View project at https://wandb.ai/aurora_gpt/ezpz.test_dist wandb: View run at https://wandb.ai/aurora_gpt/ezpz.test_dist/runs/adhgoy9j [2025-12-26 12:57:30,839972][I][ezpz/dist:2074:setup_wandb] wandb.run=[winter-salad-6843](https://wandb.ai/aurora_gpt/ezpz.test_dist/runs/adhgoy9j) [2025-12-26 12:57:30,846065][I][ezpz/dist:2117:setup_wandb] Running on machine='Aurora' [2025-12-26 12:57:32,361320][I][ezpz/test_dist:482:train] Took: 8.15 seconds to build trainer [2025-12-26 12:57:32,362820][I][ezpz/test_dist:486:train] config: { "acc_events": false, "backend": "DDP", "batch_size": 128, "cp": 1, "dataset": "mnist", "dataset_root": "/lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/datasets/mnist", "dtype": "bf16", "input_size": 784, "layer_sizes": [ 512, 256, 128 ], "log_freq": 1, "no_distributed_history": false, "num_workers": 0, "output_size": 10, "pp": 1, "print_freq": 10, "profile_memory": true, "pyinstrument_profiler": false, "pytorch_profiler": false, "pytorch_profiler_active": 3, "pytorch_profiler_repeat": 5, "pytorch_profiler_wait": 1, "pytorch_profiler_warmup": 2, "rank_zero_only": false, "record_shapes": true, "tp": 1, "train_iters": 200, "warmup": 5, "with_flops": true, "with_modules": true, "with_stack": true } [2025-12-26 12:57:32,364988][I][ezpz/test_dist:488:train] Took: 28.84 to get here. [2025-12-26 12:57:46,725491][I][ezpz/test_dist:369:train] Warmup complete at step 5 [2025-12-26 12:57:46,963482][I][ezpz/test_dist:325:train_step] iter=10 loss=0.994967 accuracy=0.750000 dtf=0.011305 dtb=0.001640 loss/mean=1.035069 loss/max=1.218811 loss/min=0.923871 loss/std=0.067301 accuracy/mean=0.714844 accuracy/max=0.804688 accuracy/min=0.609375 accuracy/std=0.046054 dtf/mean=0.010381 dtf/max=0.011685 dtf/min=0.009660 dtf/std=0.000662 dtb/mean=0.001692 dtb/max=0.002077 dtb/min=0.001408 dtb/std=0.000237 [2025-12-26 12:57:47,784965][I][ezpz/test_dist:325:train_step] iter=20 loss=0.843957 accuracy=0.779412 dtf=0.007382 dtb=0.232720 loss/mean=0.587017 loss/max=0.843957 loss/min=0.312610 loss/std=0.137216 accuracy/mean=0.806373 accuracy/max=0.911765 accuracy/min=0.705882 accuracy/std=0.054310 dtf/mean=0.006949 dtf/max=0.007548 dtf/min=0.006570 dtf/std=0.000303 dtb/mean=0.211198 dtb/max=0.238684 dtb/min=0.176031 dtb/std=0.020564 [2025-12-26 12:57:48,288727][I][ezpz/test_dist:325:train_step] iter=30 loss=0.465919 accuracy=0.867188 dtf=0.009977 dtb=0.001979 loss/mean=0.438402 loss/max=0.722735 loss/min=0.278721 loss/std=0.110631 accuracy/mean=0.866536 accuracy/max=0.921875 accuracy/min=0.750000 accuracy/std=0.035937 dtf/mean=0.010105 dtf/max=0.010829 dtf/min=0.009644 dtf/std=0.000391 dtb/mean=0.001774 dtb/max=0.002093 dtb/min=0.001422 dtb/std=0.000223 [2025-12-26 12:57:49,034654][I][ezpz/test_dist:325:train_step] iter=40 loss=0.458118 accuracy=0.882353 dtf=0.007307 dtb=0.002033 loss/mean=0.297673 loss/max=0.516792 loss/min=0.184366 loss/std=0.080388 accuracy/mean=0.912990 accuracy/max=0.955882 accuracy/min=0.838235 accuracy/std=0.031458 dtf/mean=0.006865 dtf/max=0.007475 dtf/min=0.006140 dtf/std=0.000433 dtb/mean=0.001488 dtb/max=0.002033 dtb/min=0.001172 dtb/std=0.000251 [2025-12-26 12:57:49,656664][I][ezpz/test_dist:325:train_step] iter=50 loss=0.364185 accuracy=0.882812 dtf=0.010035 dtb=0.002136 loss/mean=0.296386 loss/max=0.433208 loss/min=0.205008 loss/std=0.066657 accuracy/mean=0.912109 accuracy/max=0.953125 accuracy/min=0.851562 accuracy/std=0.027274 dtf/mean=0.009980 dtf/max=0.010566 dtf/min=0.009565 dtf/std=0.000270 dtb/mean=0.001785 dtb/max=0.002197 dtb/min=0.001444 dtb/std=0.000243 [2025-12-26 12:57:50,516216][I][ezpz/test_dist:325:train_step] iter=60 loss=0.303229 accuracy=0.926471 dtf=0.006841 dtb=0.001837 loss/mean=0.181245 loss/max=0.303229 loss/min=0.074041 loss/std=0.051771 accuracy/mean=0.952206 accuracy/max=1.000000 accuracy/min=0.911765 accuracy/std=0.024108 dtf/mean=0.006655 dtf/max=0.006969 dtf/min=0.006220 dtf/std=0.000242 dtb/mean=0.001543 dtb/max=0.001904 dtb/min=0.001178 dtb/std=0.000215 [2025-12-26 12:57:51,748835][I][ezpz/test_dist:325:train_step] iter=70 loss=0.287316 accuracy=0.906250 dtf=0.010923 dtb=0.002028 loss/mean=0.213261 loss/max=0.345638 loss/min=0.130070 loss/std=0.065777 accuracy/mean=0.937174 accuracy/max=0.968750 accuracy/min=0.867188 accuracy/std=0.025958 dtf/mean=0.010181 dtf/max=0.011084 dtf/min=0.009712 dtf/std=0.000379 dtb/mean=0.001803 dtb/max=0.002258 dtb/min=0.001430 dtb/std=0.000229 [2025-12-26 12:57:54,740809][I][ezpz/test_dist:325:train_step] iter=80 loss=0.206866 accuracy=0.926471 dtf=0.006063 dtb=0.001766 loss/mean=0.113710 loss/max=0.206866 loss/min=0.068099 loss/std=0.038122 accuracy/mean=0.974265 accuracy/max=1.000000 accuracy/min=0.926471 accuracy/std=0.019102 dtf/mean=0.005980 dtf/max=0.006408 dtf/min=0.005786 dtf/std=0.000135 dtb/mean=0.001514 dtb/max=0.001766 dtb/min=0.001132 dtb/std=0.000189 [2025-12-26 12:57:55,375104][I][ezpz/test_dist:325:train_step] iter=90 loss=0.220868 accuracy=0.914062 dtf=0.010806 dtb=0.001936 loss/mean=0.166121 loss/max=0.261424 loss/min=0.083375 loss/std=0.047467 accuracy/mean=0.951172 accuracy/max=0.984375 accuracy/min=0.914062 accuracy/std=0.017065 dtf/mean=0.010863 dtf/max=0.011598 dtf/min=0.010269 dtf/std=0.000426 dtb/mean=0.001793 dtb/max=0.002010 dtb/min=0.001455 dtb/std=0.000182 [2025-12-26 12:57:55,916235][I][ezpz/test_dist:325:train_step] iter=100 loss=0.101629 accuracy=0.970588 dtf=0.007392 dtb=0.001704 loss/mean=0.077895 loss/max=0.216991 loss/min=0.044901 loss/std=0.037287 accuracy/mean=0.988358 accuracy/max=1.000000 accuracy/min=0.955882 accuracy/std=0.013408 dtf/mean=0.006932 dtf/max=0.007560 dtf/min=0.006267 dtf/std=0.000455 dtb/mean=0.001566 dtb/max=0.002013 dtb/min=0.001161 dtb/std=0.000249 [2025-12-26 12:57:56,422680][I][ezpz/test_dist:325:train_step] iter=110 loss=0.174663 accuracy=0.953125 dtf=0.011343 dtb=0.001621 loss/mean=0.119567 loss/max=0.200464 loss/min=0.068889 loss/std=0.039575 accuracy/mean=0.970052 accuracy/max=0.992188 accuracy/min=0.937500 accuracy/std=0.014901 dtf/mean=0.010806 dtf/max=0.012639 dtf/min=0.010221 dtf/std=0.000509 dtb/mean=0.001810 dtb/max=0.002037 dtb/min=0.001430 dtb/std=0.000182 [2025-12-26 12:57:56,786762][I][ezpz/test_dist:325:train_step] iter=120 loss=0.074708 accuracy=0.985294 dtf=0.006787 dtb=0.001536 loss/mean=0.049546 loss/max=0.090880 loss/min=0.026799 loss/std=0.018310 accuracy/mean=0.991422 accuracy/max=1.000000 accuracy/min=0.985294 accuracy/std=0.007246 dtf/mean=0.006472 dtf/max=0.006828 dtf/min=0.005932 dtf/std=0.000261 dtb/mean=0.001562 dtb/max=0.001867 dtb/min=0.001090 dtb/std=0.000205 [2025-12-26 12:57:57,246460][I][ezpz/test_dist:325:train_step] iter=130 loss=0.137289 accuracy=0.953125 dtf=0.010142 dtb=0.001862 loss/mean=0.095899 loss/max=0.145525 loss/min=0.054574 loss/std=0.030761 accuracy/mean=0.974935 accuracy/max=1.000000 accuracy/min=0.945312 accuracy/std=0.016102 dtf/mean=0.010148 dtf/max=0.012131 dtf/min=0.009641 dtf/std=0.000639 dtb/mean=0.001848 dtb/max=0.002093 dtb/min=0.001321 dtb/std=0.000210 [2025-12-26 12:57:57,832532][I][ezpz/test_dist:325:train_step] iter=140 loss=0.038551 accuracy=0.985294 dtf=0.006596 dtb=0.001460 loss/mean=0.037799 loss/max=0.061152 loss/min=0.015614 loss/std=0.011380 accuracy/mean=0.995098 accuracy/max=1.000000 accuracy/min=0.985294 accuracy/std=0.006944 dtf/mean=0.006719 dtf/max=0.007528 dtf/min=0.006087 dtf/std=0.000449 dtb/mean=0.001491 dtb/max=0.001719 dtb/min=0.001157 dtb/std=0.000206 [2025-12-26 12:57:58,329794][I][ezpz/test_dist:325:train_step] iter=150 loss=0.084032 accuracy=0.968750 dtf=0.010424 dtb=0.001986 loss/mean=0.076138 loss/max=0.141387 loss/min=0.033583 loss/std=0.027965 accuracy/mean=0.979818 accuracy/max=1.000000 accuracy/min=0.945312 accuracy/std=0.013514 dtf/mean=0.010651 dtf/max=0.011385 dtf/min=0.009915 dtf/std=0.000520 dtb/mean=0.001795 dtb/max=0.002165 dtb/min=0.001298 dtb/std=0.000235 [2025-12-26 12:57:58,871216][I][ezpz/test_dist:325:train_step] iter=160 loss=0.030340 accuracy=1.000000 dtf=0.006370 dtb=0.001434 loss/mean=0.036724 loss/max=0.116999 loss/min=0.011584 loss/std=0.026702 accuracy/mean=0.992647 accuracy/max=1.000000 accuracy/min=0.941176 accuracy/std=0.014082 dtf/mean=0.006482 dtf/max=0.006820 dtf/min=0.005905 dtf/std=0.000327 dtb/mean=0.001546 dtb/max=0.001796 dtb/min=0.001153 dtb/std=0.000192 [2025-12-26 12:57:59,277568][I][ezpz/test_dist:325:train_step] iter=170 loss=0.060540 accuracy=0.984375 dtf=0.010029 dtb=0.001871 loss/mean=0.067327 loss/max=0.170805 loss/min=0.035560 loss/std=0.030100 accuracy/mean=0.982096 accuracy/max=1.000000 accuracy/min=0.937500 accuracy/std=0.013047 dtf/mean=0.010218 dtf/max=0.012835 dtf/min=0.009561 dtf/std=0.000796 dtb/mean=0.001831 dtb/max=0.002365 dtb/min=0.001390 dtb/std=0.000244 [2025-12-26 12:57:59,752142][I][ezpz/test_dist:325:train_step] iter=180 loss=0.039758 accuracy=0.985294 dtf=0.006253 dtb=0.001701 loss/mean=0.034456 loss/max=0.081928 loss/min=0.009000 loss/std=0.020232 accuracy/mean=0.990809 accuracy/max=1.000000 accuracy/min=0.955882 accuracy/std=0.012603 dtf/mean=0.006565 dtf/max=0.007686 dtf/min=0.005779 dtf/std=0.000649 dtb/mean=0.001519 dtb/max=0.002028 dtb/min=0.001091 dtb/std=0.000251 [2025-12-26 12:58:00,304971][I][ezpz/test_dist:325:train_step] iter=190 loss=0.086260 accuracy=0.953125 dtf=0.011277 dtb=0.001865 loss/mean=0.054108 loss/max=0.114451 loss/min=0.015817 loss/std=0.026246 accuracy/mean=0.985026 accuracy/max=1.000000 accuracy/min=0.953125 accuracy/std=0.013514 dtf/mean=0.010987 dtf/max=0.011464 dtf/min=0.010086 dtf/std=0.000501 dtb/mean=0.001754 dtb/max=0.002030 dtb/min=0.001315 dtb/std=0.000212 [2025-12-26 12:58:02,269674][I][ezpz/history:2385:finalize] Saving plots to /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/mplot (matplotlib) and /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot (tplot) accuracy accuracy/min βββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββ 1.000β€ β ββββββββββββββββββ0.99β€ -------------------------β 0.934β€ β ββββββββββββββ ββββ βββ β0.80β€ ------------- β β βββββββββββ β0.62β€ --- β 0.867β€ βββββ β0.44β€- β 0.801β€ ββββ β ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β 0.734β€ββββ β 1.0 49.2 97.5 145.8 194.0 ββββ βaccuracy/min iter 0.668β€β β accuracy/std 0.602β€β β βββββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β0.068β€* β 1.0 49.2 97.5 145.8 194.0 0.058β€**** β accuracy iter 0.038β€ ****** * * β accuracy/mean 0.027β€ ************************** ** β βββββββββββββββββββββββββββββββββββ0.007β€ *****************β 0.995β€ Β· Β· Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 0.922β€ Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· β 1.0 49.2 97.5 145.8 194.0 β Β·Β·Β·Β·Β·Β·Β· βaccuracy/std iter 0.849β€ Β·Β· β accuracy/max 0.776β€ Β·Β· β βββββββββββββββββββββββββββββββββββ β Β·Β· β1.000β€ ++++++++++++++++++++++++++β 0.703β€Β·Β· β0.951β€ ++++++++ β 0.630β€Β· β0.852β€ ++ β βΒ· β0.802β€++ β 0.557β€Β· β0.703β€+ β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 1.0 49.2 97.5 145.8 194.0 1.0 49.2 97.5 145.8 194.0 accuracy/mean iter accuracy/max iter text saved in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot/accuracy.txt ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 1.00β€ ++ accuracy/max + + ++ β+++++++++β+++ββ+βββββββ+β+βββββββ+ββΒ·ββββ β -- accuracy/min ++++++++++Β·++β+β+ ββββββββββββββββββββββββββΒ·ββββββββΒ·βΒ·β β Β·Β· accuracy/mean ++β + β βββββββββββββββββββ β--ββ---β-β βββ-----ββ-β---β β ββ accuracy ββΒ·βββββββββββ-β-β-β-β---β-- - --- - ---- - - - - β 0.91β€ ++ + β ββββΒ·βββΒ·ββββββ-β--ββ----- --- - β β ++++ ββββββββββ-------- --- β β +++ββ ββββ ββ- -------- - β β ++ +ββββ β--- -- - β 0.81β€ +β βββββ ---- - - β β +ββββ ββ---- β β +ββββ ββ-- β β+ββββ - - β 0.72β€+ββββ-- β β+ββ β-- β β ββ-β- β β β --- β β β--- β 0.62β€ββ-- β ββ- - β βΒ·- β βΒ·- β 0.53β€ - β β - β β- β β- β 0.44β€- β ββ¬ββββββββββββββββββ¬βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 text saved in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot/accuracy_summary.txt accuracy/mean hist accuracy/max hist βββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ 114β€ βββββ134.0β€ βββββ 95β€ βββββ111.7β€ βββββ β βββββ β βββββ 76β€ βββββ 89.3β€ βββββ 57β€ βββββ 67.0β€ βββββ β βββββ β βββββ 38β€ ββββββββ 44.7β€ βββββ 19β€ ββββββββββββ 22.3β€ ββββββββ β βββββββββββββββ β ββββββββββββββ 0β€ββββββββββββββββββββββββββββββββββββ 0.0β€βββββββ ββββββββββββββββββββββββ ββ¬βββββββββ¬ββββββββ¬βββββββββ¬ββββββββ¬β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 0.54 0.66 0.78 0.90 1.01 0.690 0.771 0.852 0.932 1.013 accuracy/min hist accuracy/std hist ββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββ 75.0β€ βββββ89.0β€ ββββ β β βββββ β ββββ β 62.5β€ ββββββββ74.2β€ ββββ β 50.0β€ ββββββββ59.3β€ ββββ β β ββββββββ β ββββ β 37.5β€ ββββββββ44.5β€ ββββ β β βββββββββββ ββββββββ β 25.0β€ βββββββββββ29.7β€ββββββββββ β 12.5β€ βββββββββββ14.8β€ββββββββββββββ β β βββββββββββββββββββββββββ ββββββββββββββββββ βββββββ β 0.0β€βββββββββββββββββββββββββββββββββββ 0.0β€βββββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β 0.41 0.56 0.71 0.86 1.01 0.004 0.021 0.038 0.054 0.071 text saved in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot/accuracy_hist.txt dtb dtb/min βββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ 0.233β€ β β0.176β€ - β 0.194β€ β β0.118β€ - β β β β0.059β€ - β 0.156β€ β β0.001β€---------------------------------β 0.117β€ β β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 0.079β€ β β 1.0 49.2 97.5 145.8 194.0 β β βdtb/min iter 0.040β€ β β dtb/std 0.001β€ββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β0.0206β€ * β 1.0 49.2 97.5 145.8 194.0 0.0172β€ * β dtb iter 0.0103β€ * β dtb/mean 0.0069β€ * β βββββββββββββββββββββββββββββββββββ0.0001β€********************************β 0.211β€ Β· β ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β 0.176β€ Β· β 1.0 49.2 97.5 145.8 194.0 β Β· βdtb/std iter 0.141β€ Β· β dtb/max 0.106β€ Β· β βββββββββββββββββββββββββββββββββββ β Β· β0.239β€ + β 0.071β€ Β· β0.199β€ + β 0.036β€ Β· β0.120β€ + β β Β· β0.081β€ + β 0.001β€Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·β0.002β€+++++++++++++++++++++++++++++++++β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 1.0 49.2 97.5 145.8 194.0 1.0 49.2 97.5 145.8 194.0 dtb/mean iter dtb/max iter text saved in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot/dtb.txt βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.239β€ ++ dtb/max β β -- dtb/min β β Β·Β· dtb/mean β β ββ dtb β 0.199β€ β β β β β β β β β β β 0.159β€ β β β β β β β β β β β 0.120β€ β β β β β β β β β β β β β β 0.080β€ β β β β β β β β β β β 0.041β€ β β β β β β β β β β β 0.001β€ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 text saved in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot/dtb_summary.txt dtb/mean hist dtb/max hist βββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ 193.0β€ββββ β193.0β€ββββ β 160.8β€ββββ β160.8β€ββββ β βββββ β βββββ β 128.7β€ββββ β128.7β€ββββ β 96.5β€ββββ β 96.5β€ββββ β βββββ β βββββ β 64.3β€ββββ β 64.3β€ββββ β 32.2β€ββββ β 32.2β€ββββ β βββββ β βββββ β 0.0β€βββ βββββ 0.0β€βββ βββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β -0.008 0.049 0.106 0.163 0.221 -0.009 0.056 0.120 0.185 0.249 dtb/min hist dtb/std hist βββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ 193.0β€ββββ β193.0β€ββββ β βββββ β βββββ β 160.8β€ββββ β160.8β€ββββ β 128.7β€ββββ β128.7β€ββββ β βββββ β βββββ β 96.5β€ββββ β 96.5β€ββββ β βββββ β βββββ β 64.3β€ββββ β 64.3β€ββββ β 32.2β€ββββ β 32.2β€ββββ β βββββ β βββββ β 0.0β€βββ βββββ 0.0β€βββ βββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ -0.007 0.041 0.089 0.136 0.184 -0.0008 0.0048 0.0103 0.0159 text saved in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot/dtb_hist.txt dtf dtf/min ββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββ 0.0188β€ β β β β0.0129β€ - - - - - - β 0.0167β€ β β β β β0.0106β€--------------------------------β β β β ββ β β β0.0082β€ - - - - - - -- - - β 0.0146β€ β β β β ββ β β β β0.0058β€ - - - - - - - - - β 0.0124β€β β ββββββ βββββ βββ ββ β ββ ββ ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β 0.0103β€βββββββββββ βββββββββββββββββββββ 1.0 49.2 97.5 145.8 194.0 ββββ βββββββββββββ ββββββββββββ βdtf/min iter 0.0082β€ β β ββ β β β ββ β β β dtf/std 0.0061β€ β β β β β β β βββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β0.00142β€ * * * β 1.0 49.2 97.5 145.8 194.0 0.00120β€ * ** * *** * *β dtf iter 0.00076β€** * *** *******************β dtf/mean 0.00054β€*******************************β ββββββββββββββββββββββββββββββββββ0.00011β€ *** ******** * * * ** ** β 0.0136β€ Β· Β· Β· β ββ¬ββββββββ¬βββββββ¬ββββββββ¬ββββββββ 0.0124β€ Β· Β· Β· Β· Β· Β· β 1.0 49.2 97.5 145.8 β Β· Β· Β· Β· Β·Β· Β· Β· Β· Β· βdtf/std iter 0.0111β€Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Β·Β·Β·Β·Β·Β·Β·Β·Β· Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·β dtf/max 0.0098β€Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·β ββββββββββββββββββββββββββββββββββ β Β· Β· Β·Β·Β·Β· Β· Β· Β·Β· Β· Β· β0.0188β€ + + + β 0.0085β€ Β· Β· Β· Β· Β· Β· Β·Β· Β· Β· β0.0167β€ + + + + + β 0.0073β€ Β· Β· Β· Β· Β· Β· Β·Β· Β· Β· β0.0126β€++++++++++++++++++++++++++++++++β β Β· Β· Β· Β· Β· Β· Β· Β· Β· β0.0105β€ + ++++++++++ ++++ ++ + + + β 0.0060β€ Β· β0.0064β€ + + + + + + + + + β ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β ββ¬ββββββββ¬ββββββββ¬βββββββ¬ββββββββ¬β 1.0 49.2 97.5 145.8 194.0 1.0 49.2 97.5 145.8 194.0 dtf/mean iter dtf/max iter text saved in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot/dtf.txt ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.0188β€ ++ dtf/max ββ β β -- dtf/min β β ββ β β Β·Β· dtf/mean ββ ββ ββ β β ββ dtf ββ ββ ββ β 0.0166β€ ββ β ββ ββ β β ββ β ββ ββ β β ββ β ββ ββ β β β+ ββ β ββ ββ β 0.0145β€ β+ ββ β ββ ββ β β β+ ββ + β ββ ββ β β βΒ· + ββ β + β + ββ β ββ β β βΒ· β β β ββ + β + + β + ββ+ + β +++ ββ β 0.0123β€β βΒ· ββ β β ββ ++ ββ++ ++β + βββ+ + β+ +β+βββ +ββ ββ βΒ· β++ββββ β +β ββ ββ+++ββ+++++β+ β+ ββββ + β+++β+βββ++ +βββ ββββ+ βΒ·++β++βββββββββ+ β ββββββ+βββββ+β+β++ββ+ββββββ++ ββ++βββββ++βββββ βββββ+ββββββββββββββββ++ββ +βββββββββββββββββββββββββββ+ββββββββββββββββββ ββββββββββββββββββββββ+βββ++ββββββββββββΒ·β-ββββββββββββΒ·ββββ-βββββββββββββ 0.0101β€ββββΒ·ββ-βββββββ-βΒ·βββββββββββββββββββββ---ββ--ββββ-ββββββββ--ββββββ-Β·β-ββ β--- -β - ---βΒ·-----ββ------β -----ββ- - - β- ---β -- --ββ- --- β-- ----β β β βΒ· ββ β ββ β β ββ β β β β βΒ· ββ β ββ β β ββ β β 0.0080β€ β βΒ· ββ β ββ β β ββ β β β β βΒ· ββ β ββ β β ββ β β β Β· -Β· ββ β -Β· β β ββ β β β - - β - Β· β ββ β β 0.0058β€ β - - - β ββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬βββββββββββββββββ¬ββββββββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 text saved in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot/dtf_summary.txt dtf/mean hist dtf/max hist ββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββ 87.0β€ ββββ β89.0β€ ββββ β 72.5β€ βββββββ β74.2β€ ββββ β β βββββββ β β βββββββ β 58.0β€ βββββββ β59.3β€ βββββββ β 43.5β€ βββββββ β44.5β€ βββββββ β β βββββββ β β βββββββ β 29.0β€ βββββββ β29.7β€ βββββββ β 14.5β€ βββββββ β14.8β€ βββββββββββ β βββββ βββββββββββ β βββββ ββββββββββββββ β 0.0β€βββββββ ββββββββββββββββββββββ 0.0β€ββββββββββββββββββββββββ ββββββββ ββ¬ββββββββ¬βββββββββ¬ββββββββ¬βββββββββ ββ¬ββββββββ¬βββββββββ¬ββββββββ¬βββββββββ 0.0056 0.0077 0.0098 0.0119 0.0059 0.0092 0.0126 0.0160 dtf/min hist dtf/std hist βββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββ 119.0β€ ββββ β55.0β€ βββ β β ββββ β β βββ β 99.2β€ ββββ β45.8β€ βββ β 79.3β€ ββββ β36.7β€ βββββββββββ β β ββββ β β βββββββββββ β 59.5β€ βββββββ β27.5β€ βββββββββββ β β βββββββ β βββββββββββββββ β 39.7β€ βββββββ β18.3β€βββββββββββββββββ β 19.8β€ βββββββ β 9.2β€βββββββββββββββββββββ β βββββ βββββββ β βββββββββββββββββββββββββ βββββ 0.0β€βββββββ ββββββββββββββββββ 0.0β€βββββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ ββ¬ββββββββ¬βββββββββ¬ββββββββ¬βββββββββ 0.0055 0.0074 0.0094 0.0113 0.00005 0.00040 0.00076 0.00112 text saved in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot/dtf_hist.txt loss loss/min ββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββ 1.75β€β β1.62β€- β 1.46β€β β1.08β€-- β ββ β0.54β€ ----- β 1.17β€ββ β0.01β€ ------------------------------β 0.89β€ ββ β ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β 0.60β€ ββββ β 1.0 49.2 97.5 145.8 194.0 β βββββ β βloss/min iter 0.31β€ βββββββββββββββ β β loss/std 0.03β€ β ββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β0.137β€ * β 1.0 49.2 97.5 145.8 194.0 0.116β€ ***** β loss iter 0.074β€** *********** β loss/mean 0.053β€* **************************β ββββββββββββββββββββββββββββββββββββ0.011β€ * ** ********β 1.70β€Β· β ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β 1.42β€Β· β 1.0 49.2 97.5 145.8 194.0 βΒ· βloss/std iter 1.15β€ Β· β loss/max 0.87β€ Β· β ββββββββββββββββββββββββββββββββββββ β Β·Β· β1.76β€+ β 0.59β€ Β·Β·Β· β1.48β€++ β 0.31β€ Β·Β·Β·Β· β0.91β€ ++++ β β Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· β0.63β€ +++++++++++++ β 0.03β€ Β· Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·β0.06β€ ++++++++++++++++++++++β ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β ββ¬ββββββββ¬βββββββββ¬ββββββββ¬ββββββββ¬β 1.0 49.2 97.5 145.8 194.0 1.0 49.2 97.5 145.8 194.0 loss/mean iter loss/max iter text saved in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot/loss.txt ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 1.76β€ ++ loss/max β β -- loss/min β β Β·Β· loss/mean β β ββ loss β 1.47β€β+ β ββ+ β ββΒ· β β β β 1.18β€ β+ β β β+ β β ββ+ β β ββ+ β 0.89β€ β+β β β β+ββ β β ββββ+ + β β -βββ++++ β β -ββββββ + β 0.59β€ -ββββββ+++ + β β --βββββ β++++++ β β ---β βββββ ++++++ + β β ---- βΒ·ββββΒ·βββββ+++β + ++ β 0.30β€ - --β- Β· βββββββ+ββββ+β+++β + β+ β β --- -----β -βββββββββββββ+β+β+++β+++β++ + + + + + β β - -----Β·---ββΒ· Β·ββββββββββββββββ++β++βββ++β+β+++++++++++++++ β β - -----Β·--β-β--ββ-βββββββββββββββΒ·βββββββΒ·βΒ·ββββΒ·βΒ·βΒ·β 0.01β€ ---- ----β-ββ-β-β-ββββββββββΒ·ββββ ββ¬ββββββββββββββββββ¬βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 text saved in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot/loss_summary.txt loss/mean hist loss/max hist βββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββ 127.0β€ββββ β102β€ββββ β 105.8β€ββββ β 85β€ββββ β βββββ β βββββ β 84.7β€ββββ β 68β€ββββ β 63.5β€ββββ β 51β€ββββ β βββββ β ββββββββ β 42.3β€ββββ β 34β€βββββββ β 21.2β€βββββββ β 17β€βββββββββββ β ββββββββββββββ β βββββββββββββββββββ β 0.0β€ββββββββββββββββββββββββββ βββββ 0β€ββββββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β ββ¬βββββββββ¬ββββββββ¬βββββββββ¬ββββββββ¬β -0.04 0.41 0.87 1.32 1.78 -0.01 0.45 0.91 1.38 1.84 loss/min hist loss/std hist βββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββ 145.0β€ββββ β72β€ βββ β βββββ β β βββ β 120.8β€ββββ β60β€ βββ β 96.7β€ββββ β48β€ βββ β βββββ β β βββ β 72.5β€ββββ β36β€ βββββββ β βββββ β β βββββββ β 48.3β€ββββ β24β€ ββββββββββββββ β 24.2β€βββββββ β12β€βββββββββββββββββββββββββ β βββββββββββ β ββββββββββββββββββββββββββββββ β 0.0β€ββββββββββββββββββββββββββββββββββ 0β€βββββββββββββββββββββββββββββββββββββ ββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬β ββ¬βββββββββ¬βββββββββ¬ββββββββ¬βββββββββ¬β -0.06 0.38 0.81 1.25 1.69 0.006 0.040 0.074 0.109 0.143 text saved in /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/plots/tplot/loss_hist.txt [2025-12-26 12:58:07,565854][I][ezpz/history:2433:finalize] Saving history report to /lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/outputs/ezpz.test_dist/2025-12-26-125709/report.md [2025-12-26 12:58:07,571828][I][ezpz/test_dist:348:finalize] dataset=<xarray.Dataset> Size: 39kB Dimensions: (draw: 194) Coordinates: * draw (draw) int64 2kB 0 1 2 3 4 5 6 ... 188 189 190 191 192 193 Data variables: (12/25) iter (draw) int64 2kB 6 7 8 9 10 11 12 ... 194 195 196 197 198 199 loss (draw) float32 776B 1.746 1.533 1.29 ... 0.03311 0.02764 accuracy (draw) float32 776B 0.625 0.6016 0.6328 ... 0.9922 0.9922 dtf (draw) float64 2kB 0.0127 0.01003 0.01162 ... 0.01053 0.01025 dtb (draw) float64 2kB 0.001811 0.001627 ... 0.001683 0.00256 iter_mean (draw) float64 2kB 6.0 7.0 8.0 9.0 ... 197.0 198.0 199.0 ... ... dtf_min (draw) float64 2kB 0.01021 0.009599 ... 0.01024 0.009542 dtf_std (draw) float64 2kB 0.0007831 0.0006131 ... 0.0004008 dtb_mean (draw) float64 2kB 0.001742 0.001728 ... 0.001774 0.001822 dtb_max (draw) float64 2kB 0.002061 0.002182 ... 0.002031 0.00256 dtb_min (draw) float64 2kB 0.001459 0.00144 ... 0.001345 0.001372 dtb_std (draw) float64 2kB 0.0002062 0.0002116 ... 0.0002654 [2025-12-26 12:58:08,256424][I][ezpz/test_dist:500:train] Took: 35.89 seconds to finish training [2025-12-26 12:58:08,257557][I][ezpz/test_dist:695:main] Took: 64.73 seconds wandb: wandb: π View run winter-salad-6843 at: https://wandb.ai/aurora_gpt/ezpz.test_dist/runs/adhgoy9j wandb: Find logs at: ../../../../../../../../../lus/flare/projects/AuroraGPT/AuroraGPT-v1/Experiments/AuroraGPT-2B/tt/saforem2/tmp/2025-12-26-124007/wandb/run-20251226_125724-adhgoy9j/logs [2025-12-26 12:58:10,167355][I][ezpz/launch:447:launch] ----[π ezpz.launch][stop][2025-12-26-125810]---- [2025-12-26 12:58:10,168735][I][ezpz/launch:448:launch] Execution finished with 0. [2025-12-26 12:58:10,169220][I][ezpz/launch:449:launch] Executing finished in 68.93 seconds. [2025-12-26 12:58:10,169583][I][ezpz/launch:450:launch] Took 68.93 seconds to run. Exiting. took: 1m 16s
-
{Polaris} @ ALCF
Output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476
(2025-09-25/base) #[/eagle/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131][β±οΈ 1m56s] #[12/26/25 @ 13:20:57][x3102c0s13b0n0] ; TMPDIR=$(pwd) uv run --python=$(which python3) --with "git+https://github.com/saforem2/ezpz@distributed-metrics" ezpz test Updated https://github.com/saforem2/ezpz (e21d0a9cdc19557ad4f4be88fc2315af0fbfa2db) Updated https://github.com/saforem2/ambivalent (b8de07d9daad215d3db0d18b4aa99cb73107ef77) Built ezpz @ git+https://github.com/saforem2/ezpz@e21d0a9cdc19557ad4f4be88fc2315af0fbfa2db Built ambivalent @ git+https://github.com/saforem2/ambivalent@b8de07d9daad215d3db0d18b4aa99cb73107ef77 Built antlr4-python3-runtime==4.9.3 Installed 87 packages in 1.40s warning: `propcache==0.4.0` is yanked (reason: "ref leak https://github.com/aio-libs/propcache/issues/159") [2025-12-26 13:21:31,922789][I][ezpz/launch:396:launch] ----[π ezpz.launch][started][2025-12-26-132131]---- [2025-12-26 13:21:32,593377][I][ezpz/launch:416:launch] Job ID: 6826897 [2025-12-26 13:21:32,594224][I][ezpz/launch:417:launch] nodelist: ['x3102c0s13b0n0', 'x3102c0s13b1n0'] [2025-12-26 13:21:32,594624][I][ezpz/launch:418:launch] hostfile: /var/spool/pbs/aux/6826897.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov [2025-12-26 13:21:32,595323][I][ezpz/pbs:264:get_pbs_launch_cmd] β Using [8/8] GPUs [2 hosts] x [4 GPU/host] [2025-12-26 13:21:32,596845][I][ezpz/launch:367:build_executable] Building command to execute by piecing together: [2025-12-26 13:21:32,597234][I][ezpz/launch:368:build_executable] (1.) launch_cmd: mpiexec --envall --np=8 --ppn=4 --hostfile=/var/spool/pbs/aux/6826897.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov --cpu-bind=depth --depth=8 [2025-12-26 13:21:32,597798][I][ezpz/launch:369:build_executable] (2.) cmd_to_launch: /home/foremans/.cache/uv/builds-v0/.tmpwG7Oyq/bin/python -m ezpz.test_dist [2025-12-26 13:21:32,598339][I][ezpz/launch:433:launch] Took: 0.68 seconds to build command. [2025-12-26 13:21:32,598684][I][ezpz/launch:436:launch] Executing: mpiexec --envall --np=8 --ppn=4 --hostfile=/var/spool/pbs/aux/6826897.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov --cpu-bind=depth --depth=8 /home/foremans/.cache/uv/builds-v0/.tmpwG7Oyq/bin/python -m ezpz.test_dist [2025-12-26 13:21:32,600442][I][ezpz/launch:443:launch] Execution started @ 2025-12-26-132132... [2025-12-26 13:21:32,600884][I][ezpz/launch:139:run_command] Running command: mpiexec --envall --np=8 --ppn=4 --hostfile=/var/spool/pbs/aux/6826897.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov --cpu-bind=depth --depth=8 /home/foremans/.cache/uv/builds-v0/.tmpwG7Oyq/bin/python -m ezpz.test_dist [2025-12-26 13:21:41,009597][I][ezpz/test_dist:132:__post_init__] Outputs will be saved to /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141 [2025-12-26 13:21:41,011757][I][ezpz/dist:1506:setup_torch_distributed] Using fw='ddp' with torch_{device,backend}= {cuda, nccl} [2025-12-26 13:21:41,013713][I][ezpz/dist:1371:setup_torch_DDP] Caught MASTER_PORT=49717 from environment! [2025-12-26 13:21:41,014243][I][ezpz/dist:1387:setup_torch_DDP] Using torch.distributed.init_process_group with - master_addr='x3102c0s13b0n0.hsn.cm.polaris.alcf.anl.gov' - master_port='49717' - world_size=8 - rank=0 - local_rank=0 - timeout=datetime.timedelta(seconds=3600) - backend='nccl' [2025-12-26 13:21:41,015130][I][ezpz/dist:1019:init_process_group] Calling torch.distributed.init_process_group_with: rank=0 world_size=8 backend=nccl [2025-12-26 13:21:48,305115][I][ezpz/dist:1732:setup_torch] Using device='cuda' with backend='nccl' + 'nccl' for distributed training. [2025-12-26 13:21:48,306060][W][ezpz/dist:544:print_dist_setup] Using [8 / 8] available "cuda" devices !! [2025-12-26 13:21:48,306511][I][ezpz/dist:1779:setup_torch] ['x3102c0s13b0n0'][device='cuda'][node=0/1][rank=0/7][local_rank=0/3] [2025-12-26 13:21:48,305536][I][ezpz/dist:1779:setup_torch] ['x3102c0s13b0n0'][device='cuda'][node=1/1][rank=3/7][local_rank=3/3] [2025-12-26 13:21:48,305529][I][ezpz/dist:1779:setup_torch] ['x3102c0s13b0n0'][device='cuda'][node=1/1][rank=1/7][local_rank=1/3] [2025-12-26 13:21:48,305423][I][ezpz/dist:1779:setup_torch] ['x3102c0s13b1n0'][device='cuda'][node=0/1][rank=4/7][local_rank=0/3] [2025-12-26 13:21:48,305537][I][ezpz/dist:1779:setup_torch] ['x3102c0s13b0n0'][device='cuda'][node=0/1][rank=2/7][local_rank=2/3] [2025-12-26 13:21:48,305414][I][ezpz/dist:1779:setup_torch] ['x3102c0s13b1n0'][device='cuda'][node=1/1][rank=5/7][local_rank=1/3] [2025-12-26 13:21:48,305415][I][ezpz/dist:1779:setup_torch] ['x3102c0s13b1n0'][device='cuda'][node=0/1][rank=6/7][local_rank=2/3] [2025-12-26 13:21:48,305412][I][ezpz/dist:1779:setup_torch] ['x3102c0s13b1n0'][device='cuda'][node=1/1][rank=7/7][local_rank=3/3] [2025-12-26 13:21:48,308064][I][ezpz/test_dist:678:main] Took: 7.31 seconds to setup torch [2025-12-26 13:21:48,321964][I][ezpz/test_dist:461:train] Model size: 567434 parameters [2025-12-26 13:21:48,323195][I][ezpz/test_dist:465:train] ================================================================= Layer (type:depth-idx) Param # ================================================================= SequentialLinearNet -- ββSequential: 1-1 567,434 ================================================================= Total params: 567,434 Trainable params: 567,434 Non-trainable params: 0 ================================================================= [2025-12-26 13:21:48,324424][I][ezpz/test_dist:473:train] Took: 0.005884354992303997 seconds to build model [2025-12-26 13:21:48,326217][I][ezpz/test_dist:601:build_model_and_optimizer] model= SequentialLinearNet( (layers): Sequential( (0): Linear(in_features=784, out_features=512, bias=True) (1): ReLU() (2): Linear(in_features=512, out_features=256, bias=True) (3): ReLU() (4): Linear(in_features=256, out_features=128, bias=True) (5): ReLU() (6): Linear(in_features=128, out_features=10, bias=True) ) ) [2025-12-26 13:21:48,327959][I][ezpz/dist:685:wrap_model] Wrapping model with: ddp [2025-12-26 13:21:48,691473][I][ezpz/test_dist:479:train] Took: 0.37 seconds to build optimizer [2025-12-26 13:21:48,734475][I][ezpz/history:220:__init__] Using History with distributed_history=True [2025-12-26 13:21:48,738296][I][ezpz/dist:2044:setup_wandb] Setting up wandb from rank=0 [2025-12-26 13:21:48,738722][I][ezpz/dist:2045:setup_wandb] Using WB_PROJECT=ezpz.test_dist wandb: Currently logged in as: foremans (aurora_gpt) to https://api.wandb.ai. Use `wandb login --relogin` to force relogin wandb: setting up run 01zkj7vc wandb: Tracking run with wandb version 0.22.1 wandb: Run data is saved locally in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/wandb/run-20251226_132148-01zkj7vc wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run smart-breeze-6848 wandb: View project at https://wandb.ai/aurora_gpt/ezpz.test_dist wandb: View run at https://wandb.ai/aurora_gpt/ezpz.test_dist/runs/01zkj7vc [2025-12-26 13:21:55,570075][I][ezpz/dist:2074:setup_wandb] wandb.run=[smart-breeze-6848](https://wandb.ai/aurora_gpt/ezpz.test_dist/runs/01zkj7vc) [2025-12-26 13:21:55,577966][I][ezpz/dist:2117:setup_wandb] Running on machine='Polaris' [2025-12-26 13:21:56,263208][I][ezpz/test_dist:482:train] Took: 7.57 seconds to build trainer [2025-12-26 13:21:56,264200][I][ezpz/test_dist:486:train] config: { "acc_events": false, "backend": "DDP", "batch_size": 128, "cp": 1, "dataset": "mnist", "dataset_root": "/lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/datasets/mnist", "dtype": "bf16", "input_size": 784, "layer_sizes": [ 512, 256, 128 ], "log_freq": 1, "no_distributed_history": false, "num_workers": 0, "output_size": 10, "pp": 1, "print_freq": 10, "profile_memory": true, "pyinstrument_profiler": false, "pytorch_profiler": false, "pytorch_profiler_active": 3, "pytorch_profiler_repeat": 5, "pytorch_profiler_wait": 1, "pytorch_profiler_warmup": 2, "rank_zero_only": false, "record_shapes": true, "tp": 1, "train_iters": 200, "warmup": 5, "with_flops": true, "with_modules": true, "with_stack": true } [2025-12-26 13:21:56,266230][I][ezpz/test_dist:488:train] Took: 18.32 to get here. [2025-12-26 13:21:56,692071][I][ezpz/test_dist:369:train] Warmup complete at step 5 [2025-12-26 13:21:56,803374][I][ezpz/test_dist:325:train_step] iter=10 loss=1.009584 accuracy=0.765625 dtf=0.016586 dtb=0.000765 loss/mean=1.138943 loss/max=1.389118 loss/min=0.988708 loss/std=0.116546 accuracy/mean=0.690430 accuracy/max=0.796875 accuracy/min=0.578125 accuracy/std=0.067085 dtf/mean=0.016839 dtf/max=0.017218 dtf/min=0.016586 dtf/std=0.000178 dtb/mean=0.000758 dtb/max=0.000768 dtb/min=0.000744 dtb/std=0.000008 [2025-12-26 13:21:57,036584][I][ezpz/test_dist:325:train_step] iter=20 loss=0.516474 accuracy=0.812500 dtf=0.016623 dtb=0.000767 loss/mean=0.621663 loss/max=0.751371 loss/min=0.513288 loss/std=0.093839 accuracy/mean=0.791992 accuracy/max=0.859375 accuracy/min=0.718750 accuracy/std=0.046209 dtf/mean=0.016998 dtf/max=0.017245 dtf/min=0.016623 dtf/std=0.000208 dtb/mean=0.000759 dtb/max=0.000767 dtb/min=0.000751 dtb/std=0.000005 [2025-12-26 13:21:57,265033][I][ezpz/test_dist:325:train_step] iter=30 loss=0.482071 accuracy=0.828125 dtf=0.016847 dtb=0.000769 loss/mean=0.436843 loss/max=0.533845 loss/min=0.284811 loss/std=0.069080 accuracy/mean=0.870117 accuracy/max=0.914062 accuracy/min=0.828125 accuracy/std=0.023089 dtf/mean=0.017028 dtf/max=0.017492 dtf/min=0.016678 dtf/std=0.000223 dtb/mean=0.000757 dtb/max=0.000769 dtb/min=0.000743 dtb/std=0.000008 [2025-12-26 13:21:57,485773][I][ezpz/test_dist:325:train_step] iter=40 loss=0.411392 accuracy=0.843750 dtf=0.016916 dtb=0.000771 loss/mean=0.455263 loss/max=0.584419 loss/min=0.397925 loss/std=0.055186 accuracy/mean=0.859375 accuracy/max=0.875000 accuracy/min=0.843750 accuracy/std=0.012956 dtf/mean=0.017048 dtf/max=0.017304 dtf/min=0.016830 dtf/std=0.000140 dtb/mean=0.000759 dtb/max=0.000771 dtb/min=0.000751 dtb/std=0.000006 [2025-12-26 13:21:57,720448][I][ezpz/test_dist:325:train_step] iter=50 loss=0.340432 accuracy=0.859375 dtf=0.017033 dtb=0.000771 loss/mean=0.400236 loss/max=0.587103 loss/min=0.278782 loss/std=0.088603 accuracy/mean=0.871094 accuracy/max=0.906250 accuracy/min=0.843750 accuracy/std=0.024080 dtf/mean=0.017107 dtf/max=0.017321 dtf/min=0.016968 dtf/std=0.000112 dtb/mean=0.000767 dtb/max=0.000785 dtb/min=0.000748 dtb/std=0.000011 [2025-12-26 13:21:57,968693][I][ezpz/test_dist:325:train_step] iter=60 loss=0.325704 accuracy=0.906250 dtf=0.018421 dtb=0.000773 loss/mean=0.347035 loss/max=0.470769 loss/min=0.274286 loss/std=0.057969 accuracy/mean=0.888672 accuracy/max=0.906250 accuracy/min=0.828125 accuracy/std=0.024316 dtf/mean=0.018716 dtf/max=0.018999 dtf/min=0.018345 dtf/std=0.000219 dtb/mean=0.000764 dtb/max=0.000776 dtb/min=0.000751 dtb/std=0.000008 [2025-12-26 13:21:58,215199][I][ezpz/test_dist:325:train_step] iter=70 loss=0.242337 accuracy=0.914062 dtf=0.016899 dtb=0.000785 loss/mean=0.260672 loss/max=0.361649 loss/min=0.186009 loss/std=0.053688 accuracy/mean=0.916016 accuracy/max=0.945312 accuracy/min=0.882812 accuracy/std=0.017794 dtf/mean=0.017151 dtf/max=0.017322 dtf/min=0.016899 dtf/std=0.000136 dtb/mean=0.000774 dtb/max=0.000789 dtb/min=0.000758 dtb/std=0.000012 [2025-12-26 13:21:58,472737][I][ezpz/test_dist:325:train_step] iter=80 loss=0.344910 accuracy=0.882812 dtf=0.016888 dtb=0.000774 loss/mean=0.274805 loss/max=0.344910 loss/min=0.163093 loss/std=0.059792 accuracy/mean=0.918945 accuracy/max=0.960938 accuracy/min=0.882812 accuracy/std=0.027046 dtf/mean=0.017064 dtf/max=0.017452 dtf/min=0.016775 dtf/std=0.000201 dtb/mean=0.000762 dtb/max=0.000774 dtb/min=0.000756 dtb/std=0.000005 [2025-12-26 13:21:58,701404][I][ezpz/test_dist:325:train_step] iter=90 loss=0.260920 accuracy=0.914062 dtf=0.016934 dtb=0.000776 loss/mean=0.221058 loss/max=0.312963 loss/min=0.097677 loss/std=0.066769 accuracy/mean=0.930664 accuracy/max=0.992188 accuracy/min=0.898438 accuracy/std=0.027466 dtf/mean=0.017072 dtf/max=0.017282 dtf/min=0.016857 dtf/std=0.000142 dtb/mean=0.000762 dtb/max=0.000776 dtb/min=0.000755 dtb/std=0.000006 [2025-12-26 13:21:58,925449][I][ezpz/test_dist:325:train_step] iter=100 loss=0.290902 accuracy=0.914062 dtf=0.017022 dtb=0.000771 loss/mean=0.219431 loss/max=0.290902 loss/min=0.158593 loss/std=0.038115 accuracy/mean=0.937500 accuracy/max=0.953125 accuracy/min=0.914062 accuracy/std=0.012353 dtf/mean=0.017146 dtf/max=0.017407 dtf/min=0.016838 dtf/std=0.000171 dtb/mean=0.000763 dtb/max=0.000771 dtb/min=0.000756 dtb/std=0.000004 [2025-12-26 13:21:59,183043][I][ezpz/test_dist:325:train_step] iter=110 loss=0.270826 accuracy=0.914062 dtf=0.016910 dtb=0.000785 loss/mean=0.220031 loss/max=0.311172 loss/min=0.142488 loss/std=0.060282 accuracy/mean=0.934570 accuracy/max=0.960938 accuracy/min=0.914062 accuracy/std=0.016544 dtf/mean=0.017096 dtf/max=0.017434 dtf/min=0.016804 dtf/std=0.000188 dtb/mean=0.000762 dtb/max=0.000785 dtb/min=0.000753 dtb/std=0.000009 [2025-12-26 13:21:59,396895][I][ezpz/test_dist:325:train_step] iter=120 loss=0.304672 accuracy=0.921875 dtf=0.017031 dtb=0.000768 loss/mean=0.231112 loss/max=0.329426 loss/min=0.110154 loss/std=0.073585 accuracy/mean=0.928711 accuracy/max=0.953125 accuracy/min=0.882812 accuracy/std=0.024531 dtf/mean=0.017054 dtf/max=0.017213 dtf/min=0.016711 dtf/std=0.000159 dtb/mean=0.000760 dtb/max=0.000769 dtb/min=0.000743 dtb/std=0.000008 [2025-12-26 13:21:59,631761][I][ezpz/test_dist:325:train_step] iter=130 loss=0.232980 accuracy=0.945312 dtf=0.017138 dtb=0.000771 loss/mean=0.235195 loss/max=0.355287 loss/min=0.102751 loss/std=0.074560 accuracy/mean=0.927734 accuracy/max=0.976562 accuracy/min=0.898438 accuracy/std=0.022693 dtf/mean=0.017109 dtf/max=0.017356 dtf/min=0.016762 dtf/std=0.000199 dtb/mean=0.000760 dtb/max=0.000777 dtb/min=0.000750 dtb/std=0.000009 [2025-12-26 13:21:59,862446][I][ezpz/test_dist:325:train_step] iter=140 loss=0.168414 accuracy=0.968750 dtf=0.016910 dtb=0.000771 loss/mean=0.210054 loss/max=0.340699 loss/min=0.129359 loss/std=0.068940 accuracy/mean=0.940430 accuracy/max=0.968750 accuracy/min=0.890625 accuracy/std=0.024686 dtf/mean=0.017123 dtf/max=0.017356 dtf/min=0.016893 dtf/std=0.000170 dtb/mean=0.000759 dtb/max=0.000771 dtb/min=0.000751 dtb/std=0.000006 [2025-12-26 13:22:00,085098][I][ezpz/test_dist:325:train_step] iter=150 loss=0.237147 accuracy=0.929688 dtf=0.016932 dtb=0.000775 loss/mean=0.167624 loss/max=0.237147 loss/min=0.122940 loss/std=0.040060 accuracy/mean=0.941406 accuracy/max=0.953125 accuracy/min=0.921875 accuracy/std=0.012353 dtf/mean=0.017041 dtf/max=0.017280 dtf/min=0.016753 dtf/std=0.000176 dtb/mean=0.000757 dtb/max=0.000775 dtb/min=0.000740 dtb/std=0.000009 [2025-12-26 13:22:00,305868][I][ezpz/test_dist:325:train_step] iter=160 loss=0.208926 accuracy=0.945312 dtf=0.016980 dtb=0.000771 loss/mean=0.186015 loss/max=0.215280 loss/min=0.128407 loss/std=0.027561 accuracy/mean=0.941406 accuracy/max=0.960938 accuracy/min=0.929688 accuracy/std=0.008735 dtf/mean=0.017058 dtf/max=0.017327 dtf/min=0.016779 dtf/std=0.000193 dtb/mean=0.000756 dtb/max=0.000771 dtb/min=0.000737 dtb/std=0.000009 [2025-12-26 13:22:00,525172][I][ezpz/test_dist:325:train_step] iter=170 loss=0.232940 accuracy=0.921875 dtf=0.017109 dtb=0.000773 loss/mean=0.198723 loss/max=0.269332 loss/min=0.122802 loss/std=0.053061 accuracy/mean=0.940430 accuracy/max=0.968750 accuracy/min=0.906250 accuracy/std=0.020647 dtf/mean=0.017133 dtf/max=0.017396 dtf/min=0.016898 dtf/std=0.000146 dtb/mean=0.000757 dtb/max=0.000773 dtb/min=0.000743 dtb/std=0.000008 [2025-12-26 13:22:00,741349][I][ezpz/test_dist:325:train_step] iter=180 loss=0.051174 accuracy=0.992188 dtf=0.016878 dtb=0.000779 loss/mean=0.142097 loss/max=0.257418 loss/min=0.051174 loss/std=0.076244 accuracy/mean=0.966797 accuracy/max=0.992188 accuracy/min=0.929688 accuracy/std=0.022011 dtf/mean=0.017102 dtf/max=0.017473 dtf/min=0.016812 dtf/std=0.000194 dtb/mean=0.000762 dtb/max=0.000779 dtb/min=0.000750 dtb/std=0.000008 [2025-12-26 13:22:00,962154][I][ezpz/test_dist:325:train_step] iter=190 loss=0.105810 accuracy=0.945312 dtf=0.016914 dtb=0.000775 loss/mean=0.152862 loss/max=0.230180 loss/min=0.094466 loss/std=0.049649 accuracy/mean=0.951172 accuracy/max=0.976562 accuracy/min=0.937500 accuracy/std=0.012807 dtf/mean=0.017123 dtf/max=0.017377 dtf/min=0.016858 dtf/std=0.000202 dtb/mean=0.000761 dtb/max=0.000775 dtb/min=0.000752 dtb/std=0.000007 [2025-12-26 13:22:04,963504][I][ezpz/history:2385:finalize] Saving plots to /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/mplot (matplotlib) and /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot (tplot) accuracy accuracy/min βββββββββββββββββββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.992β€ β βββ ββββββββββββ ββββββββββββββ0.953β€ --------------------------------------------------β 0.930β€ ββββββββββββββββ ββββββββββββ ββ βββ β βββ0.641β€---- β 0.867β€ βββββββββββ β β β ββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β 0.742β€ββββββ β β β 1.0 49.2 97.5 145.8 194.0 0.680β€β ββ βaccuracy/min iter 0.617β€β β accuracy/std ββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ 1.0 49.2 97.5 145.8 194.0 0.069β€*** β accuracy iter 0.049β€******************* ***** ******* ** ** * ** β accuracy/mean 0.017β€ * *** ** ******** ****** *********************β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β 0.969β€ Β· Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·β 1.0 49.2 97.5 145.8 194.0 0.900β€ Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· βaccuracy/std iter 0.832β€ Β·Β·Β·Β·Β·Β·Β·Β· β accuracy/max 0.764β€ Β·Β·Β· β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.695β€ Β· β1.000β€ +++++++++++++++++++++++++++++++++++++++++++++++β 0.627β€Β·Β· β0.885β€ +++++++++++++ + β 0.559β€Β· β0.714β€++ β ββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β ββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 1.0 49.2 97.5 145.8 194.0 accuracy/mean iter accuracy/max iter text saved in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot/accuracy.txt βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 1.000β€ ++ accuracy/max + + + + + + β + ++ + β β -- accuracy/min β + + ++ + ++++ββ ++ ++++ +++β++ ++++ββββ+β+β++βββββ+β+Β·+Β·Β·+ββββ++βββ+ββΒ·βββ β Β·Β· accuracy/mean ++ ++β ++β++β+β+β+β++ββΒ·βββ+Β·+ββββ ββββββ βββββββΒ·βββββββββΒ·Β·ββββββββββββΒ·βββββΒ·βββββββ 0.914β€ ββ accuracy +β+++β++β+ββββ+ ββββββββββββββββΒ·Β·Β·ββββββββββββββββ-βΒ·Β·β--ββ ----β--------βββ-β-----ββ--- --ββ β + ++ + ββΒ·ββββββββββββββββΒ·ββββ-βββ-- ------- --ββ - βββ-βββ - -- β - β β + +ββββββΒ·Β·βββββββββΒ· βΒ·β-- ββ----- - -- - -- - -β β 0.828β€ ++ βΒ·βΒ·ββββ-Β·βββββββ--- - -ββ -- β β ++ ++βββΒ·β-- ββ-- - ββ - β β β+β+ββββ - -- β-- β β 0.742β€ ββββββββ-- - - β β β Β·Β·Β·β-- β βββΒ· -β β βββΒ· --β β 0.656β€ββ --- β βββ-- - β β Β·- β 0.570β€Β·-- β β - β β - β 0.484β€- β ββ¬ββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 text saved in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot/accuracy_summary.txt accuracy/mean hist accuracy/max hist ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 96β€ ββββββ79.0β€ ββββββ 80β€ ββββββ65.8β€ ββββββββββββ 64β€ ββββββ52.7β€ ββββββββββββ 48β€ βββββ ββββββ39.5β€ ββββββββββββ 32β€ βββββββββββ ββββββ26.3β€ βββββββββββββββββ 16β€ βββββββββββ ββββββ13.2β€ βββββββββββββββββββββββ 0β€βββββ ββββββββββββββββββββββββββββββββββββββββββββ ββββββ 0.0β€βββββββββββ βββββββββββββββββββββββββββββββββββββββ ββ¬ββββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬ββββββββββββββ¬β ββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β 0.54 0.65 0.76 0.88 0.99 0.64 0.73 0.83 0.92 1.02 accuracy/min hist accuracy/std hist ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 85.0β€ ββββββ57.0β€ βββββββββββ β 70.8β€ ββββββ47.5β€ βββββββββββ β 56.7β€ ββββββ38.0β€ βββββββββββ β 42.5β€ ββββββββββββ28.5β€ βββββββββββ β β ββββββββββββ βββββββββββββββββββββββ β 28.3β€ βββββββββββββββββ19.0β€βββββββββββββββββββββββββββ β 14.2β€ ββββββββββββββββββββββββββββ 9.5β€ββββββββββββββββββββββββββββββββββββββ β 0.0β€βββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.0β€βββββββββββββββββββββββββββββββββββββββββββ ββββββ ββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β ββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β 0.46 0.59 0.72 0.85 0.97 0.004 0.021 0.038 0.055 0.072 text saved in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot/accuracy_hist.txt dtb dtb/min ββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.000923β€ β β0.000880β€ - β 0.000896β€ β β0.000769β€--------------------------------------------------β 0.000868β€ β β ββ¬ββββββββββββ¬βββββββββββββ¬ββββββββββββ¬ββββββββββββ¬β 0.000814β€ β β β 1.0 49.2 97.5 145.8 194.0 0.000786β€ β β β β β ββββ β β β β β ββββββ βdtb/min iter 0.000759β€βββββββββββββββββββββββββββββββββββββββββββββββββββ dtb/std ββ¬ββββββββββββ¬βββββββββββββ¬ββββββββββββ¬ββββββββββββ¬β βββββββββββββββββββββββββββββββββββββββββββββββββββ 1.0 49.2 97.5 145.8 194.0 0.0000341β€ * * ** β dtb iter 0.0000241β€* * * **** ** ** ** *** *** ** *β dtb/mean 0.0000091β€*************************************************β ββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ¬ββββββββββββ¬ββββββββββββ¬ββββββββββββ¬ββββββββββββ¬β 0.000893β€ Β· β 1.0 49.2 97.5 145.8 194.0 0.000869β€ Β· βdtb/std iter 0.000845β€ Β· β dtb/max 0.000820β€ Β· β ββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.000796β€ Β· β0.000923β€ + β 0.000772β€ Β· Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Β·Β·Β·Β·Β·Β·Β·Β·β0.000868β€+ ++ + ++ + + + ++ + β 0.000748β€Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Β·Β· Β· Β·Β· Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Β· Β· β0.000786β€++++++++++++++++++++++++++++++++++++++++++++++++++β ββ¬ββββββββββββ¬βββββββββββββ¬ββββββββββββ¬ββββββββββββ¬β ββ¬ββββββββββββ¬βββββββββββββ¬ββββββββββββ¬ββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 1.0 49.2 97.5 145.8 194.0 dtb/mean iter dtb/max iter text saved in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot/dtb.txt ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.000923β€ ++ dtb/max ββ β β -- dtb/min ββ β β Β·Β· dtb/mean ββ β 0.000888β€ ββ dtb ββ β β ββ β β ββ β 0.000853β€ ββ + β β ββ + + + β β ββ + ++ + + β+ β 0.000819β€ ββ + ++ ++ + β+ β β+ ββ + ++ ++ + β+ β β+ ββ + β+ + ++ ++ ββ β + ββ ββ β β+ + βββ + +ββ+ β+ + ++ ++β +β ββ ββ ++ β ββ ββ ++ β 0.000784β€+ + ββ+ β β+ β + +βββ ββ+ββ+ββββ + +ββ+ ββββββββββ β+β βββ+ +ββ+β + β+ β+++ ββ+ + βββββββ +βββββ++ β β+ ββββ+βββββββββββββββββββββββββΒ·Β·βΒ·Β·ββΒ·β ββββββββΒ·βΒ·ββββΒ·βββββββββββββΒ·Β·βββββββββ+βββββββββββ Β·Β· +βββββ βββββ ββββββββΒ·Β·βΒ·βΒ·Β·Β·Β·Β·Β·Β·Β·Β·βΒ·Β·βΒ·Β·Β·Β·Β·-Β· Β·- -Β·Β·-Β·Β·Β·Β·Β·Β·Β·Β· Β·Β·Β· Β·Β·Β·Β·-Β·Β·Β·-Β·-Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·-Β·Β·Β·Β·Β· Β·Β·Β·Β·Β·Β·β 0.000749β€ Β·Β·Β·Β·Β·Β·Β·- ----------------------- -- - ------- --------- --- ---- ------------- --- - - ------------ --β β-------- - -- - --- -- ----------- β β- β 0.000714β€- β ββ¬βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 text saved in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot/dtb_summary.txt dtb/mean hist dtb/max hist βββββββββββββββββββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββββββββββββββββββ 130.0β€βββββ β104.0β€βββββ β 108.3β€βββββ β 86.7β€βββββ β 86.7β€βββββ β 69.3β€βββββββββββ β 65.0β€βββββββββββ β 52.0β€βββββββββββ β 43.3β€βββββββββββ β 34.7β€βββββββββββ β 21.7β€βββββββββββ β 17.3β€βββββββββββ β 0.0β€ββββββββββββββββ ββββββ 0.0β€ββββββββββββββββββββββββββ βββββ ββββββ ββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β ββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β 0.000741 0.000781 0.000820 0.000860 0.000900 0.000752 0.000796 0.000841 0.000886 0.000930 dtb/min hist dtb/std hist βββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 111.0β€ βββββ β90β€ βββββ β 92.5β€ βββββ β75β€ βββββ β 74.0β€ βββββββββββ β60β€βββββ βββββ β 55.5β€ βββββββββββ β45β€βββββ βββββ β β βββββββββββ β ββββββ βββββ β 37.0β€ βββββββββββ β30β€βββββ βββββββββββ β 18.5β€ βββββββββββ β15β€βββββ βββββββββββ β 0.0β€βββββββββββββββββββββ ββββββ 0β€βββββ ββββββββββββββββββββββββββββββββββββββββββββ ββββββ ββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β ββ¬ββββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬ββββββββββββββ¬β 0.000707 0.000752 0.000797 0.000842 0.000887 0.000003 0.000011 0.000019 0.000027 0.000035 text saved in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot/dtb_hist.txt dtf dtf/min ββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.0188β€ β β β ββ β0.0183β€----------------------------------------------------β 0.0174β€βββββββββββββββββββββββββββββββββββββββββββββββββββββ0.0131β€ - - - β 0.0160β€ β ββ β β ββ¬βββββββββββββ¬βββββββββββββ¬ββββββββββββ¬βββββββββββββ¬β 0.0132β€ β ββ β β 1.0 49.2 97.5 145.8 194.0 0.0118β€ β ββ β βdtf/min iter 0.0104β€ β ββ β β dtf/std ββ¬βββββββββββββ¬βββββββββββββ¬ββββββββββββ¬βββββββββββββ¬β ββββββββββββββββββββββββββββββββββββββββββββββββββββ 1.0 49.2 97.5 145.8 194.0 0.000452β€ * * * * β dtf iter 0.000331β€* ********** * *********** * *** ******** ******β dtf/mean 0.000150β€************************************************* β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ¬ββββββββββββ¬βββββββββββββ¬ββββββββββββ¬ββββββββββββ¬β 0.0187β€ Β· Β· Β· β 1.0 49.2 97.5 145.8 194.0 0.0174β€Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·βdtf/std iter 0.0160β€ Β· Β·Β· Β· β dtf/max 0.0146β€ Β· Β·Β· Β· β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.0133β€ Β· Β·Β· Β· β0.0190β€++++++++++++++++++++++++++++++++++++++++++++++++++++β 0.0119β€ Β· Β·Β· Β· β0.0162β€ + ++ + β 0.0106β€ Β· Β· Β· β0.0121β€ + + + β ββ¬βββββββββββββ¬βββββββββββββ¬ββββββββββββ¬βββββββββββββ¬β ββ¬βββββββββββββ¬βββββββββββββ¬ββββββββββββ¬βββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 1.0 49.2 97.5 145.8 194.0 dtf/mean iter dtf/max iter text saved in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot/dtf.txt ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.0190β€ ++ dtf/max + β β β -- dtf/min +β + Β· +β β β Β·Β· dtf/mean ββ ++ Β·β + Β·β + + β 0.0176β€ ββ dtf ++ + + ++ ββ+ + + + + ++ ++ ββ+ ++++Β·ββ+ + β +ββ++ + β+ + + Β·β++ + + + ++β β+++++++Β·ββΒ·β+Β·Β·Β·Β·Β·Β·βΒ·Β·βββββββββββββΒ·Β·βββββΒ·Β·Β·Β·βββββββββββΒ·βΒ·βΒ·ββΒ·ββΒ·ββΒ·ββββββββΒ·ββΒ·ββΒ·ββΒ·βΒ·βββββββββββββΒ·ββββΒ·βΒ·β ββββββββββββββββββββββββ--β-ββ-β -βββββββ--ββββ-ββ-ββββ---βββββββββ-ββββ--ββ-β-βββ-ββββ-ββ-ββββ----ββββββββββ-ββββ 0.0161β€ β β ββ ββ β β β ββ ββ β β β ββ ββ β 0.0147β€ β ββ ββ β β β ββ ββ β β β β ββ β β β β ββ β 0.0133β€ β β ββ β β β β ββ β β β β ββ β 0.0118β€ β β ββ β β β β ββ β β β β ββ β 0.0104β€ β β Β·β β ββ¬ββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 text saved in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot/dtf_summary.txt dtf/mean hist dtf/max hist βββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 100.0β€ ββββββ β96β€ βββββββββββ β 83.3β€ βββββββββββ β80β€ βββββββββββ β 66.7β€ βββββββββββ β64β€ βββββββββββ β 50.0β€ βββββββββββ β48β€ βββββββββββ β 33.3β€ βββββββββββ β32β€ βββββββββββ β 16.7β€ βββββββββββ β16β€ βββββββββββ β 0.0β€βββββ βββββββββββββββββ 0β€βββββ βββββββββββ ββββββ ββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β ββ¬ββββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬ββββββββββββββ¬β 0.0102 0.0124 0.0146 0.0169 0.0191 0.0104 0.0126 0.0149 0.0171 0.0194 dtf/min hist dtf/std hist βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 138β€ βββββ β73.0β€ βββββββββββ β 115β€ βββββ β60.8β€ βββββββββββ β 92β€ βββββ β48.7β€ βββββββββββ β 69β€ βββββ β36.5β€ βββββββββββ β β ββββββββββ β β βββββββββββ β 46β€ ββββββββββ β24.3β€ββββββββββββββββββββββ β 23β€ ββββββββββ β12.2β€ββββββββββββββββββββββ β 0β€βββββ ββββββββββ ββββββ 0.0β€βββββββββββββββββββββββββββ ββββββ ββββββββββββ ββ¬ββββββββββββββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬β ββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β 0.0101 0.0122 0.0144 0.0165 0.0187 0.00007 0.00017 0.00027 0.00037 0.00047 text saved in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot/dtf_hist.txt loss loss/min ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 1.76β€β β1.76β€-- β 1.48β€β β0.62β€ -----------------------------------------------------β 1.19β€ β β ββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β 0.91β€ ββββ β β 1.0 49.2 97.5 145.8 194.0 0.34β€ βββββββββββββββββββ β ββββββββ β βloss/min iter 0.05β€ β ββ ββββββββββββββββββββββββββββββββββββββ loss/std ββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ 1.0 49.2 97.5 145.8 194.0 0.127β€ * ** * ** β loss iter 0.092β€*********************************** *************** *β loss/mean 0.038β€* * * * * ** * **** ****************β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β 1.84β€Β· β 1.0 49.2 97.5 145.8 194.0 1.55β€Β·Β· βloss/std iter 1.26β€ Β· β loss/max 0.98β€ Β·Β· β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0.69β€ Β·Β·Β·Β· β1.92β€++ β 0.40β€ Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Β·Β· Β· β1.32β€ +++++++++ ++ β 0.12β€ Β· Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·β0.44β€ +++++++++++++++++++++++++++++++++++++++++++++++β ββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β ββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 1.0 49.2 97.5 145.8 194.0 loss/mean iter loss/max iter text saved in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot/loss.txt ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 1.92β€ ++ loss/max β β -- loss/min β β Β·Β· loss/mean β 1.61β€ ββ loss β ββΒ· β β β+ β 1.29β€ βΒ· β β βΒ· β β β++ β 0.98β€ βΒ·+ β β ββΒ·+ + β β ββΒ·β++ β β ββββ++ ++ β 0.67β€ βΒ·ββββ+++ +β+ + + β β -ββββΒ·βΒ·+ββ ++++++β++++ ++ + + β β -ββΒ·ββββΒ·Β·Β·ββββββΒ·Β·βββ β +β++++ ++ + + + β 0.36β€ β-- Β·-βββΒ·--ββββββββΒ·βββββββΒ·++Β·βββ++ +ββ+Β·+ ++ ββ++β+β+++++ + ++ + + ++ + β β -- - - - β-- -βββββ- ββββββββββββββββΒ·Β·Β·βββββββββββββββββ+βββΒ·βββββ++βββ+Β·++++ββ+β+βββ+++β+++++ +ββ β - ββ -β----β-- ββ ---βββ----ββ-β-ββββ -βββββββββββββββΒ·βΒ·ββββββββββββββΒ·βββββΒ·ββββββ 0.05β€ - β-- - --β- -βββ----β- --ββ-β---βββ-βββ-β ββ¬ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ¬β 1.0 49.2 97.5 145.8 194.0 text saved in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot/loss_summary.txt loss/mean hist loss/max hist βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 126β€βββββ β85.0β€βββββ β 105β€βββββ β70.8β€βββββ β 84β€βββββ β56.7β€βββββββββββ β 63β€βββββ β42.5β€βββββββββββ β 42β€βββββ βββββ β28.3β€ββββββββββββββββ β 21β€βββββ ββββββββββ β14.2β€ββββββββββββββββββββββ β 0β€βββββ ββββββββββ ββββββββββ ββββββββββ ββββββββββ ββββββ 0.0β€βββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ¬ββββββββββββββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬β ββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β 0.04 0.51 0.98 1.45 1.91 0.06 0.55 1.03 1.51 2.00 loss/min hist loss/std hist βββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 139.0β€βββββ β44.0β€ βββββ β 115.8β€βββββ β36.7β€ βββββ β 92.7β€βββββ β29.3β€ ββββββββββββββββ β 69.5β€βββββ β22.0β€ βββββββββββββββββββββββββββ β ββββββ β β βββββββββββββββββββββββββββββββββ β 46.3β€βββββββββββ β14.7β€ βββββββββββββββββββββββββββββββββ β 23.2β€βββββββββββ β 7.3β€βββββββββββββββββββββββββββββββββββββββββββ β 0.0β€ββββββββββββββββββββββββββ βββββββββββββββββββββββββββ 0.0β€βββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β ββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬βββββββββββββ¬β -0.03 0.44 0.91 1.37 1.84 0.015 0.045 0.074 0.103 0.132 text saved in /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/plots/tplot/loss_hist.txt [2025-12-26 13:22:10,673046][I][ezpz/history:2433:finalize] Saving history report to /lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/outputs/ezpz.test_dist/2025-12-26-132141/report.md [2025-12-26 13:22:10,684947][I][ezpz/test_dist:348:finalize] dataset=<xarray.Dataset> Size: 39kB Dimensions: (draw: 194) Coordinates: * draw (draw) int64 2kB 0 1 2 3 4 5 6 ... 188 189 190 191 192 193 Data variables: (12/25) iter (draw) int64 2kB 6 7 8 9 10 11 12 ... 194 195 196 197 198 199 loss (draw) float32 776B 1.761 1.571 1.454 ... 0.2359 0.1281 accuracy (draw) float32 776B 0.6562 0.6953 0.6172 ... 0.9141 0.9688 dtf (draw) float64 2kB 0.01671 0.01655 ... 0.01678 0.01688 dtb (draw) float64 2kB 0.0007633 0.0007603 ... 0.0007723 iter_mean (draw) float64 2kB 6.0 7.0 8.0 9.0 ... 197.0 198.0 199.0 ... ... dtf_min (draw) float64 2kB 0.0166 0.01655 0.01673 ... 0.01675 0.01678 dtf_std (draw) float64 2kB 0.0001951 0.0001735 ... 0.0002166 dtb_mean (draw) float64 2kB 0.0007558 0.0007502 ... 0.0007614 dtb_max (draw) float64 2kB 0.0008161 0.0007603 ... 0.0007723 dtb_min (draw) float64 2kB 0.0007143 0.0007425 ... 0.0007473 dtb_std (draw) float64 2kB 2.653e-05 5.994e-06 ... 7.243e-06 [2025-12-26 13:22:11,411451][I][ezpz/test_dist:500:train] Took: 15.14 seconds to finish training [2025-12-26 13:22:11,412326][I][ezpz/test_dist:695:main] Took: 33.47 seconds wandb: wandb: π View run smart-breeze-6848 at: wandb: Find logs at: ../../../../../../../lus/eagle/projects/AuroraGPT/foremans/projects/saforem2/tmp/2025-12-26-130131/wandb/run-20251226_132148-01zkj7vc/logs [2025-12-26 13:22:14,556135][I][ezpz/launch:447:launch] ----[π ezpz.launch][stop][2025-12-26-132214]---- [2025-12-26 13:22:14,556823][I][ezpz/launch:448:launch] Execution finished with 0. [2025-12-26 13:22:14,557231][I][ezpz/launch:449:launch] Executing finished in 41.96 seconds. [2025-12-26 13:22:14,557601][I][ezpz/launch:450:launch] Took 41.96 seconds to run. Exiting. took: 1m 15s
-