speedinghzl commited on
Commit
fb095a9
·
verified ·
1 Parent(s): 44f479b

Upload folder using huggingface_hub

Browse files
clip_vit_b16_s512m_bs16k_mix0_0/checkpoints/epoch_4.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fe2a8f0f278bf87c0612f6b5d9ba36033d25df3908cf563eb548334e3197885e
3
+ size 1795823122
clip_vit_b16_s512m_bs16k_mix0_0/out.log ADDED
@@ -0,0 +1,583 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-05-06,11:43:27 | INFO | No latest resume checkpoint found in ./logs-lr1e-3-datacomp/clip_vit_b16_s512m_bs16k_mix0_0/checkpoints.
2
+ 2025-05-06,11:43:29 | INFO | Running in distributed mode with multiple processes. Device: cuda:0.Process (global: 0, local 0), total 16.
3
+ 2025-05-06,11:43:29 | INFO | Loaded ViT-B-16 model config.
4
+ 2025-05-06,11:43:30 | INFO | Model:
5
+ 2025-05-06,11:43:30 | INFO | CLIP(
6
+ (visual): VisionTransformer(
7
+ (conv1): Conv2d(3, 768, kernel_size=(16, 16), stride=(16, 16), bias=False)
8
+ (patch_dropout): Identity()
9
+ (ln_pre): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
10
+ (transformer): Transformer(
11
+ (resblocks): ModuleList(
12
+ (0-11): 12 x ResidualAttentionBlock(
13
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
14
+ (attn): MultiheadAttention(
15
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
16
+ )
17
+ (ls_1): Identity()
18
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
19
+ (mlp): Sequential(
20
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
21
+ (gelu): GELU(approximate='none')
22
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
23
+ )
24
+ (ls_2): Identity()
25
+ )
26
+ )
27
+ )
28
+ (ln_post): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
29
+ )
30
+ (transformer): Transformer(
31
+ (resblocks): ModuleList(
32
+ (0-11): 12 x ResidualAttentionBlock(
33
+ (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
34
+ (attn): MultiheadAttention(
35
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
36
+ )
37
+ (ls_1): Identity()
38
+ (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
39
+ (mlp): Sequential(
40
+ (c_fc): Linear(in_features=512, out_features=2048, bias=True)
41
+ (gelu): GELU(approximate='none')
42
+ (c_proj): Linear(in_features=2048, out_features=512, bias=True)
43
+ )
44
+ (ls_2): Identity()
45
+ )
46
+ )
47
+ )
48
+ (token_embedding): Embedding(49408, 512)
49
+ (ln_final): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
50
+ )
51
+ 2025-05-06,11:43:30 | INFO | Params:
52
+ 2025-05-06,11:43:30 | INFO | NDR_patch_size: 16
53
+ 2025-05-06,11:43:30 | INFO | accum_freq: 1
54
+ 2025-05-06,11:43:30 | INFO | aug_cfg: {}
55
+ 2025-05-06,11:43:30 | INFO | batch_size: 1024
56
+ 2025-05-06,11:43:30 | INFO | beta1: 0.9
57
+ 2025-05-06,11:43:30 | INFO | beta2: 0.98
58
+ 2025-05-06,11:43:30 | INFO | checkpoint_path: ./logs-lr1e-3-datacomp/clip_vit_b16_s512m_bs16k_mix0_0/checkpoints
59
+ 2025-05-06,11:43:30 | INFO | coca_caption_loss_weight: 2.0
60
+ 2025-05-06,11:43:30 | INFO | coca_contrastive_loss_weight: 1.0
61
+ 2025-05-06,11:43:30 | INFO | copy_codebase: False
62
+ 2025-05-06,11:43:30 | INFO | csv_caption_key: title
63
+ 2025-05-06,11:43:30 | INFO | csv_img_key: filepath
64
+ 2025-05-06,11:43:30 | INFO | csv_separator:
65
+ 2025-05-06,11:43:30 | INFO | dataset_resampled: False
66
+ 2025-05-06,11:43:30 | INFO | dataset_type: webdataset
67
+ 2025-05-06,11:43:30 | INFO | ddp_static_graph: True
68
+ 2025-05-06,11:43:30 | INFO | debug: False
69
+ 2025-05-06,11:43:30 | INFO | delete_prev_step_ckpt: True
70
+ 2025-05-06,11:43:30 | INFO | delete_previous_checkpoint: False
71
+ 2025-05-06,11:43:30 | INFO | device: cuda:0
72
+ 2025-05-06,11:43:30 | INFO | dist_backend: nccl
73
+ 2025-05-06,11:43:30 | INFO | dist_url: env://
74
+ 2025-05-06,11:43:30 | INFO | distill: False
75
+ 2025-05-06,11:43:30 | INFO | distill_model: None
76
+ 2025-05-06,11:43:30 | INFO | distill_pretrained: None
77
+ 2025-05-06,11:43:30 | INFO | distributed: True
78
+ 2025-05-06,11:43:30 | INFO | epochs: 4
79
+ 2025-05-06,11:43:30 | INFO | epochs_cooldown: None
80
+ 2025-05-06,11:43:30 | INFO | eps: 1e-06
81
+ 2025-05-06,11:43:30 | INFO | force_custom_text: False
82
+ 2025-05-06,11:43:30 | INFO | force_image_size: 224
83
+ 2025-05-06,11:43:30 | INFO | force_patch_dropout: None
84
+ 2025-05-06,11:43:30 | INFO | force_quick_gelu: False
85
+ 2025-05-06,11:43:30 | INFO | gather_with_grad: True
86
+ 2025-05-06,11:43:30 | INFO | global_batch_size: 16384
87
+ 2025-05-06,11:43:30 | INFO | grad_checkpointing: True
88
+ 2025-05-06,11:43:30 | INFO | grad_clip_norm: None
89
+ 2025-05-06,11:43:30 | INFO | horovod: False
90
+ 2025-05-06,11:43:30 | INFO | image_interpolation: None
91
+ 2025-05-06,11:43:30 | INFO | image_mean: None
92
+ 2025-05-06,11:43:30 | INFO | image_resize_mode: None
93
+ 2025-05-06,11:43:30 | INFO | image_std: None
94
+ 2025-05-06,11:43:30 | INFO | imagenet_v2: None
95
+ 2025-05-06,11:43:30 | INFO | imagenet_val: /mnt/bn/zilongdata-hl/dataset/imagenet/val
96
+ 2025-05-06,11:43:30 | INFO | is_cls_token: True
97
+ 2025-05-06,11:43:30 | INFO | local_loss: True
98
+ 2025-05-06,11:43:30 | INFO | local_rank: 0
99
+ 2025-05-06,11:43:30 | INFO | lock_image: False
100
+ 2025-05-06,11:43:30 | INFO | lock_image_freeze_bn_stats: False
101
+ 2025-05-06,11:43:30 | INFO | lock_image_unlocked_groups: 0
102
+ 2025-05-06,11:43:30 | INFO | lock_text: False
103
+ 2025-05-06,11:43:30 | INFO | lock_text_freeze_layer_norm: False
104
+ 2025-05-06,11:43:30 | INFO | lock_text_unlocked_layers: 0
105
+ 2025-05-06,11:43:30 | INFO | log_every_n_steps: 128
106
+ 2025-05-06,11:43:30 | INFO | log_level: 20
107
+ 2025-05-06,11:43:30 | INFO | log_local: False
108
+ 2025-05-06,11:43:30 | INFO | log_path: ./logs-lr1e-3-datacomp/clip_vit_b16_s512m_bs16k_mix0_0/out.log
109
+ 2025-05-06,11:43:30 | INFO | logs: ./logs-lr1e-3-datacomp
110
+ 2025-05-06,11:43:30 | INFO | lr: 0.001
111
+ 2025-05-06,11:43:30 | INFO | lr_cooldown_end: 0.0
112
+ 2025-05-06,11:43:30 | INFO | lr_cooldown_power: 1.0
113
+ 2025-05-06,11:43:30 | INFO | lr_scheduler: cosine
114
+ 2025-05-06,11:43:30 | INFO | max_seq_len: 15000
115
+ 2025-05-06,11:43:30 | INFO | model: ViT-B-16
116
+ 2025-05-06,11:43:30 | INFO | name: clip_vit_b16_s512m_bs16k_mix0_0
117
+ 2025-05-06,11:43:30 | INFO | native_dynamic_resolution: False
118
+ 2025-05-06,11:43:30 | INFO | no_set_device_rank: False
119
+ 2025-05-06,11:43:30 | INFO | only_packing: False
120
+ 2025-05-06,11:43:30 | INFO | precision: amp
121
+ 2025-05-06,11:43:30 | INFO | pretrained:
122
+ 2025-05-06,11:43:30 | INFO | pretrained_image:
123
+ 2025-05-06,11:43:30 | INFO | pretrained_text:
124
+ 2025-05-06,11:43:30 | INFO | rank: 0
125
+ 2025-05-06,11:43:30 | INFO | remote_sync: None
126
+ 2025-05-06,11:43:30 | INFO | remote_sync_frequency: 300
127
+ 2025-05-06,11:43:30 | INFO | remote_sync_protocol: s3
128
+ 2025-05-06,11:43:30 | INFO | report_to: wandb
129
+ 2025-05-06,11:43:30 | INFO | resume: None
130
+ 2025-05-06,11:43:30 | INFO | rope_attn_num_heads: 12
131
+ 2025-05-06,11:43:30 | INFO | rope_model_width: 768
132
+ 2025-05-06,11:43:30 | INFO | save_every_n_steps: 6104
133
+ 2025-05-06,11:43:30 | INFO | save_frequency: 1
134
+ 2025-05-06,11:43:30 | INFO | save_most_recent: False
135
+ 2025-05-06,11:43:30 | INFO | seed: 0
136
+ 2025-05-06,11:43:30 | INFO | siglip: False
137
+ 2025-05-06,11:43:30 | INFO | skip_scheduler: False
138
+ 2025-05-06,11:43:30 | INFO | tensorboard: False
139
+ 2025-05-06,11:43:30 | INFO | tensorboard_path:
140
+ 2025-05-06,11:43:30 | INFO | torchcompile: False
141
+ 2025-05-06,11:43:30 | INFO | torchscript: False
142
+ 2025-05-06,11:43:30 | INFO | trace: False
143
+ 2025-05-06,11:43:30 | INFO | train_data: /mnt/bn/zilongdata-hl/dataset/Recap-DataComp-1B-Dataset/{000000..140146}.tar
144
+ 2025-05-06,11:43:30 | INFO | train_data_upsampling_factors: None
145
+ 2025-05-06,11:43:30 | INFO | train_num_samples: 128000000
146
+ 2025-05-06,11:43:30 | INFO | use_bn_sync: False
147
+ 2025-05-06,11:43:30 | INFO | use_bnb_linear: None
148
+ 2025-05-06,11:43:30 | INFO | val_data: None
149
+ 2025-05-06,11:43:30 | INFO | val_frequency: 1
150
+ 2025-05-06,11:43:30 | INFO | val_num_samples: None
151
+ 2025-05-06,11:43:30 | INFO | val_steps: 0
152
+ 2025-05-06,11:43:30 | INFO | wandb: True
153
+ 2025-05-06,11:43:30 | INFO | wandb_notes:
154
+ 2025-05-06,11:43:30 | INFO | wandb_project_name: cls-clip-NDR
155
+ 2025-05-06,11:43:30 | INFO | warmup: 500
156
+ 2025-05-06,11:43:30 | INFO | wd: 0.2
157
+ 2025-05-06,11:43:30 | INFO | workers: 1
158
+ 2025-05-06,11:43:30 | INFO | world_size: 16
159
+ 2025-05-06,11:43:30 | INFO | zeroshot_frequency: 4
160
+ 2025-05-06,11:43:30 | INFO | zeroshot_steps: 0
161
+ 2025-05-06,11:43:47 | INFO | Start epoch 0
162
+ 2025-05-06,11:44:03 | INFO | Train Epoch: 0 [ 16384/128008192 (0%)] Data (t): 8.270 Batch (t): 15.504, 1056.79/s, 66.0492/s/gpu LR: 0.000002 Logit Scale: 14.286 Contrastive_loss: 9.8015 (9.8015) Loss: 9.8015 (9.8015)
163
+ 2025-05-06,11:46:08 | WARNING | Handling webdataset error (OSError('image file is truncated (44 bytes not processed)')). Ignoring.
164
+ 2025-05-06,11:49:51 | WARNING | Handling webdataset error (OSError('image file is truncated (47 bytes not processed)')). Ignoring.
165
+ 2025-05-06,11:56:12 | INFO | Train Epoch: 0 [ 2113536/128008192 (2%)] Data (t): 0.411 Batch (t): 5.698, 2901.41/s, 181.338/s/gpu LR: 0.000258 Logit Scale: 14.332 Contrastive_loss: 7.4634 (8.6325) Loss: 7.4634 (8.6325)
166
+ 2025-05-06,12:00:41 | WARNING | Handling webdataset error (OSError('image file is truncated (68 bytes not processed)')). Ignoring.
167
+ 2025-05-06,12:08:14 | INFO | Train Epoch: 0 [ 4210688/128008192 (3%)] Data (t): 0.375 Batch (t): 5.639, 2969.19/s, 185.574/s/gpu LR: 0.000514 Logit Scale: 14.662 Contrastive_loss: 6.9604 (8.0751) Loss: 6.9604 (8.0751)
168
+ 2025-05-06,12:10:44 | WARNING | Handling webdataset error (OSError('image file is truncated (25 bytes not processed)')). Ignoring.
169
+ 2025-05-06,12:20:13 | INFO | Train Epoch: 0 [ 6307840/128008192 (5%)] Data (t): 0.374 Batch (t): 5.621, 2822.37/s, 176.398/s/gpu LR: 0.000770 Logit Scale: 15.550 Contrastive_loss: 6.3089 (7.6336) Loss: 6.3089 (7.6336)
170
+ 2025-05-06,12:32:13 | INFO | Train Epoch: 0 [ 8404992/128008192 (7%)] Data (t): 0.369 Batch (t): 5.623, 2962.04/s, 185.128/s/gpu LR: 0.001000 Logit Scale: 16.973 Contrastive_loss: 6.7608 (7.4590) Loss: 6.7608 (7.4590)
171
+ 2025-05-06,12:44:14 | INFO | Train Epoch: 0 [ 10502144/128008192 (8%)] Data (t): 0.376 Batch (t): 5.635, 2962.09/s, 185.130/s/gpu LR: 0.001000 Logit Scale: 18.904 Contrastive_loss: 5.1964 (7.0819) Loss: 5.1964 (7.0819)
172
+ 2025-05-06,12:47:05 | WARNING | Handling webdataset error (OSError('image file is truncated (21 bytes not processed)')). Ignoring.
173
+ 2025-05-06,12:47:05 | WARNING | Handling webdataset error (OSError('image file is truncated (104 bytes not processed)')). Ignoring.
174
+ 2025-05-06,12:56:17 | INFO | Train Epoch: 0 [ 12599296/128008192 (10%)] Data (t): 0.350 Batch (t): 5.647, 2944.62/s, 184.039/s/gpu LR: 0.001000 Logit Scale: 21.498 Contrastive_loss: 5.8585 (6.9071) Loss: 5.8585 (6.9071)
175
+ 2025-05-06,13:08:18 | INFO | Train Epoch: 0 [ 14696448/128008192 (11%)] Data (t): 0.376 Batch (t): 5.634, 3003.00/s, 187.687/s/gpu LR: 0.001000 Logit Scale: 23.931 Contrastive_loss: 3.8234 (6.5217) Loss: 3.8234 (6.5217)
176
+ 2025-05-06,13:11:23 | WARNING | Handling webdataset error (OSError('image file is truncated (32 bytes not processed)')). Ignoring.
177
+ 2025-05-06,13:17:44 | WARNING | Handling webdataset error (OSError('image file is truncated (23 bytes not processed)')). Ignoring.
178
+ 2025-05-06,13:20:23 | INFO | Train Epoch: 0 [ 16793600/128008192 (13%)] Data (t): 0.382 Batch (t): 5.660, 2894.56/s, 180.910/s/gpu LR: 0.000999 Logit Scale: 26.936 Contrastive_loss: 3.3977 (6.1746) Loss: 3.3977 (6.1746)
179
+ 2025-05-06,13:32:37 | INFO | Train Epoch: 0 [ 18890752/128008192 (15%)] Data (t): 0.379 Batch (t): 5.736, 2911.12/s, 181.945/s/gpu LR: 0.000999 Logit Scale: 29.381 Contrastive_loss: 3.1158 (5.8687) Loss: 3.1158 (5.8687)
180
+ 2025-05-06,13:44:42 | INFO | Train Epoch: 0 [ 20987904/128008192 (16%)] Data (t): 0.375 Batch (t): 5.662, 2958.99/s, 184.937/s/gpu LR: 0.000998 Logit Scale: 32.422 Contrastive_loss: 2.9274 (5.6013) Loss: 2.9274 (5.6013)
181
+ 2025-05-06,13:56:43 | INFO | Train Epoch: 0 [ 23085056/128008192 (18%)] Data (t): 0.375 Batch (t): 5.632, 2901.85/s, 181.366/s/gpu LR: 0.000998 Logit Scale: 35.595 Contrastive_loss: 2.5818 (5.3497) Loss: 2.5818 (5.3497)
182
+ 2025-05-06,14:08:45 | INFO | Train Epoch: 0 [ 25182208/128008192 (20%)] Data (t): 0.369 Batch (t): 5.640, 2867.52/s, 179.220/s/gpu LR: 0.000997 Logit Scale: 39.100 Contrastive_loss: 2.3918 (5.1221) Loss: 2.3918 (5.1221)
183
+ 2025-05-06,14:20:50 | INFO | Train Epoch: 0 [ 27279360/128008192 (21%)] Data (t): 0.372 Batch (t): 5.664, 2934.50/s, 183.406/s/gpu LR: 0.000996 Logit Scale: 41.990 Contrastive_loss: 2.1578 (4.9104) Loss: 2.1578 (4.9104)
184
+ 2025-05-06,14:22:25 | WARNING | Handling webdataset error (OSError('image file is truncated (59 bytes not processed)')). Ignoring.
185
+ 2025-05-06,14:32:55 | INFO | Train Epoch: 0 [ 29376512/128008192 (23%)] Data (t): 0.372 Batch (t): 5.667, 2912.66/s, 182.041/s/gpu LR: 0.000996 Logit Scale: 44.912 Contrastive_loss: 1.9478 (4.7129) Loss: 1.9478 (4.7129)
186
+ 2025-05-06,14:34:25 | WARNING | Handling webdataset error (OSError('image file is truncated (1 bytes not processed)')). Ignoring.
187
+ 2025-05-06,14:44:59 | INFO | Train Epoch: 0 [ 31473664/128008192 (25%)] Data (t): 0.376 Batch (t): 5.661, 2697.24/s, 168.577/s/gpu LR: 0.000995 Logit Scale: 47.222 Contrastive_loss: 1.8343 (4.5330) Loss: 1.8343 (4.5330)
188
+ 2025-05-06,14:57:03 | INFO | Train Epoch: 0 [ 33570816/128008192 (26%)] Data (t): 0.370 Batch (t): 5.656, 2906.20/s, 181.638/s/gpu LR: 0.000994 Logit Scale: 49.182 Contrastive_loss: 1.9232 (4.3795) Loss: 1.9232 (4.3795)
189
+ 2025-05-06,15:01:03 | WARNING | Handling webdataset error (OSError('image file is truncated (0 bytes not processed)')). Ignoring.
190
+ 2025-05-06,15:09:11 | INFO | Train Epoch: 0 [ 35667968/128008192 (28%)] Data (t): 0.379 Batch (t): 5.687, 2889.25/s, 180.578/s/gpu LR: 0.000993 Logit Scale: 50.771 Contrastive_loss: 1.5132 (4.2202) Loss: 1.5132 (4.2202)
191
+ 2025-05-06,15:20:19 | WARNING | Handling webdataset error (OSError('image file is truncated (18 bytes not processed)')). Ignoring.
192
+ 2025-05-06,15:21:14 | INFO | Train Epoch: 0 [ 37765120/128008192 (30%)] Data (t): 0.375 Batch (t): 5.646, 2932.80/s, 183.300/s/gpu LR: 0.000992 Logit Scale: 52.112 Contrastive_loss: 1.4175 (4.0727) Loss: 1.4175 (4.0727)
193
+ 2025-05-06,15:21:40 | WARNING | Handling webdataset error (OSError('image file is truncated (59 bytes not processed)')). Ignoring.
194
+ 2025-05-06,15:31:28 | WARNING | Handling webdataset error (OSError('image file is truncated (84 bytes not processed)')). Ignoring.
195
+ 2025-05-06,15:33:16 | INFO | Train Epoch: 0 [ 39862272/128008192 (31%)] Data (t): 0.385 Batch (t): 5.644, 2959.70/s, 184.981/s/gpu LR: 0.000990 Logit Scale: 53.444 Contrastive_loss: 1.4557 (3.9419) Loss: 1.4557 (3.9419)
196
+ 2025-05-06,15:45:20 | INFO | Train Epoch: 0 [ 41959424/128008192 (33%)] Data (t): 0.384 Batch (t): 5.650, 2867.22/s, 179.201/s/gpu LR: 0.000989 Logit Scale: 52.602 Contrastive_loss: 1.4758 (3.8244) Loss: 1.4758 (3.8244)
197
+ 2025-05-06,15:57:28 | INFO | Train Epoch: 0 [ 44056576/128008192 (34%)] Data (t): 0.365 Batch (t): 5.688, 2888.61/s, 180.538/s/gpu LR: 0.000988 Logit Scale: 54.760 Contrastive_loss: 1.4764 (3.7177) Loss: 1.4764 (3.7177)
198
+ 2025-05-06,16:00:16 | WARNING | Handling webdataset error (OSError('image file is truncated (49 bytes not processed)')). Ignoring.
199
+ 2025-05-06,16:09:29 | INFO | Train Epoch: 0 [ 46153728/128008192 (36%)] Data (t): 0.347 Batch (t): 5.632, 2665.37/s, 166.586/s/gpu LR: 0.000986 Logit Scale: 55.988 Contrastive_loss: 1.4509 (3.6191) Loss: 1.4509 (3.6191)
200
+ 2025-05-06,16:21:29 | INFO | Train Epoch: 0 [ 48250880/128008192 (38%)] Data (t): 0.360 Batch (t): 5.626, 2882.46/s, 180.154/s/gpu LR: 0.000984 Logit Scale: 57.065 Contrastive_loss: 1.1703 (3.5171) Loss: 1.1703 (3.5171)
201
+ 2025-05-06,16:33:27 | INFO | Train Epoch: 0 [ 50348032/128008192 (39%)] Data (t): 0.349 Batch (t): 5.610, 2822.03/s, 176.377/s/gpu LR: 0.000983 Logit Scale: 57.927 Contrastive_loss: 1.3119 (3.4289) Loss: 1.3119 (3.4289)
202
+ 2025-05-06,16:45:32 | INFO | Train Epoch: 0 [ 52445184/128008192 (41%)] Data (t): 0.367 Batch (t): 5.663, 2904.27/s, 181.517/s/gpu LR: 0.000981 Logit Scale: 58.756 Contrastive_loss: 1.1724 (3.3421) Loss: 1.1724 (3.3421)
203
+ 2025-05-06,16:57:37 | INFO | Train Epoch: 0 [ 54542336/128008192 (43%)] Data (t): 0.367 Batch (t): 5.664, 2888.04/s, 180.503/s/gpu LR: 0.000979 Logit Scale: 59.539 Contrastive_loss: 1.2115 (3.2632) Loss: 1.2115 (3.2632)
204
+ 2025-05-06,17:09:40 | INFO | Train Epoch: 0 [ 56639488/128008192 (44%)] Data (t): 0.368 Batch (t): 5.652, 2952.70/s, 184.544/s/gpu LR: 0.000977 Logit Scale: 60.231 Contrastive_loss: 1.3236 (3.1939) Loss: 1.3236 (3.1939)
205
+ 2025-05-06,17:21:47 | INFO | Train Epoch: 0 [ 58736640/128008192 (46%)] Data (t): 0.377 Batch (t): 5.675, 2936.91/s, 183.557/s/gpu LR: 0.000975 Logit Scale: 60.682 Contrastive_loss: 1.3034 (3.1287) Loss: 1.3034 (3.1287)
206
+ 2025-05-06,17:22:37 | WARNING | Handling webdataset error (OSError('image file is truncated (82 bytes not processed)')). Ignoring.
207
+ 2025-05-06,17:22:57 | WARNING | Handling webdataset error (OSError('image file is truncated (107 bytes not processed)')). Ignoring.
208
+ 2025-05-06,17:23:41 | WARNING | Handling webdataset error (OSError('image file is truncated (73 bytes not processed)')). Ignoring.
209
+ 2025-05-06,17:33:51 | INFO | Train Epoch: 0 [ 60833792/128008192 (48%)] Data (t): 0.383 Batch (t): 5.657, 2918.07/s, 182.380/s/gpu LR: 0.000973 Logit Scale: 61.261 Contrastive_loss: 1.0694 (3.0601) Loss: 1.0694 (3.0601)
210
+ 2025-05-06,17:45:59 | INFO | Train Epoch: 0 [ 62930944/128008192 (49%)] Data (t): 0.377 Batch (t): 5.689, 2931.90/s, 183.244/s/gpu LR: 0.000971 Logit Scale: 61.991 Contrastive_loss: 1.3213 (3.0040) Loss: 1.3213 (3.0040)
211
+ 2025-05-06,17:46:49 | WARNING | Handling webdataset error (OSError('image file is truncated (2 bytes not processed)')). Ignoring.
212
+ 2025-05-06,17:58:00 | INFO | Train Epoch: 0 [ 65028096/128008192 (51%)] Data (t): 0.373 Batch (t): 5.635, 2866.52/s, 179.158/s/gpu LR: 0.000969 Logit Scale: 62.436 Contrastive_loss: 1.2314 (2.9486) Loss: 1.2314 (2.9486)
213
+ 2025-05-06,17:58:40 | WARNING | Handling webdataset error (OSError('image file is truncated (54 bytes not processed)')). Ignoring.
214
+ 2025-05-06,18:10:03 | INFO | Train Epoch: 0 [ 67125248/128008192 (52%)] Data (t): 0.359 Batch (t): 5.651, 2656.77/s, 166.048/s/gpu LR: 0.000967 Logit Scale: 62.934 Contrastive_loss: 1.0955 (2.8925) Loss: 1.0955 (2.8925)
215
+ 2025-05-06,18:18:19 | WARNING | Handling webdataset error (OSError('image file is truncated (46 bytes not processed)')). Ignoring.
216
+ 2025-05-06,18:20:36 | WARNING | Handling webdataset error (OSError('image file is truncated (87 bytes not processed)')). Ignoring.
217
+ 2025-05-06,18:22:09 | INFO | Train Epoch: 0 [ 69222400/128008192 (54%)] Data (t): 0.374 Batch (t): 5.667, 2878.64/s, 179.915/s/gpu LR: 0.000964 Logit Scale: 63.442 Contrastive_loss: 1.2061 (2.8429) Loss: 1.2061 (2.8429)
218
+ 2025-05-06,18:26:09 | WARNING | Handling webdataset error (OSError('image file is truncated (31 bytes not processed)')). Ignoring.
219
+ 2025-05-06,18:34:18 | INFO | Train Epoch: 0 [ 71319552/128008192 (56%)] Data (t): 0.371 Batch (t): 5.698, 2899.64/s, 181.228/s/gpu LR: 0.000962 Logit Scale: 63.721 Contrastive_loss: 1.1519 (2.7945) Loss: 1.1519 (2.7945)
220
+ 2025-05-06,18:46:26 | INFO | Train Epoch: 0 [ 73416704/128008192 (57%)] Data (t): 0.376 Batch (t): 5.683, 2701.04/s, 168.815/s/gpu LR: 0.000959 Logit Scale: 64.158 Contrastive_loss: 1.0109 (2.7450) Loss: 1.0109 (2.7450)
221
+ 2025-05-06,18:58:31 | INFO | Train Epoch: 0 [ 75513856/128008192 (59%)] Data (t): 0.376 Batch (t): 5.665, 2944.70/s, 184.044/s/gpu LR: 0.000957 Logit Scale: 64.621 Contrastive_loss: 1.0880 (2.7002) Loss: 1.0880 (2.7002)
222
+ 2025-05-06,19:05:48 | WARNING | Handling webdataset error (OSError('image file is truncated (64 bytes not processed)')). Ignoring.
223
+ 2025-05-06,19:10:33 | INFO | Train Epoch: 0 [ 77611008/128008192 (61%)] Data (t): 0.379 Batch (t): 5.641, 2918.67/s, 182.417/s/gpu LR: 0.000954 Logit Scale: 64.866 Contrastive_loss: 1.1069 (2.6583) Loss: 1.1069 (2.6583)
224
+ 2025-05-06,19:22:41 | INFO | Train Epoch: 0 [ 79708160/128008192 (62%)] Data (t): 0.376 Batch (t): 5.689, 2895.54/s, 180.971/s/gpu LR: 0.000951 Logit Scale: 65.513 Contrastive_loss: 1.1329 (2.6192) Loss: 1.1329 (2.6192)
225
+ 2025-05-06,19:34:46 | INFO | Train Epoch: 0 [ 81805312/128008192 (64%)] Data (t): 0.374 Batch (t): 5.666, 2891.25/s, 180.703/s/gpu LR: 0.000948 Logit Scale: 65.762 Contrastive_loss: 1.1112 (2.5815) Loss: 1.1112 (2.5815)
226
+ 2025-05-06,19:46:53 | INFO | Train Epoch: 0 [ 83902464/128008192 (66%)] Data (t): 0.368 Batch (t): 5.675, 2754.43/s, 172.152/s/gpu LR: 0.000945 Logit Scale: 66.061 Contrastive_loss: 1.1860 (2.5474) Loss: 1.1860 (2.5474)
227
+ 2025-05-06,19:59:00 | INFO | Train Epoch: 0 [ 85999616/128008192 (67%)] Data (t): 0.374 Batch (t): 5.684, 2857.25/s, 178.578/s/gpu LR: 0.000942 Logit Scale: 66.419 Contrastive_loss: 1.1136 (2.5133) Loss: 1.1136 (2.5133)
228
+ 2025-05-06,20:01:42 | WARNING | Handling webdataset error (OSError('image file is truncated (54 bytes not processed)')). Ignoring.
229
+ 2025-05-06,20:02:42 | WARNING | Handling webdataset error (OSError('image file is truncated (72 bytes not processed)')). Ignoring.
230
+ 2025-05-06,20:11:11 | INFO | Train Epoch: 0 [ 88096768/128008192 (69%)] Data (t): 0.376 Batch (t): 5.707, 2902.52/s, 181.407/s/gpu LR: 0.000939 Logit Scale: 66.636 Contrastive_loss: 1.1058 (2.4806) Loss: 1.1058 (2.4806)
231
+ 2025-05-06,20:23:16 | INFO | Train Epoch: 0 [ 90193920/128008192 (70%)] Data (t): 0.379 Batch (t): 5.668, 2926.06/s, 182.879/s/gpu LR: 0.000936 Logit Scale: 66.949 Contrastive_loss: 1.2265 (2.4521) Loss: 1.2265 (2.4521)
232
+ 2025-05-06,20:35:22 | INFO | Train Epoch: 0 [ 92291072/128008192 (72%)] Data (t): 0.375 Batch (t): 5.668, 2909.08/s, 181.818/s/gpu LR: 0.000933 Logit Scale: 67.200 Contrastive_loss: 1.0564 (2.4210) Loss: 1.0564 (2.4210)
233
+ 2025-05-06,20:47:34 | INFO | Train Epoch: 0 [ 94388224/128008192 (74%)] Data (t): 0.375 Batch (t): 5.719, 2868.77/s, 179.298/s/gpu LR: 0.000930 Logit Scale: 67.186 Contrastive_loss: 0.98671 (2.3899) Loss: 0.98671 (2.3899)
234
+ 2025-05-06,20:59:38 | INFO | Train Epoch: 0 [ 96485376/128008192 (75%)] Data (t): 0.375 Batch (t): 5.662, 2927.47/s, 182.967/s/gpu LR: 0.000926 Logit Scale: 67.691 Contrastive_loss: 1.0063 (2.3604) Loss: 1.0063 (2.3604)
235
+ 2025-05-06,21:11:46 | INFO | Train Epoch: 0 [ 98582528/128008192 (77%)] Data (t): 0.362 Batch (t): 5.683, 2949.03/s, 184.315/s/gpu LR: 0.000923 Logit Scale: 68.002 Contrastive_loss: 0.99076 (2.3319) Loss: 0.99076 (2.3319)
236
+ 2025-05-06,21:23:58 | INFO | Train Epoch: 0 [100679680/128008192 (79%)] Data (t): 0.421 Batch (t): 5.720, 2840.36/s, 177.523/s/gpu LR: 0.000919 Logit Scale: 68.259 Contrastive_loss: 1.0663 (2.3061) Loss: 1.0663 (2.3061)
237
+ 2025-05-06,21:36:05 | INFO | Train Epoch: 0 [102776832/128008192 (80%)] Data (t): 0.356 Batch (t): 5.678, 2854.76/s, 178.422/s/gpu LR: 0.000916 Logit Scale: 68.413 Contrastive_loss: 1.0643 (2.2812) Loss: 1.0643 (2.2812)
238
+ 2025-05-06,21:42:46 | WARNING | Handling webdataset error (OSError('image file is truncated (92 bytes not processed)')). Ignoring.
239
+ 2025-05-06,21:48:18 | INFO | Train Epoch: 0 [104873984/128008192 (82%)] Data (t): 0.371 Batch (t): 5.733, 2928.18/s, 183.012/s/gpu LR: 0.000912 Logit Scale: 68.810 Contrastive_loss: 1.1003 (2.2581) Loss: 1.1003 (2.2581)
240
+ 2025-05-06,22:00:19 | INFO | Train Epoch: 0 [106971136/128008192 (84%)] Data (t): 0.369 Batch (t): 5.630, 2740.06/s, 171.253/s/gpu LR: 0.000908 Logit Scale: 69.055 Contrastive_loss: 0.89877 (2.2319) Loss: 0.89877 (2.2319)
241
+ 2025-05-06,22:12:22 | INFO | Train Epoch: 0 [109068288/128008192 (85%)] Data (t): 0.374 Batch (t): 5.649, 2912.05/s, 182.003/s/gpu LR: 0.000904 Logit Scale: 69.181 Contrastive_loss: 0.94747 (2.2077) Loss: 0.94747 (2.2077)
242
+ 2025-05-06,22:14:28 | WARNING | Handling webdataset error (OSError('image file is truncated (59 bytes not processed)')). Ignoring.
243
+ 2025-05-06,22:24:23 | INFO | Train Epoch: 0 [111165440/128008192 (87%)] Data (t): 0.378 Batch (t): 5.631, 2927.67/s, 182.979/s/gpu LR: 0.000900 Logit Scale: 69.407 Contrastive_loss: 0.95738 (2.1845) Loss: 0.95738 (2.1845)
244
+ 2025-05-06,22:36:30 | INFO | Train Epoch: 0 [113262592/128008192 (88%)] Data (t): 0.376 Batch (t): 5.681, 2902.43/s, 181.402/s/gpu LR: 0.000897 Logit Scale: 69.640 Contrastive_loss: 0.79848 (2.1593) Loss: 0.79848 (2.1593)
245
+ 2025-05-06,22:45:51 | WARNING | Handling webdataset error (OSError('image file is truncated (101 bytes not processed)')). Ignoring.
246
+ 2025-05-06,22:48:35 | INFO | Train Epoch: 0 [115359744/128008192 (90%)] Data (t): 0.380 Batch (t): 5.662, 2887.37/s, 180.461/s/gpu LR: 0.000892 Logit Scale: 69.789 Contrastive_loss: 0.95363 (2.1378) Loss: 0.95363 (2.1378)
247
+ 2025-05-06,23:00:39 | INFO | Train Epoch: 0 [117456896/128008192 (92%)] Data (t): 0.374 Batch (t): 5.655, 2810.49/s, 175.656/s/gpu LR: 0.000888 Logit Scale: 70.152 Contrastive_loss: 0.79762 (2.1143) Loss: 0.79762 (2.1143)
248
+ 2025-05-06,23:12:44 | INFO | Train Epoch: 0 [119554048/128008192 (93%)] Data (t): 0.381 Batch (t): 5.668, 2693.00/s, 168.313/s/gpu LR: 0.000884 Logit Scale: 70.173 Contrastive_loss: 1.1472 (2.0976) Loss: 1.1472 (2.0976)
249
+ 2025-05-06,23:17:09 | WARNING | Handling webdataset error (OSError('image file is truncated (82 bytes not processed)')). Ignoring.
250
+ 2025-05-06,23:22:43 | WARNING | Handling webdataset error (OSError('image file is truncated (37 bytes not processed)')). Ignoring.
251
+ 2025-05-06,23:24:58 | INFO | Train Epoch: 0 [121651200/128008192 (95%)] Data (t): 0.377 Batch (t): 5.732, 2903.65/s, 181.478/s/gpu LR: 0.000880 Logit Scale: 70.383 Contrastive_loss: 1.0090 (2.0792) Loss: 1.0090 (2.0792)
252
+ 2025-05-06,23:27:31 | WARNING | Handling webdataset error (OSError('image file is truncated (4 bytes not processed)')). Ignoring.
253
+ 2025-05-06,23:32:47 | WARNING | Handling webdataset error (OSError('image file is truncated (88 bytes not processed)')). Ignoring.
254
+ 2025-05-06,23:37:00 | INFO | Train Epoch: 0 [123748352/128008192 (97%)] Data (t): 0.367 Batch (t): 5.645, 2879.15/s, 179.947/s/gpu LR: 0.000876 Logit Scale: 70.577 Contrastive_loss: 1.0011 (2.0612) Loss: 1.0011 (2.0612)
255
+ 2025-05-06,23:49:06 | INFO | Train Epoch: 0 [125845504/128008192 (98%)] Data (t): 0.378 Batch (t): 5.672, 2903.75/s, 181.485/s/gpu LR: 0.000871 Logit Scale: 70.949 Contrastive_loss: 1.0423 (2.0445) Loss: 1.0423 (2.0445)
256
+ 2025-05-07,00:01:17 | INFO | Train Epoch: 0 [127942656/128008192 (100%)] Data (t): 0.373 Batch (t): 5.705, 2820.89/s, 176.305/s/gpu LR: 0.000867 Logit Scale: 71.084 Contrastive_loss: 1.0349 (2.0282) Loss: 1.0349 (2.0282)
257
+ 2025-05-07,00:01:39 | INFO | Train Epoch: 0 [128008192/128008192 (100%)] Data (t): 0.374 Batch (t): 5.618, 2949.87/s, 184.367/s/gpu LR: 0.000867 Logit Scale: 71.106 Contrastive_loss: 0.92982 (2.0108) Loss: 0.92982 (2.0108)
258
+ 2025-05-07,00:01:45 | INFO | Start epoch 1
259
+ 2025-05-07,00:01:57 | INFO | Train Epoch: 1 [ 16384/128008192 (0%)] Data (t): 7.381 Batch (t): 11.754, 1393.94/s, 87.1215/s/gpu LR: 0.000867 Logit Scale: 71.108 Contrastive_loss: 0.97268 (0.97268) Loss: 0.97268 (0.97268)
260
+ 2025-05-07,00:13:59 | INFO | Train Epoch: 1 [ 2113536/128008192 (2%)] Data (t): 0.625 Batch (t): 5.641, 2918.10/s, 182.381/s/gpu LR: 0.000862 Logit Scale: 71.298 Contrastive_loss: 1.0932 (1.0329) Loss: 1.0932 (1.0329)
261
+ 2025-05-07,00:17:43 | WARNING | Handling webdataset error (OSError('image file is truncated (50 bytes not processed)')). Ignoring.
262
+ 2025-05-07,00:17:43 | WARNING | Handling webdataset error (OSError('image file is truncated (53 bytes not processed)')). Ignoring.
263
+ 2025-05-07,00:26:05 | INFO | Train Epoch: 1 [ 4210688/128008192 (3%)] Data (t): 0.380 Batch (t): 5.677, 2935.95/s, 183.497/s/gpu LR: 0.000858 Logit Scale: 71.571 Contrastive_loss: 1.1247 (1.0635) Loss: 1.1247 (1.0635)
264
+ 2025-05-07,00:38:05 | INFO | Train Epoch: 1 [ 6307840/128008192 (5%)] Data (t): 0.361 Batch (t): 5.619, 2993.96/s, 187.123/s/gpu LR: 0.000853 Logit Scale: 71.592 Contrastive_loss: 1.0843 (1.0687) Loss: 1.0843 (1.0687)
265
+ 2025-05-07,00:50:04 | INFO | Train Epoch: 1 [ 8404992/128008192 (7%)] Data (t): 0.357 Batch (t): 5.621, 2891.63/s, 180.727/s/gpu LR: 0.000849 Logit Scale: 71.801 Contrastive_loss: 1.0255 (1.0601) Loss: 1.0255 (1.0601)
266
+ 2025-05-07,01:02:16 | INFO | Train Epoch: 1 [ 10502144/128008192 (8%)] Data (t): 0.373 Batch (t): 5.717, 2967.44/s, 185.465/s/gpu LR: 0.000844 Logit Scale: 71.973 Contrastive_loss: 0.90339 (1.0340) Loss: 0.90339 (1.0340)
267
+ 2025-05-07,01:14:24 | INFO | Train Epoch: 1 [ 12599296/128008192 (10%)] Data (t): 0.374 Batch (t): 5.686, 2914.79/s, 182.175/s/gpu LR: 0.000839 Logit Scale: 72.053 Contrastive_loss: 0.80941 (1.0019) Loss: 0.80941 (1.0019)
268
+ 2025-05-07,01:26:30 | INFO | Train Epoch: 1 [ 14696448/128008192 (11%)] Data (t): 0.379 Batch (t): 5.671, 2852.55/s, 178.285/s/gpu LR: 0.000834 Logit Scale: 72.197 Contrastive_loss: 0.78505 (0.97477) Loss: 0.78505 (0.97477)
269
+ 2025-05-07,01:38:32 | INFO | Train Epoch: 1 [ 16793600/128008192 (13%)] Data (t): 0.383 Batch (t): 5.646, 2959.60/s, 184.975/s/gpu LR: 0.000829 Logit Scale: 72.364 Contrastive_loss: 0.92437 (0.96917) Loss: 0.92437 (0.96917)
270
+ 2025-05-07,01:50:37 | INFO | Train Epoch: 1 [ 18890752/128008192 (15%)] Data (t): 0.378 Batch (t): 5.663, 2897.66/s, 181.104/s/gpu LR: 0.000824 Logit Scale: 72.524 Contrastive_loss: 0.81629 (0.95388) Loss: 0.81629 (0.95388)
271
+ 2025-05-07,01:53:30 | WARNING | Handling webdataset error (OSError('image file is truncated (6 bytes not processed)')). Ignoring.
272
+ 2025-05-07,02:02:38 | INFO | Train Epoch: 1 [ 20987904/128008192 (16%)] Data (t): 0.369 Batch (t): 5.634, 2953.85/s, 184.615/s/gpu LR: 0.000819 Logit Scale: 72.855 Contrastive_loss: 0.87990 (0.94716) Loss: 0.87990 (0.94716)
273
+ 2025-05-07,02:04:28 | WARNING | Handling webdataset error (OSError('image file is truncated (32 bytes not processed)')). Ignoring.
274
+ 2025-05-07,02:05:10 | WARNING | Handling webdataset error (OSError('image file is truncated (66 bytes not processed)')). Ignoring.
275
+ 2025-05-07,02:14:39 | INFO | Train Epoch: 1 [ 23085056/128008192 (18%)] Data (t): 0.377 Batch (t): 5.632, 2921.82/s, 182.614/s/gpu LR: 0.000814 Logit Scale: 72.876 Contrastive_loss: 0.89661 (0.94294) Loss: 0.89661 (0.94294)
276
+ 2025-05-07,02:25:17 | WARNING | Handling webdataset error (OSError('image file is truncated (4 bytes not processed)')). Ignoring.
277
+ 2025-05-07,02:26:41 | INFO | Train Epoch: 1 [ 25182208/128008192 (20%)] Data (t): 0.378 Batch (t): 5.641, 2909.34/s, 181.833/s/gpu LR: 0.000809 Logit Scale: 72.971 Contrastive_loss: 0.94616 (0.94319) Loss: 0.94616 (0.94319)
278
+ 2025-05-07,02:34:49 | WARNING | Handling webdataset error (OSError('image file is truncated (5 bytes not processed)')). Ignoring.
279
+ 2025-05-07,02:38:44 | INFO | Train Epoch: 1 [ 27279360/128008192 (21%)] Data (t): 0.379 Batch (t): 5.648, 2963.67/s, 185.229/s/gpu LR: 0.000804 Logit Scale: 73.136 Contrastive_loss: 0.79497 (0.93260) Loss: 0.79497 (0.93260)
280
+ 2025-05-07,02:39:02 | WARNING | Handling webdataset error (OSError('image file is truncated (5 bytes not processed)')). Ignoring.
281
+ 2025-05-07,02:42:47 | WARNING | Handling webdataset error (OSError('image file is truncated (131 bytes not processed)')). Ignoring.
282
+ 2025-05-07,02:50:41 | INFO | Train Epoch: 1 [ 29376512/128008192 (23%)] Data (t): 0.377 Batch (t): 5.602, 2958.21/s, 184.888/s/gpu LR: 0.000799 Logit Scale: 73.301 Contrastive_loss: 0.84971 (0.92708) Loss: 0.84971 (0.92708)
283
+ 2025-05-07,02:51:35 | WARNING | Handling webdataset error (OSError('image file is truncated (15 bytes not processed)')). Ignoring.
284
+ 2025-05-07,03:02:48 | INFO | Train Epoch: 1 [ 31473664/128008192 (25%)] Data (t): 0.377 Batch (t): 5.677, 2929.61/s, 183.100/s/gpu LR: 0.000794 Logit Scale: 73.498 Contrastive_loss: 1.0497 (0.93474) Loss: 1.0497 (0.93474)
285
+ 2025-05-07,03:14:51 | INFO | Train Epoch: 1 [ 33570816/128008192 (26%)] Data (t): 0.381 Batch (t): 5.648, 2913.65/s, 182.103/s/gpu LR: 0.000788 Logit Scale: 73.722 Contrastive_loss: 0.88784 (0.93198) Loss: 0.88784 (0.93198)
286
+ 2025-05-07,03:20:17 | WARNING | Handling webdataset error (OSError('image file is truncated (24 bytes not processed)')). Ignoring.
287
+ 2025-05-07,03:26:50 | INFO | Train Epoch: 1 [ 35667968/128008192 (28%)] Data (t): 0.359 Batch (t): 5.618, 2926.07/s, 182.879/s/gpu LR: 0.000783 Logit Scale: 73.748 Contrastive_loss: 0.91809 (0.93121) Loss: 0.91809 (0.93121)
288
+ 2025-05-07,03:36:11 | WARNING | Handling webdataset error (OSError('image file is truncated (186 bytes not processed)')). Ignoring.
289
+ 2025-05-07,03:38:54 | INFO | Train Epoch: 1 [ 37765120/128008192 (30%)] Data (t): 0.379 Batch (t): 5.653, 2937.63/s, 183.602/s/gpu LR: 0.000777 Logit Scale: 73.913 Contrastive_loss: 0.83517 (0.92616) Loss: 0.83517 (0.92616)
290
+ 2025-05-07,03:50:55 | INFO | Train Epoch: 1 [ 39862272/128008192 (31%)] Data (t): 0.382 Batch (t): 5.639, 2965.66/s, 185.354/s/gpu LR: 0.000772 Logit Scale: 73.996 Contrastive_loss: 0.83897 (0.92180) Loss: 0.83897 (0.92180)
291
+ 2025-05-07,04:03:00 | INFO | Train Epoch: 1 [ 41959424/128008192 (33%)] Data (t): 0.381 Batch (t): 5.660, 2913.28/s, 182.080/s/gpu LR: 0.000767 Logit Scale: 74.102 Contrastive_loss: 0.93905 (0.92262) Loss: 0.93905 (0.92262)
292
+ 2025-05-07,04:15:04 | INFO | Train Epoch: 1 [ 44056576/128008192 (34%)] Data (t): 0.383 Batch (t): 5.656, 2890.18/s, 180.636/s/gpu LR: 0.000761 Logit Scale: 74.352 Contrastive_loss: 0.79451 (0.91679) Loss: 0.79451 (0.91679)
293
+ 2025-05-07,04:27:06 | INFO | Train Epoch: 1 [ 46153728/128008192 (36%)] Data (t): 0.383 Batch (t): 5.640, 2911.09/s, 181.943/s/gpu LR: 0.000755 Logit Scale: 74.480 Contrastive_loss: 0.97329 (0.91925) Loss: 0.97329 (0.91925)
294
+ 2025-05-07,04:39:08 | INFO | Train Epoch: 1 [ 48250880/128008192 (38%)] Data (t): 0.386 Batch (t): 5.643, 2863.05/s, 178.941/s/gpu LR: 0.000750 Logit Scale: 74.752 Contrastive_loss: 0.90988 (0.91886) Loss: 0.90988 (0.91886)
295
+ 2025-05-07,04:51:08 | INFO | Train Epoch: 1 [ 50348032/128008192 (39%)] Data (t): 0.377 Batch (t): 5.623, 2927.28/s, 182.955/s/gpu LR: 0.000744 Logit Scale: 74.782 Contrastive_loss: 0.79242 (0.91380) Loss: 0.79242 (0.91380)
296
+ 2025-05-07,04:53:44 | WARNING | Handling webdataset error (OSError('image file is truncated (5 bytes not processed)')). Ignoring.
297
+ 2025-05-07,04:56:26 | WARNING | Handling webdataset error (OSError('image file is truncated (14 bytes not processed)')). Ignoring.
298
+ 2025-05-07,05:03:15 | INFO | Train Epoch: 1 [ 52445184/128008192 (41%)] Data (t): 0.375 Batch (t): 5.678, 2914.62/s, 182.164/s/gpu LR: 0.000738 Logit Scale: 75.058 Contrastive_loss: 0.86759 (0.91203) Loss: 0.86759 (0.91203)
299
+ 2025-05-07,05:15:30 | INFO | Train Epoch: 1 [ 54542336/128008192 (43%)] Data (t): 0.364 Batch (t): 5.747, 2884.33/s, 180.271/s/gpu LR: 0.000733 Logit Scale: 75.126 Contrastive_loss: 0.82566 (0.90883) Loss: 0.82566 (0.90883)
300
+ 2025-05-07,05:27:33 | INFO | Train Epoch: 1 [ 56639488/128008192 (44%)] Data (t): 0.373 Batch (t): 5.646, 3007.39/s, 187.962/s/gpu LR: 0.000727 Logit Scale: 75.031 Contrastive_loss: 0.82822 (0.90595) Loss: 0.82822 (0.90595)
301
+ 2025-05-07,05:30:19 | WARNING | Handling webdataset error (OSError('image file is truncated (76 bytes not processed)')). Ignoring.
302
+ 2025-05-07,05:31:44 | WARNING | Handling webdataset error (OSError('image file is truncated (1 bytes not processed)')). Ignoring.
303
+ 2025-05-07,05:39:36 | INFO | Train Epoch: 1 [ 58736640/128008192 (46%)] Data (t): 0.375 Batch (t): 5.647, 2849.74/s, 178.109/s/gpu LR: 0.000721 Logit Scale: 75.089 Contrastive_loss: 0.94766 (0.90739) Loss: 0.94766 (0.90739)
304
+ 2025-05-07,05:41:56 | WARNING | Handling webdataset error (OSError('image file is truncated (7 bytes not processed)')). Ignoring.
305
+ 2025-05-07,05:51:35 | INFO | Train Epoch: 1 [ 60833792/128008192 (48%)] Data (t): 0.370 Batch (t): 5.620, 2900.55/s, 181.284/s/gpu LR: 0.000715 Logit Scale: 75.396 Contrastive_loss: 0.78444 (0.90329) Loss: 0.78444 (0.90329)
306
+ 2025-05-07,06:03:43 | INFO | Train Epoch: 1 [ 62930944/128008192 (49%)] Data (t): 0.375 Batch (t): 5.686, 2938.79/s, 183.674/s/gpu LR: 0.000709 Logit Scale: 75.503 Contrastive_loss: 0.71530 (0.89722) Loss: 0.71530 (0.89722)
307
+ 2025-05-07,06:12:15 | WARNING | Handling webdataset error (OSError('image file is truncated (5 bytes not processed)')). Ignoring.
308
+ 2025-05-07,06:12:54 | WARNING | Handling webdataset error (OSError('image file is truncated (28 bytes not processed)')). Ignoring.
309
+ 2025-05-07,06:13:56 | WARNING | Handling webdataset error (OSError('image file is truncated (17 bytes not processed)')). Ignoring.
310
+ 2025-05-07,06:15:48 | INFO | Train Epoch: 1 [ 65028096/128008192 (51%)] Data (t): 0.374 Batch (t): 5.667, 2744.12/s, 171.508/s/gpu LR: 0.000703 Logit Scale: 75.473 Contrastive_loss: 0.85657 (0.89595) Loss: 0.85657 (0.89595)
311
+ 2025-05-07,06:21:27 | WARNING | Handling webdataset error (OSError('image file is truncated (12 bytes not processed)')). Ignoring.
312
+ 2025-05-07,06:25:56 | WARNING | Handling webdataset error (OSError('image file is truncated (28 bytes not processed)')). Ignoring.
313
+ 2025-05-07,06:27:56 | INFO | Train Epoch: 1 [ 67125248/128008192 (52%)] Data (t): 0.369 Batch (t): 5.690, 2896.23/s, 181.014/s/gpu LR: 0.000697 Logit Scale: 75.730 Contrastive_loss: 0.82678 (0.89386) Loss: 0.82678 (0.89386)
314
+ 2025-05-07,06:40:04 | INFO | Train Epoch: 1 [ 69222400/128008192 (54%)] Data (t): 0.393 Batch (t): 5.687, 2894.53/s, 180.908/s/gpu LR: 0.000691 Logit Scale: 75.804 Contrastive_loss: 1.0394 (0.89814) Loss: 1.0394 (0.89814)
315
+ 2025-05-07,06:52:20 | INFO | Train Epoch: 1 [ 71319552/128008192 (56%)] Data (t): 0.623 Batch (t): 5.743, 2919.63/s, 182.477/s/gpu LR: 0.000685 Logit Scale: 75.957 Contrastive_loss: 0.82509 (0.89605) Loss: 0.82509 (0.89605)
316
+ 2025-05-07,07:04:18 | INFO | Train Epoch: 1 [ 73416704/128008192 (57%)] Data (t): 0.382 Batch (t): 5.611, 2924.02/s, 182.751/s/gpu LR: 0.000679 Logit Scale: 76.294 Contrastive_loss: 0.86247 (0.89512) Loss: 0.86247 (0.89512)
317
+ 2025-05-07,07:12:02 | WARNING | Handling webdataset error (OSError('image file is truncated (151 bytes not processed)')). Ignoring.
318
+ 2025-05-07,07:16:22 | INFO | Train Epoch: 1 [ 75513856/128008192 (59%)] Data (t): 0.378 Batch (t): 5.660, 2843.96/s, 177.747/s/gpu LR: 0.000673 Logit Scale: 76.385 Contrastive_loss: 0.85842 (0.89413) Loss: 0.85842 (0.89413)
319
+ 2025-05-07,07:28:27 | INFO | Train Epoch: 1 [ 77611008/128008192 (61%)] Data (t): 0.377 Batch (t): 5.660, 2936.77/s, 183.548/s/gpu LR: 0.000667 Logit Scale: 76.578 Contrastive_loss: 0.87413 (0.89360) Loss: 0.87413 (0.89360)
320
+ 2025-05-07,07:40:29 | INFO | Train Epoch: 1 [ 79708160/128008192 (62%)] Data (t): 0.381 Batch (t): 5.644, 2876.02/s, 179.751/s/gpu LR: 0.000661 Logit Scale: 76.636 Contrastive_loss: 0.82380 (0.89181) Loss: 0.82380 (0.89181)
321
+ 2025-05-07,07:52:33 | INFO | Train Epoch: 1 [ 81805312/128008192 (64%)] Data (t): 0.386 Batch (t): 5.654, 2981.25/s, 186.328/s/gpu LR: 0.000654 Logit Scale: 76.520 Contrastive_loss: 0.90881 (0.89223) Loss: 0.90881 (0.89223)
322
+ 2025-05-07,08:04:39 | INFO | Train Epoch: 1 [ 83902464/128008192 (66%)] Data (t): 0.377 Batch (t): 5.669, 2686.70/s, 167.919/s/gpu LR: 0.000648 Logit Scale: 76.783 Contrastive_loss: 0.87141 (0.89173) Loss: 0.87141 (0.89173)
323
+ 2025-05-07,08:16:43 | INFO | Train Epoch: 1 [ 85999616/128008192 (67%)] Data (t): 0.377 Batch (t): 5.660, 2888.76/s, 180.547/s/gpu LR: 0.000642 Logit Scale: 76.986 Contrastive_loss: 0.87172 (0.89125) Loss: 0.87172 (0.89125)
324
+ 2025-05-07,08:28:44 | INFO | Train Epoch: 1 [ 88096768/128008192 (69%)] Data (t): 0.381 Batch (t): 5.630, 2893.91/s, 180.870/s/gpu LR: 0.000636 Logit Scale: 77.038 Contrastive_loss: 0.87869 (0.89096) Loss: 0.87869 (0.89096)
325
+ 2025-05-07,08:40:50 | INFO | Train Epoch: 1 [ 90193920/128008192 (70%)] Data (t): 0.379 Batch (t): 5.674, 2796.83/s, 174.802/s/gpu LR: 0.000629 Logit Scale: 77.018 Contrastive_loss: 0.73682 (0.88745) Loss: 0.73682 (0.88745)
326
+ 2025-05-07,08:52:51 | INFO | Train Epoch: 1 [ 92291072/128008192 (72%)] Data (t): 0.380 Batch (t): 5.633, 2963.44/s, 185.215/s/gpu LR: 0.000623 Logit Scale: 77.225 Contrastive_loss: 0.88852 (0.88748) Loss: 0.88852 (0.88748)
327
+ 2025-05-07,08:57:29 | WARNING | Handling webdataset error (OSError('image file is truncated (99 bytes not processed)')). Ignoring.
328
+ 2025-05-07,09:00:36 | WARNING | Handling webdataset error (OSError('image file is truncated (45 bytes not processed)')). Ignoring.
329
+ 2025-05-07,09:02:18 | WARNING | Handling webdataset error (OSError('image file is truncated (55 bytes not processed)')). Ignoring.
330
+ 2025-05-07,09:04:52 | INFO | Train Epoch: 1 [ 94388224/128008192 (74%)] Data (t): 0.379 Batch (t): 5.630, 2894.16/s, 180.885/s/gpu LR: 0.000617 Logit Scale: 77.282 Contrastive_loss: 0.95756 (0.88900) Loss: 0.95756 (0.88900)
331
+ 2025-05-07,09:16:57 | INFO | Train Epoch: 1 [ 96485376/128008192 (75%)] Data (t): 0.381 Batch (t): 5.667, 2920.07/s, 182.504/s/gpu LR: 0.000610 Logit Scale: 77.415 Contrastive_loss: 0.84778 (0.88812) Loss: 0.84778 (0.88812)
332
+ 2025-05-07,09:17:41 | WARNING | Handling webdataset error (OSError('image file is truncated (33 bytes not processed)')). Ignoring.
333
+ 2025-05-07,09:28:59 | INFO | Train Epoch: 1 [ 98582528/128008192 (77%)] Data (t): 0.386 Batch (t): 5.639, 2866.54/s, 179.159/s/gpu LR: 0.000604 Logit Scale: 77.405 Contrastive_loss: 0.85286 (0.88739) Loss: 0.85286 (0.88739)
334
+ 2025-05-07,09:41:00 | INFO | Train Epoch: 1 [100679680/128008192 (79%)] Data (t): 0.377 Batch (t): 5.632, 2918.69/s, 182.418/s/gpu LR: 0.000597 Logit Scale: 77.667 Contrastive_loss: 0.91629 (0.88798) Loss: 0.91629 (0.88798)
335
+ 2025-05-07,09:44:16 | WARNING | Handling webdataset error (OSError('image file is truncated (0 bytes not processed)')). Ignoring.
336
+ 2025-05-07,09:49:00 | WARNING | Handling webdataset error (OSError('image file is truncated (108 bytes not processed)')). Ignoring.
337
+ 2025-05-07,09:53:04 | INFO | Train Epoch: 1 [102776832/128008192 (80%)] Data (t): 0.392 Batch (t): 5.660, 2870.27/s, 179.392/s/gpu LR: 0.000591 Logit Scale: 77.802 Contrastive_loss: 0.90265 (0.88827) Loss: 0.90265 (0.88827)
338
+ 2025-05-07,09:55:15 | WARNING | Handling webdataset error (OSError('image file is truncated (85 bytes not processed)')). Ignoring.
339
+ 2025-05-07,10:01:51 | WARNING | Handling webdataset error (OSError('image file is truncated (25 bytes not processed)')). Ignoring.
340
+ 2025-05-07,10:05:07 | INFO | Train Epoch: 1 [104873984/128008192 (82%)] Data (t): 0.385 Batch (t): 5.649, 2865.12/s, 179.070/s/gpu LR: 0.000585 Logit Scale: 77.967 Contrastive_loss: 0.87233 (0.88796) Loss: 0.87233 (0.88796)
341
+ 2025-05-07,10:17:09 | INFO | Train Epoch: 1 [106971136/128008192 (84%)] Data (t): 0.381 Batch (t): 5.641, 2894.51/s, 180.907/s/gpu LR: 0.000578 Logit Scale: 78.056 Contrastive_loss: 0.84737 (0.88718) Loss: 0.84737 (0.88718)
342
+ 2025-05-07,10:26:47 | WARNING | Handling webdataset error (OSError('image file is truncated (67 bytes not processed)')). Ignoring.
343
+ 2025-05-07,10:29:11 | INFO | Train Epoch: 1 [109068288/128008192 (85%)] Data (t): 0.376 Batch (t): 5.637, 2888.18/s, 180.512/s/gpu LR: 0.000572 Logit Scale: 78.159 Contrastive_loss: 0.84202 (0.88633) Loss: 0.84202 (0.88633)
344
+ 2025-05-07,10:41:22 | INFO | Train Epoch: 1 [111165440/128008192 (87%)] Data (t): 0.377 Batch (t): 5.716, 2964.57/s, 185.286/s/gpu LR: 0.000565 Logit Scale: 78.386 Contrastive_loss: 0.78789 (0.88451) Loss: 0.78789 (0.88451)
345
+ 2025-05-07,10:46:27 | WARNING | Handling webdataset error (OSError('image file is truncated (17 bytes not processed)')). Ignoring.
346
+ 2025-05-07,10:49:54 | WARNING | Handling webdataset error (OSError('image file is truncated (21 bytes not processed)')). Ignoring.
347
+ 2025-05-07,10:53:28 | INFO | Train Epoch: 1 [113262592/128008192 (88%)] Data (t): 0.377 Batch (t): 5.668, 2931.98/s, 183.249/s/gpu LR: 0.000559 Logit Scale: 78.396 Contrastive_loss: 0.86814 (0.88421) Loss: 0.86814 (0.88421)
348
+ 2025-05-07,11:05:35 | INFO | Train Epoch: 1 [115359744/128008192 (90%)] Data (t): 0.379 Batch (t): 5.679, 2940.84/s, 183.803/s/gpu LR: 0.000552 Logit Scale: 78.460 Contrastive_loss: 0.81978 (0.88306) Loss: 0.81978 (0.88306)
349
+ 2025-05-07,11:14:57 | WARNING | Handling webdataset error (OSError('image file is truncated (88 bytes not processed)')). Ignoring.
350
+ 2025-05-07,11:17:33 | INFO | Train Epoch: 1 [117456896/128008192 (92%)] Data (t): 0.379 Batch (t): 5.611, 2910.54/s, 181.909/s/gpu LR: 0.000546 Logit Scale: 78.774 Contrastive_loss: 0.89555 (0.88328) Loss: 0.89555 (0.88328)
351
+ 2025-05-07,11:28:13 | WARNING | Handling webdataset error (OSError('image file is truncated (38 bytes not processed)')). Ignoring.
352
+ 2025-05-07,11:29:35 | INFO | Train Epoch: 1 [119554048/128008192 (93%)] Data (t): 0.377 Batch (t): 5.641, 2881.38/s, 180.086/s/gpu LR: 0.000539 Logit Scale: 78.817 Contrastive_loss: 0.83547 (0.88245) Loss: 0.83547 (0.88245)
353
+ 2025-05-07,11:39:22 | WARNING | Handling webdataset error (OSError('image file is truncated (31 bytes not processed)')). Ignoring.
354
+ 2025-05-07,11:41:49 | INFO | Train Epoch: 1 [121651200/128008192 (95%)] Data (t): 0.378 Batch (t): 5.731, 2958.70/s, 184.919/s/gpu LR: 0.000533 Logit Scale: 78.897 Contrastive_loss: 0.92679 (0.88320) Loss: 0.92679 (0.88320)
355
+ 2025-05-07,11:47:05 | WARNING | Handling webdataset error (OSError('image file is truncated (26 bytes not processed)')). Ignoring.
356
+ 2025-05-07,11:47:58 | WARNING | Handling webdataset error (OSError('image file is truncated (89 bytes not processed)')). Ignoring.
357
+ 2025-05-07,11:53:53 | INFO | Train Epoch: 1 [123748352/128008192 (97%)] Data (t): 0.383 Batch (t): 5.659, 2931.17/s, 183.198/s/gpu LR: 0.000526 Logit Scale: 78.989 Contrastive_loss: 0.73757 (0.88078) Loss: 0.73757 (0.88078)
358
+ 2025-05-07,12:05:58 | INFO | Train Epoch: 1 [125845504/128008192 (98%)] Data (t): 0.383 Batch (t): 5.664, 2902.91/s, 181.432/s/gpu LR: 0.000520 Logit Scale: 79.202 Contrastive_loss: 0.88730 (0.88088) Loss: 0.88730 (0.88088)
359
+ 2025-05-07,12:13:31 | WARNING | Handling webdataset error (OSError('image file is truncated (1 bytes not processed)')). Ignoring.
360
+ 2025-05-07,12:18:03 | INFO | Train Epoch: 1 [127942656/128008192 (100%)] Data (t): 0.382 Batch (t): 5.662, 2858.72/s, 178.670/s/gpu LR: 0.000513 Logit Scale: 79.185 Contrastive_loss: 0.78128 (0.87928) Loss: 0.78128 (0.87928)
361
+ 2025-05-07,12:18:07 | WARNING | Handling webdataset error (OSError('image file is truncated (46 bytes not processed)')). Ignoring.
362
+ 2025-05-07,12:18:25 | INFO | Train Epoch: 1 [128008192/128008192 (100%)] Data (t): 0.371 Batch (t): 5.596, 3077.60/s, 192.350/s/gpu LR: 0.000513 Logit Scale: 79.189 Contrastive_loss: 0.73011 (0.87691) Loss: 0.73011 (0.87691)
363
+ 2025-05-07,12:18:33 | INFO | Start epoch 2
364
+ 2025-05-07,12:18:45 | INFO | Train Epoch: 2 [ 16384/128008192 (0%)] Data (t): 7.473 Batch (t): 11.865, 1380.92/s, 86.3073/s/gpu LR: 0.000513 Logit Scale: 79.182 Contrastive_loss: 0.76969 (0.76969) Loss: 0.76969 (0.76969)
365
+ 2025-05-07,12:22:57 | WARNING | Handling webdataset error (OSError('image file is truncated (9 bytes not processed)')). Ignoring.
366
+ 2025-05-07,12:30:46 | INFO | Train Epoch: 2 [ 2113536/128008192 (2%)] Data (t): 0.388 Batch (t): 5.629, 2958.77/s, 184.923/s/gpu LR: 0.000506 Logit Scale: 79.508 Contrastive_loss: 0.78077 (0.77523) Loss: 0.78077 (0.77523)
367
+ 2025-05-07,12:37:46 | WARNING | Handling webdataset error (OSError('image file is truncated (101 bytes not processed)')). Ignoring.
368
+ 2025-05-07,12:42:50 | INFO | Train Epoch: 2 [ 4210688/128008192 (3%)] Data (t): 0.366 Batch (t): 5.662, 2950.73/s, 184.421/s/gpu LR: 0.000500 Logit Scale: 79.629 Contrastive_loss: 0.71149 (0.75398) Loss: 0.71149 (0.75398)
369
+ 2025-05-07,12:52:49 | WARNING | Handling webdataset error (OSError('image file is truncated (46 bytes not processed)')). Ignoring.
370
+ 2025-05-07,12:54:52 | INFO | Train Epoch: 2 [ 6307840/128008192 (5%)] Data (t): 0.377 Batch (t): 5.635, 2912.93/s, 182.058/s/gpu LR: 0.000493 Logit Scale: 79.662 Contrastive_loss: 0.79652 (0.76462) Loss: 0.79652 (0.76462)
371
+ 2025-05-07,13:00:48 | WARNING | Handling webdataset error (OSError('image file is truncated (9 bytes not processed)')). Ignoring.
372
+ 2025-05-07,13:01:12 | WARNING | Handling webdataset error (OSError('image file is truncated (3 bytes not processed)')). Ignoring.
373
+ 2025-05-07,13:05:35 | WARNING | Handling webdataset error (OSError('image file is truncated (61 bytes not processed)')). Ignoring.
374
+ 2025-05-07,13:06:51 | INFO | Train Epoch: 2 [ 8404992/128008192 (7%)] Data (t): 0.369 Batch (t): 5.618, 2894.09/s, 180.881/s/gpu LR: 0.000487 Logit Scale: 79.854 Contrastive_loss: 0.64805 (0.74130) Loss: 0.64805 (0.74130)
375
+ 2025-05-07,13:18:54 | INFO | Train Epoch: 2 [ 10502144/128008192 (8%)] Data (t): 0.383 Batch (t): 5.652, 2881.56/s, 180.097/s/gpu LR: 0.000480 Logit Scale: 79.917 Contrastive_loss: 0.84810 (0.75910) Loss: 0.84810 (0.75910)
376
+ 2025-05-07,13:30:58 | INFO | Train Epoch: 2 [ 12599296/128008192 (10%)] Data (t): 0.374 Batch (t): 5.659, 2972.49/s, 185.781/s/gpu LR: 0.000474 Logit Scale: 80.055 Contrastive_loss: 0.72484 (0.75421) Loss: 0.72484 (0.75421)
377
+ 2025-05-07,13:43:01 | INFO | Train Epoch: 2 [ 14696448/128008192 (11%)] Data (t): 0.391 Batch (t): 5.647, 2729.53/s, 170.596/s/gpu LR: 0.000467 Logit Scale: 80.196 Contrastive_loss: 0.75568 (0.75439) Loss: 0.75568 (0.75439)
378
+ 2025-05-07,13:55:07 | INFO | Train Epoch: 2 [ 16793600/128008192 (13%)] Data (t): 0.489 Batch (t): 5.669, 2884.53/s, 180.283/s/gpu LR: 0.000461 Logit Scale: 80.217 Contrastive_loss: 0.78141 (0.75739) Loss: 0.78141 (0.75739)
379
+ 2025-05-07,13:59:10 | WARNING | Handling webdataset error (OSError('broken data stream when reading image file')). Ignoring.
380
+ 2025-05-07,14:03:09 | WARNING | Handling webdataset error (OSError('image file is truncated (7 bytes not processed)')). Ignoring.
381
+ 2025-05-07,14:07:13 | INFO | Train Epoch: 2 [ 18890752/128008192 (15%)] Data (t): 0.383 Batch (t): 5.670, 2849.66/s, 178.104/s/gpu LR: 0.000454 Logit Scale: 80.417 Contrastive_loss: 0.74972 (0.75663) Loss: 0.74972 (0.75663)
382
+ 2025-05-07,14:08:33 | WARNING | Handling webdataset error (OSError('image file is truncated (88 bytes not processed)')). Ignoring.
383
+ 2025-05-07,14:19:23 | INFO | Train Epoch: 2 [ 20987904/128008192 (16%)] Data (t): 0.387 Batch (t): 5.704, 2916.61/s, 182.288/s/gpu LR: 0.000447 Logit Scale: 80.589 Contrastive_loss: 0.74302 (0.75539) Loss: 0.74302 (0.75539)
384
+ 2025-05-07,14:24:47 | WARNING | Handling webdataset error (OSError('image file is truncated (47 bytes not processed)')). Ignoring.
385
+ 2025-05-07,14:31:27 | INFO | Train Epoch: 2 [ 23085056/128008192 (18%)] Data (t): 0.387 Batch (t): 5.654, 2915.99/s, 182.249/s/gpu LR: 0.000441 Logit Scale: 80.690 Contrastive_loss: 0.83508 (0.76203) Loss: 0.83508 (0.76203)
386
+ 2025-05-07,14:33:56 | WARNING | Handling webdataset error (OSError('image file is truncated (82 bytes not processed)')). Ignoring.
387
+ 2025-05-07,14:43:27 | INFO | Train Epoch: 2 [ 25182208/128008192 (20%)] Data (t): 0.376 Batch (t): 5.626, 2944.09/s, 184.005/s/gpu LR: 0.000435 Logit Scale: 80.830 Contrastive_loss: 0.71333 (0.75828) Loss: 0.71333 (0.75828)
388
+ 2025-05-07,14:55:29 | INFO | Train Epoch: 2 [ 27279360/128008192 (21%)] Data (t): 0.376 Batch (t): 5.643, 2880.81/s, 180.050/s/gpu LR: 0.000428 Logit Scale: 81.062 Contrastive_loss: 0.77313 (0.75934) Loss: 0.77313 (0.75934)
389
+ 2025-05-07,15:07:34 | INFO | Train Epoch: 2 [ 29376512/128008192 (23%)] Data (t): 0.370 Batch (t): 5.660, 2867.09/s, 179.193/s/gpu LR: 0.000422 Logit Scale: 81.180 Contrastive_loss: 0.77774 (0.76057) Loss: 0.77774 (0.76057)
390
+ 2025-05-07,15:19:35 | INFO | Train Epoch: 2 [ 31473664/128008192 (25%)] Data (t): 0.378 Batch (t): 5.639, 2891.92/s, 180.745/s/gpu LR: 0.000415 Logit Scale: 81.368 Contrastive_loss: 0.78084 (0.76184) Loss: 0.78084 (0.76184)
391
+ 2025-05-07,15:31:37 | INFO | Train Epoch: 2 [ 33570816/128008192 (26%)] Data (t): 0.378 Batch (t): 5.639, 2855.44/s, 178.465/s/gpu LR: 0.000409 Logit Scale: 81.445 Contrastive_loss: 0.83527 (0.76616) Loss: 0.83527 (0.76616)
392
+ 2025-05-07,15:43:49 | INFO | Train Epoch: 2 [ 35667968/128008192 (28%)] Data (t): 0.377 Batch (t): 5.719, 2933.12/s, 183.320/s/gpu LR: 0.000402 Logit Scale: 81.514 Contrastive_loss: 0.81256 (0.76874) Loss: 0.81256 (0.76874)
393
+ 2025-05-07,15:55:54 | INFO | Train Epoch: 2 [ 37765120/128008192 (30%)] Data (t): 0.383 Batch (t): 5.662, 2961.51/s, 185.094/s/gpu LR: 0.000396 Logit Scale: 81.776 Contrastive_loss: 0.72513 (0.76644) Loss: 0.72513 (0.76644)
394
+ 2025-05-07,16:07:56 | INFO | Train Epoch: 2 [ 39862272/128008192 (31%)] Data (t): 0.381 Batch (t): 5.643, 3024.43/s, 189.027/s/gpu LR: 0.000389 Logit Scale: 81.930 Contrastive_loss: 0.82584 (0.76941) Loss: 0.82584 (0.76941)
395
+ 2025-05-07,16:11:34 | WARNING | Handling webdataset error (OSError('image file is truncated (28 bytes not processed)')). Ignoring.
396
+ 2025-05-07,16:12:33 | WARNING | Handling webdataset error (OSError('image file is truncated (76 bytes not processed)')). Ignoring.
397
+ 2025-05-07,16:13:49 | WARNING | Handling webdataset error (OSError('image file is truncated (230 bytes not processed)')). Ignoring.
398
+ 2025-05-07,16:19:57 | INFO | Train Epoch: 2 [ 41959424/128008192 (33%)] Data (t): 0.384 Batch (t): 5.627, 2959.35/s, 184.959/s/gpu LR: 0.000383 Logit Scale: 82.064 Contrastive_loss: 0.69728 (0.76598) Loss: 0.69728 (0.76598)
399
+ 2025-05-07,16:22:36 | WARNING | Handling webdataset error (OSError('image file is truncated (12 bytes not processed)')). Ignoring.
400
+ 2025-05-07,16:32:02 | INFO | Train Epoch: 2 [ 44056576/128008192 (34%)] Data (t): 0.422 Batch (t): 5.670, 3444.17/s, 215.261/s/gpu LR: 0.000377 Logit Scale: 82.159 Contrastive_loss: 0.71794 (0.76379) Loss: 0.71794 (0.76379)
401
+ 2025-05-07,16:44:03 | INFO | Train Epoch: 2 [ 46153728/128008192 (36%)] Data (t): 0.382 Batch (t): 5.627, 2848.67/s, 178.042/s/gpu LR: 0.000370 Logit Scale: 82.355 Contrastive_loss: 0.81990 (0.76623) Loss: 0.81990 (0.76623)
402
+ 2025-05-07,16:45:54 | WARNING | Handling webdataset error (OSError('image file is truncated (19 bytes not processed)')). Ignoring.
403
+ 2025-05-07,16:48:01 | WARNING | Handling webdataset error (OSError('image file is truncated (7 bytes not processed)')). Ignoring.
404
+ 2025-05-07,16:56:07 | INFO | Train Epoch: 2 [ 48250880/128008192 (38%)] Data (t): 0.387 Batch (t): 5.657, 2863.84/s, 178.990/s/gpu LR: 0.000364 Logit Scale: 82.513 Contrastive_loss: 0.75284 (0.76567) Loss: 0.75284 (0.76567)
405
+ 2025-05-07,17:08:18 | INFO | Train Epoch: 2 [ 50348032/128008192 (39%)] Data (t): 0.379 Batch (t): 5.711, 2921.83/s, 182.614/s/gpu LR: 0.000358 Logit Scale: 82.588 Contrastive_loss: 0.91406 (0.77161) Loss: 0.91406 (0.77161)
406
+ 2025-05-07,17:10:42 | WARNING | Handling webdataset error (OSError('image file is truncated (3 bytes not processed)')). Ignoring.
407
+ 2025-05-07,17:11:50 | WARNING | Handling webdataset error (OSError('image file is truncated (43 bytes not processed)')). Ignoring.
408
+ 2025-05-07,17:20:20 | INFO | Train Epoch: 2 [ 52445184/128008192 (41%)] Data (t): 0.373 Batch (t): 5.646, 2943.33/s, 183.958/s/gpu LR: 0.000352 Logit Scale: 82.827 Contrastive_loss: 0.76640 (0.77141) Loss: 0.76640 (0.77141)
409
+ 2025-05-07,17:32:29 | INFO | Train Epoch: 2 [ 54542336/128008192 (43%)] Data (t): 0.376 Batch (t): 5.690, 2904.36/s, 181.523/s/gpu LR: 0.000345 Logit Scale: 82.998 Contrastive_loss: 0.66885 (0.76761) Loss: 0.66885 (0.76761)
410
+ 2025-05-07,17:44:28 | INFO | Train Epoch: 2 [ 56639488/128008192 (44%)] Data (t): 0.375 Batch (t): 5.617, 2893.15/s, 180.822/s/gpu LR: 0.000339 Logit Scale: 83.149 Contrastive_loss: 0.67608 (0.76434) Loss: 0.67608 (0.76434)
411
+ 2025-05-07,17:56:33 | INFO | Train Epoch: 2 [ 58736640/128008192 (46%)] Data (t): 0.382 Batch (t): 5.665, 2912.60/s, 182.038/s/gpu LR: 0.000333 Logit Scale: 83.209 Contrastive_loss: 0.69079 (0.76180) Loss: 0.69079 (0.76180)
412
+ 2025-05-07,18:08:46 | INFO | Train Epoch: 2 [ 60833792/128008192 (48%)] Data (t): 0.372 Batch (t): 5.726, 2916.15/s, 182.260/s/gpu LR: 0.000327 Logit Scale: 83.490 Contrastive_loss: 0.73611 (0.76095) Loss: 0.73611 (0.76095)
413
+ 2025-05-07,18:13:51 | WARNING | Handling webdataset error (OSError('image file is truncated (59 bytes not processed)')). Ignoring.
414
+ 2025-05-07,18:20:48 | INFO | Train Epoch: 2 [ 62930944/128008192 (49%)] Data (t): 0.378 Batch (t): 5.642, 2919.38/s, 182.461/s/gpu LR: 0.000321 Logit Scale: 83.633 Contrastive_loss: 0.81623 (0.76273) Loss: 0.81623 (0.76273)
415
+ 2025-05-07,18:25:10 | WARNING | Handling webdataset error (OSError('image file is truncated (8 bytes not processed)')). Ignoring.
416
+ 2025-05-07,18:32:47 | INFO | Train Epoch: 2 [ 65028096/128008192 (51%)] Data (t): 0.378 Batch (t): 5.620, 2949.16/s, 184.322/s/gpu LR: 0.000315 Logit Scale: 83.780 Contrastive_loss: 0.76071 (0.76267) Loss: 0.76071 (0.76267)
417
+ 2025-05-07,18:36:43 | WARNING | Handling webdataset error (OSError('image file is truncated (54 bytes not processed)')). Ignoring.
418
+ 2025-05-07,18:39:21 | WARNING | Handling webdataset error (OSError('image file is truncated (22 bytes not processed)')). Ignoring.
419
+ 2025-05-07,18:44:50 | INFO | Train Epoch: 2 [ 67125248/128008192 (52%)] Data (t): 0.380 Batch (t): 5.649, 2847.70/s, 177.981/s/gpu LR: 0.000309 Logit Scale: 83.885 Contrastive_loss: 0.79383 (0.76361) Loss: 0.79383 (0.76361)
420
+ 2025-05-07,18:57:01 | INFO | Train Epoch: 2 [ 69222400/128008192 (54%)] Data (t): 0.378 Batch (t): 5.707, 2948.96/s, 184.310/s/gpu LR: 0.000303 Logit Scale: 84.011 Contrastive_loss: 0.65164 (0.76032) Loss: 0.65164 (0.76032)
421
+ 2025-05-07,19:09:08 | INFO | Train Epoch: 2 [ 71319552/128008192 (56%)] Data (t): 0.386 Batch (t): 5.683, 2769.97/s, 173.123/s/gpu LR: 0.000297 Logit Scale: 84.113 Contrastive_loss: 0.81447 (0.76187) Loss: 0.81447 (0.76187)
422
+ 2025-05-07,19:21:16 | INFO | Train Epoch: 2 [ 73416704/128008192 (57%)] Data (t): 0.382 Batch (t): 5.691, 2754.60/s, 172.163/s/gpu LR: 0.000291 Logit Scale: 84.358 Contrastive_loss: 0.94590 (0.76698) Loss: 0.94590 (0.76698)
423
+ 2025-05-07,19:27:14 | WARNING | Handling webdataset error (OSError('image file is truncated (34 bytes not processed)')). Ignoring.
424
+ 2025-05-07,19:33:18 | INFO | Train Epoch: 2 [ 75513856/128008192 (59%)] Data (t): 0.375 Batch (t): 5.635, 2937.63/s, 183.602/s/gpu LR: 0.000285 Logit Scale: 84.555 Contrastive_loss: 0.63614 (0.76344) Loss: 0.63614 (0.76344)
425
+ 2025-05-07,19:39:51 | WARNING | Handling webdataset error (OSError('image file is truncated (86 bytes not processed)')). Ignoring.
426
+ 2025-05-07,19:43:51 | WARNING | Handling webdataset error (OSError('image file is truncated (37 bytes not processed)')). Ignoring.
427
+ 2025-05-07,19:45:21 | INFO | Train Epoch: 2 [ 77611008/128008192 (61%)] Data (t): 0.379 Batch (t): 5.648, 2950.73/s, 184.421/s/gpu LR: 0.000279 Logit Scale: 84.618 Contrastive_loss: 0.66098 (0.76075) Loss: 0.66098 (0.76075)
428
+ 2025-05-07,19:50:35 | WARNING | Handling webdataset error (OSError('image file is truncated (80 bytes not processed)')). Ignoring.
429
+ 2025-05-07,19:50:35 | WARNING | Handling webdataset error (OSError('image file is truncated (152 bytes not processed)')). Ignoring.
430
+ 2025-05-07,19:52:35 | WARNING | Handling webdataset error (OSError('image file is truncated (16 bytes not processed)')). Ignoring.
431
+ 2025-05-07,19:57:25 | INFO | Train Epoch: 2 [ 79708160/128008192 (62%)] Data (t): 0.384 Batch (t): 5.659, 2950.83/s, 184.427/s/gpu LR: 0.000273 Logit Scale: 84.735 Contrastive_loss: 0.70062 (0.75920) Loss: 0.70062 (0.75920)
432
+ 2025-05-07,20:09:27 | INFO | Train Epoch: 2 [ 81805312/128008192 (64%)] Data (t): 0.385 Batch (t): 5.642, 2908.96/s, 181.810/s/gpu LR: 0.000267 Logit Scale: 84.819 Contrastive_loss: 0.76818 (0.75943) Loss: 0.76818 (0.75943)
433
+ 2025-05-07,20:10:15 | WARNING | Handling webdataset error (OSError('image file is truncated (60 bytes not processed)')). Ignoring.
434
+ 2025-05-07,20:21:31 | INFO | Train Epoch: 2 [ 83902464/128008192 (66%)] Data (t): 0.387 Batch (t): 5.657, 2915.49/s, 182.218/s/gpu LR: 0.000261 Logit Scale: 84.956 Contrastive_loss: 0.70566 (0.75812) Loss: 0.70566 (0.75812)
435
+ 2025-05-07,20:33:38 | INFO | Train Epoch: 2 [ 85999616/128008192 (67%)] Data (t): 0.384 Batch (t): 5.676, 2856.44/s, 178.528/s/gpu LR: 0.000256 Logit Scale: 85.092 Contrastive_loss: 0.75456 (0.75803) Loss: 0.75456 (0.75803)
436
+ 2025-05-07,20:39:42 | WARNING | Handling webdataset error (OSError('broken data stream when reading image file')). Ignoring.
437
+ 2025-05-07,20:45:37 | INFO | Train Epoch: 2 [ 88096768/128008192 (69%)] Data (t): 0.373 Batch (t): 5.621, 2890.43/s, 180.652/s/gpu LR: 0.000250 Logit Scale: 85.272 Contrastive_loss: 0.69051 (0.75646) Loss: 0.69051 (0.75646)
438
+ 2025-05-07,20:57:53 | INFO | Train Epoch: 2 [ 90193920/128008192 (70%)] Data (t): 0.374 Batch (t): 5.751, 2859.02/s, 178.689/s/gpu LR: 0.000244 Logit Scale: 85.465 Contrastive_loss: 0.73157 (0.75590) Loss: 0.73157 (0.75590)
439
+ 2025-05-07,21:01:47 | WARNING | Handling webdataset error (OSError('image file is truncated (33 bytes not processed)')). Ignoring.
440
+ 2025-05-07,21:02:38 | WARNING | Handling webdataset error (OSError('image file is truncated (16 bytes not processed)')). Ignoring.
441
+ 2025-05-07,21:10:01 | INFO | Train Epoch: 2 [ 92291072/128008192 (72%)] Data (t): 0.384 Batch (t): 5.681, 2943.21/s, 183.950/s/gpu LR: 0.000239 Logit Scale: 85.668 Contrastive_loss: 0.75559 (0.75589) Loss: 0.75559 (0.75589)
442
+ 2025-05-07,21:22:04 | INFO | Train Epoch: 2 [ 94388224/128008192 (74%)] Data (t): 0.376 Batch (t): 5.653, 2849.77/s, 178.111/s/gpu LR: 0.000233 Logit Scale: 85.762 Contrastive_loss: 0.71152 (0.75493) Loss: 0.71152 (0.75493)
443
+ 2025-05-07,21:34:11 | INFO | Train Epoch: 2 [ 96485376/128008192 (75%)] Data (t): 0.366 Batch (t): 5.674, 2907.34/s, 181.708/s/gpu LR: 0.000228 Logit Scale: 85.916 Contrastive_loss: 0.66774 (0.75307) Loss: 0.66774 (0.75307)
444
+ 2025-05-07,21:46:13 | INFO | Train Epoch: 2 [ 98582528/128008192 (77%)] Data (t): 0.377 Batch (t): 5.644, 2828.42/s, 176.776/s/gpu LR: 0.000222 Logit Scale: 86.106 Contrastive_loss: 0.67051 (0.75135) Loss: 0.67051 (0.75135)
445
+ 2025-05-07,21:46:16 | WARNING | Handling webdataset error (OSError('image file is truncated (26 bytes not processed)')). Ignoring.
446
+ 2025-05-07,21:58:19 | INFO | Train Epoch: 2 [100679680/128008192 (79%)] Data (t): 0.378 Batch (t): 5.669, 2712.12/s, 169.508/s/gpu LR: 0.000217 Logit Scale: 86.205 Contrastive_loss: 0.73824 (0.75108) Loss: 0.73824 (0.75108)
447
+ 2025-05-07,22:10:23 | INFO | Train Epoch: 2 [102776832/128008192 (80%)] Data (t): 0.386 Batch (t): 5.660, 2963.76/s, 185.235/s/gpu LR: 0.000211 Logit Scale: 86.296 Contrastive_loss: 0.77205 (0.75150) Loss: 0.77205 (0.75150)
448
+ 2025-05-07,22:22:26 | INFO | Train Epoch: 2 [104873984/128008192 (82%)] Data (t): 0.377 Batch (t): 5.649, 2927.49/s, 182.968/s/gpu LR: 0.000206 Logit Scale: 86.463 Contrastive_loss: 0.83267 (0.75309) Loss: 0.83267 (0.75309)
449
+ 2025-05-07,22:34:30 | INFO | Train Epoch: 2 [106971136/128008192 (84%)] Data (t): 0.383 Batch (t): 5.652, 2912.04/s, 182.002/s/gpu LR: 0.000201 Logit Scale: 86.576 Contrastive_loss: 0.67985 (0.75169) Loss: 0.67985 (0.75169)
450
+ 2025-05-07,22:37:03 | WARNING | Handling webdataset error (OSError('image file is truncated (66 bytes not processed)')). Ignoring.
451
+ 2025-05-07,22:46:31 | INFO | Train Epoch: 2 [109068288/128008192 (85%)] Data (t): 0.379 Batch (t): 5.637, 2889.04/s, 180.565/s/gpu LR: 0.000196 Logit Scale: 86.760 Contrastive_loss: 0.67902 (0.75031) Loss: 0.67902 (0.75031)
452
+ 2025-05-07,22:50:11 | WARNING | Handling webdataset error (OSError('image file is truncated (86 bytes not processed)')). Ignoring.
453
+ 2025-05-07,22:53:00 | WARNING | Handling webdataset error (OSError('image file is truncated (29 bytes not processed)')). Ignoring.
454
+ 2025-05-07,22:56:20 | WARNING | Handling webdataset error (OSError('image file is truncated (4 bytes not processed)')). Ignoring.
455
+ 2025-05-07,22:56:44 | WARNING | Handling webdataset error (OSError('image file is truncated (80 bytes not processed)')). Ignoring.
456
+ 2025-05-07,22:58:35 | INFO | Train Epoch: 2 [111165440/128008192 (87%)] Data (t): 0.384 Batch (t): 5.656, 2875.76/s, 179.735/s/gpu LR: 0.000190 Logit Scale: 86.939 Contrastive_loss: 0.84035 (0.75198) Loss: 0.84035 (0.75198)
457
+ 2025-05-07,23:05:42 | WARNING | Handling webdataset error (OSError('image file is truncated (67 bytes not processed)')). Ignoring.
458
+ 2025-05-07,23:07:30 | WARNING | Handling webdataset error (OSError('image file is truncated (19 bytes not processed)')). Ignoring.
459
+ 2025-05-07,23:08:02 | WARNING | Handling webdataset error (OSError('image file is truncated (8 bytes not processed)')). Ignoring.
460
+ 2025-05-07,23:10:42 | INFO | Train Epoch: 2 [113262592/128008192 (88%)] Data (t): 0.376 Batch (t): 5.675, 2929.20/s, 183.075/s/gpu LR: 0.000185 Logit Scale: 87.027 Contrastive_loss: 0.76918 (0.75229) Loss: 0.76918 (0.75229)
461
+ 2025-05-07,23:12:55 | WARNING | Handling webdataset error (OSError('image file is truncated (93 bytes not processed)')). Ignoring.
462
+ 2025-05-07,23:22:41 | INFO | Train Epoch: 2 [115359744/128008192 (90%)] Data (t): 0.378 Batch (t): 5.622, 2980.30/s, 186.269/s/gpu LR: 0.000180 Logit Scale: 87.202 Contrastive_loss: 0.54325 (0.74856) Loss: 0.54325 (0.74856)
463
+ 2025-05-07,23:28:18 | WARNING | Handling webdataset error (OSError('image file is truncated (8 bytes not processed)')). Ignoring.
464
+ 2025-05-07,23:34:47 | INFO | Train Epoch: 2 [117456896/128008192 (92%)] Data (t): 0.382 Batch (t): 5.671, 2941.13/s, 183.821/s/gpu LR: 0.000175 Logit Scale: 87.336 Contrastive_loss: 0.73473 (0.74832) Loss: 0.73473 (0.74832)
465
+ 2025-05-07,23:46:52 | INFO | Train Epoch: 2 [119554048/128008192 (93%)] Data (t): 0.376 Batch (t): 5.662, 2898.79/s, 181.175/s/gpu LR: 0.000170 Logit Scale: 87.507 Contrastive_loss: 0.69387 (0.74738) Loss: 0.69387 (0.74738)
466
+ 2025-05-07,23:59:00 | INFO | Train Epoch: 2 [121651200/128008192 (95%)] Data (t): 0.376 Batch (t): 5.692, 2917.66/s, 182.354/s/gpu LR: 0.000165 Logit Scale: 87.646 Contrastive_loss: 0.67988 (0.74624) Loss: 0.67988 (0.74624)
467
+ 2025-05-08,00:07:10 | WARNING | Handling webdataset error (OSError('image file is truncated (96 bytes not processed)')). Ignoring.
468
+ 2025-05-08,00:11:03 | INFO | Train Epoch: 2 [123748352/128008192 (97%)] Data (t): 0.377 Batch (t): 5.644, 2858.51/s, 178.657/s/gpu LR: 0.000161 Logit Scale: 87.821 Contrastive_loss: 0.61918 (0.74412) Loss: 0.61918 (0.74412)
469
+ 2025-05-08,00:23:06 | INFO | Train Epoch: 2 [125845504/128008192 (98%)] Data (t): 0.380 Batch (t): 5.647, 2890.91/s, 180.682/s/gpu LR: 0.000156 Logit Scale: 87.806 Contrastive_loss: 0.69250 (0.74327) Loss: 0.69250 (0.74327)
470
+ 2025-05-08,00:28:05 | WARNING | Handling webdataset error (OSError('image file is truncated (29 bytes not processed)')). Ignoring.
471
+ 2025-05-08,00:35:08 | INFO | Train Epoch: 2 [127942656/128008192 (100%)] Data (t): 0.378 Batch (t): 5.641, 2912.37/s, 182.023/s/gpu LR: 0.000151 Logit Scale: 88.015 Contrastive_loss: 0.65714 (0.74188) Loss: 0.65714 (0.74188)
472
+ 2025-05-08,00:35:30 | INFO | Train Epoch: 2 [128008192/128008192 (100%)] Data (t): 0.378 Batch (t): 5.576, 2993.51/s, 187.094/s/gpu LR: 0.000151 Logit Scale: 88.025 Contrastive_loss: 0.73382 (0.74175) Loss: 0.73382 (0.74175)
473
+ 2025-05-08,00:35:37 | INFO | Start epoch 3
474
+ 2025-05-08,00:35:50 | INFO | Train Epoch: 3 [ 16384/128008192 (0%)] Data (t): 7.765 Batch (t): 12.146, 1348.91/s, 84.3067/s/gpu LR: 0.000151 Logit Scale: 88.027 Contrastive_loss: 0.60689 (0.60689) Loss: 0.60689 (0.60689)
475
+ 2025-05-08,00:45:11 | WARNING | Handling webdataset error (OSError('image file is truncated (50 bytes not processed)')). Ignoring.
476
+ 2025-05-08,00:47:52 | INFO | Train Epoch: 3 [ 2113536/128008192 (2%)] Data (t): 0.385 Batch (t): 5.647, 2914.65/s, 182.165/s/gpu LR: 0.000146 Logit Scale: 88.367 Contrastive_loss: 0.56284 (0.58487) Loss: 0.56284 (0.58487)
477
+ 2025-05-08,00:59:57 | INFO | Train Epoch: 3 [ 4210688/128008192 (3%)] Data (t): 0.378 Batch (t): 5.657, 2875.22/s, 179.701/s/gpu LR: 0.000142 Logit Scale: 88.623 Contrastive_loss: 0.71519 (0.62831) Loss: 0.71519 (0.62831)
478
+ 2025-05-08,01:12:00 | INFO | Train Epoch: 3 [ 6307840/128008192 (5%)] Data (t): 0.391 Batch (t): 5.650, 2853.26/s, 178.329/s/gpu LR: 0.000137 Logit Scale: 88.826 Contrastive_loss: 0.68923 (0.64354) Loss: 0.68923 (0.64354)
479
+ 2025-05-08,01:23:28 | WARNING | Handling webdataset error (OSError('image file is truncated (26 bytes not processed)')). Ignoring.
480
+ 2025-05-08,01:24:01 | INFO | Train Epoch: 3 [ 8404992/128008192 (7%)] Data (t): 0.383 Batch (t): 5.638, 2851.56/s, 178.222/s/gpu LR: 0.000133 Logit Scale: 89.009 Contrastive_loss: 0.53171 (0.62117) Loss: 0.53171 (0.62117)
481
+ 2025-05-08,01:36:03 | INFO | Train Epoch: 3 [ 10502144/128008192 (8%)] Data (t): 0.374 Batch (t): 5.639, 2887.82/s, 180.489/s/gpu LR: 0.000128 Logit Scale: 89.153 Contrastive_loss: 0.81621 (0.65368) Loss: 0.81621 (0.65368)
482
+ 2025-05-08,01:42:35 | WARNING | Handling webdataset error (OSError('image file is truncated (2 bytes not processed)')). Ignoring.
483
+ 2025-05-08,01:48:09 | INFO | Train Epoch: 3 [ 12599296/128008192 (10%)] Data (t): 0.375 Batch (t): 5.675, 2915.92/s, 182.245/s/gpu LR: 0.000124 Logit Scale: 89.335 Contrastive_loss: 0.62362 (0.64938) Loss: 0.62362 (0.64938)
484
+ 2025-05-08,01:56:40 | WARNING | Handling webdataset error (OSError('broken data stream when reading image file')). Ignoring.
485
+ 2025-05-08,02:00:11 | INFO | Train Epoch: 3 [ 14696448/128008192 (11%)] Data (t): 0.373 Batch (t): 5.635, 2809.79/s, 175.612/s/gpu LR: 0.000120 Logit Scale: 89.481 Contrastive_loss: 0.68460 (0.65378) Loss: 0.68460 (0.65378)
486
+ 2025-05-08,02:12:16 | INFO | Train Epoch: 3 [ 16793600/128008192 (13%)] Data (t): 0.408 Batch (t): 5.670, 2897.46/s, 181.091/s/gpu LR: 0.000116 Logit Scale: 89.720 Contrastive_loss: 0.69582 (0.65846) Loss: 0.69582 (0.65846)
487
+ 2025-05-08,02:22:22 | WARNING | Handling webdataset error (OSError('image file is truncated (88 bytes not processed)')). Ignoring.
488
+ 2025-05-08,02:24:15 | INFO | Train Epoch: 3 [ 18890752/128008192 (15%)] Data (t): 0.378 Batch (t): 5.617, 2918.85/s, 182.428/s/gpu LR: 0.000111 Logit Scale: 89.844 Contrastive_loss: 0.65350 (0.65796) Loss: 0.65350 (0.65796)
489
+ 2025-05-08,02:36:15 | INFO | Train Epoch: 3 [ 20987904/128008192 (16%)] Data (t): 0.371 Batch (t): 5.618, 2950.82/s, 184.426/s/gpu LR: 0.000107 Logit Scale: 89.969 Contrastive_loss: 0.64335 (0.65663) Loss: 0.64335 (0.65663)
490
+ 2025-05-08,02:48:15 | INFO | Train Epoch: 3 [ 23085056/128008192 (18%)] Data (t): 0.384 Batch (t): 5.626, 2875.62/s, 179.726/s/gpu LR: 0.000103 Logit Scale: 90.122 Contrastive_loss: 0.75448 (0.66479) Loss: 0.75448 (0.66479)
491
+ 2025-05-08,03:00:18 | INFO | Train Epoch: 3 [ 25182208/128008192 (20%)] Data (t): 0.385 Batch (t): 5.648, 2968.21/s, 185.513/s/gpu LR: 0.000099 Logit Scale: 90.287 Contrastive_loss: 0.57289 (0.65772) Loss: 0.57289 (0.65772)
492
+ 2025-05-08,03:06:08 | WARNING | Handling webdataset error (OSError('image file is truncated (10 bytes not processed)')). Ignoring.
493
+ 2025-05-08,03:12:19 | INFO | Train Epoch: 3 [ 27279360/128008192 (21%)] Data (t): 0.383 Batch (t): 5.633, 2950.70/s, 184.419/s/gpu LR: 0.000095 Logit Scale: 90.426 Contrastive_loss: 0.57143 (0.65155) Loss: 0.57143 (0.65155)
494
+ 2025-05-08,03:22:43 | WARNING | Handling webdataset error (OSError('image file is truncated (12 bytes not processed)')). Ignoring.
495
+ 2025-05-08,03:24:27 | INFO | Train Epoch: 3 [ 29376512/128008192 (23%)] Data (t): 0.378 Batch (t): 5.692, 2927.93/s, 182.996/s/gpu LR: 0.000092 Logit Scale: 90.578 Contrastive_loss: 0.67905 (0.65339) Loss: 0.67905 (0.65339)
496
+ 2025-05-08,03:29:27 | WARNING | Handling webdataset error (OSError('image file is truncated (9 bytes not processed)')). Ignoring.
497
+ 2025-05-08,03:35:47 | WARNING | Handling webdataset error (OSError('image file is truncated (17 bytes not processed)')). Ignoring.
498
+ 2025-05-08,03:36:30 | INFO | Train Epoch: 3 [ 31473664/128008192 (25%)] Data (t): 0.380 Batch (t): 5.646, 2908.78/s, 181.799/s/gpu LR: 0.000088 Logit Scale: 90.743 Contrastive_loss: 0.65314 (0.65337) Loss: 0.65314 (0.65337)
499
+ 2025-05-08,03:36:50 | WARNING | Handling webdataset error (OSError('image file is truncated (4 bytes not processed)')). Ignoring.
500
+ 2025-05-08,03:48:31 | INFO | Train Epoch: 3 [ 33570816/128008192 (26%)] Data (t): 0.384 Batch (t): 5.633, 2912.60/s, 182.037/s/gpu LR: 0.000084 Logit Scale: 90.912 Contrastive_loss: 0.71821 (0.65719) Loss: 0.71821 (0.65719)
501
+ 2025-05-08,03:53:53 | WARNING | Handling webdataset error (OSError('image file is truncated (66 bytes not processed)')). Ignoring.
502
+ 2025-05-08,04:00:32 | INFO | Train Epoch: 3 [ 35667968/128008192 (28%)] Data (t): 0.381 Batch (t): 5.634, 2972.92/s, 185.808/s/gpu LR: 0.000081 Logit Scale: 91.086 Contrastive_loss: 0.66309 (0.65751) Loss: 0.66309 (0.65751)
503
+ 2025-05-08,04:12:41 | INFO | Train Epoch: 3 [ 37765120/128008192 (30%)] Data (t): 0.380 Batch (t): 5.696, 2669.44/s, 166.840/s/gpu LR: 0.000077 Logit Scale: 91.269 Contrastive_loss: 0.57715 (0.65328) Loss: 0.57715 (0.65328)
504
+ 2025-05-08,04:24:48 | INFO | Train Epoch: 3 [ 39862272/128008192 (31%)] Data (t): 0.380 Batch (t): 5.678, 2946.25/s, 184.141/s/gpu LR: 0.000074 Logit Scale: 91.378 Contrastive_loss: 0.66395 (0.65382) Loss: 0.66395 (0.65382)
505
+ 2025-05-08,04:34:27 | WARNING | Handling webdataset error (OSError('image file is truncated (20 bytes not processed)')). Ignoring.
506
+ 2025-05-08,04:37:03 | INFO | Train Epoch: 3 [ 41959424/128008192 (33%)] Data (t): 0.373 Batch (t): 5.741, 2908.82/s, 181.801/s/gpu LR: 0.000070 Logit Scale: 91.547 Contrastive_loss: 0.68520 (0.65531) Loss: 0.68520 (0.65531)
507
+ 2025-05-08,04:38:49 | WARNING | Handling webdataset error (OSError('image file is truncated (54 bytes not processed)')). Ignoring.
508
+ 2025-05-08,04:49:09 | INFO | Train Epoch: 3 [ 44056576/128008192 (34%)] Data (t): 0.384 Batch (t): 5.671, 2822.15/s, 176.384/s/gpu LR: 0.000067 Logit Scale: 91.661 Contrastive_loss: 0.70978 (0.65779) Loss: 0.70978 (0.65779)
509
+ 2025-05-08,04:59:21 | WARNING | Handling webdataset error (OSError('image file is truncated (48 bytes not processed)')). Ignoring.
510
+ 2025-05-08,05:00:18 | WARNING | Handling webdataset error (OSError('image file is truncated (18 bytes not processed)')). Ignoring.
511
+ 2025-05-08,05:01:15 | INFO | Train Epoch: 3 [ 46153728/128008192 (36%)] Data (t): 0.385 Batch (t): 5.674, 2859.03/s, 178.689/s/gpu LR: 0.000064 Logit Scale: 91.833 Contrastive_loss: 0.82095 (0.66488) Loss: 0.82095 (0.66488)
512
+ 2025-05-08,05:13:23 | INFO | Train Epoch: 3 [ 48250880/128008192 (38%)] Data (t): 0.379 Batch (t): 5.689, 2744.84/s, 171.552/s/gpu LR: 0.000061 Logit Scale: 91.998 Contrastive_loss: 0.67371 (0.66525) Loss: 0.67371 (0.66525)
513
+ 2025-05-08,05:25:02 | WARNING | Handling webdataset error (OSError('image file is truncated (152 bytes not processed)')). Ignoring.
514
+ 2025-05-08,05:25:28 | INFO | Train Epoch: 3 [ 50348032/128008192 (39%)] Data (t): 0.375 Batch (t): 5.662, 2793.82/s, 174.614/s/gpu LR: 0.000058 Logit Scale: 92.098 Contrastive_loss: 0.59232 (0.66233) Loss: 0.59232 (0.66233)
515
+ 2025-05-08,05:34:25 | WARNING | Handling webdataset error (OSError('image file is truncated (22 bytes not processed)')). Ignoring.
516
+ 2025-05-08,05:37:42 | INFO | Train Epoch: 3 [ 52445184/128008192 (41%)] Data (t): 0.373 Batch (t): 5.734, 2855.92/s, 178.495/s/gpu LR: 0.000055 Logit Scale: 92.181 Contrastive_loss: 0.67104 (0.66267) Loss: 0.67104 (0.66267)
517
+ 2025-05-08,05:49:53 | INFO | Train Epoch: 3 [ 54542336/128008192 (43%)] Data (t): 0.377 Batch (t): 5.707, 2910.75/s, 181.922/s/gpu LR: 0.000052 Logit Scale: 92.330 Contrastive_loss: 0.70475 (0.66423) Loss: 0.70475 (0.66423)
518
+ 2025-05-08,05:56:42 | WARNING | Handling webdataset error (OSError('image file is truncated (17 bytes not processed)')). Ignoring.
519
+ 2025-05-08,06:01:57 | INFO | Train Epoch: 3 [ 56639488/128008192 (44%)] Data (t): 0.379 Batch (t): 5.657, 2933.48/s, 183.343/s/gpu LR: 0.000049 Logit Scale: 92.494 Contrastive_loss: 0.79821 (0.66901) Loss: 0.79821 (0.66901)
520
+ 2025-05-08,06:13:40 | WARNING | Handling webdataset error (OSError('image file is truncated (2 bytes not processed)')). Ignoring.
521
+ 2025-05-08,06:14:02 | INFO | Train Epoch: 3 [ 58736640/128008192 (46%)] Data (t): 0.382 Batch (t): 5.668, 2906.76/s, 181.672/s/gpu LR: 0.000046 Logit Scale: 92.607 Contrastive_loss: 0.69965 (0.67007) Loss: 0.69965 (0.67007)
522
+ 2025-05-08,06:20:00 | WARNING | Handling webdataset error (OSError('image file is truncated (31 bytes not processed)')). Ignoring.
523
+ 2025-05-08,06:26:10 | INFO | Train Epoch: 3 [ 60833792/128008192 (48%)] Data (t): 0.382 Batch (t): 5.686, 2861.25/s, 178.828/s/gpu LR: 0.000043 Logit Scale: 92.704 Contrastive_loss: 0.64696 (0.66930) Loss: 0.64696 (0.66930)
524
+ 2025-05-08,06:38:13 | INFO | Train Epoch: 3 [ 62930944/128008192 (49%)] Data (t): 0.385 Batch (t): 5.650, 2904.76/s, 181.548/s/gpu LR: 0.000041 Logit Scale: 92.844 Contrastive_loss: 0.71736 (0.67085) Loss: 0.71736 (0.67085)
525
+ 2025-05-08,06:50:16 | INFO | Train Epoch: 3 [ 65028096/128008192 (51%)] Data (t): 0.380 Batch (t): 5.646, 2936.33/s, 183.521/s/gpu LR: 0.000038 Logit Scale: 92.946 Contrastive_loss: 0.68936 (0.67143) Loss: 0.68936 (0.67143)
526
+ 2025-05-08,06:55:01 | WARNING | Handling webdataset error (OSError('image file is truncated (92 bytes not processed)')). Ignoring.
527
+ 2025-05-08,07:02:18 | INFO | Train Epoch: 3 [ 67125248/128008192 (52%)] Data (t): 0.384 Batch (t): 5.641, 2933.34/s, 183.334/s/gpu LR: 0.000036 Logit Scale: 93.049 Contrastive_loss: 0.64159 (0.67052) Loss: 0.64159 (0.67052)
528
+ 2025-05-08,07:04:35 | WARNING | Handling webdataset error (OSError('image file is truncated (13 bytes not processed)')). Ignoring.
529
+ 2025-05-08,07:14:28 | INFO | Train Epoch: 3 [ 69222400/128008192 (54%)] Data (t): 0.383 Batch (t): 5.707, 2864.80/s, 179.050/s/gpu LR: 0.000033 Logit Scale: 93.148 Contrastive_loss: 0.70433 (0.67152) Loss: 0.70433 (0.67152)
530
+ 2025-05-08,07:17:52 | WARNING | Handling webdataset error (OSError('image file is truncated (123 bytes not processed)')). Ignoring.
531
+ 2025-05-08,07:26:36 | INFO | Train Epoch: 3 [ 71319552/128008192 (56%)] Data (t): 0.386 Batch (t): 5.687, 2898.52/s, 181.158/s/gpu LR: 0.000031 Logit Scale: 93.231 Contrastive_loss: 0.70642 (0.67251) Loss: 0.70642 (0.67251)
532
+ 2025-05-08,07:38:40 | INFO | Train Epoch: 3 [ 73416704/128008192 (57%)] Data (t): 0.386 Batch (t): 5.654, 2901.23/s, 181.327/s/gpu LR: 0.000029 Logit Scale: 93.298 Contrastive_loss: 0.67349 (0.67254) Loss: 0.67349 (0.67254)
533
+ 2025-05-08,07:50:47 | INFO | Train Epoch: 3 [ 75513856/128008192 (59%)] Data (t): 0.385 Batch (t): 5.679, 2907.87/s, 181.742/s/gpu LR: 0.000027 Logit Scale: 93.370 Contrastive_loss: 0.71091 (0.67358) Loss: 0.71091 (0.67358)
534
+ 2025-05-08,07:58:15 | WARNING | Handling webdataset error (OSError('image file is truncated (27 bytes not processed)')). Ignoring.
535
+ 2025-05-08,08:02:53 | INFO | Train Epoch: 3 [ 77611008/128008192 (61%)] Data (t): 0.382 Batch (t): 5.674, 2892.90/s, 180.806/s/gpu LR: 0.000025 Logit Scale: 93.455 Contrastive_loss: 0.63170 (0.67248) Loss: 0.63170 (0.67248)
536
+ 2025-05-08,08:12:04 | WARNING | Handling webdataset error (OSError('image file is truncated (74 bytes not processed)')). Ignoring.
537
+ 2025-05-08,08:14:57 | INFO | Train Epoch: 3 [ 79708160/128008192 (62%)] Data (t): 0.380 Batch (t): 5.650, 2937.15/s, 183.572/s/gpu LR: 0.000023 Logit Scale: 93.547 Contrastive_loss: 0.61521 (0.67101) Loss: 0.61521 (0.67101)
538
+ 2025-05-08,08:26:59 | INFO | Train Epoch: 3 [ 81805312/128008192 (64%)] Data (t): 0.379 Batch (t): 5.641, 2963.86/s, 185.241/s/gpu LR: 0.000021 Logit Scale: 93.624 Contrastive_loss: 0.64632 (0.67039) Loss: 0.64632 (0.67039)
539
+ 2025-05-08,08:31:05 | WARNING | Handling webdataset error (OSError('image file is truncated (2 bytes not processed)')). Ignoring.
540
+ 2025-05-08,08:37:42 | WARNING | Handling webdataset error (OSError('image file is truncated (7 bytes not processed)')). Ignoring.
541
+ 2025-05-08,08:39:08 | INFO | Train Epoch: 3 [ 83902464/128008192 (66%)] Data (t): 0.368 Batch (t): 5.701, 2905.24/s, 181.578/s/gpu LR: 0.000019 Logit Scale: 93.710 Contrastive_loss: 0.58877 (0.66840) Loss: 0.58877 (0.66840)
542
+ 2025-05-08,08:48:19 | WARNING | Handling webdataset error (OSError('image file is truncated (92 bytes not processed)')). Ignoring.
543
+ 2025-05-08,08:51:10 | INFO | Train Epoch: 3 [ 85999616/128008192 (67%)] Data (t): 0.376 Batch (t): 5.636, 2929.68/s, 183.105/s/gpu LR: 0.000017 Logit Scale: 93.773 Contrastive_loss: 0.63460 (0.66760) Loss: 0.63460 (0.66760)
544
+ 2025-05-08,09:03:13 | INFO | Train Epoch: 3 [ 88096768/128008192 (69%)] Data (t): 0.380 Batch (t): 5.654, 2937.62/s, 183.601/s/gpu LR: 0.000015 Logit Scale: 93.832 Contrastive_loss: 0.60477 (0.66613) Loss: 0.60477 (0.66613)
545
+ 2025-05-08,09:05:36 | WARNING | Handling webdataset error (OSError('image file is truncated (121 bytes not processed)')). Ignoring.
546
+ 2025-05-08,09:10:06 | WARNING | Handling webdataset error (OSError('image file is truncated (80 bytes not processed)')). Ignoring.
547
+ 2025-05-08,09:11:04 | WARNING | Handling webdataset error (OSError('image file is truncated (58 bytes not processed)')). Ignoring.
548
+ 2025-05-08,09:15:18 | INFO | Train Epoch: 3 [ 90193920/128008192 (70%)] Data (t): 0.386 Batch (t): 5.664, 2865.66/s, 179.103/s/gpu LR: 0.000014 Logit Scale: 93.877 Contrastive_loss: 0.60816 (0.66482) Loss: 0.60816 (0.66482)
549
+ 2025-05-08,09:17:14 | WARNING | Handling webdataset error (OSError('image file is truncated (18 bytes not processed)')). Ignoring.
550
+ 2025-05-08,09:27:21 | INFO | Train Epoch: 3 [ 92291072/128008192 (72%)] Data (t): 0.383 Batch (t): 5.642, 2959.03/s, 184.940/s/gpu LR: 0.000012 Logit Scale: 93.937 Contrastive_loss: 0.69596 (0.66551) Loss: 0.69596 (0.66551)
551
+ 2025-05-08,09:39:25 | INFO | Train Epoch: 3 [ 94388224/128008192 (74%)] Data (t): 0.384 Batch (t): 5.656, 2956.70/s, 184.794/s/gpu LR: 0.000011 Logit Scale: 93.967 Contrastive_loss: 0.63859 (0.66492) Loss: 0.63859 (0.66492)
552
+ 2025-05-08,09:43:51 | WARNING | Handling webdataset error (OSError('image file is truncated (24 bytes not processed)')). Ignoring.
553
+ 2025-05-08,09:51:32 | INFO | Train Epoch: 3 [ 96485376/128008192 (75%)] Data (t): 0.378 Batch (t): 5.683, 2895.45/s, 180.966/s/gpu LR: 0.000010 Logit Scale: 94.024 Contrastive_loss: 0.59691 (0.66348) Loss: 0.59691 (0.66348)
554
+ 2025-05-08,09:54:19 | WARNING | Handling webdataset error (OSError('image file is truncated (86 bytes not processed)')). Ignoring.
555
+ 2025-05-08,10:03:35 | INFO | Train Epoch: 3 [ 98582528/128008192 (77%)] Data (t): 0.384 Batch (t): 5.647, 2900.98/s, 181.311/s/gpu LR: 0.000008 Logit Scale: 94.061 Contrastive_loss: 0.50785 (0.66023) Loss: 0.50785 (0.66023)
556
+ 2025-05-08,10:15:41 | INFO | Train Epoch: 3 [100679680/128008192 (79%)] Data (t): 0.379 Batch (t): 5.669, 2921.97/s, 182.623/s/gpu LR: 0.000007 Logit Scale: 94.094 Contrastive_loss: 0.61861 (0.65938) Loss: 0.61861 (0.65938)
557
+ 2025-05-08,10:22:03 | WARNING | Handling webdataset error (OSError('image file is truncated (26 bytes not processed)')). Ignoring.
558
+ 2025-05-08,10:27:44 | INFO | Train Epoch: 3 [102776832/128008192 (80%)] Data (t): 0.378 Batch (t): 5.649, 2823.93/s, 176.496/s/gpu LR: 0.000006 Logit Scale: 94.118 Contrastive_loss: 0.74334 (0.66106) Loss: 0.74334 (0.66106)
559
+ 2025-05-08,10:29:59 | WARNING | Handling webdataset error (OSError('image file is truncated (97 bytes not processed)')). Ignoring.
560
+ 2025-05-08,10:39:46 | INFO | Train Epoch: 3 [104873984/128008192 (82%)] Data (t): 0.372 Batch (t): 5.646, 2944.25/s, 184.015/s/gpu LR: 0.000005 Logit Scale: 94.146 Contrastive_loss: 0.60558 (0.65998) Loss: 0.60558 (0.65998)
561
+ 2025-05-08,10:51:47 | INFO | Train Epoch: 3 [106971136/128008192 (84%)] Data (t): 0.382 Batch (t): 5.633, 2950.37/s, 184.398/s/gpu LR: 0.000004 Logit Scale: 94.160 Contrastive_loss: 0.69206 (0.66059) Loss: 0.69206 (0.66059)
562
+ 2025-05-08,10:52:02 | WARNING | Handling webdataset error (OSError('broken data stream when reading image file')). Ignoring.
563
+ 2025-05-08,11:03:58 | INFO | Train Epoch: 3 [109068288/128008192 (85%)] Data (t): 0.375 Batch (t): 5.705, 2904.99/s, 181.562/s/gpu LR: 0.000003 Logit Scale: 94.178 Contrastive_loss: 0.66710 (0.66072) Loss: 0.66710 (0.66072)
564
+ 2025-05-08,11:07:31 | WARNING | Handling webdataset error (OSError('image file is truncated (28 bytes not processed)')). Ignoring.
565
+ 2025-05-08,11:16:01 | INFO | Train Epoch: 3 [111165440/128008192 (87%)] Data (t): 0.383 Batch (t): 5.650, 2879.26/s, 179.954/s/gpu LR: 0.000003 Logit Scale: 94.188 Contrastive_loss: 0.62112 (0.65998) Loss: 0.62112 (0.65998)
566
+ 2025-05-08,11:19:16 | WARNING | Handling webdataset error (OSError('image file is truncated (58 bytes not processed)')). Ignoring.
567
+ 2025-05-08,11:28:05 | INFO | Train Epoch: 3 [113262592/128008192 (88%)] Data (t): 0.385 Batch (t): 5.656, 2887.56/s, 180.472/s/gpu LR: 0.000002 Logit Scale: 94.196 Contrastive_loss: 0.62819 (0.65940) Loss: 0.62819 (0.65940)
568
+ 2025-05-08,11:31:39 | WARNING | Handling webdataset error (OSError('image file is truncated (17 bytes not processed)')). Ignoring.
569
+ 2025-05-08,11:40:21 | INFO | Train Epoch: 3 [115359744/128008192 (90%)] Data (t): 0.381 Batch (t): 5.754, 2703.34/s, 168.959/s/gpu LR: 0.000002 Logit Scale: 94.203 Contrastive_loss: 0.76332 (0.66126) Loss: 0.76332 (0.66126)
570
+ 2025-05-08,11:45:55 | WARNING | Handling webdataset error (OSError('image file is truncated (6 bytes not processed)')). Ignoring.
571
+ 2025-05-08,11:52:26 | INFO | Train Epoch: 3 [117456896/128008192 (92%)] Data (t): 0.420 Batch (t): 5.664, 3005.53/s, 187.846/s/gpu LR: 0.000001 Logit Scale: 94.204 Contrastive_loss: 0.71731 (0.66224) Loss: 0.71731 (0.66224)
572
+ 2025-05-08,12:00:59 | WARNING | Handling webdataset error (OSError('image file is truncated (37 bytes not processed)')). Ignoring.
573
+ 2025-05-08,12:04:32 | INFO | Train Epoch: 3 [119554048/128008192 (93%)] Data (t): 0.382 Batch (t): 5.667, 2878.33/s, 179.896/s/gpu LR: 0.000001 Logit Scale: 94.206 Contrastive_loss: 0.72314 (0.66329) Loss: 0.72314 (0.66329)
574
+ 2025-05-08,12:05:10 | WARNING | Handling webdataset error (OSError('image file is truncated (70 bytes not processed)')). Ignoring.
575
+ 2025-05-08,12:16:37 | INFO | Train Epoch: 3 [121651200/128008192 (95%)] Data (t): 0.383 Batch (t): 5.668, 2946.39/s, 184.149/s/gpu LR: 0.000000 Logit Scale: 94.207 Contrastive_loss: 0.53584 (0.66113) Loss: 0.53584 (0.66113)
576
+ 2025-05-08,12:28:41 | INFO | Train Epoch: 3 [123748352/128008192 (97%)] Data (t): 0.405 Batch (t): 5.658, 2905.91/s, 181.619/s/gpu LR: 0.000000 Logit Scale: 94.207 Contrastive_loss: 0.70419 (0.66185) Loss: 0.70419 (0.66185)
577
+ 2025-05-08,12:30:05 | WARNING | Handling webdataset error (OSError('image file is truncated (31 bytes not processed)')). Ignoring.
578
+ 2025-05-08,12:40:42 | INFO | Train Epoch: 3 [125845504/128008192 (98%)] Data (t): 0.372 Batch (t): 5.632, 2923.01/s, 182.688/s/gpu LR: 0.000000 Logit Scale: 94.207 Contrastive_loss: 0.60281 (0.66088) Loss: 0.60281 (0.66088)
579
+ 2025-05-08,12:52:46 | INFO | Train Epoch: 3 [127942656/128008192 (100%)] Data (t): 0.372 Batch (t): 5.653, 2861.28/s, 178.830/s/gpu LR: 0.000000 Logit Scale: 94.207 Contrastive_loss: 0.66097 (0.66088) Loss: 0.66097 (0.66088)
580
+ 2025-05-08,12:53:09 | INFO | Train Epoch: 3 [128008192/128008192 (100%)] Data (t): 0.383 Batch (t): 5.697, 2998.34/s, 187.396/s/gpu LR: 0.000000 Logit Scale: 94.207 Contrastive_loss: 0.63071 (0.66041) Loss: 0.63071 (0.66041)
581
+ 2025-05-08,12:53:17 | INFO | Starting zero-shot imagenet.
582
+ 2025-05-08,12:53:17 | INFO | Building zero-shot classifier
583
+ 2025-05-08,12:53:35 | INFO | Using classifier
clip_vit_b16_s512m_bs16k_mix0_0/params.txt ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ NDR_patch_size: 16
2
+ accum_freq: 1
3
+ aug_cfg: {}
4
+ batch_size: 1024
5
+ beta1: 0.9
6
+ beta2: 0.98
7
+ checkpoint_path: ./logs-lr1e-3-datacomp/clip_vit_b16_s512m_bs16k_mix0_0/checkpoints
8
+ coca_caption_loss_weight: 2.0
9
+ coca_contrastive_loss_weight: 1.0
10
+ copy_codebase: False
11
+ csv_caption_key: title
12
+ csv_img_key: filepath
13
+ csv_separator:
14
+ dataset_resampled: False
15
+ dataset_type: webdataset
16
+ ddp_static_graph: True
17
+ debug: False
18
+ delete_prev_step_ckpt: True
19
+ delete_previous_checkpoint: False
20
+ device: cuda:0
21
+ dist_backend: nccl
22
+ dist_url: env://
23
+ distill: False
24
+ distill_model: None
25
+ distill_pretrained: None
26
+ distributed: True
27
+ epochs: 4
28
+ epochs_cooldown: None
29
+ eps: 1e-06
30
+ force_custom_text: False
31
+ force_image_size: 224
32
+ force_patch_dropout: None
33
+ force_quick_gelu: False
34
+ gather_with_grad: True
35
+ global_batch_size: 16384
36
+ grad_checkpointing: True
37
+ grad_clip_norm: None
38
+ horovod: False
39
+ image_interpolation: None
40
+ image_mean: None
41
+ image_resize_mode: None
42
+ image_std: None
43
+ imagenet_v2: None
44
+ imagenet_val: /mnt/bn/zilongdata-hl/dataset/imagenet/val
45
+ is_cls_token: True
46
+ local_loss: True
47
+ local_rank: 0
48
+ lock_image: False
49
+ lock_image_freeze_bn_stats: False
50
+ lock_image_unlocked_groups: 0
51
+ lock_text: False
52
+ lock_text_freeze_layer_norm: False
53
+ lock_text_unlocked_layers: 0
54
+ log_every_n_steps: 128
55
+ log_level: 20
56
+ log_local: False
57
+ log_path: ./logs-lr1e-3-datacomp/clip_vit_b16_s512m_bs16k_mix0_0/out.log
58
+ logs: ./logs-lr1e-3-datacomp
59
+ lr: 0.001
60
+ lr_cooldown_end: 0.0
61
+ lr_cooldown_power: 1.0
62
+ lr_scheduler: cosine
63
+ max_seq_len: 15000
64
+ model: ViT-B-16
65
+ name: clip_vit_b16_s512m_bs16k_mix0_0
66
+ native_dynamic_resolution: False
67
+ no_set_device_rank: False
68
+ only_packing: False
69
+ precision: amp
70
+ pretrained:
71
+ pretrained_image:
72
+ pretrained_text:
73
+ rank: 0
74
+ remote_sync: None
75
+ remote_sync_frequency: 300
76
+ remote_sync_protocol: s3
77
+ report_to: wandb
78
+ resume: None
79
+ rope_attn_num_heads: 12
80
+ rope_model_width: 768
81
+ save_every_n_steps: 6104
82
+ save_frequency: 1
83
+ save_most_recent: False
84
+ seed: 0
85
+ siglip: False
86
+ skip_scheduler: False
87
+ tensorboard: False
88
+ tensorboard_path:
89
+ torchcompile: False
90
+ torchscript: False
91
+ trace: False
92
+ train_data: /mnt/bn/zilongdata-hl/dataset/Recap-DataComp-1B-Dataset/{000000..140146}.tar
93
+ train_data_upsampling_factors: None
94
+ train_num_samples: 128000000
95
+ use_bn_sync: False
96
+ use_bnb_linear: None
97
+ val_data: None
98
+ val_frequency: 1
99
+ val_num_samples: None
100
+ val_steps: 0
101
+ wandb: True
102
+ wandb_notes:
103
+ wandb_project_name: cls-clip-NDR
104
+ warmup: 500
105
+ wd: 0.2
106
+ workers: 1
107
+ world_size: 16
108
+ zeroshot_frequency: 4
109
+ zeroshot_steps: 0