nmmursit commited on
Commit
b10200b
·
verified ·
1 Parent(s): 1c659ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -109,10 +109,10 @@ model = convert_to_float8_training(model, config=config)
109
  | Llama-3.1-8B-Instruct-w16a8-8nodes-bs32 | 31476844 | 23.50 | 8 | 4 | **3.133** | **12.533** | 4 | 4 | 8 | 1024 |
110
  | Llama-3.1-8B-Instruct-w16a16-8nodes-bs64 | 31476914 | 22.00 | 8 | 4 | **2.933** | **11.733** | 4 | 8 | 8 | 1024 |
111
  | Llama-3.1-8B-Instruct-w16a8-8nodes-bs64 | 31476844 | 23.50 | 8 | 4 | **3.133** | **12.533** | 4 | 8 | 8 | 1024 |
112
- | Llama-3.1-8B-Instruct-w16a8-rowwise_4nodes | 33477070 | 39.75 | 4 | 4 | **2.650** | **10.600** | 4 | 4 | 8 | 512 |
113
- | Llama-3.1-8B-Instruct-w16a8-rowwise_with_gw_hp_4nodes | 33477179 | 37.43 | 4 | 4 | **2.495** | **9.982** | 4 | 4 | 8 | 512 |
114
- | Llama-3.1-8B-Instruct-w16a8-rowwise_8nodes | 33476690 | 23.50 | 8 | 4 | **3.133** | **12.533** | 4 | 4 | 8 | 1024 |
115
- | Llama-3.1-8B-Instruct-w16a8-rowwise_with_gw_hp_8nodes | 33476618 | 22.13 | 8 | 4 | **2.951** | **11.802** | 4 | 4 | 8 | 1024 |
116
 
117
  ### *Training Time Analysision*
118
  | Model | Training Time (mins) | Memory Allocated (avg %) | GPU Utilization (avg %) | Speed vs bf16 |
 
109
  | Llama-3.1-8B-Instruct-w16a8-8nodes-bs32 | 31476844 | 23.50 | 8 | 4 | **3.133** | **12.533** | 4 | 4 | 8 | 1024 |
110
  | Llama-3.1-8B-Instruct-w16a16-8nodes-bs64 | 31476914 | 22.00 | 8 | 4 | **2.933** | **11.733** | 4 | 8 | 8 | 1024 |
111
  | Llama-3.1-8B-Instruct-w16a8-8nodes-bs64 | 31476844 | 23.50 | 8 | 4 | **3.133** | **12.533** | 4 | 8 | 8 | 1024 |
112
+ | Llama-3.1-8B-Instruct-w16a8-rw_4nodes | 33477070 | 39.75 | 4 | 4 | **2.650** | **10.600** | 4 | 4 | 8 | 512 |
113
+ | Llama-3.1-8B-Instruct-w16a8-rw-8nodes | 33476690 | 23.50 | 8 | 4 | **3.133** | **12.533** | 4 | 4 | 8 | 1024 |
114
+ | Llama-3.1-8B-Instruct-w16a8-rw_with_gw_hp_4nodes | 33477179 | 37.43 | 4 | 4 | **2.495** | **9.982** | 4 | 4 | 8 | 512 |
115
+ | Llama-3.1-8B-Instruct-w16a8-rw-with-gw-hp-8nodes | 33476618 | 22.13 | 8 | 4 | **2.951** | **11.802** | 4 | 4 | 8 | 1024 |
116
 
117
  ### *Training Time Analysision*
118
  | Model | Training Time (mins) | Memory Allocated (avg %) | GPU Utilization (avg %) | Speed vs bf16 |