Adding model, graphs and metadata.
Browse files- README.md +12 -12
- eval/eval_metrics.json +4 -0
- eval/evaluate_timing.json +1 -0
- eval/nbest_predictions.json.tgz +0 -0
- eval/predictions.json +0 -0
- eval/sparsity_report.json +1 -0
- eval/speed_report.json +1 -0
- model_card/density_info.js +4 -4
- model_card/pruning_info.js +4 -4
- model_info.json +303 -0
- training/data_args.json +16 -0
- training/model_args.json +7 -0
- training/sparse_args.json +36 -0
- training/training_args.bin +3 -0
README.md
CHANGED
|
@@ -4,8 +4,8 @@ thumbnail:
|
|
| 4 |
license: mit
|
| 5 |
tags:
|
| 6 |
- question-answering
|
| 7 |
-
-
|
| 8 |
-
-
|
| 9 |
datasets:
|
| 10 |
- squad
|
| 11 |
metrics:
|
|
@@ -19,7 +19,7 @@ widget:
|
|
| 19 |
|
| 20 |
## BERT-base uncased model fine-tuned on SQuAD v1
|
| 21 |
|
| 22 |
-
This model was created using the [nn_pruning](https://github.com/huggingface/nn_pruning) python library: the **linear layers contains 30.0%** of the original
|
| 23 |
|
| 24 |
This model **CANNOT be used without using nn_pruning `optimize_model`** function, as it uses NoNorms instead of LayerNorms and this is not currently supported by the Transformers library.
|
| 25 |
|
|
@@ -30,20 +30,20 @@ This does not need special handling, as it is supported by the Transformers libr
|
|
| 30 |
|
| 31 |
The model contains **45.0%** of the original weights **overall** (the embeddings account for a significant part of the model, and they are not pruned by this method).
|
| 32 |
|
| 33 |
-
With a simple resizing of the linear matrices it ran **2.01x as fast as
|
| 34 |
This is possible because the pruning method lead to structured matrices: to visualize them, hover below on the plot to see the non-zero/zero parts of each matrix.
|
| 35 |
|
| 36 |
-
<div class="graph"><script src="/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/density_info.js" id="
|
| 37 |
|
| 38 |
-
In terms of accuracy, its **F1 is 89.19**, compared with 88.5 for
|
| 39 |
|
| 40 |
## Fine-Pruning details
|
| 41 |
-
This model was fine-tuned from the HuggingFace [
|
| 42 |
This model is case-insensitive: it does not make a difference between english and English.
|
| 43 |
|
| 44 |
A side-effect of the block pruning is that some of the attention heads are completely removed: 55 heads were removed on a total of 144 (38.2%).
|
| 45 |
Here is a detailed view on how the remaining heads are distributed in the network after pruning.
|
| 46 |
-
<div class="graph"><script src="/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/pruning_info.js" id="
|
| 47 |
|
| 48 |
## Details of the SQuAD1.1 dataset
|
| 49 |
|
|
@@ -65,7 +65,7 @@ GPU driver: 455.23.05, CUDA: 11.1
|
|
| 65 |
|
| 66 |
### Results
|
| 67 |
|
| 68 |
-
**Pytorch model file size**: `
|
| 69 |
|
| 70 |
| Metric | # Value | # Original ([Table 2](https://www.aclweb.org/anthology/N19-1423.pdf))| Variation |
|
| 71 |
| ------ | --------- | --------- | --------- |
|
|
@@ -89,11 +89,11 @@ qa_pipeline = pipeline(
|
|
| 89 |
tokenizer="madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1"
|
| 90 |
)
|
| 91 |
|
| 92 |
-
print("
|
| 93 |
-
print(f"Parameters count (includes head pruning)={int(qa_pipeline.model.num_parameters() / 1E6)}M")
|
| 94 |
qa_pipeline.model = optimize_model(qa_pipeline.model, "dense")
|
| 95 |
|
| 96 |
-
print(f"Parameters count after optimization={int(qa_pipeline.model.num_parameters() / 1E6)}M")
|
| 97 |
predictions = qa_pipeline({
|
| 98 |
'context': "Frédéric François Chopin, born Fryderyk Franciszek Chopin (1 March 1810 – 17 October 1849), was a Polish composer and virtuoso pianist of the Romantic era who wrote primarily for solo piano.",
|
| 99 |
'question': "Who is Frederic Chopin?",
|
|
|
|
| 4 |
license: mit
|
| 5 |
tags:
|
| 6 |
- question-answering
|
| 7 |
+
-
|
| 8 |
+
-
|
| 9 |
datasets:
|
| 10 |
- squad
|
| 11 |
metrics:
|
|
|
|
| 19 |
|
| 20 |
## BERT-base uncased model fine-tuned on SQuAD v1
|
| 21 |
|
| 22 |
+
This model was created using the [nn_pruning](https://github.com/huggingface/nn_pruning) python library: the **linear layers contains 30.0%** of the original weights.
|
| 23 |
|
| 24 |
This model **CANNOT be used without using nn_pruning `optimize_model`** function, as it uses NoNorms instead of LayerNorms and this is not currently supported by the Transformers library.
|
| 25 |
|
|
|
|
| 30 |
|
| 31 |
The model contains **45.0%** of the original weights **overall** (the embeddings account for a significant part of the model, and they are not pruned by this method).
|
| 32 |
|
| 33 |
+
With a simple resizing of the linear matrices it ran **2.01x as fast as bert-base-uncased** on the evaluation.
|
| 34 |
This is possible because the pruning method lead to structured matrices: to visualize them, hover below on the plot to see the non-zero/zero parts of each matrix.
|
| 35 |
|
| 36 |
+
<div class="graph"><script src="/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/density_info.js" id="c3b978cc-6d18-4fd0-a24b-e4369569d64d"></script></div>
|
| 37 |
|
| 38 |
+
In terms of accuracy, its **F1 is 89.19**, compared with 88.5 for bert-base-uncased, a **F1 gain of 0.69**.
|
| 39 |
|
| 40 |
## Fine-Pruning details
|
| 41 |
+
This model was fine-tuned from the HuggingFace [model](https://huggingface.co/bert-base-uncased) checkpoint on [SQuAD1.1](https://rajpurkar.github.io/SQuAD-explorer), and distilled from the model [bert-large-uncased-whole-word-masking-finetuned-squad](https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad)
|
| 42 |
This model is case-insensitive: it does not make a difference between english and English.
|
| 43 |
|
| 44 |
A side-effect of the block pruning is that some of the attention heads are completely removed: 55 heads were removed on a total of 144 (38.2%).
|
| 45 |
Here is a detailed view on how the remaining heads are distributed in the network after pruning.
|
| 46 |
+
<div class="graph"><script src="/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/pruning_info.js" id="7de38b6d-774c-4313-a5a4-8e32f554d9ec"></script></div>
|
| 47 |
|
| 48 |
## Details of the SQuAD1.1 dataset
|
| 49 |
|
|
|
|
| 65 |
|
| 66 |
### Results
|
| 67 |
|
| 68 |
+
**Pytorch model file size**: `374MB` (original BERT: `420MB`)
|
| 69 |
|
| 70 |
| Metric | # Value | # Original ([Table 2](https://www.aclweb.org/anthology/N19-1423.pdf))| Variation |
|
| 71 |
| ------ | --------- | --------- | --------- |
|
|
|
|
| 89 |
tokenizer="madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1"
|
| 90 |
)
|
| 91 |
|
| 92 |
+
print("bert-base-uncased parameters: 200.0M")
|
| 93 |
+
print(f"Parameters count (includes only head pruning, not feed forward pruning)={int(qa_pipeline.model.num_parameters() / 1E6)}M")
|
| 94 |
qa_pipeline.model = optimize_model(qa_pipeline.model, "dense")
|
| 95 |
|
| 96 |
+
print(f"Parameters count after complete optimization={int(qa_pipeline.model.num_parameters() / 1E6)}M")
|
| 97 |
predictions = qa_pipeline({
|
| 98 |
'context': "Frédéric François Chopin, born Fryderyk Franciszek Chopin (1 March 1810 – 17 October 1849), was a Polish composer and virtuoso pianist of the Romantic era who wrote primarily for solo piano.",
|
| 99 |
'question': "Who is Frederic Chopin?",
|
eval/eval_metrics.json
ADDED
|
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"exact_match": 82.21381267738883,
|
| 3 |
+
"f1": 89.18801010717891
|
| 4 |
+
}
|
eval/evaluate_timing.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"eval_elapsed_time": 98.45825179666281, "cuda_eval_elapsed_time": 90.40731259918213}
|
eval/nbest_predictions.json.tgz
ADDED
|
Binary file (6.6 MB). View file
|
|
|
eval/predictions.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
eval/sparsity_report.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"total": 108882626, "nnz": 49662908, "linear_total": 84934656, "linear_nnz": 25746432, "layers": {"0": {"total": 7086912, "nnz": 2116894, "linear_total": 7077888, "linear_nnz": 2110464, "linear_attention_total": 2359296, "linear_attention_nnz": 1376256, "linear_dense_total": 4718592, "linear_dense_nnz": 734208}, "1": {"total": 7086528, "nnz": 1803218, "linear_total": 7077888, "linear_nnz": 1797120, "linear_attention_total": 2359296, "linear_attention_nnz": 983040, "linear_dense_total": 4718592, "linear_dense_nnz": 814080}, "2": {"total": 7087296, "nnz": 2797913, "linear_total": 7077888, "linear_nnz": 2790912, "linear_attention_total": 2359296, "linear_attention_nnz": 1769472, "linear_dense_total": 4718592, "linear_dense_nnz": 1021440}, "3": {"total": 7087296, "nnz": 2741044, "linear_total": 7077888, "linear_nnz": 2734080, "linear_attention_total": 2359296, "linear_attention_nnz": 1769472, "linear_dense_total": 4718592, "linear_dense_nnz": 964608}, "4": {"total": 7087488, "nnz": 2808736, "linear_total": 7077888, "linear_nnz": 2801664, "linear_attention_total": 2359296, "linear_attention_nnz": 1966080, "linear_dense_total": 4718592, "linear_dense_nnz": 835584}, "5": {"total": 7086912, "nnz": 2239854, "linear_total": 7077888, "linear_nnz": 2233344, "linear_attention_total": 2359296, "linear_attention_nnz": 1376256, "linear_dense_total": 4718592, "linear_dense_nnz": 857088}, "6": {"total": 7087296, "nnz": 2516642, "linear_total": 7077888, "linear_nnz": 2509824, "linear_attention_total": 2359296, "linear_attention_nnz": 1769472, "linear_dense_total": 4718592, "linear_dense_nnz": 740352}, "7": {"total": 7086912, "nnz": 1946287, "linear_total": 7077888, "linear_nnz": 1939968, "linear_attention_total": 2359296, "linear_attention_nnz": 1376256, "linear_dense_total": 4718592, "linear_dense_nnz": 563712}, "8": {"total": 7087296, "nnz": 2058616, "linear_total": 7077888, "linear_nnz": 2052096, "linear_attention_total": 2359296, "linear_attention_nnz": 1769472, "linear_dense_total": 4718592, "linear_dense_nnz": 282624}, "9": {"total": 7086720, "nnz": 1386755, "linear_total": 7077888, "linear_nnz": 1380864, "linear_attention_total": 2359296, "linear_attention_nnz": 1179648, "linear_dense_total": 4718592, "linear_dense_nnz": 201216}, "10": {"total": 7086528, "nnz": 1494281, "linear_total": 7077888, "linear_nnz": 1488384, "linear_attention_total": 2359296, "linear_attention_nnz": 983040, "linear_dense_total": 4718592, "linear_dense_nnz": 505344}, "11": {"total": 7086720, "nnz": 1913946, "linear_total": 7077888, "linear_nnz": 1907712, "linear_attention_total": 2359296, "linear_attention_nnz": 1179648, "linear_dense_total": 4718592, "linear_dense_nnz": 728064}}, "total_sparsity": 54.38858353765275, "linear_sparsity": 69.68677662037037, "pruned_heads": {"0": [0, 2, 4, 5, 6], "1": [0, 2, 3, 5, 6, 7, 8], "2": [8, 4, 7], "3": [2, 4, 6], "4": [1, 2], "5": [1, 2, 6, 7, 11], "6": [3, 10, 2], "7": [1, 3, 6, 7, 11], "8": [0, 3, 4], "9": [1, 4, 5, 7, 9, 10], "10": [1, 2, 4, 5, 6, 7, 8], "11": [0, 5, 7, 8, 10, 11]}}
|
eval/speed_report.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"timings": {"eval_elapsed_time": 26.312922549434006, "cuda_eval_elapsed_time": 19.22297591018677}, "metrics": {"exact_match": 82.21381267738883, "f1": 89.18874369381042}}
|
model_card/density_info.js
CHANGED
|
@@ -16,9 +16,9 @@
|
|
| 16 |
|
| 17 |
|
| 18 |
|
| 19 |
-
var element = document.getElementById("
|
| 20 |
if (element == null) {
|
| 21 |
-
console.warn("Bokeh: autoload.js configured with elementid '
|
| 22 |
}
|
| 23 |
|
| 24 |
|
|
@@ -115,8 +115,8 @@
|
|
| 115 |
(function(root) {
|
| 116 |
function embed_document(root) {
|
| 117 |
|
| 118 |
-
var docs_json = '{"bb99ed32-eb33-4d5e-b318-7d785b4d7302":{"roots":{"references":[{"attributes":{"data":{"density":["58.3%","41.7%","75.0%","75.0%","83.3%","58.3%","75.0%","58.3%","75.0%","50.0%","41.7%","50.0%"],"height":[0.344064,0.24576,0.442368,0.442368,0.49152,0.344064,0.442368,0.344064,0.442368,0.294912,0.24576,0.294912],"img_height":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"img_width":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"name":["0.attention.key","1.attention.key","2.attention.key","3.attention.key","4.attention.key","5.attention.key","6.attention.key","7.attention.key","8.attention.key","9.attention.key","10.attention.key","11.attention.key"],"parameters":["0.34","0.25","0.44","0.44","0.49","0.34","0.44","0.34","0.44","0.29","0.25","0.29"],"url":["/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_0_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_1_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_2_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_3_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_4_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_5_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_6_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_7_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_8_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_9_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_10_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_11_attention_self_key.png"],"x":[0.25,1.25,2.25,3.25,4.25,5.25,6.25,7.25,8.25,9.25,10.25,11.25]},"selected":{"id":"1153"},"selection_policy":{"id":"1152"}},"id":"1122","type":"ColumnDataSource"},{"attributes":{},"id":"1111","type":"BasicTicker"},{"attributes":{"start":0},"id":"1100","type":"DataRange1d"},{"attributes":{"source":{"id":"1128"}},"id":"1133","type":"CDSView"},{"attributes":{"axis_label":"Layer","formatter":{"id":"1146"},"minor_tick_line_color":null,"ticker":{"id":"1107"}},"id":"1106","type":"LinearAxis"},{"attributes":{},"id":"1098","type":"DataRange1d"},{"attributes":{"text":"Transformer Layers"},"id":"1096","type":"Title"},{"attributes":{},"id":"1102","type":"LinearScale"},{"attributes":{"fill_color":{"value":"#6573f7"},"line_color":{"value":"#6573f7"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1118","type":"VBar"},{"attributes":{"label":{"value":"query"},"renderers":[{"id":"1120"}]},"id":"1141","type":"LegendItem"},{"attributes":{"source":{"id":"1134"}},"id":"1139","type":"CDSView"},{"attributes":{},"id":"1155","type":"Selection"},{"attributes":{"items":[{"id":"1141"},{"id":"1142"},{"id":"1143"},{"id":"1144"}],"location":[10,0],"orientation":"horizontal"},"id":"1140","type":"Legend"},{"attributes":{"axis_label":"Parameters (M)","formatter":{"id":"1148"},"minor_tick_line_color":null,"ticker":{"id":"1111"}},"id":"1110","type":"LinearAxis"},{"attributes":{"label":{"value":"value"},"renderers":[{"id":"1132"}]},"id":"1143","type":"LegendItem"},{"attributes":{"data_source":{"id":"1134"},"glyph":{"id":"1136"},"hover_glyph":null,"muted_glyph":null,"name":"fully connected","nonselection_glyph":{"id":"1137"},"selection_glyph":null,"view":{"id":"1139"}},"id":"1138","type":"GlyphRenderer"},{"attributes":{"fill_alpha":{"value":0.1},"fill_color":{"value":"#ed5642"},"line_alpha":{"value":0.1},"line_color":{"value":"#ed5642"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1125","type":"VBar"},{"attributes":{},"id":"1151","type":"Selection"},{"attributes":{},"id":"1154","type":"UnionRenderers"},{"attributes":{"data":{"density":["58.3%","15.6%","15.6%","41.7%","17.3%","17.3%","75.0%","21.6%","21.6%","75.0%","20.4%","20.4%","83.3%","17.7%","17.7%","58.3%","18.2%","18.2%","75.0%","15.7%","15.7%","58.3%","11.9%","11.9%","75.0%","6.0%","6.0%","50.0%","4.3%","4.3%","41.7%","10.7%","10.7%","50.0%","15.4%","15.4%"],"height":[0.344064,0.367104,0.367104,0.24576,0.40704,0.40704,0.442368,0.51072,0.51072,0.442368,0.482304,0.482304,0.49152,0.417792,0.417792,0.344064,0.428544,0.428544,0.442368,0.370176,0.370176,0.344064,0.281856,0.281856,0.442368,0.141312,0.141312,0.294912,0.100608,0.100608,0.24576,0.252672,0.252672,0.294912,0.364032,0.364032],"img_height":["96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px"],"img_width":["96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px"],"name":["0.attention.output","0.intermediate","0.output","1.attention.output","1.intermediate","1.output","2.attention.output","2.intermediate","2.output","3.attention.output","3.intermediate","3.output","4.attention.output","4.intermediate","4.output","5.attention.output","5.intermediate","5.output","6.attention.output","6.intermediate","6.output","7.attention.output","7.intermediate","7.output","8.attention.output","8.intermediate","8.output","9.attention.output","9.intermediate","9.output","10.attention.output","10.intermediate","10.output","11.attention.output","11.intermediate","11.output"],"parameters":["0.34","0.37","0.37","0.25","0.41","0.41","0.44","0.51","0.51","0.44","0.48","0.48","0.49","0.42","0.42","0.34","0.43","0.43","0.44","0.37","0.37","0.34","0.28","0.28","0.44","0.14","0.14","0.29","0.10","0.10","0.25","0.25","0.25","0.29","0.36","0.36"],"url":["/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_0_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_0_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_0_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_1_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_1_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_1_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_2_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_2_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_2_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_3_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_3_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_3_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_4_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_4_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_4_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_5_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_5_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_5_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_6_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_6_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_6_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_7_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_7_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_7_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_8_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_8_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_8_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_9_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_9_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_9_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_10_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_10_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_10_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_11_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_11_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_11_output_dense.png"],"x":[0.5833333333333334,0.75,0.9166666666666667,1.5833333333333333,1.75,1.9166666666666665,2.5833333333333335,2.75,2.916666666666667,3.5833333333333335,3.75,3.916666666666667,4.583333333333333,4.75,4.916666666666666,5.583333333333333,5.75,5.916666666666666,6.583333333333333,6.75,6.916666666666666,7.583333333333333,7.75,7.916666666666666,8.583333333333334,8.75,8.916666666666668,9.583333333333334,9.75,9.916666666666668,10.583333333333334,10.75,10.916666666666668,11.583333333333334,11.75,11.916666666666668]},"selected":{"id":"1157"},"selection_policy":{"id":"1156"}},"id":"1134","type":"ColumnDataSource"},{"attributes":{"fill_alpha":{"value":0.1},"fill_color":{"value":"#20cb97"},"line_alpha":{"value":0.1},"line_color":{"value":"#20cb97"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1131","type":"VBar"},{"attributes":{"axis":{"id":"1110"},"dimension":1,"ticker":null},"id":"1113","type":"Grid"},{"attributes":{"fill_color":{"value":"#ed5642"},"line_color":{"value":"#ed5642"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1124","type":"VBar"},{"attributes":{},"id":"1156","type":"UnionRenderers"},{"attributes":{"data_source":{"id":"1128"},"glyph":{"id":"1130"},"hover_glyph":null,"muted_glyph":null,"name":"value","nonselection_glyph":{"id":"1131"},"selection_glyph":null,"view":{"id":"1133"}},"id":"1132","type":"GlyphRenderer"},{"attributes":{"fill_alpha":{"value":0.1},"fill_color":{"value":"#aa69f7"},"line_alpha":{"value":0.1},"line_color":{"value":"#aa69f7"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1137","type":"VBar"},{"attributes":{},"id":"1157","type":"Selection"},{"attributes":{},"id":"1152","type":"UnionRenderers"},{"attributes":{},"id":"1153","type":"Selection"},{"attributes":{"data_source":{"id":"1116"},"glyph":{"id":"1118"},"hover_glyph":null,"muted_glyph":null,"name":"query","nonselection_glyph":{"id":"1119"},"selection_glyph":null,"view":{"id":"1121"}},"id":"1120","type":"GlyphRenderer"},{"attributes":{"source":{"id":"1122"}},"id":"1127","type":"CDSView"},{"attributes":{},"id":"1104","type":"LinearScale"},{"attributes":{"fill_color":{"value":"#20cb97"},"line_color":{"value":"#20cb97"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1130","type":"VBar"},{"attributes":{"active_drag":"auto","active_inspect":"auto","active_multi":null,"active_scroll":"auto","active_tap":"auto","tools":[{"id":"1094"}]},"id":"1114","type":"Toolbar"},{"attributes":{"data_source":{"id":"1122"},"glyph":{"id":"1124"},"hover_glyph":null,"muted_glyph":null,"name":"key","nonselection_glyph":{"id":"1125"},"selection_glyph":null,"view":{"id":"1127"}},"id":"1126","type":"GlyphRenderer"},{"attributes":{"axis":{"id":"1106"},"grid_line_color":null,"ticker":null},"id":"1109","type":"Grid"},{"attributes":{},"id":"1150","type":"UnionRenderers"},{"attributes":{"data":{"density":["58.3%","41.7%","75.0%","75.0%","83.3%","58.3%","75.0%","58.3%","75.0%","50.0%","41.7%","50.0%"],"height":[0.344064,0.24576,0.442368,0.442368,0.49152,0.344064,0.442368,0.344064,0.442368,0.294912,0.24576,0.294912],"img_height":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"img_width":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"name":["0.attention.value","1.attention.value","2.attention.value","3.attention.value","4.attention.value","5.attention.value","6.attention.value","7.attention.value","8.attention.value","9.attention.value","10.attention.value","11.attention.value"],"parameters":["0.34","0.25","0.44","0.44","0.49","0.34","0.44","0.34","0.44","0.29","0.25","0.29"],"url":["/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_0_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_1_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_2_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_3_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_4_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_5_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_6_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_7_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_8_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_9_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_10_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_11_attention_self_value.png"],"x":[0.41666666666666663,1.4166666666666665,2.416666666666667,3.416666666666667,4.416666666666666,5.416666666666666,6.416666666666666,7.416666666666666,8.416666666666668,9.416666666666668,10.416666666666668,11.416666666666668]},"selected":{"id":"1155"},"selection_policy":{"id":"1154"}},"id":"1128","type":"ColumnDataSource"},{"attributes":{"label":{"value":"fully connected"},"renderers":[{"id":"1138"}]},"id":"1144","type":"LegendItem"},{"attributes":{},"id":"1107","type":"BasicTicker"},{"attributes":{"fill_color":{"value":"#aa69f7"},"line_color":{"value":"#aa69f7"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1136","type":"VBar"},{"attributes":{"above":[{"id":"1140"}],"below":[{"id":"1106"}],"center":[{"id":"1109"},{"id":"1113"}],"left":[{"id":"1110"}],"outline_line_color":null,"plot_height":300,"plot_width":505,"renderers":[{"id":"1120"},{"id":"1126"},{"id":"1132"},{"id":"1138"}],"title":{"id":"1096"},"toolbar":{"id":"1114"},"x_range":{"id":"1098"},"x_scale":{"id":"1102"},"y_range":{"id":"1100"},"y_scale":{"id":"1104"}},"id":"1095","subtype":"Figure","type":"Plot"},{"attributes":{"label":{"value":"key"},"renderers":[{"id":"1126"}]},"id":"1142","type":"LegendItem"},{"attributes":{"source":{"id":"1116"}},"id":"1121","type":"CDSView"},{"attributes":{"callback":null,"tooltips":"\\n <div>\\n <div style=\\"margin-bottom:10px\\">\\n <span style=\\"font-size: 15px;\\"><b>@name</b><br/>density=@density</span>\\n </div>\\n <div> \\n <img\\n src=\\"@url\\" height=\\"@img_height\\" width=\\"@img_width\\" alt=\\"@url\\"\\n style=\\"float: left; margin: 0px 15px 15px 0px;\\"\\n border=\\"0\\"\\n />\\n </div>\\n </div>\\n "},"id":"1094","type":"HoverTool"},{"attributes":{"fill_alpha":{"value":0.1},"fill_color":{"value":"#6573f7"},"line_alpha":{"value":0.1},"line_color":{"value":"#6573f7"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1119","type":"VBar"},{"attributes":{},"id":"1146","type":"BasicTickFormatter"},{"attributes":{"data":{"density":["58.3%","41.7%","75.0%","75.0%","83.3%","58.3%","75.0%","58.3%","75.0%","50.0%","41.7%","50.0%"],"height":[0.344064,0.24576,0.442368,0.442368,0.49152,0.344064,0.442368,0.344064,0.442368,0.294912,0.24576,0.294912],"img_height":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"img_width":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"name":["0.attention.query","1.attention.query","2.attention.query","3.attention.query","4.attention.query","5.attention.query","6.attention.query","7.attention.query","8.attention.query","9.attention.query","10.attention.query","11.attention.query"],"parameters":["0.34","0.25","0.44","0.44","0.49","0.34","0.44","0.34","0.44","0.29","0.25","0.29"],"url":["/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_0_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_1_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_2_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_3_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_4_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_5_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_6_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_7_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_8_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_9_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_10_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_11_attention_self_query.png"],"x":[0.08333333333333333,1.0833333333333333,2.0833333333333335,3.0833333333333335,4.083333333333333,5.083333333333333,6.083333333333333,7.083333333333333,8.083333333333334,9.083333333333334,10.083333333333334,11.083333333333334]},"selected":{"id":"1151"},"selection_policy":{"id":"1150"}},"id":"1116","type":"ColumnDataSource"},{"attributes":{},"id":"1148","type":"BasicTickFormatter"}],"root_ids":["1095"]},"title":"Bokeh Application","version":"2.2.3"}}';
|
| 119 |
-
var render_items = [{"docid":"
|
| 120 |
root.Bokeh.embed.embed_items(docs_json, render_items);
|
| 121 |
|
| 122 |
}
|
|
|
|
| 16 |
|
| 17 |
|
| 18 |
|
| 19 |
+
var element = document.getElementById("c3b978cc-6d18-4fd0-a24b-e4369569d64d");
|
| 20 |
if (element == null) {
|
| 21 |
+
console.warn("Bokeh: autoload.js configured with elementid 'c3b978cc-6d18-4fd0-a24b-e4369569d64d' but no matching script tag was found.")
|
| 22 |
}
|
| 23 |
|
| 24 |
|
|
|
|
| 115 |
(function(root) {
|
| 116 |
function embed_document(root) {
|
| 117 |
|
| 118 |
+
var docs_json = '{"632a3500-8560-49a8-827c-31134be1ae4f":{"roots":{"references":[{"attributes":{},"id":"1153","type":"UnionRenderers"},{"attributes":{"axis_label":"Layer","formatter":{"id":"1147"},"minor_tick_line_color":null,"ticker":{"id":"1107"}},"id":"1106","type":"LinearAxis"},{"attributes":{"text":"Transformer Layers"},"id":"1096","type":"Title"},{"attributes":{"axis_label":"Parameters (M)","formatter":{"id":"1149"},"minor_tick_line_color":null,"ticker":{"id":"1111"}},"id":"1110","type":"LinearAxis"},{"attributes":{"fill_alpha":{"value":0.1},"fill_color":{"value":"#aa69f7"},"line_alpha":{"value":0.1},"line_color":{"value":"#aa69f7"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1137","type":"VBar"},{"attributes":{"data":{"density":["58.3%","41.7%","75.0%","75.0%","83.3%","58.3%","75.0%","58.3%","75.0%","50.0%","41.7%","50.0%"],"height":[0.344064,0.24576,0.442368,0.442368,0.49152,0.344064,0.442368,0.344064,0.442368,0.294912,0.24576,0.294912],"img_height":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"img_width":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"name":["0.attention.key","1.attention.key","2.attention.key","3.attention.key","4.attention.key","5.attention.key","6.attention.key","7.attention.key","8.attention.key","9.attention.key","10.attention.key","11.attention.key"],"parameters":["0.34","0.25","0.44","0.44","0.49","0.34","0.44","0.34","0.44","0.29","0.25","0.29"],"url":["/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_0_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_1_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_2_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_3_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_4_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_5_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_6_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_7_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_8_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_9_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_10_attention_self_key.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_11_attention_self_key.png"],"x":[0.25,1.25,2.25,3.25,4.25,5.25,6.25,7.25,8.25,9.25,10.25,11.25]},"selected":{"id":"1152"},"selection_policy":{"id":"1153"}},"id":"1122","type":"ColumnDataSource"},{"attributes":{"data_source":{"id":"1128"},"glyph":{"id":"1130"},"hover_glyph":null,"muted_glyph":null,"name":"value","nonselection_glyph":{"id":"1131"},"selection_glyph":null,"view":{"id":"1133"}},"id":"1132","type":"GlyphRenderer"},{"attributes":{},"id":"1098","type":"DataRange1d"},{"attributes":{"fill_color":{"value":"#aa69f7"},"line_color":{"value":"#aa69f7"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1136","type":"VBar"},{"attributes":{},"id":"1150","type":"Selection"},{"attributes":{"data":{"density":["58.3%","15.6%","15.6%","41.7%","17.3%","17.3%","75.0%","21.6%","21.6%","75.0%","20.4%","20.4%","83.3%","17.7%","17.7%","58.3%","18.2%","18.2%","75.0%","15.7%","15.7%","58.3%","11.9%","11.9%","75.0%","6.0%","6.0%","50.0%","4.3%","4.3%","41.7%","10.7%","10.7%","50.0%","15.4%","15.4%"],"height":[0.344064,0.367104,0.367104,0.24576,0.40704,0.40704,0.442368,0.51072,0.51072,0.442368,0.482304,0.482304,0.49152,0.417792,0.417792,0.344064,0.428544,0.428544,0.442368,0.370176,0.370176,0.344064,0.281856,0.281856,0.442368,0.141312,0.141312,0.294912,0.100608,0.100608,0.24576,0.252672,0.252672,0.294912,0.364032,0.364032],"img_height":["96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px"],"img_width":["96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px","96px","96px","384px"],"name":["0.attention.output","0.intermediate","0.output","1.attention.output","1.intermediate","1.output","2.attention.output","2.intermediate","2.output","3.attention.output","3.intermediate","3.output","4.attention.output","4.intermediate","4.output","5.attention.output","5.intermediate","5.output","6.attention.output","6.intermediate","6.output","7.attention.output","7.intermediate","7.output","8.attention.output","8.intermediate","8.output","9.attention.output","9.intermediate","9.output","10.attention.output","10.intermediate","10.output","11.attention.output","11.intermediate","11.output"],"parameters":["0.34","0.37","0.37","0.25","0.41","0.41","0.44","0.51","0.51","0.44","0.48","0.48","0.49","0.42","0.42","0.34","0.43","0.43","0.44","0.37","0.37","0.34","0.28","0.28","0.44","0.14","0.14","0.29","0.10","0.10","0.25","0.25","0.25","0.29","0.36","0.36"],"url":["/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_0_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_0_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_0_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_1_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_1_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_1_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_2_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_2_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_2_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_3_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_3_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_3_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_4_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_4_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_4_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_5_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_5_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_5_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_6_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_6_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_6_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_7_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_7_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_7_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_8_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_8_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_8_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_9_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_9_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_9_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_10_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_10_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_10_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_11_attention_output_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_11_intermediate_dense.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_11_output_dense.png"],"x":[0.5833333333333334,0.75,0.9166666666666667,1.5833333333333333,1.75,1.9166666666666665,2.5833333333333335,2.75,2.916666666666667,3.5833333333333335,3.75,3.916666666666667,4.583333333333333,4.75,4.916666666666666,5.583333333333333,5.75,5.916666666666666,6.583333333333333,6.75,6.916666666666666,7.583333333333333,7.75,7.916666666666666,8.583333333333334,8.75,8.916666666666668,9.583333333333334,9.75,9.916666666666668,10.583333333333334,10.75,10.916666666666668,11.583333333333334,11.75,11.916666666666668]},"selected":{"id":"1156"},"selection_policy":{"id":"1157"}},"id":"1134","type":"ColumnDataSource"},{"attributes":{"fill_color":{"value":"#20cb97"},"line_color":{"value":"#20cb97"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1130","type":"VBar"},{"attributes":{},"id":"1149","type":"BasicTickFormatter"},{"attributes":{"data":{"density":["58.3%","41.7%","75.0%","75.0%","83.3%","58.3%","75.0%","58.3%","75.0%","50.0%","41.7%","50.0%"],"height":[0.344064,0.24576,0.442368,0.442368,0.49152,0.344064,0.442368,0.344064,0.442368,0.294912,0.24576,0.294912],"img_height":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"img_width":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"name":["0.attention.query","1.attention.query","2.attention.query","3.attention.query","4.attention.query","5.attention.query","6.attention.query","7.attention.query","8.attention.query","9.attention.query","10.attention.query","11.attention.query"],"parameters":["0.34","0.25","0.44","0.44","0.49","0.34","0.44","0.34","0.44","0.29","0.25","0.29"],"url":["/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_0_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_1_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_2_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_3_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_4_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_5_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_6_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_7_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_8_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_9_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_10_attention_self_query.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_11_attention_self_query.png"],"x":[0.08333333333333333,1.0833333333333333,2.0833333333333335,3.0833333333333335,4.083333333333333,5.083333333333333,6.083333333333333,7.083333333333333,8.083333333333334,9.083333333333334,10.083333333333334,11.083333333333334]},"selected":{"id":"1150"},"selection_policy":{"id":"1151"}},"id":"1116","type":"ColumnDataSource"},{"attributes":{"start":0},"id":"1100","type":"DataRange1d"},{"attributes":{"axis":{"id":"1110"},"dimension":1,"ticker":null},"id":"1113","type":"Grid"},{"attributes":{"source":{"id":"1134"}},"id":"1139","type":"CDSView"},{"attributes":{"source":{"id":"1128"}},"id":"1133","type":"CDSView"},{"attributes":{"active_drag":"auto","active_inspect":"auto","active_multi":null,"active_scroll":"auto","active_tap":"auto","tools":[{"id":"1094"}]},"id":"1114","type":"Toolbar"},{"attributes":{"label":{"value":"query"},"renderers":[{"id":"1120"}]},"id":"1141","type":"LegendItem"},{"attributes":{"axis":{"id":"1106"},"grid_line_color":null,"ticker":null},"id":"1109","type":"Grid"},{"attributes":{"above":[{"id":"1140"}],"below":[{"id":"1106"}],"center":[{"id":"1109"},{"id":"1113"}],"left":[{"id":"1110"}],"outline_line_color":null,"plot_height":300,"plot_width":505,"renderers":[{"id":"1120"},{"id":"1126"},{"id":"1132"},{"id":"1138"}],"title":{"id":"1096"},"toolbar":{"id":"1114"},"x_range":{"id":"1098"},"x_scale":{"id":"1102"},"y_range":{"id":"1100"},"y_scale":{"id":"1104"}},"id":"1095","subtype":"Figure","type":"Plot"},{"attributes":{"source":{"id":"1122"}},"id":"1127","type":"CDSView"},{"attributes":{},"id":"1151","type":"UnionRenderers"},{"attributes":{"label":{"value":"key"},"renderers":[{"id":"1126"}]},"id":"1142","type":"LegendItem"},{"attributes":{},"id":"1107","type":"BasicTicker"},{"attributes":{},"id":"1154","type":"Selection"},{"attributes":{"label":{"value":"fully connected"},"renderers":[{"id":"1138"}]},"id":"1144","type":"LegendItem"},{"attributes":{},"id":"1111","type":"BasicTicker"},{"attributes":{"data_source":{"id":"1134"},"glyph":{"id":"1136"},"hover_glyph":null,"muted_glyph":null,"name":"fully connected","nonselection_glyph":{"id":"1137"},"selection_glyph":null,"view":{"id":"1139"}},"id":"1138","type":"GlyphRenderer"},{"attributes":{},"id":"1155","type":"UnionRenderers"},{"attributes":{"data_source":{"id":"1122"},"glyph":{"id":"1124"},"hover_glyph":null,"muted_glyph":null,"name":"key","nonselection_glyph":{"id":"1125"},"selection_glyph":null,"view":{"id":"1127"}},"id":"1126","type":"GlyphRenderer"},{"attributes":{"callback":null,"tooltips":"\\n <div>\\n <div style=\\"margin-bottom:10px\\">\\n <span style=\\"font-size: 15px;\\"><b>@name</b><br/>density=@density</span>\\n </div>\\n <div> \\n <img\\n src=\\"@url\\" height=\\"@img_height\\" width=\\"@img_width\\" alt=\\"@url\\"\\n style=\\"float: left; margin: 0px 15px 15px 0px;\\"\\n border=\\"0\\"\\n />\\n </div>\\n </div>\\n "},"id":"1094","type":"HoverTool"},{"attributes":{"items":[{"id":"1141"},{"id":"1142"},{"id":"1143"},{"id":"1144"}],"location":[10,0],"orientation":"horizontal"},"id":"1140","type":"Legend"},{"attributes":{"fill_alpha":{"value":0.1},"fill_color":{"value":"#6573f7"},"line_alpha":{"value":0.1},"line_color":{"value":"#6573f7"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1119","type":"VBar"},{"attributes":{"data":{"density":["58.3%","41.7%","75.0%","75.0%","83.3%","58.3%","75.0%","58.3%","75.0%","50.0%","41.7%","50.0%"],"height":[0.344064,0.24576,0.442368,0.442368,0.49152,0.344064,0.442368,0.344064,0.442368,0.294912,0.24576,0.294912],"img_height":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"img_width":["96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px","96px"],"name":["0.attention.value","1.attention.value","2.attention.value","3.attention.value","4.attention.value","5.attention.value","6.attention.value","7.attention.value","8.attention.value","9.attention.value","10.attention.value","11.attention.value"],"parameters":["0.34","0.25","0.44","0.44","0.49","0.34","0.44","0.34","0.44","0.29","0.25","0.29"],"url":["/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_0_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_1_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_2_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_3_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_4_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_5_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_6_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_7_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_8_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_9_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_10_attention_self_value.png","/madlag/bert-base-uncased-squadv1-x2.01-f89.2-d30-hybrid-rewind-opt-v1/raw/main/model_card/images/layer_11_attention_self_value.png"],"x":[0.41666666666666663,1.4166666666666665,2.416666666666667,3.416666666666667,4.416666666666666,5.416666666666666,6.416666666666666,7.416666666666666,8.416666666666668,9.416666666666668,10.416666666666668,11.416666666666668]},"selected":{"id":"1154"},"selection_policy":{"id":"1155"}},"id":"1128","type":"ColumnDataSource"},{"attributes":{},"id":"1152","type":"Selection"},{"attributes":{"fill_alpha":{"value":0.1},"fill_color":{"value":"#20cb97"},"line_alpha":{"value":0.1},"line_color":{"value":"#20cb97"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1131","type":"VBar"},{"attributes":{},"id":"1104","type":"LinearScale"},{"attributes":{"fill_alpha":{"value":0.1},"fill_color":{"value":"#ed5642"},"line_alpha":{"value":0.1},"line_color":{"value":"#ed5642"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1125","type":"VBar"},{"attributes":{"data_source":{"id":"1116"},"glyph":{"id":"1118"},"hover_glyph":null,"muted_glyph":null,"name":"query","nonselection_glyph":{"id":"1119"},"selection_glyph":null,"view":{"id":"1121"}},"id":"1120","type":"GlyphRenderer"},{"attributes":{},"id":"1147","type":"BasicTickFormatter"},{"attributes":{"label":{"value":"value"},"renderers":[{"id":"1132"}]},"id":"1143","type":"LegendItem"},{"attributes":{},"id":"1156","type":"Selection"},{"attributes":{},"id":"1102","type":"LinearScale"},{"attributes":{"source":{"id":"1116"}},"id":"1121","type":"CDSView"},{"attributes":{},"id":"1157","type":"UnionRenderers"},{"attributes":{"fill_color":{"value":"#6573f7"},"line_color":{"value":"#6573f7"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1118","type":"VBar"},{"attributes":{"fill_color":{"value":"#ed5642"},"line_color":{"value":"#ed5642"},"top":{"field":"height"},"width":{"value":0.125},"x":{"field":"x"}},"id":"1124","type":"VBar"}],"root_ids":["1095"]},"title":"Bokeh Application","version":"2.2.3"}}';
|
| 119 |
+
var render_items = [{"docid":"632a3500-8560-49a8-827c-31134be1ae4f","root_ids":["1095"],"roots":{"1095":"c3b978cc-6d18-4fd0-a24b-e4369569d64d"}}];
|
| 120 |
root.Bokeh.embed.embed_items(docs_json, render_items);
|
| 121 |
|
| 122 |
}
|
model_card/pruning_info.js
CHANGED
|
@@ -16,9 +16,9 @@
|
|
| 16 |
|
| 17 |
|
| 18 |
|
| 19 |
-
var element = document.getElementById("
|
| 20 |
if (element == null) {
|
| 21 |
-
console.warn("Bokeh: autoload.js configured with elementid '
|
| 22 |
}
|
| 23 |
|
| 24 |
|
|
@@ -115,8 +115,8 @@
|
|
| 115 |
(function(root) {
|
| 116 |
function embed_document(root) {
|
| 117 |
|
| 118 |
-
var docs_json = '{"
|
| 119 |
-
var render_items = [{"docid":"
|
| 120 |
root.Bokeh.embed.embed_items(docs_json, render_items);
|
| 121 |
|
| 122 |
}
|
|
|
|
| 16 |
|
| 17 |
|
| 18 |
|
| 19 |
+
var element = document.getElementById("7de38b6d-774c-4313-a5a4-8e32f554d9ec");
|
| 20 |
if (element == null) {
|
| 21 |
+
console.warn("Bokeh: autoload.js configured with elementid '7de38b6d-774c-4313-a5a4-8e32f554d9ec' but no matching script tag was found.")
|
| 22 |
}
|
| 23 |
|
| 24 |
|
|
|
|
| 115 |
(function(root) {
|
| 116 |
function embed_document(root) {
|
| 117 |
|
| 118 |
+
var docs_json = '{"1042e3b9-9a00-47e5-8b3c-ffb68f29c13e":{"roots":{"references":[{"attributes":{"bottom":{"expr":{"id":"1020"}},"fill_alpha":{"value":0.1},"fill_color":{"value":"#0000ff"},"line_alpha":{"value":0.1},"line_color":{"value":"#0000ff"},"top":{"expr":{"id":"1021"}},"width":{"value":0.9},"x":{"field":"layers"}},"id":"1027","type":"VBar"},{"attributes":{"data_source":{"id":"1024"},"glyph":{"id":"1026"},"hover_glyph":null,"muted_glyph":null,"name":"active","nonselection_glyph":{"id":"1027"},"selection_glyph":null,"view":{"id":"1029"}},"id":"1028","type":"GlyphRenderer"},{"attributes":{"axis_label":"Layer index","formatter":{"id":"1032"},"minor_tick_line_color":null,"ticker":{"id":"1013"}},"id":"1012","type":"CategoricalAxis"},{"attributes":{"data":{"active":[7,5,9,9,10,7,9,7,9,6,5,6],"layers":["0","1","2","3","4","5","6","7","8","9","10","11"],"pruned":[5,7,3,3,2,5,3,5,3,6,7,6]},"selected":{"id":"1052"},"selection_policy":{"id":"1053"}},"id":"1039","type":"ColumnDataSource"},{"attributes":{"start":0},"id":"1006","type":"DataRange1d"},{"attributes":{"items":[{"id":"1038"},{"id":"1054"}],"location":null},"id":"1037","type":"Legend"},{"attributes":{},"id":"1008","type":"CategoricalScale"},{"attributes":{},"id":"1035","type":"Selection"},{"attributes":{"axis":{"id":"1012"},"grid_line_color":null,"ticker":null},"id":"1014","type":"Grid"},{"attributes":{},"id":"1036","type":"UnionRenderers"},{"attributes":{},"id":"1013","type":"CategoricalTicker"},{"attributes":{},"id":"1052","type":"Selection"},{"attributes":{"fields":[]},"id":"1020","type":"Stack"},{"attributes":{},"id":"1053","type":"UnionRenderers"},{"attributes":{"factors":["0","1","2","3","4","5","6","7","8","9","10","11"],"range_padding":0.1},"id":"1004","type":"FactorRange"},{"attributes":{"bottom":{"expr":{"id":"1020"}},"fill_color":{"value":"#0000ff"},"line_color":{"value":"#0000ff"},"top":{"expr":{"id":"1021"}},"width":{"value":0.9},"x":{"field":"layers"}},"id":"1026","type":"VBar"},{"attributes":{"source":{"id":"1024"}},"id":"1029","type":"CDSView"},{"attributes":{"fields":["active"]},"id":"1022","type":"Stack"},{"attributes":{"above":[{"id":"1055"}],"below":[{"id":"1012"}],"center":[{"id":"1014"},{"id":"1018"},{"id":"1037"}],"left":[{"id":"1015"}],"outline_line_color":null,"plot_height":400,"renderers":[{"id":"1028"},{"id":"1043"}],"title":{"id":"1002"},"toolbar":{"id":"1019"},"toolbar_location":null,"x_range":{"id":"1004"},"x_scale":{"id":"1008"},"y_range":{"id":"1006"},"y_scale":{"id":"1010"}},"id":"1001","subtype":"Figure","type":"Plot"},{"attributes":{},"id":"1032","type":"CategoricalTickFormatter"},{"attributes":{"text":"Pruned Transformer Heads"},"id":"1002","type":"Title"},{"attributes":{},"id":"1010","type":"LinearScale"},{"attributes":{"items":[{"id":"1056"},{"id":"1057"}],"location":[10,0],"orientation":"horizontal"},"id":"1055","type":"Legend"},{"attributes":{"label":{"value":"active"},"renderers":[{"id":"1028"}]},"id":"1056","type":"LegendItem"},{"attributes":{"label":{"value":"active"},"renderers":[{"id":"1028"}]},"id":"1038","type":"LegendItem"},{"attributes":{"source":{"id":"1039"}},"id":"1044","type":"CDSView"},{"attributes":{"bottom":{"expr":{"id":"1022"}},"fill_color":{"value":"#ffcccc"},"line_color":{"value":"#ffcccc"},"top":{"expr":{"id":"1023"}},"width":{"value":0.9},"x":{"field":"layers"}},"id":"1041","type":"VBar"},{"attributes":{"data_source":{"id":"1039"},"glyph":{"id":"1041"},"hover_glyph":null,"muted_glyph":null,"name":"pruned","nonselection_glyph":{"id":"1042"},"selection_glyph":null,"view":{"id":"1044"}},"id":"1043","type":"GlyphRenderer"},{"attributes":{"bottom":{"expr":{"id":"1022"}},"fill_alpha":{"value":0.1},"fill_color":{"value":"#ffcccc"},"line_alpha":{"value":0.1},"line_color":{"value":"#ffcccc"},"top":{"expr":{"id":"1023"}},"width":{"value":0.9},"x":{"field":"layers"}},"id":"1042","type":"VBar"},{"attributes":{"fields":["active"]},"id":"1021","type":"Stack"},{"attributes":{"axis":{"id":"1015"},"dimension":1,"ticker":null},"id":"1018","type":"Grid"},{"attributes":{"label":{"value":"pruned"},"renderers":[{"id":"1043"}]},"id":"1054","type":"LegendItem"},{"attributes":{"fields":["active","pruned"]},"id":"1023","type":"Stack"},{"attributes":{},"id":"1034","type":"BasicTickFormatter"},{"attributes":{"active_drag":"auto","active_inspect":"auto","active_multi":null,"active_scroll":"auto","active_tap":"auto"},"id":"1019","type":"Toolbar"},{"attributes":{},"id":"1016","type":"BasicTicker"},{"attributes":{"axis_label":"Heads count","formatter":{"id":"1034"},"minor_tick_line_color":null,"ticker":{"id":"1016"}},"id":"1015","type":"LinearAxis"},{"attributes":{"label":{"value":"pruned"},"renderers":[{"id":"1043"}]},"id":"1057","type":"LegendItem"},{"attributes":{"data":{"active":[7,5,9,9,10,7,9,7,9,6,5,6],"layers":["0","1","2","3","4","5","6","7","8","9","10","11"],"pruned":[5,7,3,3,2,5,3,5,3,6,7,6]},"selected":{"id":"1035"},"selection_policy":{"id":"1036"}},"id":"1024","type":"ColumnDataSource"}],"root_ids":["1001"]},"title":"Bokeh Application","version":"2.2.3"}}';
|
| 119 |
+
var render_items = [{"docid":"1042e3b9-9a00-47e5-8b3c-ffb68f29c13e","root_ids":["1001"],"roots":{"1001":"7de38b6d-774c-4313-a5a4-8e32f554d9ec"}}];
|
| 120 |
root.Bokeh.embed.embed_items(docs_json, render_items);
|
| 121 |
|
| 122 |
}
|
model_info.json
ADDED
|
@@ -0,0 +1,303 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"checkpoint_path": "/data_2to/devel_data/nn_pruning/output/squad_test_9_fullpatch6/hp_od-__data_2to__devel_data__nn_pruning__output__squad_test_9_fullpatch6___es-steps_nte20_ls250_stl50_est5000_rn-__data_2to__devel_data__nn_pruning__output__squad_test_9_fullpatch6_--5f772c87c5edbc85/checkpoint-100000",
|
| 3 |
+
"config": {
|
| 4 |
+
"_name_or_path": "/tmp/tmpcklouvey",
|
| 5 |
+
"architectures": ["BertForQuestionAnswering"],
|
| 6 |
+
"attention_probs_dropout_prob": 0.1,
|
| 7 |
+
"gradient_checkpointing": false,
|
| 8 |
+
"hidden_act": "relu",
|
| 9 |
+
"hidden_dropout_prob": 0.1,
|
| 10 |
+
"hidden_size": 768,
|
| 11 |
+
"initializer_range": 0.02,
|
| 12 |
+
"intermediate_size": 3072,
|
| 13 |
+
"layer_norm_eps": 1e-12,
|
| 14 |
+
"layer_norm_type": "no_norm",
|
| 15 |
+
"max_position_embeddings": 512,
|
| 16 |
+
"model_type": "bert",
|
| 17 |
+
"num_attention_heads": 12,
|
| 18 |
+
"num_hidden_layers": 12,
|
| 19 |
+
"pad_token_id": 0,
|
| 20 |
+
"position_embedding_type": "absolute",
|
| 21 |
+
"pruned_heads": {
|
| 22 |
+
"0": [0, 2, 4, 5, 6],
|
| 23 |
+
"1": [0, 2, 3, 5, 6, 7, 8],
|
| 24 |
+
"10": [1, 2, 4, 5, 6, 7, 8],
|
| 25 |
+
"11": [0, 5, 7, 8, 10, 11],
|
| 26 |
+
"2": [4, 7, 8],
|
| 27 |
+
"3": [2, 4, 6],
|
| 28 |
+
"4": [1, 2],
|
| 29 |
+
"5": [1, 2, 6, 7, 11],
|
| 30 |
+
"6": [2, 3, 10],
|
| 31 |
+
"7": [1, 3, 6, 7, 11],
|
| 32 |
+
"8": [0, 3, 4],
|
| 33 |
+
"9": [1, 4, 5, 7, 9, 10]
|
| 34 |
+
},
|
| 35 |
+
"type_vocab_size": 2,
|
| 36 |
+
"vocab_size": 30522
|
| 37 |
+
},
|
| 38 |
+
"eval_metrics": {
|
| 39 |
+
"exact_match": 82.21381267738883,
|
| 40 |
+
"f1": 89.18874369381042,
|
| 41 |
+
"main_metric": 89.18874369381042
|
| 42 |
+
},
|
| 43 |
+
"model_args": {
|
| 44 |
+
"cache_dir": null,
|
| 45 |
+
"config_name": null,
|
| 46 |
+
"model_name_or_path": "bert-base-uncased",
|
| 47 |
+
"tokenizer_name": null,
|
| 48 |
+
"use_fast_tokenizer": true
|
| 49 |
+
},
|
| 50 |
+
"sparse_args": {
|
| 51 |
+
"ampere_pruning_method": "disabled",
|
| 52 |
+
"attention_block_cols": 32,
|
| 53 |
+
"attention_block_rows": 32,
|
| 54 |
+
"attention_lambda": 1.0,
|
| 55 |
+
"attention_output_with_dense": 0,
|
| 56 |
+
"attention_pruning_method": "sigmoied_threshold",
|
| 57 |
+
"bias_mask": true,
|
| 58 |
+
"dense_block_cols": 1,
|
| 59 |
+
"dense_block_rows": 1,
|
| 60 |
+
"dense_lambda": 1.0,
|
| 61 |
+
"dense_pruning_method": "sigmoied_threshold:1d_alt",
|
| 62 |
+
"distil_alpha_ce": 0.1,
|
| 63 |
+
"distil_alpha_teacher": 0.9,
|
| 64 |
+
"distil_teacher_name_or_path": "bert-large-uncased-whole-word-masking-finetuned-squad",
|
| 65 |
+
"distil_temperature": 2.0,
|
| 66 |
+
"final_ampere_temperature": 20.0,
|
| 67 |
+
"final_finetune": false,
|
| 68 |
+
"final_threshold": 0.1,
|
| 69 |
+
"final_warmup": 10,
|
| 70 |
+
"gelu_patch": 1,
|
| 71 |
+
"gelu_patch_steps": 50000,
|
| 72 |
+
"initial_ampere_temperature": 0.0,
|
| 73 |
+
"initial_threshold": 0,
|
| 74 |
+
"initial_warmup": 1,
|
| 75 |
+
"layer_norm_patch": 1,
|
| 76 |
+
"layer_norm_patch_start_delta": 0.99,
|
| 77 |
+
"layer_norm_patch_steps": 50000,
|
| 78 |
+
"linear_min_parameters": 0,
|
| 79 |
+
"mask_init": "constant",
|
| 80 |
+
"mask_scale": 0.0,
|
| 81 |
+
"mask_scores_learning_rate": 0.01,
|
| 82 |
+
"regularization": "l1",
|
| 83 |
+
"regularization_final_lambda": 10,
|
| 84 |
+
"rewind_model_name_or_path": "madlag/bert-base-uncased-squadv1-x1.96-f88.3-d27-hybrid-filled-opt-v1"
|
| 85 |
+
},
|
| 86 |
+
"speed": {
|
| 87 |
+
"cuda_eval_elapsed_time": 19.22297591018677,
|
| 88 |
+
"eval_elapsed_time": 26.312922549434006
|
| 89 |
+
},
|
| 90 |
+
"speedup": 2.00772207100977,
|
| 91 |
+
"stats": {
|
| 92 |
+
"layers": {
|
| 93 |
+
"0": {
|
| 94 |
+
"linear_attention_nnz": 1376256,
|
| 95 |
+
"linear_attention_total": 2359296,
|
| 96 |
+
"linear_dense_nnz": 734208,
|
| 97 |
+
"linear_dense_total": 4718592,
|
| 98 |
+
"linear_nnz": 2110464,
|
| 99 |
+
"linear_total": 7077888,
|
| 100 |
+
"nnz": 2116894,
|
| 101 |
+
"total": 7086912
|
| 102 |
+
},
|
| 103 |
+
"1": {
|
| 104 |
+
"linear_attention_nnz": 983040,
|
| 105 |
+
"linear_attention_total": 2359296,
|
| 106 |
+
"linear_dense_nnz": 814080,
|
| 107 |
+
"linear_dense_total": 4718592,
|
| 108 |
+
"linear_nnz": 1797120,
|
| 109 |
+
"linear_total": 7077888,
|
| 110 |
+
"nnz": 1803218,
|
| 111 |
+
"total": 7086528
|
| 112 |
+
},
|
| 113 |
+
"10": {
|
| 114 |
+
"linear_attention_nnz": 983040,
|
| 115 |
+
"linear_attention_total": 2359296,
|
| 116 |
+
"linear_dense_nnz": 505344,
|
| 117 |
+
"linear_dense_total": 4718592,
|
| 118 |
+
"linear_nnz": 1488384,
|
| 119 |
+
"linear_total": 7077888,
|
| 120 |
+
"nnz": 1494281,
|
| 121 |
+
"total": 7086528
|
| 122 |
+
},
|
| 123 |
+
"11": {
|
| 124 |
+
"linear_attention_nnz": 1179648,
|
| 125 |
+
"linear_attention_total": 2359296,
|
| 126 |
+
"linear_dense_nnz": 728064,
|
| 127 |
+
"linear_dense_total": 4718592,
|
| 128 |
+
"linear_nnz": 1907712,
|
| 129 |
+
"linear_total": 7077888,
|
| 130 |
+
"nnz": 1913946,
|
| 131 |
+
"total": 7086720
|
| 132 |
+
},
|
| 133 |
+
"2": {
|
| 134 |
+
"linear_attention_nnz": 1769472,
|
| 135 |
+
"linear_attention_total": 2359296,
|
| 136 |
+
"linear_dense_nnz": 1021440,
|
| 137 |
+
"linear_dense_total": 4718592,
|
| 138 |
+
"linear_nnz": 2790912,
|
| 139 |
+
"linear_total": 7077888,
|
| 140 |
+
"nnz": 2797913,
|
| 141 |
+
"total": 7087296
|
| 142 |
+
},
|
| 143 |
+
"3": {
|
| 144 |
+
"linear_attention_nnz": 1769472,
|
| 145 |
+
"linear_attention_total": 2359296,
|
| 146 |
+
"linear_dense_nnz": 964608,
|
| 147 |
+
"linear_dense_total": 4718592,
|
| 148 |
+
"linear_nnz": 2734080,
|
| 149 |
+
"linear_total": 7077888,
|
| 150 |
+
"nnz": 2741044,
|
| 151 |
+
"total": 7087296
|
| 152 |
+
},
|
| 153 |
+
"4": {
|
| 154 |
+
"linear_attention_nnz": 1966080,
|
| 155 |
+
"linear_attention_total": 2359296,
|
| 156 |
+
"linear_dense_nnz": 835584,
|
| 157 |
+
"linear_dense_total": 4718592,
|
| 158 |
+
"linear_nnz": 2801664,
|
| 159 |
+
"linear_total": 7077888,
|
| 160 |
+
"nnz": 2808736,
|
| 161 |
+
"total": 7087488
|
| 162 |
+
},
|
| 163 |
+
"5": {
|
| 164 |
+
"linear_attention_nnz": 1376256,
|
| 165 |
+
"linear_attention_total": 2359296,
|
| 166 |
+
"linear_dense_nnz": 857088,
|
| 167 |
+
"linear_dense_total": 4718592,
|
| 168 |
+
"linear_nnz": 2233344,
|
| 169 |
+
"linear_total": 7077888,
|
| 170 |
+
"nnz": 2239854,
|
| 171 |
+
"total": 7086912
|
| 172 |
+
},
|
| 173 |
+
"6": {
|
| 174 |
+
"linear_attention_nnz": 1769472,
|
| 175 |
+
"linear_attention_total": 2359296,
|
| 176 |
+
"linear_dense_nnz": 740352,
|
| 177 |
+
"linear_dense_total": 4718592,
|
| 178 |
+
"linear_nnz": 2509824,
|
| 179 |
+
"linear_total": 7077888,
|
| 180 |
+
"nnz": 2516642,
|
| 181 |
+
"total": 7087296
|
| 182 |
+
},
|
| 183 |
+
"7": {
|
| 184 |
+
"linear_attention_nnz": 1376256,
|
| 185 |
+
"linear_attention_total": 2359296,
|
| 186 |
+
"linear_dense_nnz": 563712,
|
| 187 |
+
"linear_dense_total": 4718592,
|
| 188 |
+
"linear_nnz": 1939968,
|
| 189 |
+
"linear_total": 7077888,
|
| 190 |
+
"nnz": 1946287,
|
| 191 |
+
"total": 7086912
|
| 192 |
+
},
|
| 193 |
+
"8": {
|
| 194 |
+
"linear_attention_nnz": 1769472,
|
| 195 |
+
"linear_attention_total": 2359296,
|
| 196 |
+
"linear_dense_nnz": 282624,
|
| 197 |
+
"linear_dense_total": 4718592,
|
| 198 |
+
"linear_nnz": 2052096,
|
| 199 |
+
"linear_total": 7077888,
|
| 200 |
+
"nnz": 2058616,
|
| 201 |
+
"total": 7087296
|
| 202 |
+
},
|
| 203 |
+
"9": {
|
| 204 |
+
"linear_attention_nnz": 1179648,
|
| 205 |
+
"linear_attention_total": 2359296,
|
| 206 |
+
"linear_dense_nnz": 201216,
|
| 207 |
+
"linear_dense_total": 4718592,
|
| 208 |
+
"linear_nnz": 1380864,
|
| 209 |
+
"linear_total": 7077888,
|
| 210 |
+
"nnz": 1386755,
|
| 211 |
+
"total": 7086720
|
| 212 |
+
}
|
| 213 |
+
},
|
| 214 |
+
"linear_nnz": 25746432,
|
| 215 |
+
"linear_sparsity": 69.68677662037037,
|
| 216 |
+
"linear_total": 84934656,
|
| 217 |
+
"nnz": 49662908,
|
| 218 |
+
"pruned_heads": {
|
| 219 |
+
"0": [0, 2, 4, 5, 6],
|
| 220 |
+
"1": [0, 2, 3, 5, 6, 7, 8],
|
| 221 |
+
"10": [1, 2, 4, 5, 6, 7, 8],
|
| 222 |
+
"11": [0, 5, 7, 8, 10, 11],
|
| 223 |
+
"2": [8, 4, 7],
|
| 224 |
+
"3": [2, 4, 6],
|
| 225 |
+
"4": [1, 2],
|
| 226 |
+
"5": [1, 2, 6, 7, 11],
|
| 227 |
+
"6": [3, 10, 2],
|
| 228 |
+
"7": [1, 3, 6, 7, 11],
|
| 229 |
+
"8": [0, 3, 4],
|
| 230 |
+
"9": [1, 4, 5, 7, 9, 10]
|
| 231 |
+
},
|
| 232 |
+
"total": 108882626,
|
| 233 |
+
"total_sparsity": 54.38858353765275
|
| 234 |
+
},
|
| 235 |
+
"training_args": {
|
| 236 |
+
"_n_gpu": -1,
|
| 237 |
+
"adafactor": false,
|
| 238 |
+
"adam_beta1": 0.9,
|
| 239 |
+
"adam_beta2": 0.999,
|
| 240 |
+
"adam_epsilon": 1e-08,
|
| 241 |
+
"dataloader_drop_last": false,
|
| 242 |
+
"dataloader_num_workers": 0,
|
| 243 |
+
"dataloader_pin_memory": true,
|
| 244 |
+
"ddp_find_unused_parameters": null,
|
| 245 |
+
"debug": false,
|
| 246 |
+
"deepspeed": null,
|
| 247 |
+
"disable_tqdm": false,
|
| 248 |
+
"do_eval": 1,
|
| 249 |
+
"do_predict": false,
|
| 250 |
+
"do_train": 1,
|
| 251 |
+
"eval_accumulation_steps": null,
|
| 252 |
+
"eval_steps": 5000,
|
| 253 |
+
"evaluation_strategy": "steps",
|
| 254 |
+
"fp16": false,
|
| 255 |
+
"fp16_backend": "auto",
|
| 256 |
+
"fp16_full_eval": false,
|
| 257 |
+
"fp16_opt_level": "O1",
|
| 258 |
+
"gradient_accumulation_steps": 1,
|
| 259 |
+
"greater_is_better": null,
|
| 260 |
+
"group_by_length": false,
|
| 261 |
+
"ignore_data_skip": false,
|
| 262 |
+
"label_names": null,
|
| 263 |
+
"label_smoothing_factor": 0.0,
|
| 264 |
+
"learning_rate": 3e-05,
|
| 265 |
+
"length_column_name": "length",
|
| 266 |
+
"load_best_model_at_end": false,
|
| 267 |
+
"local_rank": -1,
|
| 268 |
+
"logging_dir": "/data_2to/devel_data/nn_pruning/output/squad_test_9_fullpatch6/",
|
| 269 |
+
"logging_first_step": false,
|
| 270 |
+
"logging_steps": 250,
|
| 271 |
+
"logging_strategy": "steps",
|
| 272 |
+
"lr_scheduler_type": "linear",
|
| 273 |
+
"max_grad_norm": 1.0,
|
| 274 |
+
"max_steps": -1,
|
| 275 |
+
"metric_for_best_model": null,
|
| 276 |
+
"mp_parameters": "",
|
| 277 |
+
"no_cuda": false,
|
| 278 |
+
"num_train_epochs": 20,
|
| 279 |
+
"optimize_model_before_eval": "disabled",
|
| 280 |
+
"output_dir": "/data_2to/devel_data/nn_pruning/output/squad_test_9_fullpatch6/",
|
| 281 |
+
"overwrite_output_dir": 1,
|
| 282 |
+
"past_index": -1,
|
| 283 |
+
"per_device_eval_batch_size": 8,
|
| 284 |
+
"per_device_train_batch_size": 16,
|
| 285 |
+
"per_gpu_eval_batch_size": null,
|
| 286 |
+
"per_gpu_train_batch_size": null,
|
| 287 |
+
"prediction_loss_only": false,
|
| 288 |
+
"remove_unused_columns": true,
|
| 289 |
+
"report_to": null,
|
| 290 |
+
"run_name": "/data_2to/devel_data/nn_pruning/output/squad_test_9_fullpatch6/",
|
| 291 |
+
"save_steps": 5000,
|
| 292 |
+
"save_strategy": "steps",
|
| 293 |
+
"save_total_limit": 50,
|
| 294 |
+
"seed": 17,
|
| 295 |
+
"sharded_ddp": "",
|
| 296 |
+
"skip_memory_metrics": false,
|
| 297 |
+
"tpu_metrics_debug": false,
|
| 298 |
+
"tpu_num_cores": null,
|
| 299 |
+
"warmup_ratio": 0.0,
|
| 300 |
+
"warmup_steps": 5400,
|
| 301 |
+
"weight_decay": 0.0
|
| 302 |
+
}
|
| 303 |
+
}
|
training/data_args.json
ADDED
|
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"dataset_cache_dir": "dataset_cache",
|
| 3 |
+
"dataset_config_name": null,
|
| 4 |
+
"dataset_name": "squad",
|
| 5 |
+
"doc_stride": 128,
|
| 6 |
+
"max_answer_length": 30,
|
| 7 |
+
"max_seq_length": 384,
|
| 8 |
+
"n_best_size": 20,
|
| 9 |
+
"null_score_diff_threshold": 0.0,
|
| 10 |
+
"overwrite_cache": 0,
|
| 11 |
+
"pad_to_max_length": true,
|
| 12 |
+
"preprocessing_num_workers": null,
|
| 13 |
+
"train_file": null,
|
| 14 |
+
"validation_file": null,
|
| 15 |
+
"version_2_with_negative": false
|
| 16 |
+
}
|
training/model_args.json
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"cache_dir": null,
|
| 3 |
+
"config_name": null,
|
| 4 |
+
"model_name_or_path": "bert-base-uncased",
|
| 5 |
+
"tokenizer_name": null,
|
| 6 |
+
"use_fast_tokenizer": true
|
| 7 |
+
}
|
training/sparse_args.json
ADDED
|
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"ampere_pruning_method": "disabled",
|
| 3 |
+
"attention_block_cols": 32,
|
| 4 |
+
"attention_block_rows": 32,
|
| 5 |
+
"attention_lambda": 1.0,
|
| 6 |
+
"attention_output_with_dense": 0,
|
| 7 |
+
"attention_pruning_method": "sigmoied_threshold",
|
| 8 |
+
"bias_mask": true,
|
| 9 |
+
"dense_block_cols": 1,
|
| 10 |
+
"dense_block_rows": 1,
|
| 11 |
+
"dense_lambda": 1.0,
|
| 12 |
+
"dense_pruning_method": "sigmoied_threshold:1d_alt",
|
| 13 |
+
"distil_alpha_ce": 0.1,
|
| 14 |
+
"distil_alpha_teacher": 0.9,
|
| 15 |
+
"distil_teacher_name_or_path": "bert-large-uncased-whole-word-masking-finetuned-squad",
|
| 16 |
+
"distil_temperature": 2.0,
|
| 17 |
+
"final_ampere_temperature": 20.0,
|
| 18 |
+
"final_finetune": false,
|
| 19 |
+
"final_threshold": 0.1,
|
| 20 |
+
"final_warmup": 10,
|
| 21 |
+
"gelu_patch": 1,
|
| 22 |
+
"gelu_patch_steps": 50000,
|
| 23 |
+
"initial_ampere_temperature": 0.0,
|
| 24 |
+
"initial_threshold": 0,
|
| 25 |
+
"initial_warmup": 1,
|
| 26 |
+
"layer_norm_patch": 1,
|
| 27 |
+
"layer_norm_patch_start_delta": 0.99,
|
| 28 |
+
"layer_norm_patch_steps": 50000,
|
| 29 |
+
"linear_min_parameters": 0,
|
| 30 |
+
"mask_init": "constant",
|
| 31 |
+
"mask_scale": 0.0,
|
| 32 |
+
"mask_scores_learning_rate": 0.01,
|
| 33 |
+
"regularization": "l1",
|
| 34 |
+
"regularization_final_lambda": 10,
|
| 35 |
+
"rewind_model_name_or_path": "madlag/bert-base-uncased-squadv1-x1.96-f88.3-d27-hybrid-filled-opt-v1"
|
| 36 |
+
}
|
training/training_args.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cbb5a04ceb10745ffd9468f13ee72ea9540da473e8a7efa3bb1aaf19e8880b75
|
| 3 |
+
size 1967
|