Update README.md
Browse files
README.md
CHANGED
|
@@ -67,7 +67,7 @@ We adopt the architecture of FLM-101B as the backbone for Tele-FLM, with several
|
|
| 67 |
- SwiGLU for activation function
|
| 68 |
- Linear bias disabled
|
| 69 |
- Embedding and language model head untied
|
| 70 |
-
- Input and output
|
| 71 |
|
| 72 |
Consequently, Tele-FLM is largely compatible with Llama architecturally.
|
| 73 |
To maximize convenience for the community, we made minimal adjustments to Llama's code to adapt it to Tele-FLM and released it as open source.
|
|
|
|
| 67 |
- SwiGLU for activation function
|
| 68 |
- Linear bias disabled
|
| 69 |
- Embedding and language model head untied
|
| 70 |
+
- Input and output multiplier
|
| 71 |
|
| 72 |
Consequently, Tele-FLM is largely compatible with Llama architecturally.
|
| 73 |
To maximize convenience for the community, we made minimal adjustments to Llama's code to adapt it to Tele-FLM and released it as open source.
|