Update src/about.py
Browse files- src/about.py +13 -3
src/about.py
CHANGED
|
@@ -37,7 +37,7 @@ TITLE = f"""
|
|
| 37 |
INTRODUCTION_TEXT = """
|
| 38 |
Persian LLM Leaderboard is designed to be a challenging benchmark and provide a reliable evaluation of LLMs in Persian Language.
|
| 39 |
|
| 40 |
-
Note: This is a demo version of the leaderboard.
|
| 41 |
linguistic skills and their level of bias, ethics, and trustworthiness. **These datasets are not yet public, but they will be uploaded onto huggingface along with a detailed paper
|
| 42 |
explaining the data and performance of relevent models.**
|
| 43 |
|
|
@@ -54,7 +54,13 @@ To reproduce our results, here is the commands you can run:
|
|
| 54 |
"""
|
| 55 |
|
| 56 |
EVALUATION_QUEUE_TEXT = """
|
| 57 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
|
| 59 |
### 1) Make sure you can load your model and tokenizer using AutoClasses:
|
| 60 |
```python
|
|
@@ -79,7 +85,11 @@ When we add extra information about models to the leaderboard, it will be automa
|
|
| 79 |
## In case of model failure
|
| 80 |
If your model is displayed in the `FAILED` category, its execution stopped.
|
| 81 |
Make sure you have followed the above steps first.
|
| 82 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 83 |
"""
|
| 84 |
|
| 85 |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
|
|
|
|
| 37 |
INTRODUCTION_TEXT = """
|
| 38 |
Persian LLM Leaderboard is designed to be a challenging benchmark and provide a reliable evaluation of LLMs in Persian Language.
|
| 39 |
|
| 40 |
+
Note: This is a demo version of the leaderboard. Two new benchmarks are introduced: *PeKA* and *PersBETS*, challenging the native knowledge of the models along with
|
| 41 |
linguistic skills and their level of bias, ethics, and trustworthiness. **These datasets are not yet public, but they will be uploaded onto huggingface along with a detailed paper
|
| 42 |
explaining the data and performance of relevent models.**
|
| 43 |
|
|
|
|
| 54 |
"""
|
| 55 |
|
| 56 |
EVALUATION_QUEUE_TEXT = """
|
| 57 |
+
|
| 58 |
+
Right now, the models added **are not automatically evaluated**. We may support automatic evaluation in the future on our own clusters.
|
| 59 |
+
An evaluation framework will be available in the future to help reproduce the results.
|
| 60 |
+
|
| 61 |
+
## Don't forget to read the FAQ and the About tabs for more information!
|
| 62 |
+
|
| 63 |
+
## First steps before submitting a model
|
| 64 |
|
| 65 |
### 1) Make sure you can load your model and tokenizer using AutoClasses:
|
| 66 |
```python
|
|
|
|
| 85 |
## In case of model failure
|
| 86 |
If your model is displayed in the `FAILED` category, its execution stopped.
|
| 87 |
Make sure you have followed the above steps first.
|
| 88 |
+
|
| 89 |
+
### 5) Select the correct precision
|
| 90 |
+
Not all models are converted properly from `float16` to `bfloat16`, and selecting the wrong precision can sometimes cause evaluation error (as loading a `bf16` model in `fp16` can sometimes generate NaNs, depending on the weight range).
|
| 91 |
+
|
| 92 |
+
|
| 93 |
"""
|
| 94 |
|
| 95 |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
|