|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- zh |
|
|
base_model: |
|
|
- hfl/chinese-lert-base |
|
|
tags: |
|
|
- punctuation-restoration |
|
|
--- |
|
|
<div align="center"> |
|
|
<h1>FireRedChat-punc</h1> |
|
|
</div> |
|
|
<div align="center"> |
|
|
<a href="https://fireredteam.github.io/demos/firered_chat/">Demo</a> • |
|
|
<a href="https://arxiv.org/pdf/2509.06502">FireRedChat Paper</a> • |
|
|
<a href="https://huggingface.co/FireRedTeam">Huggingface</a> |
|
|
</div> |
|
|
|
|
|
## Descriptions |
|
|
FireRedChat-punc is a fine-tuned `hfl/chinese-lert-base` model designed for punctuation restoration, primarily for post-processing in [FireRedASR](https://github.com/FireRedTeam/FireRedASR). |
|
|
|
|
|
The model restores the following punctuation marks: [, 。 ? !]. It supports both Chinese and English text, enhancing the readability of transcribed text. |
|
|
|
|
|
## Roadmap |
|
|
- [x] 2025/09 |
|
|
- [x] Release the fine-tuned punctuation restoration model. |
|
|
|
|
|
## Usage |
|
|
|
|
|
RedPost source code [github](https://github.com/FireRedTeam/FireRedChat/tree/main/fireredasr-server/server/redpost) |
|
|
Below is an example of how to use the FireRedChat-punc model for punctuation restoration: |
|
|
|
|
|
```bash |
|
|
git clone https://huggingface.co/hfl/chinese-lert-base FireRedChat-punc/chinese-lert-base |
|
|
``` |
|
|
|
|
|
```python |
|
|
import os |
|
|
from redpost import RedPost, RedPostConfig |
|
|
|
|
|
punc_model_dir = "./FireRedChat-punc" |
|
|
post_config = RedPostConfig( |
|
|
use_gpu=True, |
|
|
sentence_max_length=30 |
|
|
) |
|
|
post_model = RedPost.from_pretrained(punc_model_dir, post_config) |
|
|
batch_post_results = post_model.process([text], ["text"]) |
|
|
text = "".join([r["punc_text"] for r in batch_post_results]) |
|
|
text = re.sub("<unk>|<UNK>|\[unk\]|\[UNK\]", "", text) |
|
|
print(text) |
|
|
``` |
|
|
|
|
|
## Use with FireRedASR |
|
|
|
|
|
This punctuation restoration model can be used together with FireRedASR, refer to [fireredasr-server](https://github.com/FireRedTeam/FireRedChat/blob/main/fireredasr-server/README.md) for setup instructions. |
|
|
https://github.com/FireRedTeam/FireRedChat/tree/main/fireredasr-server |
|
|
|
|
|
## License |
|
|
The model and source code are licensed under the Apache-2.0 license. |
|
|
|
|
|
### Acknowledgment |
|
|
- Base model: `hfl/chinese-lert-base` (license: apache-2.0) |
|
|
- Designed for integration with [FireRedASR](https://github.com/FireRedTeam/FireRedASR). |