FireRedTeam commited on
Commit
d9f70c4
·
verified ·
1 Parent(s): 158770a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -3
README.md CHANGED
@@ -4,9 +4,51 @@ language:
4
  - zh
5
  base_model:
6
  - hfl/chinese-lert-base
 
 
7
  ---
8
- ## FireRedChat-punc
 
 
 
 
 
 
 
9
 
10
- This is a chinese-lert-base model finetuned for punctuation restoration, release mainly for [FireRedASR](https://github.com/FireRedTeam/FireRedASR) postprocessing.
 
11
 
12
- Model restores the following punctuations -- [, 。 ? !]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - zh
5
  base_model:
6
  - hfl/chinese-lert-base
7
+ tags:
8
+ - punctuation-restoration
9
  ---
10
+ <div align="center">
11
+ <h1>FireRedChat-punc</h1>
12
+ </div>
13
+ <div align="center">
14
+ <a href="https://fireredteam.github.io/demos/firered_chat/">Demo</a> •
15
+ <a href="https://arxiv.org/pdf/2509.06502">FireRedChat Paper</a> •
16
+ <a href="https://huggingface.co/FireRedTeam">Huggingface</a>
17
+ </div>
18
 
19
+ ## Descriptions
20
+ FireRedChat-punc is a fine-tuned `hfl/chinese-lert-base` model designed for punctuation restoration, primarily for post-processing in [FireRedASR](https://github.com/FireRedTeam/FireRedASR).
21
 
22
+ The model restores the following punctuation marks: [, 。 ? !]. It supports both Chinese and English text, enhancing the readability of transcribed text.
23
+
24
+ ## Roadmap
25
+ - [x] 2025/09
26
+ - [x] Release the fine-tuned punctuation restoration model.
27
+
28
+ ## Usage
29
+
30
+ RedPost source code [github](https://github.com/FireRedTeam/FireRedChat/tree/main/fireredasr-server/server/redpost)
31
+ Below is an example of how to use the FireRedChat-punc model for punctuation restoration:
32
+
33
+ ```python
34
+ import os
35
+ from redpost import RedPost, RedPostConfig
36
+
37
+ punc_model_dir = os.path.join("FireRedChat-punc")
38
+ post_config = RedPostConfig(
39
+ use_gpu=True,
40
+ sentence_max_length=30
41
+ )
42
+ post_model = RedPost.from_pretrained(punc_model_dir, post_config)
43
+ batch_post_results = post_model.process([text], ["text"])
44
+ text = "".join([r["punc_text"] for r in batch_post_results])
45
+ text = re.sub("<unk>|<UNK>|\[unk\]|\[UNK\]", "", text)
46
+ print(text)
47
+ ```
48
+
49
+ ## License
50
+ The model and source code are licensed under the Apache-2.0 license.
51
+
52
+ ### Acknowledgment
53
+ - Base model: `hfl/chinese-lert-base` (license: apache-2.0)
54
+ - Designed for integration with [FireRedASR](https://github.com/FireRedTeam/FireRedASR).