Update model card with new paper, add pipeline tag and library_name

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +54 -2
README.md CHANGED
@@ -1,14 +1,39 @@
1
  ---
2
- license: cc-by-sa-4.0
3
  datasets:
4
  - Homie0609/MatchTime
5
  language:
6
  - en
 
7
  tags:
8
  - sports
9
  - soccer
 
 
10
  ---
11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ## Requirements
13
  - Python >= 3.8 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html))
14
  - [PyTorch >= 2.0.0](https://pytorch.org/) (If use A100)
@@ -53,6 +78,8 @@ with the format of features is adjusted by
53
  ```
54
  python ./features/preprocess.py directory_path_of_feature
55
  ```
 
 
56
  After preparing the data and features, you can pre-train (or finetune) with the following terminal command (Check hyper-parameters at the bottom of *train.py*):
57
  ```
58
  python train.py
@@ -134,4 +161,29 @@ python ./evaluation/scoer_single.py --csv_path ./inference_result/sample.csv
134
  python ./evaluation/scoer_group.py
135
  # for gpt score (need OpenAI API Key)
136
  python ./evaluation/scoer_gpt.py ./inference_result/sample.csv
137
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  datasets:
3
  - Homie0609/MatchTime
4
  language:
5
  - en
6
+ license: cc-by-sa-4.0
7
  tags:
8
  - sports
9
  - soccer
10
+ pipeline_tag: video-text-to-text
11
+ library_name: transformers
12
  ---
13
 
14
+ # Commentary Generation for Soccer Highlights
15
+
16
+ This repository contains the code and model for **Commentary Generation for Soccer Highlights**, as presented in our paper:
17
+
18
+ **[Commentary Generation for Soccer Highlights](https://huggingface.co/papers/2508.07543)**
19
+
20
+ ## Abstract
21
+ Automated soccer commentary generation has evolved from template-based systems to advanced neural architectures, aiming to produce real-time descriptions of sports events. While frameworks like SoccerNet-Caption laid foundational work, their inability to achieve fine-grained alignment between video content and commentary remains a significant challenge. Recent efforts such as MatchTime, with its MatchVoice model, address this issue through coarse and fine-grained alignment techniques, achieving improved temporal synchronization. In this paper, we extend MatchVoice to commentary generation for soccer highlights using the GOAL dataset, which emphasizes short clips over entire games. We conduct extensive experiments to reproduce the original MatchTime results and evaluate our setup, highlighting the impact of different training configurations and hardware limitations. Furthermore, we explore the effect of varying window sizes on zero-shot performance. While MatchVoice exhibits promising generalization capabilities, our findings suggest the need for integrating techniques from broader video-language domains to further enhance performance.
22
+
23
+ <div align="center">
24
+
25
+ [\u25b6\ufe0fDemo Video (YouTube)](https://www.youtube.com/watch?v=E3RxHR-M6y0) [\u25b6\ufe0fDemo Video (bilibili)](https://www.bilibili.com/video/BV1L4421U76m) \u00b7 [\ud83c\udfe0Project Page](https://haoningwu3639.github.io/MatchTime/) \u00b7 [\ud83d\udcbbCode](https://github.com/Homie0609/MatchTime) \u00b7 [\ud83d\udcddOriginal Paper (MatchTime)](https://arxiv.org/abs/2406.18530/) \u00b7 [\ud83d\udccaDataset](https://drive.google.com/drive/folders/14tb6lV2nlTxn3VygwAPdmtKm7v0Ss8wG) \u00b7 [\ud83d\udce5Checkpoint](https://huggingface.co/Homie0609/MatchVoice)
26
+
27
+ </div>
28
+
29
+ <div align="center">
30
+ <img src="https://github.com/Homie0609/MatchTime/raw/main/assets/teaser.png">
31
+ </div>
32
+
33
+ <div align="center">
34
+ <img src="https://github.com/Homie0609/MatchTime/raw/main/assets/commentary.png">
35
+ </div>
36
+
37
  ## Requirements
38
  - Python >= 3.8 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html))
39
  - [PyTorch >= 2.0.0](https://pytorch.org/) (If use A100)
 
78
  ```
79
  python ./features/preprocess.py directory_path_of_feature
80
  ```
81
+ Above example gives the format of Baidu feature, in our experiments we also used ResNET_PCA_512, C3D_PCA_512 from official website. If you want to use [CLIP](https://github.com/openai/CLIP)(2 FPS) or [InternVideo](https://github.com/OpenGVLab/InternVideo/tree/main/InternVideo1)(1FPS) feature. You can follow their official website to extract feature or contact us for features.
82
+
83
  After preparing the data and features, you can pre-train (or finetune) with the following terminal command (Check hyper-parameters at the bottom of *train.py*):
84
  ```
85
  python train.py
 
161
  python ./evaluation/scoer_group.py
162
  # for gpt score (need OpenAI API Key)
163
  python ./evaluation/scoer_gpt.py ./inference_result/sample.csv
164
+ ```
165
+
166
+ ## Citation
167
+ If you use this code for your research or project, please cite:
168
+
169
+ ```bibtex
170
+ @article{rao2024matchtimeautomaticsoccergame,
171
+ title={MatchTime: Towards Automatic Soccer Game Commentary Generation},
172
+ author={Jiayuan Rao and Haoning Wu and Chang Liu and Yanfeng Wang and Weidi Xie},
173
+ year={2024},
174
+ journal={arXiv preprint arXiv:2406.18530},
175
+ }
176
+
177
+ @article{rao2024commentary,
178
+ title={Commentary Generation for Soccer Highlights},
179
+ author={Rao, Jiayuan and Wu, Haoning and Liu, Chang and Wang, Yanfeng and Xie, Weidi},
180
+ journal={arXiv preprint arXiv:2508.07543},
181
+ year={2024},
182
+ }
183
+ ```
184
+
185
+ ## Acknowledgements
186
+ Many thanks to the code bases from [Video-LLaMA](https://github.com/DAMO-NLP-SG/Video-LLaMA) and source data from [SoccerNet-Caption](https://arxiv.org/abs/2304.04565).
187
+
188
+ ## Contact
189
+ If you have any questions, please feel free to contact jy_rao@sjtu.edu.cn or haoningwu3639@gmail.com.