diaslmb commited on
Commit
fa0f4cd
Β·
verified Β·
1 Parent(s): 8287f40

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -5
README.md CHANGED
@@ -1,10 +1,37 @@
1
  ---
2
- title: README
3
- emoji: 🐠
4
- colorFrom: pink
5
  colorTo: indigo
6
- sdk: static
7
  pinned: false
 
8
  ---
9
 
10
- Edit this `README.md` markdown file to author your organization card.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Inflexion Lab
3
+ emoji: πŸš€
4
+ colorFrom: blue
5
  colorTo: indigo
6
+ sdk: docker
7
  pinned: false
8
+ app_file: app.py
9
  ---
10
 
11
+ # Inflexion Lab
12
+
13
+ **Advancing State-of-the-Art NLP for the Kazakh Language**
14
+
15
+ Inflexion Lab is an AI research and development group dedicated to solving the challenges of low-resource language processing. Our primary focus is building robust, industrial-grade Natural Language Processing (NLP) and Automatic Speech Recognition (ASR) systems for the Kazakh language and the Central Asian region.
16
+
17
+ ### 🎯 Our Mission
18
+ To bridge the digital divide for the Kazakh language by developing open-source models, datasets, and tools that enable seamless human-AI interaction. We combine advanced Deep Learning techniques with linguistic precision to create models that truly understand the nuances of the language.
19
+
20
+ ### πŸ”¬ Key Areas of Research
21
+ * **Automatic Speech Recognition (ASR):** Fine-tuning large-scale models (Whisper) for mixed-language environments (Kazakh/Russian).
22
+ * **Data Engineering:** Syntactic restructuring of raw speech corpora using LLMs (Gemma, Llama) to create high-quality training data.
23
+ * **Large Language Models (LLMs):** Adapting and aligning foundation models for Turkic languages.
24
+
25
+ ### πŸ‘₯ Team
26
+ We are a team of engineers and researchers passionate about AI infrastructure and linguistics.
27
+
28
+ * **Askhat Sabitkhanov**
29
+ * **Dias Ilyas**
30
+ * **Sergey Klimov**
31
+
32
+ ### πŸš€ Featured Projects
33
+ * **Sybyrla (Whisper Large V3):** A robust ASR model achieving ~12% WER on the KSC2 benchmark, optimized for code-switching.
34
+ * **KSC2 Structured:** An enhanced version of the ISSAI KSC2 corpus with punctuation and capitalization restored via LLM post-processing.
35
+
36
+ ---
37
+ *Open Science. Open Source. Inflexion.*