Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -15,50 +15,36 @@ library_name: sentence-transformers
|
|
| 15 |
|
| 16 |
# Ko-Qwen: Korean Embedding Model
|
| 17 |
|
| 18 |
-
|
| 19 |
|
| 20 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
-
|
| 23 |
|
| 24 |
```bash
|
| 25 |
pip install -U sentence-transformers
|
| 26 |
```
|
| 27 |
|
| 28 |
-
Then load and use the model:
|
| 29 |
-
|
| 30 |
```python
|
| 31 |
from sentence_transformers import SentenceTransformer
|
| 32 |
|
| 33 |
# Load model
|
| 34 |
-
model = SentenceTransformer("
|
| 35 |
|
| 36 |
# Encode sentences
|
| 37 |
-
sentences = [
|
| 38 |
-
]
|
| 39 |
-
|
| 40 |
embeddings = model.encode(sentences)
|
| 41 |
-
print(embeddings.shape) # (3, 1024)
|
| 42 |
|
| 43 |
# Compute similarities
|
| 44 |
similarities = model.similarity(embeddings, embeddings)
|
| 45 |
print(similarities)
|
| 46 |
```
|
| 47 |
|
| 48 |
-
## Training
|
| 49 |
-
|
| 50 |
-
This model was trained through a 6-stage progressive pipeline
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
## Framework Versions
|
| 54 |
-
|
| 55 |
-
- Python: 3.11.9
|
| 56 |
-
- Sentence Transformers: 3.3.1
|
| 57 |
-
- Transformers: 4.54.1
|
| 58 |
-
- PyTorch: 2.7.1+cu126
|
| 59 |
-
- Datasets: 4.3.0
|
| 60 |
-
- Tokenizers: 0.21.4
|
| 61 |
-
|
| 62 |
## License
|
| 63 |
|
| 64 |
Apache 2.0
|
|
|
|
| 15 |
|
| 16 |
# Ko-Qwen: Korean Embedding Model
|
| 17 |
|
| 18 |
+
Korean-optimized embedding model based on [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B).
|
| 19 |
|
| 20 |
+
## Model Details
|
| 21 |
+
|
| 22 |
+
- **Parameters**: 600M
|
| 23 |
+
- **Embedding Dimension**: 1024
|
| 24 |
+
- **Max Sequence Length**: 512 tokens
|
| 25 |
+
- **Language**: Korean (ko)
|
| 26 |
|
| 27 |
+
## Usage
|
| 28 |
|
| 29 |
```bash
|
| 30 |
pip install -U sentence-transformers
|
| 31 |
```
|
| 32 |
|
|
|
|
|
|
|
| 33 |
```python
|
| 34 |
from sentence_transformers import SentenceTransformer
|
| 35 |
|
| 36 |
# Load model
|
| 37 |
+
model = SentenceTransformer("gihong99/qwen3-embedding-ko-v1")
|
| 38 |
|
| 39 |
# Encode sentences
|
| 40 |
+
sentences = ["인공지능은 미래를 바꿀 것입니다.", "오늘 날씨가 좋습니다."]
|
|
|
|
|
|
|
| 41 |
embeddings = model.encode(sentences)
|
|
|
|
| 42 |
|
| 43 |
# Compute similarities
|
| 44 |
similarities = model.similarity(embeddings, embeddings)
|
| 45 |
print(similarities)
|
| 46 |
```
|
| 47 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
## License
|
| 49 |
|
| 50 |
Apache 2.0
|