| | --- |
| | base_model: |
| | - microsoft/codebert-base |
| | datasets: |
| | - devngho/the-stack-llm-annotations-v2 |
| | language: |
| | - code |
| | library_name: transformers |
| | license: mit |
| | metrics: |
| | - f1 |
| | --- |
| | |
| | # devngho/code_edu_classifier-v3-microsoft_codebert-base |
| | |
| | ์ด ๋ชจ๋ธ์ [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base)์ classifier๋ฅผ ์ถ๊ฐํ ๋ชจ๋ธ์
๋๋ค. [HuggingFaceFW/fineweb-edu-classifier](https://huggingface.co/HuggingFaceFW/fineweb-edu-classifier)์ ์ฝ๋ ๋ฒ์ ์ ๋ชฉํ๋ก, ์ฝ๋์ ๊ต์ก์ฑ ์ ์๋ฅผ ํ๊ฐํฉ๋๋ค. |
| | ํ์ต์๋ [bigcode/the-stack-dedup](https://huggingface.co/datasets/bigcode/the-stack-dedup)์์ ์ถ์ถํ ์ํ์ [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct)๋ก ํ๊ฐํ [devngho/the-stack-llm-annotations-v2](https://huggingface.co/datasets/devngho/the-stack-llm-annotations-v2) ๋ฐ์ดํฐ์
์ด ์ฌ์ฉ๋์์ต๋๋ค. |
| | |
| | ์ด ์ฐ๊ตฌ๋ Google์ TPU Research Cloud [(TRC)](https://sites.research.google/trc/about/)์ Cloud TPU ์ ๊ณต์ผ๋ก ์ํ๋์์ต๋๋ค. โก |
| | |
| | ## ์์ธ |
| | |
| | - **์ ์:** devngho |
| | - **์ธ์ด:** code |
| | - **๋ผ์ด์ ์ค:** mit |
| | - **๊ธฐ๋ฐ ๋ชจ๋ธ:** [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base) |
| | |
| | ## ํ์ต ์์ธ |
| | |
| | - learning_rate: 3e-4 (cosine) |
| | - warmup_ratio: 0.1 |
| | - batch_size: 2048(512*4) |
| | - optimizer: adamw(b1=0.9, b2=0.98, eps=1e-8, weight_decay=0.01) |
| | - duration: 4h 41m |
| | - steps: 6080 |
| | |
| | ## ํ์ต ์ฅ๋น |
| | |
| | TPU v4-8 |
| | |
| | ## ์ฑ๋ฅ |
| | |
| | ``` |
| | Validation Report: |
| | precision recall f1-score support |
| | |
| | 0 0.80 0.06 0.10 72 |
| | 1 0.62 0.40 0.48 835 |
| | 2 0.61 0.62 0.61 2722 |
| | 3 0.48 0.72 0.58 1891 |
| | 4 0.62 0.02 0.05 623 |
| | 5 0.00 0.00 0.00 1 |
| | |
| | accuracy 0.55 6144 |
| | macro avg 0.52 0.30 0.30 6144 |
| | weighted avg 0.58 0.55 0.52 6144 |
| | |
| | Confusion Matrix: |
| | [[ 4 36 30 2 0 0] |
| | [ 1 330 464 40 0 0] |
| | [ 0 157 1684 881 0 0] |
| | [ 0 5 516 1361 9 0] |
| | [ 0 0 71 537 15 0] |
| | [ 0 0 0 1 0 0]] |
| | ``` |
| | |
| | 3 ์ด์๊ณผ ๋ฏธ๋ง์ผ๋ก ๊ตฌ๋ถํ ๋ f1 score๋ ์ฝ 0.72์
๋๋ค. |
| | |
| | # devngho/code_edu_classifier-v3-microsoft_codebert-base |
| | |
| | This model is [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base) with classfier head. It is designed to evaluate the educational value of codes, similar to the [HuggingFaceFW/fineweb-edu-classifier](https://huggingface.co/HuggingFaceFW/fineweb-edu-classifier), but focused on code. The training data comes from [devngho/the-stack-llm-annotations-v2](https://huggingface.co/datasets/devngho/the-stack-llm-annotations-v2) dataset, contains samples extracted from [bigcode/the-stack-dedup](https://huggingface.co/datasets/bigcode/the-stack-dedup) and evaluated using [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct). |
| | |
| | This research was supported with Cloud TPUs from Google's TPU Research Cloud [(TRC)](https://sites.research.google/trc/about/).โก |
| | |
| | - **Developed by:** devngho |
| | - **Language(s):** code |
| | - **License:** mit |
| | - **Base model:** [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base) |
| | |
| | ## Training detail |
| | |
| | - learning_rate: 3e-4 (cosine) |
| | - warmup_ratio: 0.1 |
| | - batch_size: 2048(512*4) |
| | - optimizer: adamw(b1=0.9, b2=0.98, eps=1e-8, weight_decay=0.01) |
| | - duration: 4h 41m |
| | - steps: 6080 |
| | |
| | ## Training hardware |
| | |
| | TPU v4-8 |
| | |
| | ## Performance |
| | |
| | ``` |
| | Validation Report: |
| | precision recall f1-score support |
| | |
| | 0 0.80 0.06 0.10 72 |
| | 1 0.62 0.40 0.48 835 |
| | 2 0.61 0.62 0.61 2722 |
| | 3 0.48 0.72 0.58 1891 |
| | 4 0.62 0.02 0.05 623 |
| | 5 0.00 0.00 0.00 1 |
| | |
| | accuracy 0.55 6144 |
| | macro avg 0.52 0.30 0.30 6144 |
| | weighted avg 0.58 0.55 0.52 6144 |
| | |
| | Confusion Matrix: |
| | [[ 4 36 30 2 0 0] |
| | [ 1 330 464 40 0 0] |
| | [ 0 157 1684 881 0 0] |
| | [ 0 5 516 1361 9 0] |
| | [ 0 0 71 537 15 0] |
| | [ 0 0 0 1 0 0]] |
| | ``` |
| | |
| | The F1 score is about 0.72 when separating above and below 3. |