salakash commited on
Commit
6edcce8
·
verified ·
1 Parent(s): 3e7dab3

Upload folder using huggingface_hub

Browse files
0000100_adapters.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8968d14e7792a2feebf6e0a346db20bb8b1f7e0bff0d7ce180f128ca7f43fe5
3
+ size 11754630
LICENSE-THIRD-PARTY.md ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Third-Party Licenses and Attribution
2
+
3
+ This project uses and builds upon the following third-party components:
4
+
5
+ ## Base Model
6
+
7
+ **Qwen/Qwen2.5-Coder-0.5B-Instruct**
8
+ - Source: https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct
9
+ - License: Apache License 2.0
10
+ - Copyright: Qwen Team, Alibaba Cloud
11
+ - Description: Base language model for code generation
12
+
13
+ ### Apache License 2.0 Summary
14
+ Licensed under the Apache License, Version 2.0 (the "License");
15
+ you may not use this file except in compliance with the License.
16
+ You may obtain a copy of the License at
17
+
18
+ http://www.apache.org/licenses/LICENSE-2.0
19
+
20
+ Unless required by applicable law or agreed to in writing, software
21
+ distributed under the License is distributed on an "AS IS" BASIS,
22
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
23
+ See the License for the specific language governing permissions and
24
+ limitations under the License.
25
+
26
+ ## MLX Model Weights
27
+
28
+ **mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit**
29
+ - Source: https://huggingface.co/mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit
30
+ - License: Apache License 2.0 (inherited from base model)
31
+ - Description: MLX-optimized 4-bit quantized version of Qwen2.5-Coder-0.5B-Instruct
32
+ - Conversion: Community contribution for Apple Silicon optimization
33
+
34
+ ## Training Dataset
35
+
36
+ **flwrlabs/code-alpaca-20k**
37
+ - Source: https://huggingface.co/datasets/flwrlabs/code-alpaca-20k
38
+ - License: Apache License 2.0
39
+ - Description: Code instruction dataset based on Stanford Alpaca methodology
40
+ - Size: 20,000 code instruction-following examples
41
+
42
+ ## Python Dependencies
43
+
44
+ ### MLX-LM
45
+ - License: MIT License
46
+ - Description: MLX language model utilities
47
+ - Source: https://github.com/ml-explore/mlx-lm
48
+
49
+ ### Hugging Face Datasets
50
+ - License: Apache License 2.0
51
+ - Description: Dataset loading and processing library
52
+ - Source: https://github.com/huggingface/datasets
53
+
54
+ ### Hugging Face Hub
55
+ - License: Apache License 2.0
56
+ - Description: Hugging Face Hub client library
57
+ - Source: https://github.com/huggingface/huggingface_hub
58
+
59
+ ### PyYAML
60
+ - License: MIT License
61
+ - Description: YAML parser and emitter
62
+ - Source: https://github.com/yaml/pyyaml
63
+
64
+ ## Disclaimers
65
+
66
+ ### No Endorsement
67
+ This project is not endorsed by, affiliated with, or sponsored by:
68
+ - Qwen Team or Alibaba Cloud
69
+ - The MLX community
70
+ - flwrlabs or the code-alpaca-20k dataset authors
71
+ - Hugging Face
72
+
73
+ ### Attribution Requirements
74
+ When using this model or its derivatives:
75
+ 1. Maintain attribution to the base model (Qwen2.5-Coder-0.5B-Instruct)
76
+ 2. Maintain attribution to the training dataset (code-alpaca-20k)
77
+ 3. Include this license file or equivalent attribution
78
+ 4. Do not imply endorsement by original authors
79
+
80
+ ### Modifications
81
+ This project provides:
82
+ - LoRA adapter weights (fine-tuning on top of base model)
83
+ - Training and serving infrastructure
84
+ - Documentation and usage examples
85
+
86
+ This project does NOT redistribute:
87
+ - Base model weights (users download from original source)
88
+ - Complete fine-tuned model weights
89
+ - Training dataset (users download from original source)
90
+
91
+ ## License Compliance
92
+
93
+ All components used in this project are licensed under permissive open-source licenses (Apache-2.0, MIT) that allow:
94
+ - Commercial use
95
+ - Modification
96
+ - Distribution
97
+ - Private use
98
+
99
+ Users must:
100
+ - Include copyright notices
101
+ - Include license text
102
+ - State changes made
103
+ - Not use trademarks without permission
104
+
105
+ ## Full License Texts
106
+
107
+ ### Apache License 2.0
108
+ Full text available at: http://www.apache.org/licenses/LICENSE-2.0
109
+
110
+ ### MIT License
111
+ Full text available at: https://opensource.org/licenses/MIT
112
+
113
+ ## Questions
114
+
115
+ For questions about licensing or attribution, please open an issue at:
116
+ https://github.com/salakash/Minimalism/issues
MODEL_CARD.md ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Qwen/Qwen2.5-Coder-0.5B-Instruct
4
+ tags:
5
+ - code
6
+ - coding-assistant
7
+ - mlx
8
+ - lora
9
+ - qwen2.5
10
+ language:
11
+ - en
12
+ pipeline_tag: text-generation
13
+ ---
14
+ **Developed By Samiya Kashif, Kashif Salahuddin & Rohan Bhangale & Rpbert Rojek**
15
+
16
+ # Minimalism
17
+
18
+ Minimalism is a practical coding assistant fine-tuned with LoRA on the code-alpaca-20k dataset. It provides runnable-first responses with structured sections for Solution, Usage, and Sanity Tests.
19
+
20
+ ## Model Details
21
+
22
+ - **Base Model**: [Qwen/Qwen2.5-Coder-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct)
23
+ - **MLX Weights**: [mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit](https://huggingface.co/mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit)
24
+ - **Training Dataset**: [flwrlabs/code-alpaca-20k](https://huggingface.co/datasets/flwrlabs/code-alpaca-20k)
25
+ - **Training Method**: LoRA (Low-Rank Adaptation)
26
+ - **Framework**: MLX (Apple Silicon optimized)
27
+ - **License**: Apache-2.0
28
+
29
+ ## Intended Use
30
+
31
+ Minimalism is designed for:
32
+ - Code generation and completion
33
+ - Programming assistance and tutoring
34
+ - Quick prototyping and examples
35
+ - Learning programming concepts
36
+
37
+ ### Response Format
38
+
39
+ When asked for code, Minimalism structures responses with:
40
+
41
+ 1. **Solution**: The main implementation
42
+ 2. **Usage**: A minimal runnable example
43
+ 3. **Sanity test**: A tiny test snippet (when appropriate)
44
+
45
+ This format ensures responses are immediately actionable and testable.
46
+
47
+ ## Training Details
48
+
49
+ - **Dataset Size**: 2,000 examples (configurable)
50
+ - **Training Iterations**: 50 (configurable)
51
+ - **LoRA Rank**: 8
52
+ - **LoRA Alpha**: 16
53
+ - **Learning Rate**: 2e-5
54
+ - **Hardware**: Apple Silicon M1 with 32GB RAM
55
+
56
+ ### Data Processing
57
+
58
+ The training data underwent:
59
+ 1. Secret redaction (API keys, private keys, tokens)
60
+ 2. Deduplication by content hash
61
+ 3. Train/validation split (98/2)
62
+ 4. Deterministic truncation for efficiency
63
+
64
+ ## Usage
65
+
66
+ ### Installation
67
+
68
+ ```bash
69
+ pip install mlx-lm
70
+ ```
71
+
72
+ ### Running the Server
73
+
74
+ ```bash
75
+ python -m mlx_lm.server \
76
+ --model mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit \
77
+ --adapter-path salakash/Minimalism \
78
+ --host 127.0.0.1 \
79
+ --port 8080
80
+ ```
81
+
82
+ ### API Example
83
+
84
+ ```bash
85
+ curl http://127.0.0.1:8080/v1/chat/completions \
86
+ -H 'Content-Type: application/json' \
87
+ -d '{
88
+ "model": "Minimalism",
89
+ "messages": [
90
+ {"role": "user", "content": "Write a Python function to add two numbers"}
91
+ ],
92
+ "max_tokens": 256
93
+ }'
94
+ ```
95
+
96
+ ### Python Example
97
+
98
+ ```python
99
+ from mlx_lm import load, generate
100
+
101
+ # Load model with adapter
102
+ model, tokenizer = load(
103
+ "mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit",
104
+ adapter_path="salakash/Minimalism"
105
+ )
106
+
107
+ # Generate response
108
+ prompt = "Write a Python function to reverse a string"
109
+ response = generate(model, tokenizer, prompt=prompt, max_tokens=256)
110
+ print(response)
111
+ ```
112
+
113
+ ## Limitations
114
+
115
+ - **Model Size**: 0.5B parameters - suitable for quick tasks but not complex reasoning
116
+ - **Context Length**: Limited by base model's context window
117
+ - **Domain**: Primarily trained on Python code examples
118
+ - **Hardware**: Optimized for Apple Silicon; may not perform optimally on other platforms
119
+ - **Accuracy**: May generate incorrect or insecure code; always review outputs
120
+
121
+ ## Ethical Considerations
122
+
123
+ - **Code Review**: Always review generated code before use in production
124
+ - **Security**: Do not use for security-critical applications without thorough review
125
+ - **Bias**: May reflect biases present in training data
126
+ - **Attribution**: Generated code should be reviewed for licensing implications
127
+
128
+ ## Attribution
129
+
130
+ This model is built upon:
131
+
132
+ 1. **Base Model**: Qwen/Qwen2.5-Coder-0.5B-Instruct
133
+ - License: Apache-2.0
134
+ - Authors: Qwen Team, Alibaba Cloud
135
+ - No endorsement by original authors is implied
136
+
137
+ 2. **MLX Conversion**: mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit
138
+ - Converted for Apple Silicon optimization
139
+ - Community contribution
140
+
141
+ 3. **Training Dataset**: flwrlabs/code-alpaca-20k
142
+ - License: Apache-2.0
143
+ - Based on Stanford Alpaca methodology
144
+ - No endorsement by dataset authors is implied
145
+
146
+ ## Citation
147
+
148
+ If you use Minimalism in your research or applications, please cite:
149
+
150
+ ```bibtex
151
+ @misc{minimalism2024,
152
+ title={Minimalism: A Practical Coding Assistant},
153
+ author={Kashif Salahuddin},
154
+ year={2024},
155
+ publisher={Hugging Face},
156
+ howpublished={\url{https://huggingface.co/salakash/Minimalism}}
157
+ }
158
+ ```
159
+
160
+ ## Contact
161
+
162
+ - Repository: [github.com/salakash/Minimalism](https://github.com/salakash/Minimalism)
163
+ - Issues: [github.com/salakash/Minimalism/issues](https://github.com/salakash/Minimalism/issues)
164
+
165
+ ## Disclaimer
166
+
167
+ This adapter is provided "as is" without warranty. The authors are not responsible for any damages or issues arising from its use. Always review and test generated code before deployment.
README.md ADDED
@@ -0,0 +1,243 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ base_model: Qwen/Qwen2.5-Coder-0.5B-Instruct
6
+ tags:
7
+ - code
8
+ - coding-assistant
9
+ - lora
10
+ - mlx
11
+ - apple-silicon
12
+ - qwen2.5
13
+ datasets:
14
+ - flwrlabs/code-alpaca-20k
15
+ - m-a-p/Code-Feedback
16
+ library_name: mlx-lm
17
+ pipeline_tag: text-generation
18
+ ---
19
+ **Developed By Samiya Kashif, Kashif Salahuddin & Rohan Bhangale**
20
+ ## 1. Executive Summary
21
+
22
+ **Minimalism** is a specialized coding assistant built as a LoRA (Low-Rank Adaptation) adapter for the Qwen2.5-Coder-0.5B-Instruct base model. Unlike generic coding assistants, Minimalism implements a "runnable-first" philosophy: when users request code, responses are structured with clear **Solution**, **Usage**, and **Sanity test** sections, ensuring developers receive immediately executable code with minimal friction.
23
+
24
+ ### What Minimalism Is
25
+
26
+ - **A LoRA adapter** Trained on code-alpaca-20k dataset
27
+ - **OpenAI-compatible API** for local inference
28
+ - **Lightweight distribution** (~12MB adapter vs. multi-GB full models)
29
+ - **Production-engineered** with automated pipelines, evaluation, and publishing
30
+
31
+ ## Why Minimalism
32
+
33
+ Minimalism is built for a simple, practical goal: **deliver the same outcome with fewer lines of code**.
34
+
35
+ Most coding assistants tend to “over-achieve” by producing large, multi-step solutions—even when a smaller, clearer implementation would do. That extra code isn’t free: it increases review effort, maintenance cost, and the surface area where defects can hide.
36
+
37
+ **Too Much Code, Too Fast** Teams everywhere are seeing a huge jump in the number of lines of code (LOC). Developers—from interns to seniors—are suddenly writing **5 to 7 times more** than before. At first, it looks like higher productivity. In reality, it often means more bugs.
38
+
39
+ There’s a long-standing rule in software engineering:
40
+
41
+ > “The more lines of code you have, the higher your probability of introducing bugs.”
42
+
43
+ The industry’s oldest truth still stands: the more code you have, the more things can go wrong. And AI-generated code tends to be **verbose and repetitive**, which can inflate LOC without adding real value.
44
+
45
+ Minimalism is designed for teams that value **minimalism, clarity, and correctness** over volume.
46
+
47
+
48
+ ### What makes Minimalism different
49
+
50
+ * **Minimal LoC by default**
51
+ Minimalism is optimized to **minimize lines of code while preserving behavior**—it prefers the smallest correct solution that meets the user’s objective.
52
+
53
+ * **Internal governance behavior**
54
+ The model follows a lightweight internal “governance layer” in its response style: avoid unnecessary scaffolding, avoid over-abstraction, keep code focused, and don’t introduce additional complexity that doesn’t improve the result. The governance layer sits between the user request and the model’s final output to enforce **minimalism as a constraint**. It evaluates candidate solutions by measuring **lines of code** and selects the smallest implementation that still satisfies the original requirements. If a shorter variant fails, it automatically falls back to the next-smallest passing candidate, ensuring fewer lines **without sacrificing correctness**.
55
+
56
+ * **Practical, runnable output**
57
+ When you ask for code, Minimalism is tuned toward “runnable-first” answers—clear implementation, a minimal usage example, and a quick sanity check when appropriate.
58
+
59
+ ### Early validation
60
+
61
+ Minimalism was evaluated in a small developer study comparing it with popular coding models on a shared set of tasks. In this pilot, Minimalism showed a **clear reduction in lines of code (up to ~30%)** while producing solutions that **executed correctly and achieved the same intended outcomes** under the evaluation harness.
62
+
63
+ > Note: Results depend on task selection, constraints, and how “equivalence” is measured. We recommend validating on your own codebase and standards.
64
+
65
+
66
+
67
+ ### Why It Exists
68
+
69
+ Developers need coding assistance that:
70
+ 1. Provides **runnable code immediately** without extensive explanation
71
+ 2. Runs **locally** without cloud dependencies
72
+ 3. Maintains **small footprint** for fast iteration
73
+ 4. Offers **structured, predictable responses** for automation
74
+
75
+ ### Who It's For
76
+
77
+ - **Individual developers** working on their individual projects.
78
+ - **Small teams** needing local, private coding assistance
79
+ - **Educators** teaching programming with consistent code examples
80
+ - **Researchers** experimenting with LoRA fine-tuning on MLX
81
+
82
+
83
+
84
+ ## Quick Start
85
+
86
+ ### Option 1: Use with MLX
87
+
88
+ Install MLX and load the model with adapter:
89
+
90
+ ```bash
91
+ pip install mlx-lm
92
+ ```
93
+
94
+ ```python
95
+ from mlx_lm import load, generate
96
+
97
+ # Load base model with Minimalism adapter
98
+ model, tokenizer = load(
99
+ "mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit",
100
+ adapter_path="salakash/Minimalism"
101
+ )
102
+
103
+ # Generate code
104
+ prompt = "Write a Python function to calculate factorial"
105
+ response = generate(model, tokenizer, prompt=prompt, max_tokens=512)
106
+ print(response)
107
+ ```
108
+
109
+ ### Option 2: Use with Transformers
110
+
111
+ ```bash
112
+ pip install transformers torch
113
+ ```
114
+
115
+ ```python
116
+ from transformers import AutoModelForCausalLM, AutoTokenizer
117
+ from peft import PeftModel
118
+
119
+ # Load base model
120
+ base_model = AutoModelForCausalLM.from_pretrained(
121
+ "Qwen/Qwen2.5-Coder-0.5B-Instruct",
122
+ trust_remote_code=True
123
+ )
124
+
125
+ # Load adapter
126
+ model = PeftModel.from_pretrained(base_model, "salakash/Minimalism")
127
+ tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-0.5B-Instruct")
128
+
129
+ # Generate
130
+ messages = [{"role": "user", "content": "Write a Python function to add two numbers"}]
131
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
132
+ inputs = tokenizer(text, return_tensors="pt")
133
+ outputs = model.generate(**inputs, max_new_tokens=256)
134
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
135
+ ```
136
+
137
+ ### Option 3: Web UI with MLX
138
+
139
+ Start an OpenAI-compatible server:
140
+
141
+ ```bash
142
+ # Install mlx-lm if not already installed
143
+ pip install mlx-lm
144
+
145
+ # Start server with adapter
146
+ mlx_lm.server \
147
+ --model mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit \
148
+ --adapter-path salakash/Minimalism \
149
+ --port 8080
150
+ ```
151
+
152
+ Then use with any OpenAI-compatible client:
153
+
154
+ ```bash
155
+ curl http://localhost:8080/v1/chat/completions \
156
+ -H "Content-Type: application/json" \
157
+ -d '{
158
+ "model": "mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit",
159
+ "messages": [
160
+ {"role": "user", "content": "Write a Python function to reverse a string"}
161
+ ],
162
+ "max_tokens": 512
163
+ }'
164
+ ```
165
+
166
+ Or use with any OpenAI-compatible web UI like:
167
+ - [Open WebUI](https://github.com/open-webui/open-webui)
168
+ - [LibreChat](https://github.com/danny-avila/LibreChat)
169
+ - [ChatGPT-Next-Web](https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web)
170
+
171
+ Configure the UI to point to `http://localhost:8080` as the API endpoint.
172
+
173
+ ### Option 4: Hugging Face Inference API
174
+
175
+ Use directly via Hugging Face's Inference API (requires HF token):
176
+
177
+ ```python
178
+ import requests
179
+
180
+ API_URL = "https://api-inference.huggingface.co/models/salakash/Minimalism"
181
+ headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}
182
+
183
+ def query(payload):
184
+ response = requests.post(API_URL, headers=headers, json=payload)
185
+ return response.json()
186
+
187
+ output = query({
188
+ "inputs": "Write a Python function to check if a number is prime",
189
+ "parameters": {"max_new_tokens": 256}
190
+ })
191
+ print(output)
192
+ ```
193
+
194
+ ## Response Format
195
+
196
+ Minimalism provides structured, runnable-first responses:
197
+
198
+ - **Solution**: The main implementation code
199
+ - **Usage**: A minimal runnable example
200
+ - **Sanity test**: A tiny test snippet (when appropriate)
201
+
202
+ ## Comparison
203
+ Minimalism achieved the same objective in **~8-10 lines of code**, while a standard LLM typically produced **22–26 lines** for the equivalent solution.
204
+
205
+ ### Minimalism
206
+
207
+ ![alt text](image-1.png)
208
+
209
+ ### Standard Coding Agent
210
+
211
+ ![alt text](image.png)
212
+
213
+ ## Documentation
214
+
215
+ For comprehensive technical details, see:
216
+ - **[PYTHON_DEVELOPMENT_GUIDE.md](PYTHON_DEVELOPMENT_GUIDE.md)**: Complete Python guide covering all concepts, libraries, and techniques used in the project
217
+ - **[ARCHITECTURE.md](ARCHITECTURE.md)**: Complete system architecture, building blocks, epics & stories, technical stack, and design decisions
218
+ - **[HUGGINGFACE_UPLOAD_GUIDE.md](HUGGINGFACE_UPLOAD_GUIDE.md)**: Step-by-step guide for uploading to HuggingFace Hub
219
+ - **[MODEL_CARD.md](MODEL_CARD.md)**: Model details, training configuration, and usage guidelines
220
+ - **[QUICK_RUN_GUIDE.md](QUICK_RUN_GUIDE.md)**: Quick start guide for getting up and running
221
+
222
+ ## Base Model & Dataset
223
+
224
+ - **Base Model**: [Qwen/Qwen2.5-Coder-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct)
225
+ - **MLX Weights**: [mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit](https://huggingface.co/mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit)
226
+ - **Dataset**: [flwrlabs/code-alpaca-20k](https://huggingface.co/datasets/flwrlabs/code-alpaca-20k)
227
+ - **Dataset**: [m-a-p/Code-Feedback](https://huggingface.co/datasets/m-a-p/Code-Feedback)
228
+
229
+ ## License
230
+
231
+ This project publishes only adapter artifacts and configuration. The base model and dataset have their own licenses:
232
+
233
+ - Base Model: Apache-2.0 (Qwen/Qwen2.5-Coder-0.5B-Instruct)
234
+ - Dataset: Apache-2.0 (flwrlabs/code-alpaca-20k)
235
+
236
+ See `LICENSE-THIRD-PARTY.md` for complete attribution.
237
+
238
+ ## Acknowledgments
239
+
240
+ - Qwen team for the excellent base model
241
+ - MLX community for the Apple Silicon optimizations
242
+ - flwrlabs for the code-alpaca-20k dataset
243
+ - Multimodel Art Projection for m-a-p/Code-Feedback
USAGE.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Minimalism Usage
2
+
3
+ ## Quick Start
4
+
5
+ ### 1. Install dependencies
6
+ ```bash
7
+ pip install mlx-lm
8
+ ```
9
+
10
+ ### 2. Start the server
11
+ ```bash
12
+ # Using the base model with this adapter
13
+ python -m mlx_lm.server \
14
+ --model mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit \
15
+ --adapter-path . \
16
+ --host 127.0.0.1 \
17
+ --port 8080
18
+ ```
19
+
20
+ ### 3. Test with curl
21
+ ```bash
22
+ curl http://127.0.0.1:8080/v1/chat/completions \
23
+ -H 'Content-Type: application/json' \
24
+ -d '{
25
+ "model": "Minimalism",
26
+ "messages": [
27
+ {"role": "user", "content": "Write a Python function to add two numbers"}
28
+ ],
29
+ "max_tokens": 256
30
+ }'
31
+ ```
32
+
33
+ ## Response Format
34
+
35
+ Minimalism provides runnable-first responses with these sections:
36
+ - **Solution**: Main implementation
37
+ - **Usage**: Smallest runnable example
38
+ - **Sanity test**: Tiny test snippet (when appropriate)
adapter_config.json ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "adapter_path": "outputs/adapters/dev",
3
+ "batch_size": 4,
4
+ "config": null,
5
+ "data": "data/training_ready",
6
+ "fine_tune_type": "lora",
7
+ "grad_accumulation_steps": 1,
8
+ "grad_checkpoint": false,
9
+ "iters": 100,
10
+ "learning_rate": 2e-05,
11
+ "lora_parameters": {
12
+ "rank": 8,
13
+ "dropout": 0.0,
14
+ "scale": 20.0
15
+ },
16
+ "lr_schedule": null,
17
+ "mask_prompt": false,
18
+ "max_seq_length": 2048,
19
+ "model": "mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit",
20
+ "num_layers": 16,
21
+ "optimizer": "adam",
22
+ "optimizer_config": {
23
+ "adam": {},
24
+ "adamw": {},
25
+ "muon": {},
26
+ "sgd": {},
27
+ "adafactor": {}
28
+ },
29
+ "project_name": null,
30
+ "report_to": null,
31
+ "resume_adapter_file": null,
32
+ "save_every": 100,
33
+ "seed": 0,
34
+ "steps_per_eval": 200,
35
+ "steps_per_report": 10,
36
+ "test": false,
37
+ "test_batches": 500,
38
+ "train": true,
39
+ "val_batches": 25
40
+ }
adapters.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8968d14e7792a2feebf6e0a346db20bb8b1f7e0bff0d7ce180f128ca7f43fe5
3
+ size 11754630
config.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "qwen2",
3
+ "adapter_type": "lora",
4
+ "base_model": "mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit",
5
+ "base_model_reference": "Qwen/Qwen2.5-Coder-0.5B-Instruct",
6
+ "task": "text-generation",
7
+ "framework": "mlx",
8
+ "lora_rank": 8,
9
+ "lora_alpha": 16,
10
+ "lora_dropout": 0.05,
11
+ "trained_on": "flwrlabs/code-alpaca-20k",
12
+ "training_samples": 2000,
13
+ "training_iterations": 100,
14
+ "model_name": "Minimalism",
15
+ "description": "LoRA adapter for Qwen2.5-Coder-0.5B-Instruct trained on code-alpaca-20k dataset. Provides runnable-first coding assistance.",
16
+ "license": "apache-2.0"
17
+ }
run_meta.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_id": "mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit",
3
+ "dataset_id": "flwrlabs/code-alpaca-20k",
4
+ "iters": 100,
5
+ "rank": 8,
6
+ "alpha": 16,
7
+ "dropout": 0.05,
8
+ "learning_rate": 2e-05,
9
+ "timestamp": "2025-12-31T15:18:04.451022Z"
10
+ }