--- tags: - merge - task_wise - llm-adamerge base_model: deepseek-ai/deepseek-coder-7b-base-v1.5 --- # Merged Model using LLM-AdaMerge (task_wise) This model was created by merging multiple fine-tuned models using the LLM-AdaMerge approach with task_wise merging. ## Merge Details - **Merge Type**: task_wise - **Base Model**: deepseek-ai/deepseek-coder-7b-base-v1.5 - **Number of Models Merged**: 2 - **Models Merged**: math, code - **Final Training Loss**: N/A - **Training Epochs**: 0 ## Lambda Coefficients The following lambda coefficients were learned during training: Task-wise lambda coefficients are stored in the `learned_lambdas.json` file. ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("your-username/model-name") tokenizer = AutoTokenizer.from_pretrained("your-username/model-name") # Use the model inputs = tokenizer("Hello, how are you?", return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0])) ``` ## Training Configuration See the uploaded `training_config.json` file for detailed training configuration. ## Citation If you use this model, please cite the LLM-AdaMerge paper: ```bibtex @article{llmadamerge2024, title={LLM-AdaMerge: Adaptive Model Merging for Large Language Models}, author={...}, year={2024} } ```