--- license: mit base_model: microsoft/Phi-3-mini-128k-instruct tags: - phi-3 - fine-tuned - distributed-training - pytorch language: - en --- # Fine-tuned Phi-3-mini Model This is a fine-tuned version of microsoft/Phi-3-mini-128k-instruct using distributed training. ## Model Details - **Base Model**: microsoft/Phi-3-mini-128k-instruct - **Training Method**: Distributed fine-tuning with Ray - **Shards Used**: 2 - **Parameters**: ~3.8B ## Training Information The model was fine-tuned using a distributed approach across multiple shards. While the base architecture is preserved, this model has been through a fine-tuning process optimized for specific tasks. ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("a-k-aAiMGoD/phi3-mini-distributed-fine-tune") model = AutoModelForCausalLM.from_pretrained("a-k-aAiMGoD/phi3-mini-distributed-fine-tune") # Example usage text = "Hello, how are you?" inputs = tokenizer(text, return_tensors="pt") outputs = model.generate(**inputs, max_length=100) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## Training Configuration - Distributed across 2 shards - Optimized for large-scale deployment - Enhanced with Ray-based parallelization