aiplanet
/

panda-coder-13B

Text Generation

text-generation-inference

Model card Files Files and versions

lucifertrj commited on Mar 30, 2024

Commit

529e50e

·

verified ·

1 Parent(s): 799730b

add vLLM inference

Files changed (1) hide show

README.md +52 -0

README.md CHANGED Viewed

@@ -23,6 +23,46 @@ base_model: codellama/CodeLlama-13b-Instruct-hf
 ## Inference
 ```python
 import torch
 import transformers
@@ -70,6 +110,18 @@ hello_constant = tf.constant('Hello, World!')
 # Print the value of the constant
 print(hello_constant)
 ```
  ## 🔗 Key Features:

 ## Inference
+> Hardware requirements:
+>
+> 30GB VRAM - A100 Preferred
+### vLLM - For Faster Inference
+#### Installation
+```
+!pip install vllm
+```
+**Implementation**:
+```python
+from vllm import LLM, SamplingParams
+llm = LLM(model='aiplanet/panda-coder-13B',gpu_memory_utilization=0.95,max_model_len=4096)
+prompts = [""" ### Instruction: Write a Java code to add 15 numbers randomly generated.
+### Input: [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
+### Response:
+""",
+"### Instruction: write a neural network complete code in Keras ### Input: Use cifar dataset ### Response:"
+]
+sampling_params = SamplingParams(temperature=0.1, top_p=0.95,repetition_penalty = 1.1,max_tokens=1000)
+outputs = llm.generate(prompts, sampling_params)
+for output in outputs:
+    prompt = output.prompt
+    generated_text = output.outputs[0].text
+    print(generated_text)
+    print("\n\n")
+```
+### Transformers - Basic Implementation
 ```python
 import torch
 import transformers
 # Print the value of the constant
 print(hello_constant)
+```
+## Prompt Template for Panda Coder 13B
+```
+### Instruction:
+{<add your instruction here>}
+### Input:
+{<can be empty>}
+### Response:
 ```
  ## 🔗 Key Features: