robertnetwork
/

strudel-phi3-test

Text Generation

Transformers.js

Model card Files Files and versions

strudel-phi3-test / README.md

robertnetwork's picture

Upload 9 files

2afcced verified 6 months ago

|

history blame contribute delete

1.6 kB

	---
	license: mit
	pipeline_tag: text-generation
	library_name: transformers.js
	tags:
	- ONNX
	- DML
	- ONNXRuntime
	- nlp
	- conversational
	---

	# Phi-3 Mini-4K-Instruct ONNX model for onnxruntime-web
	This is the same models as the [official phi3 onnx model](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx) with a few changes to make it work for onnxruntime-web:

	1. the model is fp16 with int4 block quantization for weights
	2. the 'logits' output is fp32
	3. the model uses MHA instead of GQA
	4. onnx and external data file need to stay below 2GB to be cacheable in chromium



	## Usage (Transformers.js)

	If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
	```bash
	npm i @huggingface/transformers
	```

	You can then use the model to generate text like this:

	```js
	import { pipeline, TextStreamer } from "@huggingface/transformers";

	// Create a text generation pipeline
	const generator = await pipeline(
	"text-generation",
	"Xenova/Phi-3-mini-4k-instruct",
	);

	// Define the list of messages
	const messages = [
	{ role: "user", content: "Solve the equation: x^2 - 3x + 2 = 0" },
	];

	// Create text streamer
	const streamer = new TextStreamer(generator.tokenizer, {
	skip_prompt: true,
	// callback_function: (text) => { }, // Optional callback function
	})

	// Generate a response
	const output = await generator(messages, { max_new_tokens: 512, do_sample: false, streamer });
	console.log(output[0].generated_text.at(-1).content);
	```