zegen-1

NOTE: This is an alpha version for early testing, the model weights and even architecture may change.

This is a keyword generation model that can rewrite natural language queries into arrays of keyword search expressions, intended for use in retrieval pipelines with sparse keyword embedding search available.

How to Use

from vllm import LLM, SamplingParams

query = "When am I supposed to complete my security training?"

llm = LLM(
    model="zeroentropy/zegen-1",
    tensor_parallel_size=1,
    dtype="auto",
    gpu_memory_utilization=0.8,
)

messages = [
    {
        "role": "system",
        "content": """
Your goal is to take a User natural language query, and make a list of Slack API searches for that User query. The presumption is that there are slack messages inside of the User's slack, that answer the User's query, and you are trying to find them.

- Consider synonyms for each of the words in the query
- Consider alternative ways a slack message could be written, that would match a query term, even if it's not an exact synonym. For example, if the query is "How do I reset my password", a target word could be "forgot". Even though, in isolation, no query word is synonymous with "forgot".
- Make a final list of 7 search queries, that you believe will be specific enough to pull 1-10 options, but not too specific that you get 0 results (It's very common to get 0 results from the slack API, so be conservative, prefer 1 word or 2 word searches, but have some searches at every length).
  - Note that capitalization doesn't make a difference, neither does suffixes such as -s or -ing. Don't include stop words either.
- Output your answer as a JSON string array. Do not output anything else.
""".strip(),
    },
    {"role": "user", "content": query},
]

outputs = llm.chat(
    messages=[messages],
    sampling_params=SamplingParams(
        max_tokens=max_new_tokens,
        temperature=0.0,
    ),
    chat_template_kwargs={"enable_thinking": False},
)

response = outputs[0].outputs[0].text
"""
[
    "security training",
    "security due",
    "training deadline",
    "security training deadline",
    "required training",
    "onboarding",
    "compliance"
]
"""

Evaluations

On a human-annotated set of evaluations over real private Slack repositories, zegen-1 shows significant accuracy gain over similarly small models, and 10x lower latencies over similarly accurate models.

Downloads last month: 41

Safetensors

Model size

15B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zeroentropy/zegen-1

Base model

Qwen/Qwen3-14B-Base

Finetuned

Qwen/Qwen3-14B

Finetuned

(152)

this model

Quantizations

1 model