@aiconta on Hugging Face: "hello, who can help me setup a local LLM and RAG for my job i can pay"

Carson1391

7 days ago

What do you do and how sensitive?

John6666

6 days ago

Typically, you set it up like this. If you're expecting to pay, you might also consider using Expert Support.

Manas173

6 days ago

I can do it

andyolivers

6 days ago

Me too

agentlans

6 days ago

There should be some off-the-shelf software and pipelines that you can use. Then it's just following the tutorials.
For example, I heard about Langchain and Haystack before.

SelmaNajih001

5 days ago

In the past I’ve used Langchain, is good and easy to use. I suggest to use that to start

AXONVERTEX-AI-RESEARCH

5 days ago

•

edited 5 days ago

@aiconta , Think you can add the following , such that community can help.

What is your base infrastructure ? Hardware ? VRAM , RAM ,CPU , Storage - These are required to understand the kind of workloads that you can host.
What kind of workloads are you trying to do ? Multimodal , LLM, VLM ? This will help in the identification of the right model that fits as per your hardware.
What kind of Inference are you looking at ? Is it going to be self hosted and be a service endpoint handling multiple parallel requests ? This will help in addressing what should be your model-hardware expectations ?
On RAG -is it always going to be document based and what kind of document ? What kind of features do you want to extract ? If it involves tables, and others, you might have to choose a better model for RAG.
When talking about Local - What is the context window that you would expect to have ? Ex- For a 8B parameter at Quantized Q8_0 - you would get around 24K context window at a capacity of 16GB VRAM -it would take around ~12-13GB VRAM on parameter tuned LLM.
What kind of base OS are you running - Linux or Windows. - This will help in determining the approach for Docker containerization given the WSL and core Linux docker installation are a bit different.
What kind of GPU are you using ? Which make ? NVIDIA / Intel / AMD - This will help , if you need assistance only with setting up , or better help with either [ ipex-llm(Intel) ] [cuDNN . NVCC / CUDA (NVIDIA ) ], or [ ROCm (AMD )]

These questions when answered, the community will be able to help you immensely -and also in a focused direction.

·

aiconta

4 days ago

my computer is
processor amd ryzen threadripper pro 7995WX 96 cores
512gb ram
GPU: nvidia rtx pro 6000 blackwll workstation edition 96 gb
16tb storage ssd

Self hostage, i have delicated documents i don`t want to go online
i run windows, but i can do Linux too if is better
i made financial expertise and i have to extract information from the documents
documents maybe , word, excel, pdf, scanned documents. some maybe hand written documents and scanned

ZennyKenny

5 days ago

Put me in coach.

sohv

5 days ago

I can help you set it up!

aiconta

4 days ago

my computer is
processor amd ryzen threadripper pro 7995WX 96 cores
512gb ram
GPU: nvidia rtx pro 6000 blackwll workstation edition 96 gb
16tb storage ssd

Self hostage, i have delicated documents i don`t want to go online
i run windows, but i can do Linux too if is better
i made financial expertise and i have to extract information from the documents
documents maybe , word, excel, pdf, scanned documents. some maybe hand written documents and scanned

Join the conversation