Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
aiconta 
posted an update 7 days ago
Post
5399
hello, who can help me setup a local LLM and RAG for my job i can pay

What do you do and how sensitive?

I can do it

Me too

There should be some off-the-shelf software and pipelines that you can use. Then it's just following the tutorials.
For example, I heard about Langchain and Haystack before.

In the past I’ve used Langchain, is good and easy to use. I suggest to use that to start

@aiconta , Think you can add the following , such that community can help.

  1. What is your base infrastructure ? Hardware ? VRAM , RAM ,CPU , Storage - These are required to understand the kind of workloads that you can host.

  2. What kind of workloads are you trying to do ? Multimodal , LLM, VLM ? This will help in the identification of the right model that fits as per your hardware.

  3. What kind of Inference are you looking at ? Is it going to be self hosted and be a service endpoint handling multiple parallel requests ? This will help in addressing what should be your model-hardware expectations ?

  4. On RAG -is it always going to be document based and what kind of document ? What kind of features do you want to extract ? If it involves tables, and others, you might have to choose a better model for RAG.

  5. When talking about Local - What is the context window that you would expect to have ? Ex- For a 8B parameter at Quantized Q8_0 - you would get around 24K context window at a capacity of 16GB VRAM -it would take around ~12-13GB VRAM on parameter tuned LLM.

  6. What kind of base OS are you running - Linux or Windows. - This will help in determining the approach for Docker containerization given the WSL and core Linux docker installation are a bit different.

  7. What kind of GPU are you using ? Which make ? NVIDIA / Intel / AMD - This will help , if you need assistance only with setting up , or better help with either [ ipex-llm(Intel) ] [cuDNN . NVCC / CUDA (NVIDIA ) ], or [ ROCm (AMD )]

These questions when answered, the community will be able to help you immensely -and also in a focused direction.

·

my computer is
processor amd ryzen threadripper pro 7995WX 96 cores
512gb ram
GPU: nvidia rtx pro 6000 blackwll workstation edition 96 gb
16tb storage ssd

Self hostage, i have delicated documents i don`t want to go online
i run windows, but i can do Linux too if is better
i made financial expertise and i have to extract information from the documents
documents maybe , word, excel, pdf, scanned documents. some maybe hand written documents and scanned

Put me in coach.

I can help you set it up!

my computer is
processor amd ryzen threadripper pro 7995WX 96 cores
512gb ram
GPU: nvidia rtx pro 6000 blackwll workstation edition 96 gb
16tb storage ssd

Self hostage, i have delicated documents i don`t want to go online
i run windows, but i can do Linux too if is better
i made financial expertise and i have to extract information from the documents
documents maybe , word, excel, pdf, scanned documents. some maybe hand written documents and scanned