/r/LocalLLaMA

Year:

Only show posts with narrations

Deploying Llama 3 LLM Model

1 upvotes • Worried-Broccoli-477

Mark as read: Add to a list

RAG setup

1 upvotes • rgs2007

Mark as read: Add to a list

Is it possible to mimic ChatGPT's memory and Custom Instructions feature?

1 upvotes • cyrus9k

Mark as read: Add to a list

Llama 3.1 B parameters for LM studio

1 upvotes • GrapefruitUnlucky216

Mark as read: Add to a list

Passage Retrieval in Commercial Setting

1 upvotes • Albertommm

Mark as read: Add to a list

Most usefull AI discords/communities on discord

1 upvotes • Rombodawg

Mark as read: Add to a list

Using cached 20k context with cheap used 4th Gen Epyc for CPU and 4x3090 GPU inference? Please review my build plans and alternative API costs.

1 upvotes • jakub37

Mark as read: Add to a list

Prototype procedural chat interface (works with 6+ LLM chat APIs).

1 upvotes • kleer001

Mark as read: Add to a list

Did OpenAI just kill llama.cpp's GBNF grammars (used for guaranteed structured outputs) without acknowledging that their idea came from open-source? What advantages do llama.cpp's grammars have now that OpenAI supports something similar?

1 upvotes • nderstand2grow

Mark as read: Add to a list

How to inference a Quantized Llama3.1 70B locally?

1 upvotes • stramzik

Mark as read: Add to a list

Title	Upvotes	Author	Mark as read	Favorited	Rating	Add to a list
Deploying Llama 3 LLM Model	1	Worried-Broccoli-477
RAG setup	1	rgs2007
Is it possible to mimic ChatGPT's memory and Custom Instructions feature?	1	cyrus9k
Llama 3.1 B parameters for LM studio	1	GrapefruitUnlucky216
Passage Retrieval in Commercial Setting	1	Albertommm
Most usefull AI discords/communities on discord	1	Rombodawg
Using cached 20k context with cheap used 4th Gen Epyc for CPU and 4x3090 GPU inference? Please review my build plans and alternative API costs.	1	jakub37
Prototype procedural chat interface (works with 6+ LLM chat APIs).	1	kleer001
Did OpenAI just kill llama.cpp's GBNF grammars (used for guaranteed structured outputs) without acknowledging that their idea came from open-source? What advantages do llama.cpp's grammars have now that OpenAI supports something similar?	1	nderstand2grow
How to inference a Quantized Llama3.1 70B locally?	1	stramzik