Skip to content
TopicTracker
From HackerNewsView original
TranslationTranslation

Distributing LLM Inference in DwarfStar

Salvatore Sanfilippo introduces DwarfStar, a proof-of-concept for distributing LLM inference across machines using a protocol over Unix sockets, stdout, and HTTP, enabling models larger than a single GPU's VRAM via pipeline parallelism.

Related stories