LLMKube – A Kubernetes operator for local LLMs across Nvidia and Mac fleets
LLMKube is a Kubernetes operator designed to run large language models locally on fleets of Nvidia GPU and Mac devices, aiming to simplify deployment and management of local LLMs across heterogeneous hardware.