AI – HPC WITH US

Running llama.cpp across multiple CPU nodes on Discoverer: possibilities and expectations

1 April 2026 Veselin Kolev Comments 0 Comment

Most people running llama.cpp are familiar with its single-node CPU mode, where inference is spread across cores using multithreading. What is less commonly known is that llama.cpp can also be distributed across multiple machines — but understanding what that actually means in practice is essential before building a cluster setup around it. The built-in RPC backend llama.cpp includes an RPC feature that connects multiple nodes over TCP. A master node holds the model file and coordinates inference, while worker nodes…

HPC WITH US

Discoverer Petascale Supercomputer

Browsed by
Tag: AI

Running llama.cpp across multiple CPU nodes on Discoverer: possibilities and expectations

1 April 2026 Veselin Kolev Comments 0 Comment