How Idle GPUs Make Cheap Inference Possible
Lilac serves Kimi K2.5 inference on idle enterprise GPUs and landed at the lowest price in Lilac's current comparable-speed OpenRouter benchmark snapshot.
Updates, thinking, and technical deep-dives from the Lilac team.
Lilac serves Kimi K2.5 inference on idle enterprise GPUs and landed at the lowest price in Lilac's current comparable-speed OpenRouter benchmark snapshot.
A direct comparison of GPU inference API pricing across major providers. How idle GPU economics enable Lilac to offer lower per-token rates.
The GPU shortage isn't what you think. The industry doesn't have a supply problem — it has a utilization problem masquerading as one.
Most Kubernetes clusters run GPUs at 30-50% utilization. We built a single operator to change that.