Blog
Updates, thinking, and technical deep-dives from the Lilac team.
We are partnering with MiniMax to bring commercially licensed MiniMax M2.7 access to Lilac.
Read more→Why Lilac supports open-weight licensing, and why commercial rights can help more frontier models stay open.
Read more→Kimi K2.6 is now available on Lilac with OpenAI-compatible chat completions, 262K context, cache-read pricing, and no commitments.
Read more→Supported Lilac models now show lower cache read rates for repeated context, making long-context and agent workloads cheaper to run.
Read more→No more waitlist. Sign up, grab an API key, and start running inference. GLM 5.1 is live at $0.90/M input, and Gemma 4 is live at $0.11/M input.
Read more→We benchmarked our GLM 5.1 endpoint against every GLM 5.1 provider listed on OpenRouter. Competitive throughput at the lowest per-token price in the comparison.
Read more→Lilac serves Kimi K2.6 inference on idle enterprise GPUs with OpenAI-compatible, pay-per-token shared endpoints.
Read more→A direct comparison of GPU inference API pricing across major providers. How idle GPU economics enable Lilac to offer lower per-token rates.
Read more→The GPU shortage isn't what you think. The industry doesn't have a supply problem — it has a utilization problem masquerading as one.
Read more→Most Kubernetes clusters run GPUs at 30-50% utilization. We built a single operator to change that.
Read more→