FAQ

Common questions from teams evaluating local AI fit analysis.

These are the questions that usually come up once people understand that LLMFit is a fit-analysis layer, not just another model list or runtime.

Is LLMFit an inference runtime?

No. LLMFit helps decide which models and runtimes make sense for a machine. It does not replace tools like Ollama, MLX, or llama.cpp that actually run inference.

Who is the product really for?

It is useful for solo builders, platform teams, consultants, and homelab users who repeatedly need a defensible answer to “what can this machine run well?”

Why not just use a benchmark chart?

Benchmarks say a lot about relative model quality, but very little about whether a given machine can run the model at an acceptable memory and throughput profile.

Does it help with Apple Silicon?

Yes. The project includes Apple-oriented logic, MLX coverage, and unified-memory-aware fit analysis.

Can I use it for internal schedulers or portals?

Yes. That is one of the strongest use cases. Run `llmfit serve` on each node and aggregate the results in a separate service or control plane.

Can it still be useful on CPU-only machines?

Yes. CPU-only machines are exactly where fit analysis matters because unrealistic model choices become expensive mistakes quickly.

Does the site have to live on the same host as the runtime tool?

No. The site is static and can be deployed independently. The runtime tooling and the docs property can live on different machines if that fits your operations better.

What should I do before pointing the final domain?

Validate direct host routing with `curl --resolve`, confirm TLS coverage for every hostname you intend to expose, and keep a backup of the previous site and reverse proxy config.

Need product detail? Read the documentation for fit levels, interfaces, and system behavior. Need integration detail? Review the serve-mode API and node-local polling patterns. Need deployment detail? See the binary, container, and site-hosting deployment patterns.