Ship
Get an Ertas-trained GGUF into an iOS, Android, desktop, or web app, including the model delivery, first-run UX, and performance habits that survive production.
This section covers everything that happens after Export: how the GGUF reaches the user's device, how your app loads and queries it, and the performance habits that keep it responsive in production. The Ertas bundle is Ollama-ready out of the box, so a desktop user can run your model in three clicks. Mobile and web take more work, but the patterns are consistent across platforms.
Model delivery and UX
Read the model delivery and ux guide.
iOS
Read the ios guide.
Android
Read the android guide.
Desktop
Read the desktop guide.
Web
Read the web guide.
Performance Tips
Read the performance tips guide.
The shortest possible summary
If you read nothing else in this section, read this:
- Default to post-install download. The Ertas GGUF is 0.5 to 9 GB; bundling it in your app's install package wrecks install conversion on most stores. See Model delivery and UX.
- Pick your integration path by app shape, not by model. Flutter via
llamadart. React Native viallama.rn. Native iOS via llama.cpp's SwiftUI example or another maintained Swift wrapper. Native Android via llama.cpp's Android example and JNI. Desktop is easiest via Ollama (the Ertas bundle is Ollama-ready), or vianode-llama-cpp(Electron) andllama-cpp-2(Tauri) for embedded inference. Browser viawllama(WebAssembly + WebGPU, GGUF-native) or WebLLM (WebGPU, conversion required). - Load once at startup, reuse the session, dispose on teardown. Loading a model costs 1 to 3 seconds and 0.5 to 9 GB of native RAM. Re-using the loaded model across calls is the biggest single performance win available.
- Cache outputs aggressively in file storage (not
SharedPreferencesorUserDefaults). A ~10 ms cache hit beats a 1 to 3 second inference every time. - Verify before shipping. Run the smoke test on every release build. Templating mismatches and integrity failures are the two biggest sources of post-export surprises.
Start with Model delivery and UX if you have not picked a delivery path yet, or jump to your target platform's page if you already know what you are shipping into.