Jan + Ertas
Export fine-tuned GGUF models from Ertas Studio and import them into Jan for a private, offline AI assistant experience with a clean chat interface and extension ecosystem.
Overview
Jan is an open-source desktop AI assistant designed to run large language models entirely on your local machine. Built with privacy as a core principle, Jan ensures that conversations, documents, and data never leave your device. Its clean, ChatGPT-style interface makes local AI accessible to anyone, while its extension system and local API server provide the flexibility that developers need. Jan supports GGUF models natively and runs on Windows, macOS, and Linux with optimized backends for NVIDIA, AMD, and Apple Silicon GPUs.
What sets Jan apart from other local AI tools is its focus on the assistant experience. Beyond simple chat, Jan supports conversation threads, system prompt customization, knowledge retrieval from local files, and an extension marketplace for adding capabilities like web search and code interpretation. For teams that fine-tune models with Ertas for specific domains, Jan provides a polished end-user experience that feels like a commercial AI product running entirely on local infrastructure.
How Ertas Integrates
The workflow from Ertas to Jan is straightforward: after completing a fine-tuning job in Ertas Studio, download your model in GGUF format and import it into Jan through the model management interface. Jan reads the embedded metadata from the GGUF file — including the chat template, tokenizer settings, and model architecture — so the imported model works correctly out of the box without manual configuration. You can set custom system prompts and inference parameters per model to tailor the assistant behavior to your specific use case.
This integration is particularly valuable for organizations that need to distribute fine-tuned models to non-technical users. A data science team can iterate on model quality in Ertas Studio, export the best version as GGUF, and share the file with business users who simply import it into Jan on their workstations. The entire inference pipeline stays local, meeting compliance requirements for industries like healthcare, legal, and finance where data cannot be sent to external servers.
Getting Started
- 1
Complete fine-tuning in Ertas Studio
Upload your training dataset, configure LoRA or full-parameter training on the Ertas canvas, and run the job on managed cloud GPUs until your validation metrics converge.
- 2
Download the GGUF model
Export your fine-tuned model in GGUF format from Ertas Studio. Select a quantization level appropriate for your target hardware — Q4_K_M is recommended for most consumer devices.
- 3
Import into Jan
Open Jan, navigate to the Model Hub, and select 'Import Model'. Choose your downloaded GGUF file. Jan automatically detects the model architecture and configures the runtime.
- 4
Configure model settings
Set a custom system prompt, adjust temperature, context length, and GPU offloading parameters in Jan's model settings panel to match your use case requirements.
- 5
Start chatting locally
Select your imported model from the model list and begin a conversation. All inference runs locally on your hardware with zero network requests.
# After downloading your GGUF model from Ertas Studio,
# import it into Jan via the models directory
cp ./my-model-Q4_K_M.gguf ~/jan/models/my-model/
# Or use Jan's built-in import dialog:
# Model Hub → Import Model → Select GGUF file
# Jan also exposes a local API server (enable in Settings → Advanced)
curl http://localhost:1337/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "my-model",
"messages": [{"role": "user", "content": "Hello, how can you help?"}]
}'Benefits
- Open-source and fully offline — conversations never leave your device
- Clean ChatGPT-style interface accessible to non-technical team members
- Automatic model metadata detection from GGUF files for zero-config import
- Extension ecosystem for adding retrieval, web search, and tool-use capabilities
- Local API server compatible with OpenAI SDK for application development
- Cross-platform support with optimized GPU backends for all major hardware
Related Resources
Fine-Tuning
GGUF
Inference
Quantization
Getting Started with Ertas: Fine-Tune and Deploy Custom AI Models
Privacy-Conscious AI Development: Fine-Tune in the Cloud, Run on Your Terms
Self-Hosted AI for Indie Apps: Replace GPT-4 with Your Own Model
GPT4All
llama.cpp
LM Studio
Ollama
Ertas for Healthcare
Ertas for Customer Support
Ertas for Indie Developers & Vibe-Coded Apps
Ship AI that runs on your users' devices.
Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.