Unsloth AI

Unsloth Studio

A no-code local web app for fine-tuning and running open models on your own hardware.

No-code local fine-tuning and chat

Visit Site View on GitHub

GitHub Stars

—

Contributors

—

Release Downloads

—

Latest Version

—

Maintained by: Unsloth AI
First released: Mar 2026
Last commit: —
Pricing: Open Source
License: Other

Runs on This Stack

The engines this app runs on and the models it ships with, linked into the rest of the research stack.

Runs on These Engines

Overview

Unsloth Studio is an open-source, no-code desktop app that lets you train, run, and export open models entirely on your own hardware. It is built and maintained by Unsloth AI, the team behind the popular Unsloth fine-tuning library. What sets it apart from a typical local chat GUI is that it is first and foremost a fine-tuning tool. You do not need to write training scripts, configure loss functions, or debug CUDA errors. The app brings Unsloth’s performance optimizations — 2x faster training with up to 70% lower VRAM usage — into a point-and-click interface.

The app sits at the intersection of several categories: a fine-tuning UI, a local chat client, and an OpenAI-compatible API server. It competes with tools like LM Studio and Ollama for local inference, but no other desktop app offers no-code fine-tuning as a core feature. It also overlaps with Open WebUI and AnythingLLM for chat and document interaction, but those do not expose training capabilities.

Unsloth Studio launched in March 2026 and quickly gained traction on Hacker News and GitHub (the parent Unsloth repository has over 67,000 stars). The team maintains a responsive Discord community and publishes regular updates on their changelog. The app is still in beta, so you should expect rough edges and breaking changes.

Strengths

Brings fine-tuning to a point-and-click UI, no training code required.
Unsloth's optimizations mean a single consumer GPU often does the job.
Runs fully offline and exposes an OpenAI-compatible server for other tools.
Genuinely open source, with broad open-model coverage.

Trade-offs

Key Features

What the app gives you out of the box, in plain language.

macOS
Windows
Linux
Apple Silicon

Where It Shines

The jobs this app is best suited for.

Fine-tune on your own data
Teach an open model your domain, tone, or task without writing training code.
A private local model server
Run open models offline and point tools like Claude Code or Codex at the local endpoint.
Compare candidate models
Test models side by side in the Model Arena before you commit to one.

Pricing

Open Source

Free and open source. Paid Pro and Enterprise speed tiers.

Side-by-Side

Compare Unsloth Studio With Another App

Add a second or third app and see stars, downloads, platforms, and capabilities lined up next to each other.

Open the Comparator

Related Apps

Close alternatives worth a look before you decide.

LM Studio

Discover, download, and run open models on your own computer, no command line needed.

Running open models from a polished desktop GUI

Download from lmstudio.ai

Stars

—

Downloads

—

ProprietaryFreemacOS

Frequently Asked Questions

What Is a Desktop AI App?

A desktop AI app is a program you install on your computer to chat with, code with, run agents on, or fine-tune AI models. It sits on top of the inference engines and models that do the real work and gives you a friendly interface instead of a command line.

Is Unsloth Studio free to use?

Unsloth Studio is offered under a Open Source model. Check the pricing section on this page and the app’s own site for the latest details, since tiers and limits change over time.

What platforms does Unsloth Studio run on?

Unsloth Studio is maintained by Unsloth AI. See the capabilities section above for the exact list of platforms it supports, along with whether it runs models locally, connects to cloud APIs, or both.

Free Monthly Report

The AI Build Report

The state of AI models, API prices, and what to run where. New every month, free.

What You Can Do With It

Download and Run Models Locally

You can search for, download, and run open models directly from the app. It supports both GGUF (the format used by llama.cpp) and safetensors. The underlying inference engine is llama.cpp with Hugging Face integration, and it handles multi-GPU inference, automatic memory offloading, and model fitting. You can run text, vision, audio, and embedding models.

Fine-Tune Without Code

This is the core differentiator. You select a base model from a supported list (over 500 models including Gemma, Qwen, DeepSeek, and others), upload your data, and hit train. The app uses Unsloth’s custom CUDA kernels to optimize LoRA, FP8, full fine-tuning, and QAT. Training happens with real-time observability — you can watch loss curves and validation metrics in the UI.

Chat With Tool Calling and Web Search

After training or downloading, you can chat with the model in a local web UI. The chat interface supports self-healing tool calling (the model can recover from malformed tool calls), unlimited web search, and code execution in Bash and Python. Results from web search are surfaced inside the model’s thinking trace. The code execution environment is sandboxed like Claude Artifacts, so the model can test code, generate files, and verify answers with real computation.

You can export any model — including your fine-tuned ones — to GGUF or 16-bit safetensors. These can then be used with llama.cpp, vLLM, Ollama, or any other inference engine. The app also exposes an OpenAI-compatible API endpoint, meaning you can point tools like Claude Code, Codex, or any custom agent to your local model.

Compare Models Side by Side

The Model Arena lets you load two models and compare their responses to the same prompt. This is useful for evaluating candidate models before committing to one for a specific task.

Platforms, Pricing, and Requirements

Platforms: macOS, Windows (native, no WSL required), and Linux.

Pricing: The app is free and open source for all features. Paid Pro and Enterprise tiers exist for higher speed limits on the API endpoint (e.g., rate limiting, priority inference) but the local training and inference capabilities are not gated.

Hardware requirements:

Apple Silicon: Training, MLX, and GGUF inference are fully supported.
NVIDIA GPU: Required for training. Unsloth’s optimizations work on a single consumer GPU (e.g., RTX 3090/4090). Inference can fall back to CPU with offloading.
AMD GPU: Inference is supported; training is not yet ready as of the beta.
CPU-only: You can run models via llama.cpp (GGUF) for inference, but training is not available without a GPU.

Installation: Not a one-click store download. You install via a script (curl -fsSL https://unsloth.ai/install.sh | sh on macOS/Linux, or the PowerShell equivalent on Windows). It launches a local web server (default port 8000) that you open in your browser. The installation pulls Python dependencies, Unsloth’s library, and the UI.

License: Other (the main Unsloth library is available under a permissive license; the Studio UI is also open source via the GitHub repository).

Key Features and Capabilities

Upload PDFs, CSVs, JSON, DOCX, or TXT files. The Data Recipes feature automatically converts these into training datasets via a graph-node workflow — no manual preprocessing. You can then train with LoRA, FP8, full-finetuning, or prefix tuning. The UI shows real-time training metrics. This is the only desktop app that offers this capability out of the box.

Local Chat and Inference

Run any downloaded model in a chat interface with autonomous tool calling, web search, and code execution. The model can call Bash and Python, not just JavaScript, making it usable for programming tasks. The API endpoint is OpenAI-compatible, so you can replace a cloud model in your existing toolchain with a local one.

Turns unstructured documents into labeled datasets. For example, upload a PDF of internal documentation and the app can generate question-answer pairs or instruction-following examples for fine-tuning. Manual prep is not required.

Export to GGUF for use with Ollama or llama.cpp, or to safetensors for vLLM. This means your fine-tuned model is not locked into Unsloth’s ecosystem.

Real-World Use Cases

Fine-tune on your own data. A startup training a customer support model on internal transcripts can do it entirely offline on a single RTX 4090, with no cloud costs or data-leakage risk. The Data Recipes feature reduces the time from raw documents to usable dataset to hours.

Private local model server. A developer using Claude Code or Codex for code generation can point those tools at Unsloth Studio’s local API endpoint. The model (e.g., Qwen 3.5-4B) runs offline, with tool calling and web search enabled. No data ever leaves the machine.

Model evaluation. Before committing to a base model for fine-tuning, a practitioner can compare Gemma 4, Qwen 3.6, and DeepSeek in the Model Arena using the same prompts and see side-by-side responses.

Who should look elsewhere: If you need one-click install from the Mac App Store, want a hosted cloud solution, or need AMD training support, Unsloth Studio is not ready for you yet. If you only need inference (no training), LM Studio or Ollama offer simpler installation and more polished chat UIs.

Getting Started With Unsloth Studio

Install Unsloth Studio by running the following in a terminal:

macOS/Linux/WSL: curl -fsSL https://unsloth.ai/install.sh | sh
Windows PowerShell: irm https://unsloth.ai/install.ps1 | iex

Launch the app: After install, run unsloth studio (or the appropriate binary). The app starts a local web server and opens your browser.

Download a model: From the UI’s model browser, search for a model (e.g., Qwen3.5-4B) and click download. GGUF files are typically a few GB.

Start a chat: Once the model is downloaded, open the chat interface. You can enable tool calling, web search, and code execution with toggles.

Fine-tune: Upload a PDF or CSV under Data Recipes, let the system generate a dataset, select a base model and training hyperparameters, then click train. Monitor loss in real time.

Export or serve: After training, export the LoRA adapter or merged model. Or enable the API endpoint and point your tools to http://localhost:8000/v1.

For documentation, visit [unsloth.ai/docs/new/studio](https://unsloth.ai/docs/new/studio). For community support, join the Unsloth Discord (linked on the GitHub repo).

Unsloth Studio vs. LM Studio / Ollama. Both are excellent for local inference: they are easy to install, have polished GUIs, and support a wide range of models. However, neither offers fine-tuning. If your only goal is to run models locally, start with LM Studio or Ollama. If you need to train or adapt a model, Unsloth Studio is the only option that does both in one interface.

Unsloth Studio vs. Open WebUI / AnythingLLM. Those tools focus on chat, RAG, and document integration. They do not support training. Unsloth Studio’s chat interface is less full-featured for document Q&A, but it includes tool calling and code execution. Choose Unsloth Studio if you need fine-tuning; choose Open WebUI if you need a robust chat-with-documents pipeline.

Unsloth Studio vs. cloud fine-tuning platforms (e.g., Replicate, Hugging Face AutoTrain). Cloud platforms handle infrastructure but charge per hour of GPU usage and require internet. Unsloth Studio runs fully offline with no per-token costs, but you need your own GPU. For one-off experiments with free Colab credits, the cloud might be easier. For repeated training on sensitive data, local is the better bet.

In summary, Unsloth Studio is the first desktop app that makes local fine-tuning practical for non-Python developers. It is not a polished consumer app yet, but it fills a genuine gap for teams that want to train models on their own terms.

Still in beta, so expect rough edges and change.

NVIDIA-first. AMD training is not ready and there is no GPU-free training.

A local web server you install via script, not a one-click desktop binary.

OpenAI-Compatible Server

No-code fine-tuning
Train and adapt open models through a UI, with much lower GPU memory use than the standard stack.
Local chat and inference
Run your models locally with tool calling, web search, and an OpenAI-compatible API.
Data Recipes
Turn PDFs, CSVs, and other files into training datasets without manual prep.

Unsloth Studio

Runs on This Stack

Runs on These Engines

Overview

Overview

Strengths

Trade-offs

Key Features

Where It Shines

Fine-tune on your own data

A private local model server

Compare candidate models

Pricing

Compare Unsloth Studio With Another App

Related Apps

LM Studio

Frequently Asked Questions

What Is a Desktop AI App?

Is Unsloth Studio free to use?

What platforms does Unsloth Studio run on?

The AI Build Report

What You Can Do With It

Download and Run Models Locally

Fine-Tune Without Code

Chat With Tool Calling and Web Search

Export and Serve

Compare Models Side by Side

Platforms, Pricing, and Requirements

Key Features and Capabilities

No-Code Fine-Tuning

Local Chat and Inference

Data Recipes

Export Flexibility

Real-World Use Cases

Getting Started With Unsloth Studio

How It Compares

No-code fine-tuning

Local chat and inference

Data Recipes

Odysseus