awni/open_code_mlx.md

The following guide will show you how to connect a local model served with MLX to OpenCode for local coding.

1. Install OpenCode

curl -fsSL https://opencode.ai/install | bash

2. Install mlx-lm

pip install mlx-lm

3. Make a custom provider for OpenCode

Open ~/.config/opencode/opencode.json and past the following (if you already have a config just add the MLX provider):

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "mlx": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "MLX (local)",
      "options": {
        "baseURL": "http://127.0.0.1:8080/v1"
      },
      "models": {
        "mlx-community/NVIDIA-Nemotron-3-Nano-30B-A3B-4bit": {
          "name": "Nemotron 3 Nano"
        }
      }
    }
  }
}

4. Start the mlx-lm server

mlx_lm.server

5. Start OpenCode and select the provider

In the repo you plan to work, type opencode.

Once inside the OpenCode TUI:

Enter /connect
Type MLX and select it
For the API key enter none
Select the model
Start planning and building

wangkuiyi · 2026-01-03T06:07:57Z

Nice! So I don't need to specify the model checkpoint as a command line option of mlx_lm.server, correct? Will opencode attach the model name in the request and triggers the server to load the model?

JoeJoe1313 · 2026-01-03T12:26:44Z

Nice! So I don't need to specify the model checkpoint as a command line option of mlx_lm.server, correct? Will opencode attach the model name in the request and triggers the server to load the model?

@wangkuiyi, correct! And if you list multiple models in the config, for example:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "mlx": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "MLX (local)",
      "options": {
        "baseURL": "http://127.0.0.1:8080/v1"
      },
      "models": {
        "mlx-community/Qwen2.5-Coder-7B-Instruct-8bit": {
            "name": "Qwen 2.5 Coder"
        },
        "mlx-community/NVIDIA-Nemotron-3-Nano-30B-A3B-4bit": {
            "name": "Nemotron 3 Nano"
        }
      }
    }
  }
}

you can switch between these models directly in opencode:

The first time you prompt the model it would need to download it first if you hadn't downloaded it before (you can see on the left):

awni · 2026-01-03T14:20:58Z

FYI in addition to this guide you need to be aware of a couple more things:

Until this PR lands (ml-explore/mlx-lm#711) tool calling is basically broken in mlx_lm.server
Even with the above PR tool calling support is limited to certain models. If you aren't sure post the model here or open an issue. We will keep adding more tool parsers to support more models as needed

wangkuiyi · 2026-01-04T00:17:12Z

Thank you so much @JoeJoe1313 and @awni !

electroheadfx · 2026-01-06T07:45:52Z

Nice! So I don't need to specify the model checkpoint as a command line option of mlx_lm.server, correct? Will opencode attach the model name in the request and triggers the server to load the model?

@wangkuiyi, correct! And if you list multiple models in the config, for example:
{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "mlx": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "MLX (local)",
      "options": {
        "baseURL": "http://127.0.0.1:8080/v1"
      },
      "models": {
        "mlx-community/Qwen2.5-Coder-7B-Instruct-8bit": {
            "name": "Qwen 2.5 Coder"
        },
        "mlx-community/NVIDIA-Nemotron-3-Nano-30B-A3B-4bit": {
            "name": "Nemotron 3 Nano"
        }
      }
    }
  }
}

How you can give 2 baseURL for each model ?
because I can load 2 models on the same base URL ?

Thanks bro !

awni · 2026-01-06T15:32:58Z

because I can load 2 models on the same base URL ?

Each provider (e.g. MLX) has a url (localhost for local providers).
Each provider can have an arbitrary number of models.

haishengXie0712 · 2026-04-20T13:18:32Z

I connected to OpenCode using the method described above, but it couldn't write code to the file.

sirolf99 · 2026-05-05T13:27:14Z

what if already downloaded, how to server from local repo ? (and whit what name ? )
mlx_lm.server --model /Volumes//huggingface/hub/mlx-community/Qwen3.6-35B-A3B-nvfp4/ --host 0.0.0.0 --

it's still trying to redownload from huggingface instead of using local folder or searching in another location
/users/.cache/huggingface/hub/models--mlx-community--Qwen3.6-35B-A3B-nvfp4/snapshots/9c1a3a223ddd8a3425212cc421056614f149cf0f
with error Not Found: No safetensors found ; but safetensors are in the provided --model folder

sirolf99 · 2026-05-05T13:52:32Z

qwen 3.6 -> 35b-a3b + opencode

2026-05-05 15:38:49,121 - INFO - Prompt processing progress: 43835/43836
2026-05-05 15:38:49,287 - INFO - Prompt processing progress: 43836/43836
2026-05-05 15:38:58,984 - INFO - Prompt Cache: 4 sequences, 3.46 GB
2026-05-05 15:38:58,984 - INFO - - assistant: 2 sequences, 1.93 GB
2026-05-05 15:38:58,984 - INFO - - user: 1 sequences, 0.96 GB
2026-05-05 15:38:58,984 - INFO - - system: 1 sequences, 0.57 GB
192.168.1.73 - - [05/May/2026 15:38:58] "POST /v1/chat/completions HTTP/1.1" 200 -
2026-05-05 15:39:00,384 - INFO - Prompt processing progress: 228/229
2026-05-05 15:39:00,433 - INFO - Prompt processing progress: 229/229
2026-05-05 15:39:05,155 - INFO - Prompt Cache: 6 sequences, 5.40 GB
2026-05-05 15:39:05,155 - INFO - - assistant: 4 sequences, 3.87 GB
2026-05-05 15:39:05,155 - INFO - - user: 1 sequences, 0.96 GB
2026-05-05 15:39:05,155 - INFO - - system: 1 sequences, 0.57 GB
192.168.1.73 - - [05/May/2026 15:39:05] "POST /v1/chat/completions HTTP/1.1" 200 -
2026-05-05 15:39:06,142 - INFO - Prompt processing progress: 89/90
libc++abi: terminating due to uncaught exception of type std::runtime_error: [METAL] Command buffer execution failed: Insufficient Memory (00000008:kIOGPUCommandBufferCallbackErrorOutOfMemory)

+> trying workaround with param
--prompt-cache-bytes 8589934592 \

--prompt-cache-size 6
--prompt-concurrency 2
--decode-concurrency 2
--prefill-step-size 1024

+> same crash (dont have that issue with llama.cpp/lmstudio :| )

awni/open_code_mlx.md

Select an option

No results found

Select an option

No results found

1. Install OpenCode

2. Install mlx-lm

3. Make a custom provider for OpenCode

4. Start the mlx-lm server

5. Start OpenCode and select the provider

wangkuiyi commented Jan 3, 2026 •

edited

Loading

Uh oh!

JoeJoe1313 commented Jan 3, 2026

Uh oh!

awni commented Jan 3, 2026 •

edited

Loading

Uh oh!

wangkuiyi commented Jan 4, 2026

Uh oh!

electroheadfx commented Jan 6, 2026

Uh oh!

awni commented Jan 6, 2026

Uh oh!

haishengXie0712 commented Apr 20, 2026

Uh oh!

sirolf99 commented May 5, 2026 •

edited

Loading

Uh oh!

sirolf99 commented May 5, 2026 •

edited

Loading

Uh oh!

awni/open_code_mlx.md

1. Install OpenCode

2. Install mlx-lm

3. Make a custom provider for OpenCode

4. Start the mlx-lm server

5. Start OpenCode and select the provider

wangkuiyi commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JoeJoe1313 commented Jan 3, 2026

Uh oh!

awni commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangkuiyi commented Jan 4, 2026

Uh oh!

electroheadfx commented Jan 6, 2026

Uh oh!

awni commented Jan 6, 2026

Uh oh!

haishengXie0712 commented Apr 20, 2026

Uh oh!

sirolf99 commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sirolf99 commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangkuiyi commented Jan 3, 2026 •

edited

Loading

awni commented Jan 3, 2026 •

edited

Loading

sirolf99 commented May 5, 2026 •

edited

Loading

sirolf99 commented May 5, 2026 •

edited

Loading