Tracked issues relevant to the Jetson Orin Nano (and related AGX Thor/Orin) platform. All issues are currently OPEN.
NemoClaw #404 — GPU detection fails + k3s panics on iptables (Orin Nano Specific)
Labels: bug, Platform: AGX Thor/Orin
Author: realkim93 (Ryeol.Kim)
Two blockers on Jetson Orin Nano Super (8GB, JetPack 6.x, kernel 5.15.148-tegra):
- GPU not detected —
nvidia-smi --query-gpu=memory.totalreturns[N/A]on unified memory devices;detectGpu()returnsnulland preflight reports "No GPU detected". - k3s panic on startup — OpenShell gateway ships iptables v1.8.10 (nf_tables mode), but Tegra kernel lacks
nft_chain_filterand related modules:iptables v1.8.10 (nf_tables): RULE_INSERT failed (No such file or directory): rule in chain FORWARD panic: ...network_policy_controller.go:412 - Default model too large —
nemotron-3-nano:30bdoes not fit in 8GB unified memory.
NemoClaw #539 — OpenShell gateway crashes: missing nf_tables NAT kernel modules
Labels: bug, Platform: AGX Thor/Orin, status: triage
Author: garnetsoft
Onboarding fails at step [2/7] Starting OpenShell gateway because k3s uses iptables-nft but the Jetson kernel has no nft_* modules at all.
Error:
E0320 Failed to bootstrap IPTables: failed to apply partial iptables-restore
Warning: Extension MASQUERADE revision 0 not supported, missing kernel module?
iptables v1.8.10 (nf_tables): CHAIN_ADD failed (No such file or directory)
Root cause: find /lib/modules/5.15.185-tegra -name "nft*" returns nothing. The host uses iptables-legacy but the container uses nf_tables.
Suggested fix: OpenShell gateway should detect missing nf_tables support and fall back to iptables-legacy, similar to the cgroup v2 fix in spark-install.md. A dedicated Jetson setup path (like setup-spark) would also address this.
Workaround: None — requires kernel recompile or patching the gateway container image.
Environment:
- Device: Jetson Orin Nano (Engineering Reference Developer Kit)
- Kernel: 5.15.185-tegra (aarch64)
- OS: Ubuntu 22.04.5 LTS
- Docker: 29.3.0, NemoClaw: 0.1.0, OpenShell CLI: 0.0.12
NemoClaw #300 — GPU detection fails on Jetson Thor and Orin
Labels: Platform: AGX Thor/Orin
Author: dfry-lhzn (Daniel Fry)
detectGpu() in nemoclaw/bin/lib/nim.js only checks for "GB10" (Grace Blackwell), missing all Jetson-class hardware.
nvidia-smi output on Orin:
$ nvidia-smi --query-gpu=name --format=csv,noheader,nounits
Orin (nvgpu)
Offending code (nim.js:52):
if (nameOutput && nameOutput.includes("GB10")) {Proposed fix:
if (nameOutput && (nameOutput.includes("GB10") || nameOutput.includes("Thor") || nameOutput.includes("Orin"))) {NemoClaw #511 — Installation fails on Jetson AGX (aarch64): npm errors + k3s networking
Labels: bug, Platform: AGX Thor/Orin, status: triage
Author: rmegi (Roy Megidish)
Multiple failure stages on Jetson AGX (Tegra kernel 5.15.136-tegra, aarch64):
-
npm install errors:
ENOTEMPTY: directory not empty, rename '/usr/lib/node_modules/nemoclaw' npm ERR! enoent spawn sh ENOENTMissing files:
openclaw/dist,openclaw/extensions -
Gateway fails:
K8s namespace not ready — timed out waiting for namespace 'openshell' RULE_APPEND failed (No such file or directory) iptables v1.8.10 (nf_tables) — missing kernel module? -
k3s pods crash:
metrics-serverandlocal-path-provisionerinCrashLoopBackOff.
Environment: Ubuntu 22.04, aarch64, Node.js v22.22.1, npm 10.9.4, Docker 27.5.1
NemoClaw #360 — Install issues on Jetson Thor (multiple)
Labels: Docker, Platform: AGX Thor/Orin
Author: theguybieber
Checklist of install issues encountered on Jetson Thor:
- Installer
curlcommand includes a spurious$— copy-paste fails. - Docker must be enabled for the installing user.
openshellnot added to PATH after install.- Port conflict check needs
sudo(lsof -i :8080). - Preflight GPU detection fails: "No GPU detected — will use cloud inference".
- Install hangs at
[3/7] Creating sandboxwith a custom sandbox name. nemoclaw onboardfails with custom sandbox names.nemoclaw startrequires an NVIDIA API key even in local-inference mode.
NemoClaw #65 — Feature request: Jetson Nano support
Labels: Platform: AGX Thor/Orin
Author: prafsoni
Request for official Jetson Nano support, with particular interest in resource-aware local model loading for the 8GB unified memory constraint.
OpenShell #407 — k3s gateway fails on Jetson Orin Nano: iptables nf_tables / legacy conflict
Labels: state:triage-needed, topic:compatibility
Author: guinava — 7 comments
This is the upstream root cause of the gateway crash seen in NemoClaw #404 and #539. The conflict is unresolvable at the NemoClaw level — it requires a fix in the OpenShell gateway container image.
The trap: the iptables state is broken in both possible configurations on Jetson:
| Host nf_tables state | Result |
|---|---|
| Loaded (default) | kube-router panics: RULE_INSERT failed — nf_tables extensions in container conflict with legacy host tables |
| Blacklisted | kube-proxy exits: Could not fetch rule set generation id: Invalid argument — container's iptables v1.8.10 (nf_tables) still calls nf_tables which is now absent |
Root cause: Gateway container bundles iptables v1.8.10 (nf_tables). On L4T/Tegra, the host kernel nat table is in legacy mode. The container's nf_tables iptables cannot read or write legacy tables regardless of host module state.
Attempted workarounds (all failed):
update-alternatives --set iptables /usr/sbin/iptables-legacy- Blacklisting
nf_tables+ reboot - Loading all
iptable_*andxt_*legacy modules manually nemoclaw setup-spark(configurescgroupns=host)
Required fix (in OpenShell gateway container):
- Bundle
iptables-legacyin the cluster image, OR - Use
--prefer-bundled-bink3s arg with legacy mode, OR - Add entrypoint logic: run
update-alternatives --set iptables /usr/sbin/iptables-legacybeforeexec k3swhen legacy tables are detected on host
Environment: Jetson Orin Nano, Ubuntu 22.04, kernel 5.15.148-tegra, Docker 28.x, OpenShell v0.0.7, gateway image ghcr.io/nvidia/openshell/cluster:0.0.8
OpenShell #467 — Gateway fails on Jetson Orin / L4T: kube-proxy exits with "Could not fetch rule set generation id"
Labels: (none yet) Author: tangc3022-hub — 1 comment
Independent report of the same issue, with additional context about host network mode being required on Jetson due to missing veth / Docker NAT limitations.
Key finding from investigation:
- On Orin, Docker cannot use default bridge/NAT (veth missing), so the gateway must use
network_mode: host - With host networking, the container's
iptables-nftinvokes the same kernel as the host — which only has legacy nat tables → immediate failure
Mitigations implemented in reporter's fork:
- Pass
--prefer-bundled-binto k3s server args (uses k3s's own bundled iptables) - In cluster image entrypoint, before
exec k3s:update-alternatives --set iptables /usr/sbin/iptables-legacy - Improved error reporting to not misclassify this as a "Network connectivity issue"
Environment: Jetson Orin (L4T / tegra-ubuntu), Docker Engine, OpenShell v0.x.x, host uses iptables-legacy
| Issue | Root Cause | Repo |
|---|---|---|
| GPU not detected | detectGpu() in nim.js:52 only checks "GB10", missing "Orin" and "Thor" |
NemoClaw |
| Gateway crash | Container bundles iptables v1.8.10 (nf_tables); Tegra kernel only supports legacy iptables |
OpenShell |
| Default model OOM | nemotron-3-nano:30b too large for 8GB unified memory |
NemoClaw |
| npm install failure | Possible aarch64 package or file-system issue during install | NemoClaw |
- GPU detection: None upstream — NemoClaw falls back to cloud inference. The fix in
nim.js:52is simple and proposed in NemoClaw #300. - iptables/k3s (gateway crash): See the official NV workaround guide below.
- Model size: Manually select a smaller model during onboarding (e.g., a model that fits in 8GB).
Source: NVIDIA Developer Forums — NemoClaw + OpenShell: Jetson AGX Orin, Orin Super, Nano, and NX
Target: Jetson AGX Orin, Orin Super, Orin Nano, Orin NX Tested versions: JetPack 6.2.2 (L4T R36.5.0), Ubuntu 22.04.5 LTS, OpenShell 0.0.13, Node.js v22.22.1
Three issues must be addressed before running the NemoClaw installer:
| # | Issue | Impact |
|---|---|---|
| 1 | iptables nf_tables incompatibility | k3s panics, gateway never starts |
| 2 | Missing br_netfilter kernel module |
DNS resolution fails inside sandbox |
| 3 | Ollama bound to 127.0.0.1 only |
Containers can't reach local Ollama |
Install Docker (if not already installed):
sudo apt-get remove docker docker-engine docker.io containerd runc
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-pluginInstall NVIDIA Container Toolkit:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
| sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
| sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
| sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart dockerSet NVIDIA as default Docker runtime — edit /etc/docker/daemon.json:
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia"
}Add user to docker group:
sudo usermod -aG docker $USER
newgrp dockersudo modprobe br_netfilter
sudo sysctl -w net.bridge.bridge-nf-call-iptables=1Make persistent across reboots:
echo br_netfilter | sudo tee /etc/modules-load.d/k8s.conf
echo 'net.bridge.bridge-nf-call-iptables = 1' | sudo tee /etc/sysctl.d/k8s.confThis is the core fix for the gateway crash. It patches the container image in-place so k3s uses legacy iptables instead of nf_tables.
IMAGE_NAME="ghcr.io/nvidia/openshell/cluster:0.0.13"
docker run --entrypoint sh --name fix-iptables "$IMAGE_NAME" -c '
update-alternatives --set iptables /usr/sbin/iptables-legacy
update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
ln -sf /usr/sbin/iptables-legacy /usr/sbin/iptables
ln -sf /usr/sbin/ip6tables-legacy /usr/sbin/ip6tables
iptables --version
'
docker commit \
--change 'ENTRYPOINT ["/usr/local/bin/cluster-entrypoint.sh"]' \
fix-iptables "$IMAGE_NAME"
docker rm fix-iptablesThe iptables --version output should show (legacy) to confirm the patch worked.
Note: Update
IMAGE_NAMEto match the cluster image version pulled by your OpenShell version.
sudo mkdir -p /etc/systemd/system/ollama.service.d
echo -e '[Service]\nEnvironment="OLLAMA_HOST=0.0.0.0:11434"' \
| sudo tee /etc/systemd/system/ollama.service.d/override.conf
sudo systemctl daemon-reload
sudo systemctl restart ollamaVerify Ollama is listening on all interfaces:
ss -tlnp | grep 11434curl -fsSL https://www.nvidia.com/nemoclaw.sh | bashDuring onboarding, select the local Ollama option.
Check sandbox pod health:
docker exec openshell-cluster-nemoclaw kubectl get pods -n openshellVerify iptables version inside the container (should show legacy):
docker exec openshell-cluster-nemoclaw iptables --versionConfirm br_netfilter is loaded:
lsmod | grep br_netfilter
cat /proc/sys/net/bridge/bridge-nf-call-iptablesTest DNS resolution inside the cluster:
docker exec openshell-cluster-nemoclaw kubectl run dns-test \
--namespace=openshell \
--image=rancher/mirrored-library-busybox:1.37.0 \
--restart=Never \
-- nslookup openshell.openshell.svc.cluster.local
sleep 10
docker exec openshell-cluster-nemoclaw kubectl logs -n openshell dns-test
docker exec openshell-cluster-nemoclaw kubectl delete pod -n openshell dns-testClean up / fresh start:
source ~/.bashrc
nemoclaw destroy 2>/dev/null
docker rm -f openshell-cluster-nemoclaw 2>/dev/null
docker volume prune -f