This document describes how Canvas behaves when the host or container is under memory pressure, what mechanisms exist to bound memory usage, and the configuration recommended for a production deployment.
Canvas does not gracefully recover from out-of-memory conditions. It does
not rescue NoMemoryError anywhere in the Ruby code. Instead, the application
takes a defensive containment approach:
- Risky operations are wrapped in a per-process
rlimitso a runaway block fails fast instead of taking the host down. - Background-job workers can be configured to self-recycle on RSS growth or job count, returning memory to the OS between jobs.
- The web tier relies on the container orchestrator (Kubernetes / Docker / systemd) to OOM-kill and restart processes as needed.
If memory is exhausted, the in-flight request or job fails. The process or worker is then either recycled by Canvas itself (jobs) or by the orchestrator (web).
File: lib/memory_limit.rb
MemoryLimit.apply(2.gigabytes) do
# work that should not be allowed unbounded memory
endImplementation: calls Process.setrlimit(:DATA, allowed, max) around the block
and restores the prior limit on exit. The cap applies to the entire
process, not just the block or thread, for the duration of the block. If the
OS rejects the limit (Errno::EINVAL), the block runs without the cap and a
warning is logged.
When the cap is hit, Ruby raises NoMemoryError. Canvas does not rescue it,
so the request or job fails. The point of the guard is to prevent a single
runaway operation from consuming all process memory, not to recover.
Current callers:
| File | Why it is guarded |
|---|---|
app/services/file_text_extraction_service.rb:38 |
PDF / Office document text extraction |
app/services/rubric_llm_service.rb |
LLM rubric generation |
app/controllers/services_api_controller.rb |
Services API endpoints |
app/controllers/application_controller.rb |
Selected controller actions |
lib/cuty_capt.rb |
Headless screenshot / capture |
Files: config/delayed_jobs.yml.example, config/initializers/delayed_job.rb
Canvas uses inst-jobs (Delayed Job fork). Two settings control recycling:
| Setting | Purpose |
|---|---|
worker_max_memory_usage |
Worker exits cleanly between jobs if RSS exceeds the byte threshold. Pool respawns a new worker. |
worker_max_job_count |
Worker exits after processing N jobs, regardless of memory. |
Both are commented out in the example file. Workers are never killed mid-job by these mechanisms — they only check at job boundaries.
The :perform lifecycle callback samples memory before and after each job
and emits a [STAT] log line:
[STAT] <start_kb> <end_kb> <delta_kb> <user_cpu> <system_cpu>
This gives you per-job memory deltas in the logs but does not, by itself, trigger any action.
Delayed::Settings.max_attempts = 1, so a job that crashes its worker
mid-execution (for example, OS OOM-kill) is effectively dropped rather than
retried — keep this in mind when sizing memory.
File: lib/base/canvas.rb:200
On Linux, reads /proc/<pid>/statm and converts pages to KB. Falls back to
ps -o rss= on other Unixes. Used by the job lifecycle logger and by tests.
Does not include swapped-out memory.
File: config/puma.rb
threads 0, 1
if ENV["RAILS_ENV"] == "production"
preload_app! false
worker_boot_timeout 240
endThere is no puma_worker_killer or unicorn-worker-killer gem in the
Gemfile, and no application-level memory monitor. Memory pressure on the web
tier is delegated entirely to the container orchestrator's OOM handling.
File: lib/health_checks.rb
Component readiness and liveness checks are timeout-based, not memory-based. There is no liveness probe that fails when RSS or available memory crosses a threshold. If you want Kubernetes / your orchestrator to recycle a hot pod before it OOMs, you must configure that externally.
No application-level handling for QuotaExceededError,
navigator.deviceMemory, or performance.memory was found in ui/ or
packages/. Browser behavior under memory pressure (tab discard, page crash)
is whatever the browser does by default.
| Tier | Behavior |
|---|---|
Web request inside MemoryLimit.apply |
NoMemoryError raised; not rescued; request returns 500. |
| Web request outside the guard | Whole process may be OOM-killed by the OS / orchestrator; pod restarts. In-flight requests on that worker fail. |
Job worker with worker_max_memory_usage set |
Worker exits cleanly between jobs; pool respawns. No job loss. |
| Job worker without recycling configured | OS OOM-kills mid-job; the job is not retried (max_attempts = 1). |
| Browser tab | No app-level handling; tab may be discarded or crash. |
Enable both recycling controls. Without them, a single leaky job can drag a worker into mid-job OOM, which silently drops work.
production:
workers:
- queue: canvas_queue
workers: 2
max_priority: 10
- queue: canvas_queue
workers: 4
# Recycle a worker after processing this many jobs. Returns
# fragmented heap memory to the OS. Tune to your job mix; 20-100
# is a reasonable starting range.
worker_max_job_count: 50
# Recycle a worker if its RSS exceeds this many bytes between
# jobs. 1 GiB shown; size to (pod_memory_limit / workers_per_pod)
# with headroom for the parent pool process.
worker_max_memory_usage: 1073741824Sizing guideline: set worker_max_memory_usage to about 70–80% of the
per-worker share of your container's memory limit. The threshold needs enough
headroom that a worker can still allocate to finish its current job after
crossing the line — recycling is between-jobs, not immediate.
Canvas does not ship a per-worker memory ceiling for Puma. Options, in order of preference:
- Set a Kubernetes / container memory limit sized to your peak working
set with headroom. Combine with
livenessProbeandreadinessProbepointing at the existing health endpoints. The orchestrator will OOM-kill and restart, and load balancing will route around the restart. - Add
puma_worker_killerif you want application-level recycling on memory growth without relying on the orchestrator. This is not currently in the Gemfile and would be a new dependency — evaluate carefully. - Wrap additional risky controller paths in
MemoryLimit.apply. This is the cheapest mitigation for a known offender (large file processing, big exports, expensive report generation) and matches the pattern Canvas already uses.
- Alert on the
[STAT]log lines or scrape RSS from/procto spot leaky jobs early. - Treat any
NoMemoryErrorin error reporting (Sentry / equivalent) as a capacity signal, not a code bug — it usually means a guard fired. - If you raise
MemoryLimit.applycaps to make a feature work, prefer fixing the underlying allocation pattern. The cap is a containment device, not a budget.
- No
rescue NoMemoryErroranywhere — there is no graceful degradation path, only fail-fast and recycle. - No memory-aware health check. A pod can be near OOM and still report healthy.
- The example
delayed_jobs.ymlships with both recycling controls commented out, so a deployment that copies the example without editing it has no protection on the job tier. max_attempts = 1plus mid-job OOM means lost work. Recycling is the primary defense.- No frontend handling for browser memory pressure.