Primary tuning levers
- pool behavior (
poolStrategy,poolSize) - global concurrency cap (
maxConcurrent) - execution resource limits (
cpuLimit,memoryLimit,timeoutMs) - dependency approach (runtime install vs pre-baked images)
- streaming choice (
executeStream/ CLI default streaming)
Where to set performance values
| Value | CLI | Config | API (/execute) | Library |
|---|---|---|---|---|
poolStrategy | not exposed on run | poolStrategy (serve defaults) | not request-configurable | new DockerIsol8({ poolStrategy }) |
poolSize | not exposed on run | poolSize (serve defaults) | not request-configurable | new DockerIsol8({ poolSize }) |
maxConcurrent | no run flag | maxConcurrent | server-level (from config) | new DockerIsol8(opts, maxConcurrent) |
timeoutMs | --timeout | defaults.timeoutMs | request.timeoutMs or options.timeoutMs | request-level or engine default |
memoryLimit | --memory | defaults.memoryLimit | options.memoryLimit | memoryLimit |
cpuLimit | --cpu | defaults.cpuLimit | options.cpuLimit | cpuLimit |
On
isol8 run, concurrency is indirectly controlled by config.maxConcurrent (used when creating the local Docker engine). There is no dedicated run flag for it.Pool strategy trade-off
| Strategy | Runtime behavior | Best fit |
|---|---|---|
fast (default) | instant acquire from clean pool, background cleanup through dirty pool | low-latency interactive and API workloads |
secure | cleanup happens in the acquire path | stricter cleanup semantics over raw latency |
Fast mode internals
Infast mode:
cleanpool: ready-to-run containersdirtypool: returned containers waiting cleanup- acquire path prefers clean containers for lower start latency
- each clean acquire kicks off async replenishment so warm capacity refills in the background
- simple no-artifact requests can execute inline (no code-file write exec), reducing warm-path overhead
For a typical one-shot
isol8 run, the process exits after execution and calls engine.stop(), which drains/removes pooled containers. Background cleanup is mainly beneficial in long-lived processes (for example isol8 serve or an app that keeps a DockerIsol8 instance alive).Setting pool and limit options
- CLI
- API
- Library
- Config
Config-level
poolStrategy and poolSize are server defaults (isol8 serve). API request options cannot override them; library engine options can.Pool tuning examples are intentionally shown in Library and Config contexts only.
Concurrency and throughput
maxConcurrent acts as a semaphore cap:
- local engine: passed to
new DockerIsol8(options, maxConcurrent) - server mode: loaded from
isol8.config.jsonand applied globally
How poolSize and maxConcurrent work together
They control different bottlenecks:
maxConcurrent: how many executions can run at once.poolSize: how many warm containers are available per runtime image.
- If
maxConcurrentis10but PythonpoolSize.cleanis1, a burst of Python requests can still hit cold container creation once warm capacity is exhausted. - Increasing
poolSize.cleanfor your hottest runtime reduces cold misses and improves p95/p99 latency.
- Set
maxConcurrentto host capacity (CPU/memory). - Set
poolSize.cleanto expected per-runtime burst. - Set
poolSize.dirtyto absorb release spikes while background cleanup catches up.
Dependency installation cost
Per-request installs (installPackages / --install) are usually the biggest latency contributor.
For stable workloads:
- Move dependencies to
dependenciesin config. - Build custom images via
isol8 setup. - Run requests without per-request installs when possible.
Pre-baked dependencies shift cost from request time to build time and produce much more stable tail latency.
Streaming and perceived latency
- CLI streams output by default (
--no-streamdisables streaming). - Library can use
executeStream()for chunked output. - Streaming improves user-perceived responsiveness even if total wall-clock duration is unchanged.
Optimization loop
FAQ
Should I always use fast pool strategy?
Should I always use fast pool strategy?
Usually yes for production latency. Use
secure only if your security/compliance posture requires cleanup in the acquire path.Why does first request feel slow?
Why does first request feel slow?
Cold image/container startup and dependency installation dominate first-run latency. Warm pools and pre-baked images reduce this.
Does increasing memory always improve speed?
Does increasing memory always improve speed?
Not always. Memory helps when workloads are constrained or pressure occurs. CPU and package install overhead can still dominate.
Troubleshooting
- High p95/p99 despite low p50: increase
dirtypool capacity and verify background cleanup can keep up. - Every request is slow: check for per-request
--installusage; move dependencies into custom images. - Queued requests on server: inspect
maxConcurrentand host saturation before raising limits. - Fast mode not helping: verify workload is ephemeral and not bottlenecked by runtime package downloads.
Related pages
Execution guide
Request lifecycle, mode behavior, and streaming internals.
Packages and images
Build pre-baked images and avoid per-request install overhead.
Configuration reference
Defaults and server-level knobs including
maxConcurrent.Benchmarks
All local benchmark suites, including ComputeSDK-style TTI.
Troubleshooting
Symptom-driven fixes for latency and throughput issues.