Overview
This article is a central troubleshooting reference for both runner types:
Machine Runner 3.x — an agent installed directly on a VM or physical machine (Linux, macOS, Windows)
Container Runner — a Helm-deployed agent that schedules jobs as pods in a Kubernetes cluster
If you are still using Launch Agent 1.x, stop here and migrate first — see Issue 4: Launch Agent 1.x jobs are failing (EOL) below.
Quick Pre-Checks
Before diving into specific issues, confirm the following:
Check | How |
Runner is registered and visible | Org Settings → Self-Hosted Runners → confirm resource class appears and shows a runner |
Runner version |
|
Resource class name in config matches exactly | Names are case-sensitive: |
Runner has outbound internet access to | Port 443 required |
Runner token is valid and not rotated | If token was recently rotated, restart the runner process with the new token |
Issue 1: "We cannot run this job using the selected resource class"
Symptom: The job fails immediately with:
We cannot run this job using the selected resource class.
Cause A — Resource class does not exist
Verify the resource class was created:
circleci runner resource-class list <your-namespace>
If missing, create it:
circleci runner resource-class create <your-namespace>/<resource-class-name> "description"
Cause B — Runner is not enabled for your plan
Self-hosted runners require a Scale, Custom, or Server plan. Performance and Free plans do not have access. Check at Org Settings → Plan.
Cause C — Typo in config.yml
The resource class in your config must exactly match what was created. Check for capitalization differences, leading/trailing spaces, or namespace mismatches:
# Must match the registered resource class exactly resource_class: my-org/my-runner-name
Issue 2: Jobs Queued or Stuck in "Not Running" / "Preparing Environment"
Check 1 — Confirm at least one runner is online
Go to Org Settings → Self-Hosted Runners. If the resource class shows "No runners" or all runners appear offline, the runner process has stopped or lost connectivity.
Check 2 — Review maxConcurrentTasks
Each resource class has a maxConcurrentTasks limit (default: 20). If this limit is reached, additional jobs queue even if runner machines appear idle. Contact CircleCI Support to request an increase.
Check 3 — Inspect runner logs
See Runner Log File Locations below. Look for:
failed to claim task— runner cannot reach the CircleCI backendcontext deadline exceeded— network timeout torunner.circleci.comtoken is invalid— runner token was rotated; restart the runner with the new token
Check 4 — For container runner, check pod status
kubectl get pods -n <namespace>kubectl logs deployment/container-agent -n <namespace>
If the container-agent pod is not in Running state, see Issues 5 and 6 below.
Issue 3: Runner Appears Online but Jobs Are Not Being Claimed
Cause A — Runner is at maxConcurrentTasks capacity
If a previous batch of jobs did not release cleanly (e.g., machine rebooted mid-job), tasks may still be counted as active in the backend. Contact Support to clear stuck task claims.
Cause B — Runner cannot reach the task assignment endpoint
The runner must be able to reach:
runner.circleci.com:443*.circle-artifacts.com(for artifact and cache operations)
Test from the runner machine:
curl -I https://runner.circleci.com/api/v3/runner/unclaim
Cause C — Clock skew on the runner machine
TLS certificate validation requires the system clock to be within a few minutes of actual time. If the clock is skewed, authentication will fail silently. Verify NTP is configured and the clock is accurate (timedatectl on Linux).
Issue 4: Launch Agent 1.x Jobs Are Failing (EOL)
Support for Launch Agent 1.x ended on September 17, 2024. Any runner still running a 1.x version will fail.
Symptoms:
Jobs fail immediately with no useful error in the job output
Runner logs show authentication or connection errors with no clear cause
Action required: Migrate to Machine Runner 3.x
The migration is straightforward — the configuration file is 1:1 compatible. No config changes are required.
# macOS (Homebrew) brew install circleci-runner# Linux (Debian/Ubuntu) apt install circleci-runner# Linux (RHEL/CentOS) yum install circleci-runner
After installing, your existing config file (launch-agent-config.yaml) works without modification:
circleci-runner start --config launch-agent-config.yaml
Full migration docs: https://circleci.com/docs/guides/execution-runner/migrate-from-launch-agent-to-machine-runner-3-on-linux/
Issue 5: Container Runner — Jobs Stuck in "Task Lifecycle" Stage (K8s Throttling)
Symptom: Jobs hang in the "Task lifecycle" stage. Container-agent logs show:
waited for 3s due to client-side throttling, not priority and fairness, request: ...
Cause: The single container-agent pod is saturating the Kubernetes API rate limits under high task concurrency.
Fix: Increase the replica count in values.yaml:
agent: replicaCount: 2
Apply the change:
helm upgrade container-agent container-agent/container-agent -n <namespace> -f values.yaml
Issue 6: Container Runner — Pods Remain in "Pending" State
Cause | How to check |
Node out of memory (OOM) |
|
Node disk pressure |
|
No nodes match pod affinity/tolerations |
|
Image pull failure |
|
For image pull issues with a private registry, see How to use imagePullSecrets on Container Runner.
Issue 7: OIDC Tokens Not Available in Runner Jobs
Symptom: $CIRCLE_OIDC_TOKEN is empty or the job fails when trying to use it.
Cause: OIDC token generation writes a file to /tmp. If /tmp is mounted with the noexec flag (common in hardened environments), this fails silently.
Diagnose:
mount | grep /tmp # Look for "noexec" in the output
Fix options:
Remove the
noexecflag from/tmpif your security policy permits.Configure the runner to use an alternative working directory that allows execution.
Use a native credential mechanism (AWS IAM instance profiles, GCP Workload Identity) instead of OIDC on that runner.
Issue 8: "fork/exec /bin/bash: bad file descriptor" (Container Runner)
Symptom:
failed to start cmd: fork/exec /bin/bash: bad file descriptor
Cause: The job's Docker image does not have /bin/bash, or the image entrypoint conflicts with the runner's task agent.
Fix:
Ensure the image includes bash (
RUN apt-get install -y bash), or use an image that includes it.Explicitly set the shell in your job config:
jobs:
my-job:
shell: /bin/sh -eo pipefailIssue 9: SSH Debugging Not Working on Self-Hosted Runners
Container Runner does not support SSH debugging. This is a current product limitation — "Rerun job with SSH" is not available for container runner jobs.
Machine Runner does support SSH reruns. If it's not working, verify:
Project Settings → Advanced → Enable SSH reruns is turned on
The runner machine is network-accessible from your IP on the SSH port
Runner Log File Locations
Machine Runner 3.x
OS | Log location |
Linux (systemd) |
|
Linux (file) |
|
macOS |
|
Windows |
|
To increase log verbosity, set log_level: debug in the runner config file and restart the service.
Container Runner
# Container agent logs kubectl logs deployment/container-agent -n <namespace> --tail=200# Logs for a specific task pod kubectl logs <task-pod-name> -n <namespace># Events (most useful for Pending pods) kubectl describe pod <task-pod-name> -n <namespace>
When Escalating to Support
Include the following in your ticket to avoid back-and-forth:
Runner type: Machine Runner or Container Runner
Runner version:
circleci-runner --versionor Helm chart version (helm list -n <namespace>)Resource class name exactly as it appears in
config.ymlOS and version (machine runner) or Kubernetes version and cloud provider (container runner)
Runner logs from the time window of the failure
The specific failing job URL from
app.circleci.comOutput of
circleci runner resource-class list <namespace>Whether the issue is intermittent or consistent