Troubleshooting

CI still failing

A pipeline cycles between CI and CI fixing without converging. Decide whether to keep waiting, cancel, or fix the underlying CI issue.

What this means

The pipeline reached the CI state, your CI provider returned red checks, and the agent transitioned to CI fixing to read the failing logs and push a fix. Then CI ran again. And it's still failing.

A small number of CI loops are healthy. Real CI failures take an iteration or two to fix. A loop that runs four or more times usually means the agent can't fix the underlying problem.

When to use this page

  • The state badge cycles CI → CI fixing → CI → CI fixing.
  • The pipeline has been running for longer than your normal CI cycle without finishing.
  • The pipeline ended in Failed with a CI-related reason on the Diagnostics tab.

Before you start

Open the pipeline detail page. The two tabs that matter:

  • CI runs - every CI run the pipeline triggered, including the failing job and its log.
  • Timeline - the full history of state transitions and agent actions.

Steps

1. Read the failing job log

Click the CI runs tab. The most recent failed run shows the failing jobs and their log output. Read it like you would any CI failure.

What to look for:

Pattern in the logWhat it usually means
Permission denied, 403, 401A secret is missing or wrong on your CI provider.
command not foundThe runner image doesn't have a tool the project needs.
Test name fails on every runA real test failure, not a flake. The agent should be able to fix it.
Test name passes sometimes, fails sometimesA flake. The agent will keep pushing fixes that don't help.
Out of memory, timeout, or runner offlineA CI infrastructure issue. The agent can't do anything about it.

2. Decide what kind of failure it is

Failure kindWhat the agent can do
Real test failureFix it. Give it a few iterations.
Linting / formattingFix it. Usually one pass is enough.
Type errorsFix it.
Flaky testCannot fix. The agent will loop forever or until the budget runs out.
Missing CI secretCannot fix. The agent has no way to add a secret.
Wrong CI image / missing toolCannot fix. Owner of the CI config has to change it.
Runner outageCannot fix. Wait for your CI provider.

3. Act on the decision

If the agent can fix it, leave the pipeline running. The loop is part of the design.

If the agent cannot fix it:

  1. Open the pipeline detail page.
  2. Click Cancel in the header.
  3. Confirm in the dialog.
  4. Fix the underlying CI issue yourself - add the secret, fix the runner, mark the flaky test, whatever it is.
  5. Dispatch a fresh pipeline.

If you're not sure, give it two more iterations. If it still fails, cancel.

Why the agent can loop forever

The agent reads the failing log and pushes a fix. CI runs again. If CI still fails, the agent reads the new log and pushes another fix. The loop only ends when:

  • CI passes.
  • The budget cap halts the run.
  • An admin cancels.

There is no built-in iteration limit. A flaky test or environmental failure will run until budget halts the pipeline.

What does and doesn't count as "still failing"

What you seeWhat it is
Loop runs once or twice, then CI passesHealthy. The agent fixed the failure.
Loop runs three to four times, then passesHealthy on a tricky failure.
Loop runs five or more times without progressProbably stuck. Read the logs.
The error message changes between iterationsHealthy - the agent is making progress.
The error message is identical every iterationStuck. The agent's fixes aren't moving the needle.

Permissions

ActionWho can do it
Read CI runs and logsAny role.
Cancel the pipelineSubject to your role and product settings.
Fix the CI configSubject to your CI provider's permissions, not Bilbis.

Problems and fixes

ProblemWhat to check
The CI runs tab is empty even though the badge says CI.Your CI provider hasn't reported back yet. Wait a minute and refresh.
The CI runs tab shows runs but no log.The provider returned a run reference without an embedded log. Click out to the provider and read the log there.
The agent keeps pushing the same fix.The agent isn't seeing new context between iterations. Cancel and dispatch a fresh pipeline with a clearer task description.
The same test passes locally but fails in CI.An environment difference. Cancel, fix the environment, dispatch fresh.
Pipeline ended in Failed during a CI loop.The budget cap likely halted the run. Open LLM calls to confirm spend.

On this page