Task types and engines

What this means

When you create a pipeline, two fields tell Bilbis what kind of work to expect:

Type - the shape of the task, like a feature or a bug fix.
Engine - the AI engine that will run the work.

Both fields appear on the New Pipeline form. Both are also stored on templates so saved presets carry your defaults.

When to use it

Every time you create a pipeline. The fields are required.
When you save a template and want consistent runs.
When a pipeline produced a poor result and you want to try a different engine on the next run.

Before you start

The New Pipeline form must render - credentials, product, and at least one repo are needed first.
You don't need any extra setup to use the engines listed below. The LLM credential you set up in Integrations decides which engines are available.

Task types

Pick the value that best describes what you're asking for. The type affects how the agents plan the work and how engine recommendations are scored.

Type	Use when
Feature	Adding new behavior - a new endpoint, a new screen, a new option.
Bug fix	Fixing broken behavior.
Refactor	Changing how the code is organized without changing behavior.
Frontend	Visual or UI work. Often paired with a Figma URL in Advanced options.
Test	Adding or updating tests.
Documentation	Writing or updating prose, README files, or code comments.

If your task spans multiple types, pick the closest match. The agents read the task description verbatim regardless of the type.

The dropdown shows these labels in the order above. Default is Feature.

Engines

The engine is the AI model that does the actual coding work. The dropdown lists six options.

Engine	Strengths	When to pick it
Auto (let Bilbis pick)	Bilbis routes per task using a small routing model.	Default. Reasonable on almost every task.
Claude Opus	Most capable. Best reasoning on hard or ambiguous tasks.	Big refactors, unfamiliar codebases, tricky bug hunts.
Claude Sonnet	Balanced cost and quality.	Most everyday work. Safe default if you want to pin one.
Claude Haiku	Fast and cheap.	Trivial or repetitive tasks where you don't need deep reasoning.
OpenAI Codex	OpenAI's coder.	When you specifically want OpenAI rather than Anthropic.
Workers AI	Smallest, cheapest, lowest quality.	Very simple tasks where cost matters most.

The labels in the dropdown spell out the trade-off - for example, "Claude Opus (most capable)" and "Claude Haiku (fast / cheap)".

Auto is a request-time choice, not a stored value. The Work Planner picks one of the concrete engines per task based on the task description and the repo's history. Once picked, the chosen engine is what runs and what shows up on the pipeline detail page.

Recommendations

If the form has enough information - both a repo and a task type - Bilbis quietly asks for an engine recommendation. When one comes back, the form switches to it automatically, unless you've already touched the engine field. The recommended option in the dropdown shows a "recommended" tag.

You can override the recommendation at any time. Touching the field tells Bilbis to leave your choice alone for the rest of the form's lifetime.

If you're using Let Bilbis pick for repos, recommendations don't fire - there isn't a single canonical repo to score against until the Repo Router picks at runtime.

Templates lock the engine

Templates store one engine and don't allow Auto. If you save the New Pipeline form as a template while engine is set to Auto, the template falls back to Claude Sonnet. Edit the template later if you want a different engine.

The reasoning: a template promises a consistent run. Auto would defeat that.

Cost implications

Engine choice is the biggest cost lever on a pipeline.

Claude Opus is the most expensive. Use sparingly.
Claude Sonnet sits in the middle.
Claude Haiku is roughly an order of magnitude cheaper than Opus.
Workers AI is cheapest but produces the lowest-quality code.
Auto routes per task, often picking mid-tier.

The Budget cap on the New Pipeline form halts a run if cost exceeds the cap regardless of engine. See Budgets, dry runs, and priority.

Where you see the engine after dispatch

The pipeline detail page header shows the engine in plain language ("Claude Sonnet (balanced)").
The LLM calls tab on the detail page shows every prompt and which model handled it.
Analytics shows spend and quality grouped by engine, so you can see which engine is doing what across your organization.

Steps - picking values on the form

Go to Pipelines → New.
Fill in the Task field.
Pick a Repository.
Pick a Type that matches the work.
Leave Engine on Auto, or override it. Wait a moment - if a recommendation arrives, the form will switch the field for you.
Adjust the Budget cap to match the engine you chose.
Continue with the rest of the form.

Problems and fixes

Problem	What to check
The engine I want isn't in the list.	The dropdown lists every supported engine. If a label looks unfamiliar, you may be on a stale tab - refresh.
The "Looking for a recommendation" spinner never resolves.	The recommendation endpoint returned nothing. Default to Auto or pick manually.
The engine flipped on its own after I picked one.	The recommendation only flips the field if you haven't touched it. If it flipped after you picked, the field hadn't been marked dirty yet - just pick again.
Auto is missing from my template.	Templates lock to a concrete engine. Edit the template and pick one.
The pipeline cost more than expected.	Engine choice and budget cap are the two levers. Read LLM calls on the pipeline detail page to see which calls were expensive. See Budgets, dry runs, and priority.
The pipeline produced wrong code on a hard task.	Try Claude Opus on a fresh run. Read LLM calls to see where the previous engine went wrong.

Create a pipeline - every form field, including how Type and Engine fit into the rest of the form.
Budgets, dry runs, and priority - control cost and queue order.
Templates - save Type, Engine, and budget as a reusable preset.
Pipeline states - what happens once you dispatch.