Plan.
Build the next stage from the objective, completion criteria, retrieved memory, and current workspace state.
Picasso keeps the objective alive across turns, tools, tests, failures, and approvals until the criteria pass or a boundary is reached.
Every goal starts with the work, proof, limits, permissions, budgets, and stop conditions. A goal is not a loose prompt. It is a bounded agreement about what done means and when Picasso must pause.
objective: Migrate all .js files to .tscompletionCriteria: [all-tests-pass, lint-clean]budget: { dollars: 5, hours: 4, steps: 1000 }scope: packages/*permissions: safeBuild the next stage from the objective, completion criteria, retrieved memory, and current workspace state.
Run edits, tools, tests, MCP calls, and subagents inside permission, sandbox, and budget limits.
Read test results, diffs, logs, build output, provider responses, cache state, and failures.
Continue, re-plan, pause, ask for approval, or stop against the contract.
Every goal needs proof a machine can check — tests that pass, commands that exit clean, a score that must move. "Looks done" is not done. Picasso critiques the contract before you approve it, runs the baseline before the loop starts, and when the loop says finished, a final audit re-runs every criterion fresh and checks every deliverable against the working tree. Anything less gets sent back to work.
A goal without a verifiable finish line is refused, not attempted. Self-assessment alone never closes a goal.
The baseline runs before the work. A red baseline stops the goal before it spends a single token.
Every machine criterion re-verified fresh. Every deliverable checked on disk. And a visible coverage score for everything a test can't prove.
Debug prints, dead imports, and stray TODOs are flagged before a goal may call itself complete.
An edit that does not parse is rejected and reverted the moment it is written, with the error handed straight back to the agent. And every iteration that resolves a blocker writes the lesson to memory — so the next goal starts smarter than the last.
Unparseable code never lands. The bad write rolls back; the agent learns why.
Numeric fitness criteria capture a baseline at pre-flight and demand measurable improvement — in the right direction, by the margin you set.
Resolved blockers become memory. Goals compound.
A running goal can be paused, resumed, cancelled, inspected, or replayed. Permission mode changes, approval requirements, exhausted budgets, missing credentials, and risky mutations stop the loop instead of silently pushing through.
Stop the loop without losing the objective or evidence.
End the goal with cancellation state and audit trail intact.
Sensitive steps require the operator proof the policy demands.
The goal closes only when completion criteria are satisfied.
The TUI and Mac app show goal progress from the real backend ledger: provider calls, tool calls, command output, step state, budgets, cache events, memory writes, subagent results, and completion checks.
Move a shared API, update callers, and keep tests green across packages.
Work through a queue of scoped tasks, each verified before the next begins.
Turn a product spec into staged implementation, tests, and final review.
Move frameworks, languages, or dependencies with repeatable checks at each stage.
The same goal contract can run through hosted, managed, BYO-key, subscription, or local routes. Anthropic, OpenAI, Microsoft Foundry, Google, xAI, Qwen, Xiaomi MiMo, Kimi, DeepSeek, Zhipu GLM, MiniMax, OpenRouter, Ollama, and vLLM stay behind the same harness.
Picasso tracks per-session, per-day, per-goal, sponsor, provider, alias, and route budgets, pauses when exhausted, and keeps the status line honest. Autonomy stays useful because the limit is visible.
Picasso for Mac is almost here — a coding agent that looks the way serious tools should, and costs what creative freedom should: nothing. Leave your email and be first on the canvas.
Sponsors and labs — the early canvas is yours. Choose Sponsor or Lab above and we'll reach out before launch.