A Coding Agent for Production Issues

A Coding Agent for Production Issues

2026-03-01

by Uri Walevski

Your production system breaks at 3am. You wake up, stare at logs, fix it, go back to sleep. Or you give that job to an agent.

tl;dr:

  1. Write a prompt describing your production setup
  2. Customize the VM's Dockerfile with your tooling
  3. Add secrets (encrypted, never exposed to the LLM or the VM)
  4. Connect it to WhatsApp
  5. Have it monitor your systems on a schedule or trigger it from CI and monitoring scripts

The bot gets a persistent VM with a coding agent that can read logs, trace bugs, and open PRs.

This is a walkthrough of how to set one up using prompt2bot and supergreen. The result is a bot that lives on WhatsApp (or Telegram, or web), has access to your codebase and infrastructure, and can investigate and fix issues when you're not around.

The Prompt

The prompt is the bot's personality and job description. You write it in plain language in the dashboard's Overview tab.

Something like:

URLs in the prompt get automatically fetched and inlined, so you can point the bot at your docs, runbooks, or whatever context it needs. It'll read the page and include the content in its system message. GitHub repo URLs work too: if the bot has a GitHub token in its secrets, it fetches the README and inlines it, even for private repos.

The prompt is the first thing in the system message. prompt2bot appends anti-hallucination rules, tool documentation, conversation context, and formatting guidelines automatically. You just write the "what" and "how" of your agent's job.

The Dockerfile

Your bot gets a real VM with a Docker container running inside it. The base image comes with Ubuntu, Node.js, Deno, and OpenCode (an AI coding agent that does the actual work).

You can customize the container environment through the VM Dockerfile editor in the Automation tab. This is where you add project-specific tooling:

Your instructions get sandwiched between a locked base (the OS and core tools) and a locked suffix (workdir and entrypoint). You can't break the container's fundamentals, but you can install anything your production workflow needs.

The coding agent running inside the container is OpenCode, which supports skills that can be added from npm or GitHub. These extend what the agent can do beyond shell commands, things like interacting with specific APIs, running specialized analysis, or integrating with services you use. You configure them in the agent's config, not the Dockerfile.

The VM spins up on demand. When you create it, cloud-init installs Docker on the host, builds your custom image, and starts the container. The coding agent runs inside it. On the host (outside the container), a small Node.js HTTP server listens for commands. Every request from prompt2bot's server is Ed25519-signed. The Node.js process verifies the signature, then runs the command inside the container via docker exec. Nothing unsigned gets through.

Communication goes both ways. The server sends signed commands in, and when a task finishes, the host agent reads the output and posts it back to prompt2bot's API. The callback credentials live on the host, outside the container, so the LLM running inside never sees them. No persistent connection, no open ports beyond the one agent endpoint. The VM is a black box that accepts signed work and reports back when it's done.

Secrets

Your bot needs API keys to be useful. GitHub tokens, cloud provider credentials, database connection strings. You don't want those floating around in an LLM's context window or stored in plaintext on a VM.

The dashboard has a Secrets tab where you add them. You give each secret a name, a value, and a list of hostnames it's allowed to be sent to (e.g. api.github.com). The value gets AES-GCM encrypted immediately and is never shown again, not even to you. The tab only displays secret names and their allowed hosts.

You can also give secrets to the bot directly in chat. If you paste an API key into the conversation, it gets detected by regex patterns (covering GitHub PATs, OpenAI keys, AWS keys, JWTs, and more), replaced with a <<SECRET_...>> placeholder, and stored encrypted. The LLM only ever sees the placeholder, never the real value. The bot can then store it as a named secret via its set_secret tool.

Either way, the secret ends up encrypted in the database. The VM never sees the real value. Instead it gets an HMAC placeholder, a deterministic hash tied to that specific machine. When the container makes an outbound request to an approved host, a reverse proxy swaps the placeholder for the real credential at the last moment.

So the actual secret never touches the VM. It never appears in the container's environment in cleartext. The placeholder is machine-specific, so even if it leaks, it's useless on any other machine. And tools that aren't explicitly allowlisted for secret handling will reject any input containing a secret placeholder, so the bot can't accidentally pass your credentials to a random API call.

Putting It on WhatsApp

This part is optional. You can use the bot through the web interface (via alice and bot) or Telegram. But WhatsApp is where most people already live.

Normally, getting a bot on WhatsApp means going through Meta's Business API. Application, approval, message templates, rate limits, the whole enterprise dance.

Supergreen skips that. You connect a phone number and get a webhook. Each account runs in its own isolated container with its own proxy.

In prompt2bot, you set this up by telling the builder bot to connect a Supergreen WhatsApp account. It provisions the account, gives you a pairing code, and you scan it with the phone you want to use. That's it. Your production agent is now reachable on WhatsApp.

You get a message at 3am that the API is returning 500s. You forward it to the bot. It pulls logs, finds the failing query, traces it to a recent migration, and opens a PR to fix it. You review the diff on your phone, approve, and go back to sleep.

Triggering It Without a Human

You don't have to wait for someone to message the bot. There are two ways to make it act on its own.

The first is a scheduled task. You tell the bot, in conversation, something like "every morning, check the error logs from the last 24 hours and message me if anything looks off." It creates a recurring task internally. Every day it wakes up, pulls logs, analyzes them, and reports back. No cron job, no external infrastructure. The bot just does it because you told it to.

You describe what to check and how often, the bot handles the rest. It keeps state between runs, so it can notice trends, like an error rate that's been creeping up over the last three days.

The second is the API. The @prompt2bot/client library lets you trigger tasks from code.

The bot gets the task, spins up its VM, and starts investigating. It messages you on WhatsApp when it has findings, same as if you'd texted it yourself.

This is useful for hooking into existing systems. A CI pipeline that fails can send a task with the error logs. A monitoring alert can trigger an investigation. A deploy webhook can ask the bot to run a smoke test. Anything that can make an HTTP call can put the bot to work.

A much more efficient approach than writing custom monitoring scripts is to use a service like anomalisa. You instrument your app to send events (signups, purchases, API calls, whatever matters), and anomalisa runs statistical anomaly detection on them automatically. When something unusual happens, it fires a webhook. Point that webhook at createRemoteTask and the bot starts investigating the moment your metrics go weird, no polling, no cron jobs checking logs every hour.

You can also create recurring tasks through the API by passing a recurrenceRule:

Accepts "daily", "weekly", "monthly", a number in milliseconds, or a full RFC 5545 RRULE string for more complex schedules.

The secret field is your bot's API key, found in the dashboard. No user tokens, no OAuth. One field, one call.

What This Actually Looks Like Day to Day

The bot isn't autonomous in a scary way. It has a VM, it has tools, but it always reports back through the conversation. You see what it's doing. It asks before pushing code. It shows you the diff before opening a PR.

The VM persists between conversations, so your repo stays cloned, your dependencies stay installed, your dotfiles stay configured. It's like having a junior engineer who never sleeps and never forgets the runbook.

The ceiling is set by the prompt. A vague prompt gets you a vague assistant. A detailed prompt with specific runbooks, escalation procedures, and repository context gets you something that can genuinely handle production incidents while you sleep.

← All posts