

2026-05-10
by Uri Walevski
The word "agent" gets thrown around a lot, often to mean very different things. This post is a short, opinionated tour of what an agent actually is, what shapes they come in, and where they end and "automation" begins.
At the core, an agent is two things: an LLM and a harness around it.
The LLM is the language model — the part that reads context and produces the next step. On its own it can only emit text.
The harness is everything that turns those text outputs into actions and feeds the results back: the loop that re-prompts the model, the tools it can call, the memory or history it sees, the environment it runs in (a browser, a shell, a VM), and the rules for when to stop. The harness is what makes the model an agent instead of a chatbot.
Most differences between agents are differences in the harness, not in the model.
The most common agents people meet first live in a browser tab: ChatGPT, Gemini, Claude.ai, and similar. The harness here is fairly contained — usually a chat loop with a small set of built-in tools (web search, code interpreter, image generation, sometimes a sandboxed file workspace).
These are great for one-off tasks, but the agent has no access to your machine, your repos, or your real environment. Each conversation is mostly a closed world.
The next category runs locally and can actually touch your files: Claude Code, OpenCode, Cursor's agent mode, Aider, and friends. The harness here is much richer — it can read and edit files, run shell commands, execute tests, and use whatever credentials you have on disk.
This is what makes them useful for real software work. The trade-off is that they're tied to your machine: if you turn it off, the agent stops.
Some agents are built around a single model provider — for example, ChatGPT is tightly coupled to OpenAI's models, Claude Code to Anthropic's. The harness and the model ship together.
Others are model-agnostic: OpenCode, Aider, and most "agent frameworks" let you plug in whichever LLM you want. This matters because models change quickly, and the right model for a task today may not be the right one in six months. A model-agnostic harness lets you swap without rewriting the agent.
A third shape, which is what prompt2bot does, is an agent system where the execution environment isn't your laptop and isn't a fixed sandbox — it's provisioned per task. The agent gets a fresh VM (or sandbox, or container) when it needs one, lives there for as long as the task takes, and is torn down or kept around independently of the conversation.
The advantage: the agent isn't tied to a specific machine or session. It can run while you're asleep, hand a long-running job to a real Linux box, install whatever it needs, and survive across conversations. You also don't have to trust it with your local filesystem.
People often call any AI-powered workflow an "agent." It usually isn't.
Automation is a fixed sequence of steps. It can absolutely have an LLM inside it — "summarize this email, then post the summary to Slack" is a perfectly good automation. But the steps and the order are decided by you, ahead of time. The LLM is just one node in the graph.
An agent decides what to do next based on what it sees. It picks tools, reacts to results, retries, changes plans. The control flow comes from the model, not from a pre-drawn diagram.
Both are useful. They're not the same thing, and conflating them leads to bad expectations in both directions: people expect automations to be smart, and they expect agents to be predictable.
An agent's usefulness is mostly bounded by what it can do, which means: its tools.
A tool is a single capability the agent can invoke — "send an email", "query this database", "run a shell command". Tools are atomic.
A skill is a higher-level bundle: a set of instructions plus the scripts/tools that implement them, packaged so an agent can learn to do something specific (e.g. "deploy to Render", "post to Reddit", "use the AgentMail API"). A skill teaches the agent both how to think about a task and what tools it has for it.
Because skills are mostly just documents and scripts, they're portable across harnesses — which is why skill marketplaces are starting to appear. A few worth knowing about:
In prompt2bot, any skill from these sources can be hydrated into an agent on demand — you point at a skill (a repo, a package, a URL) and an agent spins up already knowing how to use it, without you having to wire anything up. That's the model we think makes the most sense: keep the agent generic, keep the skills portable, and combine them when there's a task to do.
← All posts