Hermes as a personal agent runtime
In the first post in this series, I defined the split between OpenClaw and Hermes: OpenClaw was the earlier experiment, Hermes is the runtime I am building on now.
This post goes one level deeper into Hermes.
A normal AI chat session is temporary by default. You paste context in, ask for help, copy something out, and move on. If the task is simple, that is fine.
But engineering work rarely stays simple. You need the assistant to inspect files, remember stable preferences, run commands, search older sessions, follow project conventions, and sometimes come back later when something happens.
That is the gap Hermes fills for me.
I think of Hermes as a personal agent runtime. That phrase is doing real work. I do not mean “a chatbot with more tools.” I mean an environment where the model can operate through a consistent set of capabilities: files, terminal commands, browser actions, memory, skills, scheduled jobs, session history, and delegated subagents.
The distinction matters because chat alone does not give you much structure.
OpenClaw helped me prove that I wanted an agent close to my actual work. Hermes is the version that makes the pieces feel more like a system. Instead of every capability being a clever trick bolted onto a conversation, the primitives are more explicit.
What Hermes gives me
The useful parts are not exotic. They are the boring pieces you need if you want an agent to do real work more than once.
The CLI gives you a terminal interface for direct work. The messaging gateway lets the same agent live in places like Discord or Telegram. Tools expose real capabilities: terminal commands, file edits, browser interaction, scheduled jobs, image generation, GitHub-related workflows, and more depending on what is enabled.
Skills package reusable procedures so the agent does not rediscover the same workflow every time. Memory stores stable facts and preferences. Session search lets the agent recall past work without pretending every old task belongs in permanent memory. Cron jobs let the agent run scheduled tasks and send the result back to the right channel. Subagents let one agent delegate bounded work to another isolated context.
That can sound like a feature catalog. The practical value is that the pieces compose.
For example, a blog-writing agent can search past sessions, inspect local repositories, draft Markdown, and leave files in a working directory. A coding agent can read a plan, spawn a review subagent, run tests, and patch files. A PR review service can receive a GitHub webhook, run a Hermes worker, and post one comment back to the pull request.
The same runtime supports all of those because the important abstraction is not “chat.” It is task execution with context.
The model is not the whole product
This is where I think a lot of AI tooling gets confused. The model is not the whole product. The model is one component inside a workflow.
Sometimes it is the most important component. Sometimes the boring parts matter more: the filesystem, the queue, the logs, the exact prompt loaded for a role, the command that verified a change, the policy that says “stop and ask a human.”
That is also why I like Hermes more as a runtime than as a novelty interface. The value is not just that I can talk to a model from Discord. The value is that the conversation can connect to work: files, repos, scheduled tasks, local services, and previous decisions.
My setup, generalized
I do think it is worth describing the setup, but not by publishing every operational detail.
The useful version for readers is the architecture, not the hostname. A simplified version looks like this:
Discord / CLI
-> Hermes agent runtime
-> tools, memory, skills, session search
-> local repositories and notes
-> scheduled jobs
-> subagents for isolated tasks
-> small webhook services for external events
-> GitHub pull request review worker
That is enough to explain the pattern without turning the post into an infrastructure map.
The agent has two main front doors: terminal and Discord. The terminal is best when I am actively working in a repo and want tight feedback. Discord is better for asynchronous work: asking for a draft, kicking off a review, checking status, or having an automation report back when it finishes.
Behind that, Hermes runs with access to local files and selected tools. Some repos are just source material. Some are active projects. Some are operational notes, like sanitized infrastructure runbooks. The important boundary is that secrets and private operational details do not belong in blog posts, and they should not be casually copied into repos either.
For GitHub automation, I use small services as bridges. A webhook receiver should be boring: verify the request, normalize the event, queue work idempotently, and hand off to a worker. The agent should not be the HTTP server, the queue, the database, and the policy engine all mashed into one prompt. That is how demos become fragile.
Why messaging changes the workflow
The messaging side matters more than I expected.
If an assistant only exists in a browser tab, I mostly use it for local reasoning: explain this code, rewrite this paragraph, help me think through an API. If it can run on my machine or a small server, talk through Discord, inspect files, call GitHub, and remember how I like things done, then it starts to become useful for recurring operational work.
The terminal still matters. I do not want to debug a failing test suite through chat bubbles if I am sitting at the machine. But I also do not want every useful automation trapped inside an interactive shell.
That split changes what feels worth automating. A daily note, a pull request review, a blog draft, a status check, a small research task, a reminder to inspect a log: these are more useful when the result can come back where I already am.
The boundaries matter
The more capabilities you give an agent, the more boundaries you need.
Hermes has tools, but tools should be scoped. It has memory, but memory should store stable facts, not every temporary task result. It can run scheduled jobs, but a scheduled prompt needs to be self-contained because nobody is there to answer a clarification question at 3am. It can spawn subagents, but those subagents need clear contracts and verifiable outputs.
That is why I prefer the word runtime. A runtime is not magic. It has APIs, state, logs, permissions, failure modes, and conventions.
The failure modes are part of the point.
If a model gives bad advice in a one-off chat, the failure disappears when the tab closes. If an agent edits a file, posts a PR comment, or runs a scheduled task, the failure becomes part of a system. You need artifacts. You need tests. You need logs. You need a way to say “this job should stop here and ask a human.”
Where this series goes next
This post builds on What are Hermes and OpenClaw?. The next posts should get more concrete: first the transition from OpenClaw to Hermes, then the guardrails I want around agentic workflows.
Hermes is the foundation, but the interesting posts are not going to be generic Hermes tutorials. They will be concrete workflows:
- setting up a GitHub webhook receiver for agentic PR review,
- splitting reviewer, fixer, and arbiter roles,
- using structured JSON instead of free-form agent chatter,
- deciding what belongs in memory versus a skill,
- scheduling recurring checks without creating spam,
- using Discord as a control plane for small automations,
- keeping infrastructure notes safe and reproducible without committing secrets.
The theme is simple: agents become useful when they are attached to real workflows and kept inside real constraints.
That is what I am trying to build.
Recent Comments