Total Internal Reflection: Episode 4

By Edmund CuthbertApril 16, 2026

Total Internal Reflection is an unscripted series about the agents we build for ourselves. Two cofounders showing their actual setups, workflows, and what happens when you build agents as teammates, not as tools.

Transcript

Edmund: Initially, I thought, well, I can just send my agent to interview you on Slack. But then I realized — why does my agent have to interview you? Why don't we just have my agent talk to your agent?

Edmund: Okay, we're talking about meta skills for OpenClaw today. We briefly mentioned it in the previous podcast, so it's going to be interesting to actually dive in and show the audience where we got the concept, how we developed it, and how we're using it. So initially you were just using OpenClaw?

Li: No. Originally I was using Claude Code. This whole concept of meta skills is born out of the extra challenges that arise when you go from one person and their agent as a pair, to one person plus someone else's agent, plus another person. We experienced this just with the two of us — it implies there's going to be much more complexity for companies that try and do this with multiple parties, many agents, many people all interacting. What got challenging is the company went from Li plus his agent, to all of a sudden my agent also as an actor inside the company.

Li: And I think at a higher level there's this common myth that people are discussing — AI increased everyone's productivity by 10x, 100x. But no organization actually saw that amount of productivity improvement, at least not at the same magnitude. And I think the bottleneck is actually alignment. If you're just working with your own agent, your own AI companion, you just need to figure out your own workflow. Everyone has their own workflows.

So you can increase your own workflow's productivity by a hundred x, but your workflow doesn't equal other people's workflow. Your way of doing certain things doesn't apply to someone else's workflow at all. You still need to bring humans together to drive the alignment to make sure they're on the same page. So the bottleneck is becoming less human alignment and more agent alignment.

Li: The biggest unlock now becomes: how can we avoid that human alignment? Can we make the agents drive the alignment by themselves?

Edmund: It's actually funny. A year ago when I was experimenting with Claude Code as my personal assistant, the strongest argument for Claude Code was that it's flexible. All of the AI workflow or AI assistant apps out there are trying to fit you into a static workflow that the founders of the company envisioned, because everyone has their own workflows.

Li: If we design an arbitrary workflow, we're likely to get it wrong.

Edmund: Right. And not only that — if we design a workflow for everyone, not for a hundred thousand users, just for the two of us, it still breaks. So it's even harder to design a workflow that's useful for a thousand users. Those apps end up being very generic, very constrained in the things they can do. Claude Code at the time was the biggest unlock. You can actually build custom flows, run it on your own computer, connect to all the different third-party apps to get the context, and build your own workflow on top of that. But fast forward to 2026, everyone has their own OpenClaw. Everyone has their own agents. Now that same strength, that same flexibility, becomes the curse. Now all of the organizations are saying internally, “Everyone has their own workflows, everyone has their own agents working in their own ways to pull the context to perform actions. On a company level, we still want everyone to follow the same process.” That's the thing that triggered us to think about this problem.

Edmund: How can we resolve that? Starting small — starting from driving alignment between our agents with just two of them. We actually had a couple of iterations on this. Originally we had this idea: let's just pick all the differences between our workflows into one skill. So on Edmund's machine, do this. On Li's machine, do this. But that just breaks immediately. Whenever I change a single setting on my laptop, it breaks. Now I have to push a change to our shared skill repo and ping Li — “Hey, I just updated this skill.”

Li: “Make sure you update it.” It's just not sustainable, even if it's just the two of us. And it's hard to imagine it working for any organization. And you are specifically talking about non-technical workflows here, which is interesting. Do you think non-technical workflows — for example, a sales funnel, or collaboration on some document that isn't code — are inherently harder to orchestrate between humans and agents? Because we haven't well-defined how even humans should collaborate. There is no git for documents. To what extent is the hardness of this a function of the agents, versus these domains being inherently harder to define? There's no historical tooling designed for even multiple humans working in parallel, diverging and then converging. There is no merge of slides. Or of a sales funnel. Or of anything inside Xero.

Edmund: That's super interesting. Probably, but I think the biggest bottleneck is that it's hard to validate. For a piece of business logic or a piece of program, validation is the easiest part. Does it compile? Does it error out? And you can write tests to test the business logic to make sure it follows the definitions. For workflows — what's the definition of a prospect? I guarantee you, within the same organization, there are probably 10 different definitions of what a prospect is.

Li: If you don't believe us and you work at a company of more than three people, go ask the other people how many customers you currently have. Watch everyone start flailing around with different definitions.

Edmund: Right. And there's no silver bullet for that. People have to come together and make sure they're aligned on those basic definitions. But once those basic definitions are set, then you want to automate. That's where the power of meta skills comes in.

Edmund: At this point we should probably talk about the definition of a meta skill. I think you have a more concise version of what it is.

Li: Let me use a crisp example of the kind of workflow we might be talking about here. Let's take a sales funnel. You're getting a prospect — someone who has expressed interest, submitted a form, or booked a sales call — to an activated and onboarded customer. Whatever your product, whatever your service, that's a funnel you're getting people through. In this context, a skill is something that takes an action about this workflow. So it could be some combination of updating a CRM, checking information on call transcripts, validating information inside your database.

Edmund: The meta skill doesn't have to be limited to non-technical flows. It can be anything. As long as you have a cadence, or a sequence you want the agent to follow, and the way to execute on that sequence requires context pulled in from different places or in different ways.

Li: So a generalized definition of a skill would be: a set of instructions for how to complete a unit of work. And a meta skill, as we think about it, is a set of instructions to define the parameters of the set of instructions that complete a unit of work.

See how Superposition works.
Meta skills let our agents align themselves to your workflow. See how Superposition plugs into your hiring process without forcing your team to change how they already work.

Edmund: I can get that, but I think there's a better version. What's the language you've used to describe skills before? When you first explained to me what a skill was, and I was confused about the difference between a subagent and a skill — how did you explain it to me?

Li: And you can't use a whiteboard.

Edmund: It's an SOP with a bunch of supporting documents.

Li: I'm not familiar with that TLA.

Edmund: It's a markdown file. It's a piece of description of how to get a number of things done with supporting documents. It's like a playbook.

Li: It's like a user manual.

Edmund: Yeah. It's a playbook. So a skill is just a playbook. It tells you how to do a certain thing.

Edmund: Okay. So in generalized terms, a skill is a set of instructions to do a piece of work — essentially a playbook of how to get something done. And just like a playbook, it will have pointers to other stuff. Like, you might want to go look this up. Imagine you've just joined a company and you need to do something. Someone has sent you what is probably a Notion doc of how to do it. That is essentially a skill.

Li: And so a meta skill is the instructions that can create a bespoke version of a skill configured to you. Rather than a generic playbook that inherently has to leave things vague — or has a bunch of if-this-then-that statements inside it — it's a bespoke version that still captures the generalized spirit of the outcome but has instructions that are unique to you.

Edmund: Right. Think about it like an install wizard. If you remember those install wizards from when you grabbed a piece of software and a pop-up installer would appear — a meta skill in some ways is just that installer. It converts a piece of instructions into action, an actionable to-do list for your local agent, by prompting it to do a self-inspection and then filling the gaps. Instead of having one static set of instructions for all environments.

Li: Ultimately what we're talking about here is: if I create a skill that does something inside the company, you need to be able to use it. Your agent needs to be able to use it. Making it extensible out of the box is really hard. I have to either guess at your workflow or your sources of truth. Just because I'm using Granola, maybe you're using Fathom — but the skill just needs to call transcripts. And even if we're both using the same tool, we're using it in different ways. We both use Notion, obviously, but how I manage my to-dos doesn't have to be the way you manage your to-dos.

Edmund: And so you're faced with forcing adherence to a shared workflow, which I think is the worst of all worlds. One concrete example — I have bespoke skills to get my credit card bill from my email. I get my calendar invitations from both my calendar and my email. So we have different ways to fetch the context. Even though we share the same source of truth — Notion, calendar, email inbox — we still have different ways to process that information.

Li: It's like the problem with the support docs for something like Salesforce. They're so generic, because everyone's instance of Salesforce is configured differently. That's why inside a company there are playbooks that humans at the company write that extend and also contradict the support docs of the actual thing. Because how we've set up our instance is different from how the generic out-of-the-box instance is configured. That's the gap that a meta skill tries to bridge.

Edmund: That's such a good metaphor. That's also why Notion doesn't ship with a playbook. When you onboard into a company, whatever it is — Jira, Salesforce — there are bespoke, weird configurations that are arcane and unique to that company. Which is why someone at some point has had to write a playbook of how to use that thing, even though it is just a generic piece of software.

Li: Should we talk about our workflow and how the meta meta skill configures skills for our agents? Here's a tangible example: we use Notion as our CRM. We both want to keep track of prospects moving through the pipeline, which results in a bunch of actions being taken.

Edmund: Stuff we have to configure in our product — follow-ups we have to send, billing, Stripe, MSAs to get signed, Slack notifications, call transcripts. Anyone building software used by other people, especially other companies, knows there's a bunch of configuration to go from “yeah, I want this thing” to “I'm using it.” So I had a skill to update our CRM, but that very quickly becomes brittle. You do onboardings, you do sales calls. Now what? You take my skill, but it doesn't work, because it's reliant on my workflow — the way context is structured in my emails. So we needed to create a meta skill, where my agent should be capable of going and updating the CRM and fetching the context it needs, but so should yours. And so should any arbitrary extra person in the company's agent. Without forcing them to change how they manage their own inbox, or what they use to record their calls, or how they actually interact with all the sources of context across the company.

Li: As long as the agent follows the same definition of what a prospect is, what it means when they're activated, how we move them between stages — you can do it. It's super brutal to build a catch-all skill that anticipates how people have their context fitted to the agents. But it's much easier to build a meta skill that just describes the concept, and then instructs the agent to figure it out, inspect its own environment, pull the relevant context into itself, and use that to build a bespoke skill.

Li: So how does the meta skill help organizations gain productivity?

Edmund: Just like how it helped us. Our agents get aligned. It removes the bigger part of the human alignment. Those PM meetings asking, “Hey, how do you get your emails out of the inbox?” or “Where should I check for the context on this client?” The meta skill teaches the agent to create a bespoke skill for itself, pulling the context in as needed and asking questions if it doesn't have an answer. The result is a skill that's aligned with the bigger picture but also usable out of the box.

Edmund: And easier to maintain. If you need to change the definition of a client, or the definition of a stage of the client, just change the meta skill and tell everyone to reinstall it. That's the end of the story. Or if you change your own setup — if you're changing your own agent environment — you don't need to go back and change all your skills. You just reinstall the meta skill and everything is back to working.

Li: It's like if you were managing a sales team. It wouldn't actually matter where they went to click buttons to get information. It would matter that they're filling out the right fields in your CRM. It ultimately wouldn't matter if they were scribbling notes in a journal while talking to a prospect, versus using a call transcriber, versus typing notes into their notes app. What would matter is that the context then went into the correct place. A playbook that had a series of if-else statements saying “if you write in a journal, do this; if you use a call recorder, do this” — that would be preposterous. Unfortunately, that's how most skills work if you don't create a meta skill.

Edmund: And one interesting emergent property of this is that the agents can teach it to each other. This might be an interesting point to actually show this in practice. If the skill isn't a brittle, very bespoke skill — if it's generalizable, and it's a meta skill that has the inherent concept of a wizard to install itself baked in — then actually one agent can hold the other agent's hand through installing it.

Edmund: Initially, I thought, well, I can just send my agent to interview you on Slack. But then I realized — why does my agent have to interview you? Why don't we just have my agent talk to your agent? Ask it the right questions with the context of the skill it's trying to install, gather the right context, create the skill, and then hand your agent the actual skill.

Li: So what we've done is we've left our OpenClaws to their own devices to set this skill up for me. What this skill does is update our CRM as a prospect moves from a prospect to an actually onboarded customer. Which as you can imagine requires a lot of configuration and fetching of context.

Edmund: That's unique to me — how I have my inbox set up, how I use Slack, what I use to record my calls, where this information gets stored. What we have here is a conversation kicked off by me, where I'm tagging in my agent, called Lobster, talking to Li's Claw, which is your agent. Together they're going to work through this to actually install it. We are not participating in this conversation. I'm triggering the conversation in a Slack channel that just has the four of us in it — the four of us being us two and our respective OpenClaws. And they hopefully will have figured this out amongst themselves to install it.

Li: Do you want to step us through this conversation, and we can call out things that look interesting?

Edmund: The first thing my Claw is doing is figuring out if we're using the same system of record and using it the same way. It's checking: do you use the same Notion databases I've been saving this information into? Are you accessing Notion the same way? Are you using the API, or some other means?

Li: This is super smart. You can tell there could already be divergence on this step alone — given that we both use Notion as our source of truth, but our agents have different ways to access it. I think this is high-taste when it's saying, “Do you have a database where action items get logged?” — because it's totally plausible you might have your own shadow to-dos. We might have one shared one, you might have your own divergent one. It's asking for the database IDs so that —

Edmund: So that it can actually help you plumb this into your own system if you have a divergent way of tracking these things.

Li: My Claw's confirming that we're using the same CRM. We are.

Edmund: It already called out that I have my personal to-do task DB, and at the same time the pipeline CRM has a to-do section. So it's noting a divergence — there could be a fork in the road of which of these two to-do databases we end up using. And we also have a different way to access Notion — who would have thought of that?

Li: My response, noticing that we're both using the same CRM, is double-checking. It's flagged two different to-do databases. Which one do you want to use? Or do you want some third option, like using a Slack channel? And now it's going into its next phase of questions, asking your Claw: how does it access data in the production database? How does it get things from some of our other vendors? This is completely generative. “Is there any outreach monitoring or email sequencing?” It's trying to cover all the cases.

Edmund: Let's see my Claw's answer. It correctly called out that the to-do DB is the right place. It shouldn't leave it in my personal to-dos. It's more for Li-only stuff. So now that is opinionated. Is it correct? It is, because it has my context. It knows from Notion where's the best place to put those things. Now imagine capturing that in a generic skill — “if the user has their own to-do database, do this. If not, do this other thing.” Fuck that. You don't even know if they have access to Notion. Maybe they just use their own notepad to manage their to-dos, and they prefer that way. You just never know. People have their own workflows, and this is the way to get everyone on the same page without a human in the loop.

Li: People talk about agents increasing autonomy because they take away tedium. If you don't have things extensible like this, you have to restrict some autonomy by forcing people to do their work in a structured or standardized way.

Edmund: As long as the shared systems of record are correct and everyone uses them the same way, how you go about creating your own personal context before it makes it into a shared system of record should be completely up to you.

Edmund: They're very genial with each other as well — always surprisingly polite. Last set of questions from mine. I'm going to follow your lead and just shut up so I can actually read this and then think.

Edmund: Oh, this is super interesting. It talks about where to post notifications on Slack. It's double-checking who it's talking to, the identity of the bots, and it's double-checking business logic about who to route to-dos between. Which is really interesting, because your agent will have some context about areas of ownership you have. It might even have richer and more nuanced details about that than my agent does. So now they should, in theory, be able to collaborate to make sure it's really robust business logic that handles all these cases. It's already called out a potential collision — that your agent has what we know to be a deprecated version of this skill. Let's see how they handle this, as it's essentially trying to install something that ought to, in practice, overwrite it.

Li: So it says there's no separate ops alerts channel. Skip the Slack proposal step. Direct to the to-do DB reviews from Notion, which is correct. I prefer not to review them on Slack. If a task is assigned to me, I'd rather have it live on Notion and I'll review it on Notion. My agent made a high-taste judgment call because it has the context of myself.

Edmund: The thing I'm really interested in is how they handle this collision. Let's see.

Edmund: This is super interesting. They've basically arrived at merging an existing skill with this incoming skill. It correctly called out that there's an existing skill running daily, and acknowledged there are gaps in the existing skill. It called out that the bespoke version should layer on top — use the same pipeline scope, use the same Notion log pattern, but add the to-do DB writes for the persistent blocker tracking. This is super high-taste. It would take us at least another 30 minutes to go over the skills and make the edits. But the agents are just doing it here by themselves. It's doing a diff between these two things, and it's not repeating itself.

Edmund: Now this is my agent wrapping things up. This is an incredibly good question to end on: “One last thing. Is there anything about your environment I haven't asked about that you think could affect how this skill runs?” That is the kind of question any good PM would ask at the end of discovery. It's your call at the end of any product discovery call. I'd always go, “Is there anything about this topic I didn't ask you that I should have?” I haven't hardcoded that into Lobster or the skill, but it is correctly asking a good question at the end. That is wild. That is actually kind of spooky. Your agent's always listening, always learning. Let's see what your one said in response.

Edmund: This is really cool. It's found the boundaries where it shouldn't operate, which is correct. This is a skill just about activation. So it's correctly saying: for any already onboarded customers, I'm not going to touch that. We happen to know we're building a separate skill to handle that, but it didn't need that context to figure out the boundaries. These are subtle edge cases of the interplay between a status and a property on a card in Notion. It's correctly working.

Li: I didn't read this piece of response before we hopped on. It is very high-taste. Kind of spooky as well. It's calling out a potential discrepancy and the boundaries of the skill.

Edmund: Damn. How long do you think this would have taken us to do between ourselves?

Li: At least 30 minutes to an hour.

Edmund: Now my Claw is creating the skill for your Claw. Which makes sense, because my Claw has my version of the skill in its context — has full access to that and any other skills that might point to it. And it's writing Li's Claw's version and sharing it with your Claw once it's completed. And it's documented the changes it's made from my version.

Edmund: It's explained everything that's working. All of this is traceable. It's been annotating its learning as it goes along, so we can always go back and look at this. We've basically just thrown them together, let them figure it out, and they've now created a version of the skill that works for you.

Li: Beautiful. Now our agents are fully aligned. Tomorrow when the skill runs — and we are actually doing this — our agents are following the same workflow. They're fully aligned. There are no discrepancies between our workflows and we can all sleep better. This also saves both of us at least 30 minutes to an hour.

Edmund: For companies trying to put this into practice, one of the challenging things is actually creating crisp definitions of your own internal business logic. It feels like companies where the humans can't agree on shit together are going to pay an even bigger penalty and tax for that in this world. Because if you can't create some shared definition of things and agree on them, you can't unlock the full value of having these meta skills and having agents be able to create localized, extensible versions of them.

Li: Definitely. For organizations that have clear definitions of their workflows — if they could wrap that into a meta skill, assuming all the employees are using Claws — for organizations that actually spend time and effort on clear definitions of their workflows, this will be a huge unlock. All you need to do is define the skill once, define your SOP, a clear status scale, with instructions to tell the agent to figure out and install the skill into its own workspace. Everyone will share the same workflow, and it'll be much easier to write as well.

Edmund: It'd be much easier to write a meta skill than to write a catch-all skill that predicts all the edge cases.

Li: You can actually write skills the way they were invented for — which is to encapsulate a piece of logic and get things done without thinking about all the edge cases. A skill is not a place for stuffing a bunch of if-else statements. It's for describing a workflow. Leave the rest to the agents themselves.

Edmund: It's declarative rather than procedural. I like that, the way you put it. Let the agents do what they're good at — which is figuring out the procedure between themselves. Rather than you creating a fantasy choose-your-own-adventure procedure of “if you do this, then do that, and then do this, and then move it on.” Do what you are good at, which is defining the actual domain-specific logic of your company. Then just express that as a meta skill, and let the agents talk to themselves to figure out the actual procedure on a per-person basis.

Li: Yes.