Introduction to the 'building public' blog and an overview of the projects discussed.
00:45
Alfred: AI Butler Project
Details on the Alfred AI butler project, its evolution, challenges, and current state.
21:30
Dovetail: Dev Workflow
Exploring Dovetail, an AI-powered tool for automating developer workflows and context management.
39:03
Joe: Proactive Voice Agent
Insights into Joe, a proactive voice agent for construction, focusing on interaction and performance.
51:14
Jig: Cognitive Assembly Line
Understanding Jig, a cognitive assembly line for deterministic and agentic task orchestration.
01:08:50
Other Projects & Conclusion
A look at the 'No Code No Clue' project and concluding remarks on future updates.
Transcript
00:00
This is my very first, building public blog.
00:02
But I wanted to go away from just very well researched blog articles because I'm not a pundit, I'm a builder.
00:09
So I want to show you what I'm building because if I did this, I would have a lot more content.
00:14
And I'm pretty sure that you guys are more interested in this one.
00:16
So today I want to talk about four things.
00:19
things.
00:20
Alfred, Dovetail, Joe and Jake Offred is the project that you guys know about.
00:24
Dovetail is a new project that I teased in my latest blog article, which helps me be not just more productive, but also be a better engineer without having to be a better engineer.
00:35
And then Joe and Jig are client projects, which I'm going to talk about just a few minutes.
00:39
And then there are a couple other fun stuff that I'm going to show you.
00:42
So, Alfred.
00:44
I started working on Alfred, last August, right when I said I'm building my own butler.
00:51
Because basically what I wanted, And everybody was promising that with AI, but it's not doing that.
00:56
And it's pretty annoying because of hallucinations.
01:00
And I And then even a couple months later, MCP1 was introduced in November and I'm like, yeah, that's what we need.
01:07
But even that is very brittle.
01:09
So anyway, I started building N8N workflows, because that's actually easier than coding.
01:15
At least that's what I thought.
01:16
And then basically I figured, hey, I want Alfred to just run these N8N workflows for me and build these N8N workflows for Me.
01:27
And then really quickly I realized that a big problem of this is connection.
01:32
Like having MCP connection and then not paying a shit ton ofmoney to make.com or Zapier or N8N.
01:39
So I need self hosting.
01:40
And then I went into this whole self hosting idea.
01:42
So I started building on Alfred os, which was supposed to a bunch of open source applications for you and just run a bunch of apps, but the open source alternatives for that.
01:55
So like instead of make.com let's use N810 because you can self host it.
01:59
Instead of Zoom, Jitsi because it's the same thing but open source.
02:03
And then I would have skills for authors, N8N workflows that can really do stuff well.
02:08
And then, you know, if I want to use MCP servers, I can just add them to skills.
02:13
But the point is that an Alfred skill would be deterministic, right?
02:17
That's the big problem that I've had, because you go into Claude and you create a bunch of connectors, and then it starts doing stuff with the MCP servers, but then it starts hallucinating, and I need someone to hold the agent's hand to not drift off.
02:32
Because the moment it starts working with lots of data points, it starts to drift off.
02:36
And what happens if I'm like, hey, get me data from this, from this, from this, from this, and then prepare a report for me.
02:42
And then it's just so much bloat and data that it drifts off.
02:46
So that was one big problem.
02:47
And the second big problem was that, I'm shit at DevOps, so I needed to figure this out somehow.
02:54
And I ended up creating a railway template which is, you click once and it deploys a bunch of apps, like n8n and nocodb and cal.com.
03:04
some of them work, some of them don't.
03:06
And then it also deploys a custom version of LibreChat, which is just a ChatGPT clone that would allow you to use your own API keys to have a chat with Alfred.
03:15
And then inside Alfred, Alfred would be connected to all the apps that you had.
03:20
And I'm like, yeah, that's pretty cool.
03:22
But I still want the N8N workflows to be running automatically.
03:25
And for that, I would want it to be hosted locally because I want the hosting part to be done, so I'm not paying for the cloud version.
03:33
and then the N8N licensing issue was a big problem because that I cannot do unless I pay up, 50 grand a year.
03:39
And I don't want to.
03:40
So I was kind of stuck, right?
03:42
I wanted Alfred to just know things and do things and save money.
03:47
And then the ultimate idea was that I would have this And then, boom, it installs the whole Alfred OS and the agent and the skills and everything on a server, and then Alfred can do stuff for you.
04:00
Now, the problem is, even if I solve that problem, I still cannot work with an 810 because I don't have the embed license.
04:07
So instead of doing that, I needed to figure out some other ways.
04:10
And there are a couple other ways.
04:11
Like, for example, what's really Promising is Manuscript now has a thing.
04:17
It has an API, right?
04:19
I don't even know where the API key is.
04:22
Or maybe here.
04:23
So Manus has an API, which means that I can just give tasks to Manus, but the problem still remains.
04:29
It's like, okay, I don't have to do the N8N thing.
04:31
I can still probably figure out how to build the cloud bit, but I never done that before.
04:36
But even then, this is not really going to be entirely free because you still need to pay for credits.
04:41
So I decided to kind of streamline the whole thing.
04:44
What I'm showing you is something that I built in the last three days, and this is how powerful my new workflow has become.
04:50
is that I was able to solve a problem that I don't have the skills for in three days.
04:55
So the idea of Alfred is basically, it's an execution layer for AI, You get an MCP server, you connect it to Claude or ChatGPT or whatever, and then it just knows things, does things, and saves money for you, and that's it.
05:08
And how it works, it is technical, but your experience of it is not technical.
05:15
So there's a big problem with AI execution, which is these things are very smart, but they're not logical.
05:22
They cannot employ actual reasoning, which means that reasoning unlocks deterministic workflows.
05:28
I tell you to follow step 1, 2, 3, 4, in this order.
05:32
And the AI may or may not do that.
05:34
There is no validation, there's no enforcement, there's no deterministic stuff.
05:38
For that, you need to write code, and that's kind of the plumbing that's missing.
05:42
So basically we would need separation of concerns.
05:45
We want the agentic work, which is flexible and scalable, but also can hallucinate and be unreliable.
05:51
But it's really good with unstructured information.
05:54
And that unstructured information is the idea I have in my head.
05:58
but once it's clear enough, I want the AI to kind of lock it down.
06:02
It's like, hey, here's the deterministic way of getting things done, and here's how I'm going to do it again and again and again.
06:08
Claude skills are a good move in that direction, right?
06:12
There are agent skills inside claude, but these are just mega prompts and short scripts, and it's not entirely what I would want.
06:20
So it's like, okay, we're getting there, but the flow I'm expecting is.
06:25
And the original flow I was expecting is, I talk to Alfred via chat.
06:30
And then once I figure out what I want, alfred builds an N8N workflow for me.
06:35
And then I can set when it should run where I can just say, hey, Alfred, run this for me.
06:40
And that would turn Offroad into an operator like a chief of staff, but AI, and it doesn't work because all the N8N workflows Claude or any other AI tools can build are not very good right now.
06:51
And it's just brittle.
06:53
So we need something better.
06:55
And the MCP was an idea that, you know, MCP servers and I just give the tasks to Alfred, to Claude, and then it just calls the MCP tools.
07:05
But it's, it's, then it's unreliable and hallucinates.
07:08
So I either have spaghetti inside N8N that I need to plumb, or I use an agent that tries to kind of cosplay as Dexter from Dexter's laboratory, but really it's dd and that's the big problem.
07:21
So in order to solve that, I came up with two things called talk mode and work mode and the version of all three that I'm building right now.
07:30
Basically what you get is that the moment you log in, what happens, is you get deployed, you get provisioned a new server.
07:37
So let me show you.
07:38
This is the current version.
07:39
It's, it's very early stages, but I'm using the open SaaS template.
07:44
It's running on a WASP framework and I definitely recommend you take a look at it if you want to build apps, because it's complete stuff, stripe integration, authentication, everything.
07:55
So when you're in Alfred, it's, you know, the namings and everything are not, done yet.
08:00
It's all running locally.
08:01
But the idea was I log in and then when I log in, I would just click on launch my Alfred os and then it starts getting created.
08:09
And then when it gets created, it's pending, which means that, there is a server that's currently setting things up for me, currently using Contabo, because I'm going through the verification process with Hetzner, which I don't know, I've built.
08:26
I've opened bank accounts with less but if you look at it now it's provisioning.
08:31
And now I got a, specific URL DynamicPheasant Alfredos site.
08:37
This is set, the DNS configuration is set, but you also get the IP address.
08:43
Now currently nothing's installed in it, but now there is an automatically generated virtual machine that I can, that sets itself up and Then with a setup script, and then inside that setup script I can add whatever app I want to add.
08:58
So that's the next step.
08:59
So I solved this problem.
09:01
And we're getting pretty close to the Alfred cloud because the big change is that I let go of the idea of installing N8N on this.
09:09
okay, so once you have that installed, you will have basically two apps that comes with the install, LibreChat and NoCODB.
09:17
And that's very intentional because you're not probably never going to have to open that.
09:22
What you're going to have as a quick start, you're going to get a simple URL and an API key and all you need to do is go to Claude or ChatGPT and add Alfred as a connector.
09:32
And once you add Alfred as a connector, then you can open Cloud and say, hey, Alfred, do this and do that.
09:38
And it will actually use Alfred instead of Claude.
09:41
Like Claude turns into Alfred for that session.
09:43
this is the chat part.
09:45
If you want to go to Libre Chat because you don't want to use Claude, you can still open the Libre chat.
09:50
That's why it's there.
09:51
can open Libre Chat and use Alfred there.
09:54
It will have the MCP connections to NoCODB.
09:58
It will have all the MCP connections you want to, and then the Alfred MCP will do the same thing that LibreChat can do.
10:03
So there is the same thing, but the use case here is not that.
10:08
And that was my biggest pet peeve is that I was getting distracted with the Alfred os.
10:12
This is just an infrastructure thing.
10:14
And now that I figured out how to run this, I will just create Alfred Cloud where you pay, I don't know, probably $40 or something per month.
10:23
And then you get a server, full access to it if you want.
10:26
You get a server and then it runs Alfred, it runs all the apps, it runs LibreChat, and then you have your own little thing and all the data is yours.
10:34
but the interesting thing is happening inside work mode because the idea that I want, I have is that I want to teach, offer the skill once and then I want Alfred to just remember how to run it forever.
10:47
So I would just describe something like, hey, process my emails and organize them into labels by topic of any importance.
10:53
And then I want Alfred to automatically look at my connections and see what's what, get the context on my little word, my reality, and then come up with an idea of how this should actually work.
11:10
And then it comes up with a solution.
11:12
It's like, hey, I think what we should do is connect to Gmail and then fetch unredeem us from the last 24 hours, etc, etc and then also give me a couple extra ideas like hey, what if we do a sentiment analysis from another MCP server as step 4?
11:29
Or what if I do this and that and then you say, okay, you know what I want you to do option C.
11:35
And then I just say learn it.
11:37
And then when I say learn it, then Alfred, it doesn't save the skill into memory.
11:42
It actually creates deterministic code that will trigger Alfred running every time, the exact same time, the exact same way, the exact same tools are being used.
11:55
It's deterministic and reliable and does the exact same thing all the time, much like in any 10 workflow, but you just built it in four steps because you let Alfred do the heavy lifting for you.
12:05
So talk mode and work mode is really just separation of concerns, right?
12:09
Talk mode is about authoring.
12:11
It's like ad hoc conversations, exploration, hey Alfred, do this, do that.
12:15
And then whenever something happens it's like, hey, what we just did, just give me a, like a few step walkthrough of what you did and what tools you used.
12:22
And then when Alfred gives you something like, oh, this is a really good workflow, I want to learn it.
12:27
And, and I want you to run it every Monday at 7:00am and then Alfred actually transforms that into a skill and that replaces an N8N workflow.
12:36
and that's it.
12:37
Inside work mode, you will have a trigger, you will have the actual skill which have the actual steps.
12:42
And then Alfred always runs it, completely deterministically all the time.
12:48
And then the idea here is that, you know, clothes skills are supposed to kind of do that, but the authoring bit is missing and the orchestration bit is missing, the execution layer is missing.
12:59
So let me give you a couple examples.
13:01
Let's say I just type in hey, every Monday I want you to pull revenue from stripe support tickets, project updates, Slack mentions, and then send me a summary on Slack.
13:09
And then Alfred understands what I want and then it creates a workflow suggestion.
13:15
And then it's not like a 45 node spaghetti, it's really just six steps, right?
13:21
Because it's like agentic in execution, but deterministic in orchestration.
13:26
And then once I say, yeah, that's it, learn it, then it saves it for itself and it will do those steps exactly the same thing or another thing like, hey, Alfred, I want you to analyze, support my Support tickets, find buying motives, build our ICP research leads in Apollo and then save them into my database.
13:44
Like this is a pretty big work.
13:46
So I'm like okay, let's see, let's we trigger this manually and pull support tickets via Zendesk.
13:53
Then I'm going to analyze it says via Alfred os, which means that Alfred is thinking instead of using external tools.
14:00
So analyze buying motives, build icp.
14:02
Then based on that I'm going to use the Apollo MCP to research matching leads and then save them to the database.
14:09
And then I can also do stuff like self maintenance, right?
14:13
I can just say Alfred, I want you to every Monday I want you to analyze our conversations from the last week and suggest me skills to add or remove or improve.
14:21
And because Alfred can examine itself, it will pull conversation from the history from last week using its own mcp.
14:28
Then analyze skill usage patterns, identify gaps, generate improvement suggestions and then send your report via Slack.
14:34
And then Alfred might suggest to relearn some skills if there are some errors or something.
14:40
So you don't need to deal with any of that.
14:42
another example which is something that I started working on is I have a home Assistant here at home and I started connecting everything.
14:50
And then there is the Home Assistant preview edition, which is basically like Alexa but for Home Assistant.
15:00
And then you can run anything on it.
15:02
So I have a Home Assistant voice preview.
15:04
It's downstairs in the kitchen.
15:06
It's connected to chat, it's connected to OpenAI.
15:09
I think it's using GPT5 and it also has the iconic Alfred voice from elevenlabs and that's it.
15:18
So then I have cameras, I have smart switches and stuff.
15:22
So what I can do is I can say hey, if someone's at the door and I don't reply on Slack in 60 seconds, call my phone.
15:28
So then Alfred is like, okay, first I need to have a workflow created.
15:32
Trigger would be a doorbell event.
15:34
I need to figure out what that is.
15:35
I just connect to Home Assistant through an MCP connection and find that event.
15:39
And then I will just okay, that's the web hook that I need to set.
15:42
So you don't need to deal with any of that.
15:44
And then it's like, okay, step one, detector doorbell rings and Slack alert.
15:50
Wait 60 seconds, no response.
15:52
If it replies then you know, I can see that on Slack.
15:55
So I don't do anything.
15:56
If no reply, then I'm going to initiate a phone call which can be either via Twilio or it even can be using vapi, which, you know, just have a voice agent.
16:06
So Alfred can actually call me and say david, someone's at the door.
16:09
Like go check it out.
16:10
You're not looking at Slack right now.
16:12
So the idea here is that this is the kind of stuff that I want to use and it's pretty annoying to me that I can use clothing, skills and everything, but it's not doing the stuff that I would want to do because it's hallucinating.
16:24
Or I can build any 10 workflows which are again not doing what I want because they're so complicated and spaghetti and they're brittle and fragile and unreliable.
16:34
So the idea here is that Alfred will come together as everything, right?
16:40
You just click on deploy once like you did it here.
16:43
If I you can see it's running.
16:46
It got an IP address, it got a host and that's it and everything's there.
16:49
So login, sign up for Alfred, then you get an MCP server and API key so you can start immediately using it.
16:58
And then Alfred OS gets deployed into that server for you.
17:02
So Alfred has instant connection to open source tools and then you will have apps that you like.
17:08
Open source apps.
17:09
I'm probably going to build or add more apps to the App Store.
17:13
or you can also add your own MCP connections.
17:16
Like let's say you love Zoom, you don't want to use Jitsi.
17:19
So instead of installing Jitsi on Alfred os, you're going to just create an MCP connection for Zoom to Alfred and from that moment onwards Alfred understands what's about I'm probably going to have, for the Alfred cloud, we're going to have a solo version which is going to build the lowest tier and that's going to be you know, including the whole thing, very basic Libre Chat and NoCODB.
17:46
And then whenever there is new apps coming up in the App Store you immediately get access to them but you can start using offered immediately.
17:53
I'm not entirely sure about the pricing.
17:55
It may be, I don't know, it may be usage based or like skill runs or it may be a, simple monthly fee.
18:03
the Alfredos bit is just, you know, it's flat, it's, you have full root access to the server, and on top of that you have Alfred as the agent.
18:14
the way the Alfred OS gets deployed I'm still going to make that open source.
18:19
So if you don't want to use Alfred as the agent, just want to self host stuff.
18:24
You will get access to that.
18:26
then also thinking about another tier.
18:30
I'm not entirely sure how this would work.
18:33
I may just have the offerage solo and that's it.
18:35
And then you know you can add as many MCP connections as you want.
18:38
But the whole point is you know, collaboration and skills library because the real value is going to be in skills.
18:44
And then for those who have existing businesses and they're like hey, I have a bunch of automation that I need here.
18:50
I have the offered pro subscription which would come with a white glove onboarding and I will come on board and help you create the skills, help you set everything up for yourself.
19:00
And then you just get a URL to log in and you also get the Alfred mcp, set up as a one pager and then you know we would monitor the skills and the skill development.
19:12
So you really get sort of a performance supervisor for your Alfred AI Chief of staff.
19:18
I don't know about pricing yet.
19:20
I'm still thinking about it.
19:22
There's a lot of stuff to do.
19:23
I just want this to work first.
19:24
And finally sort of clicked everything together and it works and the skill gaps are closed.
19:30
So yeah, I'm going to launch this website soon and and, and probably give some more updates but also because I don't want to just you know, show you and talk to you about theoretical stuff.
19:44
I have started using Linear for tracking my work and inside Linear I have a bunch of projects.
19:52
Some of them are, you know, client projects, some of them are my own projects.
19:56
And here is Alfred, which connects to my project, management tool, again Linear, and it has all the stuff that I'm working.
20:04
So full building public mode.
20:07
Right?
20:07
Full building public mode.
20:08
You can see all the stuff that I'm working on.
20:11
you're going to see how these are created.
20:14
I'm building everything with cloud code, so this is public.
20:18
And I'm going to share the links to the different roadmaps to the different products I have below.
20:24
And you will have the ability to keep track of things and ask questions and whatnot.
20:29
So again I want to build fully in public.
20:32
yeah, so this is Alfred and probably the first question most people are going to ask is when is this going to be ready?
20:40
honestly I don't know.
20:41
I keep building it and now I feel like it's probably going to be closer than I Figured, there are a couple people who paid for the original Alfred and I will give them the option to either have for the Alfred cloud, or use it for sort of a white glove onboarding session.
21:00
and then, you know, keep them on.
21:02
So I don't know, we'll figure things out.
21:04
A couple months ago I suggested that if you paid for it, you can just ping me and I will help you set it up for yourself however you want.
21:10
That still stands.
21:12
Okay.
21:12
So I'm going to keep posting about what I do.
21:16
The first video is like pretty blunk.
21:18
But I'm going to start releasing more updates to you on YouTube and here on, on the Lumberjack.
21:25
And it's just, you know, immediate stress.
21:27
Hey, here's what I'm building, here's where I'm at.
21:29
So you will be able to see that.
21:30
Now how and why can I do this?
21:33
So then we get to Dovetail.
21:35
Dovetail is a really interesting project because first, here's the evolution, right?
21:40
I started with IFTTT in 2014.
21:44
That was the first time I built a no code automation.
21:48
And it was, I think it was archiving photos from my iPhone to Dropbox.
21:53
And back then I was running a software company.
21:55
I was designing and launching products and I was raising VC funds and doing stuff.
22:00
And, and I did a lot of technical stuff in my life, but I never really coded.
22:04
So this whole no code idea was pretty cool.
22:06
And then as you know, life happened and the AI thing came and then there was the promaster, era where I really got into make.
22:16
So in make.com, i started using it in 2023 and I got really, really big on that.
22:22
But the problem was it got really expensive really quickly.
22:25
So in 2024 I moved over to.
22:28
And the interesting thing about here is that these are progressively more and more complicated.
22:33
Somewhere in between these two I was using Zapier, but that's like super expensive.
22:38
So with NA10 it wasn't enough.
22:40
So I had had to add Jetbrains, which was not Jetbrains.
22:43
Sorry, let me check for.
22:45
Check it.
22:45
So I, joined.
22:46
I started.
22:47
So I started using Jet Admin for creating a ui.
22:50
And then there are a couple solutions like that, like Bubble and Softer and Framer and a couple other stuff.
22:57
So I was looking around and then I started getting into using Airtable a bit more and then I moved over to Supabase.
23:05
So you know, my stack started expanding, getting more complicated as I was building more and more complicated stacks stuff.
23:11
And then, in 2025 I started using lovable.
23:15
And basically lovable plus Supabase, replaced everything else.
23:20
So I didn't need to use any 10.
23:22
I could just create Supabase Edge functions.
23:24
I didn't need to use Jet Admin because Lovable was building the ui.
23:27
I didn't need to use Airtable because Lovable had a native Supabase integration.
23:31
I felt powerful.
23:33
It was really good.
23:34
But then the problem is the same thing that we discussed with the Alfred problem, which is it's brittle.
23:40
Even Lovable UI gets really, really, complex and brittle very quickly.
23:45
So it wasn't very reliable.
23:48
So I started focusing more and more on dev work.
23:51
I was learning a lot of stuff.
23:53
and then ultimately I ended up using a lot, right?
23:58
So with Cloth Code, I tried Warp, I tried Cursor, I even used Cloth Code in the Simple Terminal or powershell, and you know, it works perfectly.
24:08
It's fine.
24:09
So cloud code is pretty cool.
24:11
And then I started going through like a productivity boom, right?
24:17
So then I went into Cloud Flow, which, is built by Riven Khan, and it basically launches a bunch of cloth, code instances.
24:26
And then you just, you know, have an agent swarm that builds stuff for you.
24:30
And I don't know, man, it's it got, it got overwhelming really quickly because the hallucinations problem was still there, right?
24:38
So Cloth Code was really great, but, it was still hallucinating and making stuff up.
24:44
Cloth Flow is the same.
24:45
I love it.
24:46
It's pretty cool.
24:46
But there is a lot of it that's just smoke and mirrors.
24:49
There's a lot of, you know, I don't know.
24:51
There's a lot of parlor tricks in there, I think.
24:54
so after a while I realized that basically, I have the same stack every time I work, right?
25:02
Almost all of my projects are almost always built on the same stack.
25:05
I use Supabase for db.
25:08
I, use Supabase for authentication.
25:10
I use Supabase for serverless functions, and for APIs as well.
25:15
It just gives you, like, you create a project in Supabase and then you're done.
25:19
I started using Linear for documentation and project management.
25:24
Because I wasn't doing anything correctly.
25:28
So I had to start forcing myself to start documenting stuff.
25:33
And then I started, using Fly IO for hosting.
25:37
It's pretty cool.
25:37
It's pretty lightweight, pretty thin.
25:40
And you just, create an app on Fly IO, deploy it and runs it and kud, code can deal with it pretty easily.
25:46
So I'm like Supabase Linear Fly IO.
25:49
And then I was still using Lovable for ui, but then immediately, once I have the UI more or less nailed down, I will move it to GitHub and then the GitHub repo into Claude code and keep working from there.
26:04
But again, there were a couple problems, namely around code changes and commits, comments and auditability, etc.
26:13
So the big problem I had with this is that I just kept working and working and then in the beginning I was looking at the stuff that cloth code was changing.
26:22
But then after a while I got complacent and then, you know, it started replacing entire working features with placeholders for no particular reason.
26:31
So I needed to learn a bit more about branches and pull requests and everything in GitHub.
26:36
But then all of a sudden I had this whole idea that, okay, so I have an idea.
26:42
I go to claude code and I say, hey, do this and change that.
26:46
And then claude code out of context, it just pulls stuff from its memory and then it starts building something and then it pushes it to GitHub, which then pushes stuff to fly IO and it gets deployed.
26:57
And then if it deleted something from the database, like I need to go back, but I don't actually realize it until like an hour later.
27:05
So in order to make this work, I needed to have a pretty strict flow, right?
27:11
I needed to.
27:12
First of all, when I start working on a project, I don't want to deal with any of this shit because anything, everything needed to be connected.
27:20
So but let's say I have this set up, everything is set up.
27:23
So then what I do is I send the prompt.
27:26
Let's say I start, start cloud code code and then cloud code, code or me.
27:33
Somebody needed to check the git repo status, latest comments, latest latest updates in linear, connections, and basically get, get the context or get context on what's what, right?
27:49
And then I say okay, send the prompt on what to build.
27:53
But if I'm not doing it right, if, if it's manual, then inside the prompt in the prompt I would need to do stuff like check linear, check db, do this, do that.
28:06
And even then in order to like do everything properly, I would need to say check linear to see if my request is related to any issues issues.
28:18
Then check the Latest commits on GitHub to see if we worked, worked on this before.
28:25
Then figure out what I want, right?
28:28
And then if it figures out what I want instead of just starting to build it, I would need to say create a new or or move to a new branch in Git and start working on that.
28:42
Once done, create a pull request and update, Linear.
28:46
So if it's manual, then I do all of this before I send the prompt.
28:50
Or I could just say claude code does it, cloud code does it.
28:55
And then we could have skills which is automatically invoked or not.
28:59
So I don't have control over that.
29:00
That's not really good.
29:01
I can use a, claude CLAUDE MD prompt, which again works.
29:06
Or not.
29:08
So then we have the new Claude plugins, which was pretty cool.
29:12
You know, it kind of does what I want.
29:16
But I was like, okay, so Claude plugins are basically commands and hooks.
29:21
But what is the hook that we're running here?
29:23
Right?
29:24
how do we make it work?
29:25
so I had the idea of launching sub agents, but then again, it works or it doesn't because then the agent is doing the orchestration.
29:34
So anyway, so I decided that none of the current solutions solve this.
29:39
So I started working on Dove there.
29:41
And what Dovetail does is first off, I can just create, I say Dovetail init.
29:48
Let me show you.
29:48
So basically the way you install Dovetail is you say it's on npm.
29:53
So NPM install G numberjang.
29:57
So Dovetail and then you install it and then it pulls the package from npm, it pulls the code and then you can check the version and then this is the current version and you can take a look at all the different, all the different commands DAVTA currently has.
30:15
If it's the first time you're running, you can go to DAVTA onboard, or if it's not the first time, you can go to DAVTA Config.
30:23
And in the config you can see that there are a bunch of different, connections put together.
30:30
There is GitHub, the Lineari CLI and the Linear API.
30:34
Linear doesn't have a command line interface.
30:36
Supabase, connection Fly IO connection, and then also a couple of default stuff like what's the default organization, what's the linear team key was the Supabase default organization.
30:47
It just walks you through that, right?
30:49
And then that's it.
30:51
Now the interesting thing about all this is if something's not working, you can also just, launch dovetail.
30:58
Dr.
30:59
And that's it.
31:00
Now, if I want to create a new project from scratch, what I have to do is I just say Dovetail Init and then project name.
31:08
I'm not going to start it now because I just started, I just ran it before.
31:11
So it starts, let's say the project is Dovetail Live.
31:15
And then what happens?
31:16
It runs the authentication, right?
31:19
It asks me to confirm if this is the repository, do I want to make this public?
31:23
What's the linear, team key was the Fly IO region, et cetera.
31:28
And then it configures a project and then it creates the scaffolding.
31:32
So it creates a simple PERN stack application code.
31:36
Like the basics are set up, for running a, an app that can be hosted on Fly IO and whatever.
31:42
It's a container, you can build whatever you want in it.
31:45
Then it creates a GitHub repository, so creates a scaffolding, creates a GitHub repo, creates a linear project, creates a Supabase project, creates a Fly IO app, then installs all the dependencies, wire everything together, set everything up.
32:00
And then once, the whole thing is put together, the last step is it installs a bunch of hooks to clothe, right?
32:07
And then if I want to, I can go ahead and and take a look and see that.
32:12
Here is my scaffolded project that I just created.
32:16
I can go to Linear and see my linear project that was just created with a couple of basic issues.
32:24
or I can go to Supabase and I don't think this is going to load it because that's the project URL, but let me open it for you.
32:33
So I go to my project here And as you can see the Dovetail Live project, which ends with Xyle, as you can see, Xyle.
32:44
So this was just creating and created.
32:47
And then, if I go to Fly IO, you can see that there's Dovetail Live staging and Dovetail Live production.
32:54
It's still pending because it hasn't been committed yet.
32:57
Now I have this here and if I want to I can just go over to Dovetail, Live and then let's see what's what.
33:06
So I just go see the Dovetail Live.
33:10
Let's go back here.
33:10
There you go.
33:11
And I just can, I can just say Dovetail status.
33:13
And I can see we don't have an active issue.
33:16
So I can start running stuff here like Dovetail check issue and then it checks all the issues.
33:23
Are there any open?
33:24
No.
33:24
So I'm going to create a new one.
33:26
What's the issue title?
33:27
And let's say I want to create a DIY gardening landing page, no description, medium priority.
33:35
So here is the manual mode, right?
33:37
And it doesn't even work.
33:38
Right.
33:39
Okay, so what we can do is I just go to CD Dove there live and then what I can do is, you know, I can check Dovetail status status.
33:50
It shows no active issue.
33:52
So there are a couple stuff that I can do here, or if I don't want to do stuff manually, what I can do is I can just launch claude.
33:59
And then when I launch Claude, what actually happens is that because I installed a bunch of hooks, CLAUDE now automatically runs those hooks every time I want them to run.
34:08
So for example, every time a session starts up, there is a dovetail command that runs which gets us all the what's the project, what branch branch are we on?
34:17
What are the services, what are any issues?
34:20
Let's see what the commits are there.
34:22
what are the latest updates on Linear and and that's it.
34:25
Right?
34:26
And then no MCP servers are configured.
34:28
That's something that I will need to add to Dovetail.
34:31
So it automatically configures Linear MCP server and Supabase MCP server and everything for you so you don't have to deal with that.
34:38
But the idea here is that when I say build a DIY gardening blog landing page, then the idea is that before I actually send a new prompt, first let's say if there is an existing issue, there is none.
34:56
So the first hook fails.
34:58
So now it starts looking at the actual code inside the database, inside the code base, right?
35:05
And then it creates, it fills its own context, but it's not doing anything, it's not changing anything.
35:11
It's just learning about your, your code, it's learning about the environment it's in.
35:15
And it says this is a pern stack with a video and react, et cetera.
35:19
So now it understands what's up.
35:21
And then the idea is that all of these things, this is done automatically by a hook.
35:28
So the idea here is that Dovetail has a session start start Hook has a user prompt submit Hook has a pre tool use hook and a post tool use hook.
35:40
And that means that every time a session starts, check what's up with the project.
35:47
Get context on the last work when I send the prompt, double check if we are on an issue.
35:53
Is there an issue?
35:54
Is it the project?
35:55
Do we have anything like that?
35:56
And then before there is a pre session, look at this.
36:01
there is a pre to use hook that runs before it shows error, but it's not an error.
36:07
so it runs before write action is done and launches a Sub agent and that or sorry first checks if Are we trying to use a restricted tool?
36:17
Yes, the write tool, changing a file that's restricted.
36:20
So let's see what's the project was the main oops, no active issue.
36:23
So we're going to block block the task and instead we're going to launch a dovetail sync agent.
36:30
That's job is to say here's what the user wants, here is the project, here is everything about the project and then figure out do we need to create a new issue, do we need to create a new branch in GitHub, like what do we do?
36:42
And it just gets everything done for you together.
36:46
there is a current issue with the hooks.
36:48
That's what I'm working on right now.
36:50
it's not very stable at the moment so because I think the agents weren't installed for some reason.
36:57
But then the agent gets everything done for you and then creates a new branch on GitHub related to the issue.
37:03
So the two are synced together and moves to that branch.
37:07
So you're not going to change anything that's working.
37:10
And then once it's done working it triggers the post tool use hook.
37:14
That's basically does.
37:15
Okay, so we just did this.
37:17
Is this issue done?
37:18
If not then we keep working on this.
37:20
If it's done, I'm going to create a pull request everything in linear and update the issue or create new issues if needed.
37:27
And then, and only then I'm done.
37:29
Right?
37:30
And that means that I move forward, I keep building stuff and then dovetail sort of cleans up in front of me, like clears the pathway in front of me, gets me the context I need and then it also cleans up after me.
37:43
And because of the branch protections on GitHub, and the pull request and the heavy documentations and the automatic deployment, everything's pretty automatic.
37:53
And there is literally zero chance that Claude code this way can actually like fuck stuff up for real during development.
38:02
And the interesting thing is because this way the whole context is now externalized inside the commit messages in GitHub and the linear documentation.
38:12
Now the agent always has really good context even if it doesn't have that in its memory.
38:17
So I can just say hey, tell me what to do, I want to build this ALFREDOS cloud and tell me what are the stuff we should do.
38:25
And then it says hey, so we have five milestones.
38:28
Milestone one requires six steps.
38:30
And I actually said to Claude, fine, in that case launch six agents.
38:34
for all of the my.
38:36
Of all of the my install steps, make them write whatever needed to be built and then launch another agent that orchestrates it and tests everything.
38:44
And then because the hooks deterministically enforce what to do and how to go through that, it always works the same way.
38:51
So yeah, that's what I'm working on right now with Dovetail.
38:54
And I'm pretty sure that I can, I can release it soon.
38:58
It's again, it's not stable, it's not working.
39:00
But anyway, I'm going to release it anyway.
39:03
yeah.
39:04
So where are we?
39:05
Okay, so that was daft there.
39:07
And again, so you go into here and then you can actually look at the issues and everything that's in here.
39:15
So everything that, that I'm building is documented in here.
39:20
this documentation is done through like a earlier version of Dovetail.
39:25
So this is fully automatically built.
39:27
The whole linear, I never open linear.
39:29
okay, so let's go back.
39:32
So the next thing is Joe.
39:33
Now Joe is interesting because it's a voice agent that I'm building for a construction company.
39:39
And we have a couple of stuff that we need to adhere to.
39:44
One, Joe needs to have.
39:46
There you go.
39:46
So Joe needs to have a less than 10 cents, per minute, cost.
39:53
has to have a less than 500 milliseconds latency.
39:58
and also has to.
40:01
We have about 30 different tasks that the client said they want to have.
40:06
Johan, we need to have at least 60% coverage.
40:09
We need to have at least 80% success rate.
40:11
Right?
40:12
So again the kind of, the problem is that because agents are doing stuff on their own, achieving this is actually not very easy.
40:21
so at first what I was doing there was also an evolution to this.
40:26
Right.
40:27
So at first what I was doing is let's go here.
40:31
So at first I figured I would build.
40:34
had an N8N, chat agent before we said we would want to have a, voice agent.
40:41
And I moved the N8N chat agent over to a customer built, app.
40:46
So I actually built a voice agent using the OpenAI, time API, which was expensive, but it did the job really well.
40:56
And there was one big problem that we couldn't really do anything with it, which is Joe has to be able to do stuff, but also has to be able to interrupt me.
41:08
So this is actually an interesting problem with voice agents, right?
41:12
Because it's usually one directional.
41:15
I call the voice agent, I say, hey, I Want you to do this and that.
41:18
Okay, I'm on it.
41:18
Wait for it, wait a second.
41:20
And then it keeps doing it or it gives me a call and we start having a conversation.
41:24
But it's all happening synchronously.
41:26
And what I want to do is let's say I'm mid call, right?
41:30
And what I can do is I can say, okay, hey Joe, I want you to do this and that.
41:34
Let me know when you're done, Bam, hang up.
41:36
And then when Joe finishes, that could start another trigger that says, Joe gives me a call and says, hey David, I'm done.
41:43
Da, da, da.
41:43
But what happens if I want that to happen mid call, right?
41:47
What happens if what I want to do is say, hey Joe, I want you to do this and that, and then says, okay, I'm doing that in the background.
41:54
Is there anything else you want to, you want you to want, want to discuss?
41:56
And while we're having a chat, if, background task is completed, whatever, we're done.
42:02
And whenever I stop talking, slide Joe.
42:04
Okay, thanks for that.
42:05
By the way.
42:06
I just got the message.
42:07
Here is the result.
42:08
Do you want to hear the result of the previous query?
42:10
That kind of interaction makes the whole thing really, really, realistic.
42:15
And it's very, very hard to make it work.
42:18
So I started working on a custom built app.
42:21
But then the whole logic was a bit, challenging.
42:24
And then I moved the whole thing to the ElevenLabs Labs conversation agent.
42:30
Conversation agent, which does everything perfectly except for this.
42:35
This bit.
42:35
ElevenLabs agent cannot deal with the interruption stuff.
42:38
It cannot like have proactive mode for a voice agent.
42:42
And then I move, moved over to vapi, which, which through the VAPI API, it is possible, it is possible, but you need some extra plumbing.
42:52
Right, but need plumbing.
42:54
because basically what we had in here was we had Joe.
42:59
Let me, let me just explain.
43:00
So Joe was a VAPI agent.
43:03
And then what I built first was Joe had N8N workflow, right?
43:09
And that N8N workflow was called through, MCP call.
43:16
And I also experimented with an MCP call or a direct function call.
43:20
it doesn't really matter, whatever works.
43:22
But what happens is that I'm having the conversation with Joe.
43:26
So user says something and then Joe says something back to the user, but also routes the task back to this, N8N workflow.
43:37
And then whenever the N8N workflow is done, that workflow then calls the Joe, agent and Joe proactively Starts talking to the user like mid sentence.
43:48
Right?
43:48
Now the challenge with this was with, as always with N8N.
43:52
So I was looking at different ways of making it work.
43:56
And the big problem that we had with this is even inside N810, you know, it works really well, but it took a lot of debugging.
44:05
But the really, really big problem was that Joe's task list here, right, we needed to have 60% coverage and 80% success rate.
44:13
So what I would do is I would take a task number one, task number one and then actually take a look at all the tools Joe can have, which is just, you know, business data, MCP connections MCP connections.
44:29
There was an ERP system the company uses.
44:31
And basically a task needs to have a set of instructions, in sequence to complete, like for example, step one, tool call one, step two, tool call five, and so on.
44:45
And I want that to happen exactly like that.
44:47
And that's when we had the challenge, same challenge that I was dealing with Alfred, which is if I just give it as a prompt, then sometimes Joe will do it that way, sometimes Joe will not do it that way, which is a problem.
45:00
Right?
45:01
Especially because what if, this tool call one, tool call five, what if that's not actually how it works?
45:06
the first idea I had was that had a SQL database connection.
45:11
So every tool call would be an execute SQL tool call.
45:15
And the idea was, is that send, I would send a prompt and then once I send the prompt, Joe, the agent would understand or translate translate prompt to a SQL query, run the query return answer.
45:32
Which seemed nice, but what if I want the financial projection based on the last two quarters of data and I want to know how much bonus I can get out of the company for Christmas?
45:42
That's not a simple solution.
45:43
There's a lot of very complicated SQL queries that needed to run.
45:46
And I was just kind of waiting for stuff to happen for minutes and that wasn't really useful.
45:52
That's when the client said that actually we should have sort of a less than two minutes, completion in order for make it worth to have a live conversation.
46:03
so then I started translating these tasks into tool calls.
46:07
But then the question is, how do I do that?
46:09
Do I create an 8 and workflows like, an 8 and workflow per task?
46:14
That's not very scalable.
46:15
Right?
46:16
Every new task we would need to create a new workflow.
46:18
So that's not what we want.
46:20
if I just use the prompt, it's unreliable Unreliable, so what can we do?
46:25
And then turns out that we needed to have a new MCP connection for the ERP system.
46:35
And that was really interesting.
46:36
That was not a simple problem because what I had to have is we had the old API for the ERP system which was not built for agentic execution.
46:46
And also it was not built for human work.
46:49
Right.
46:49
It was a technical API.
46:51
So it was following a technical logic.
46:53
I needed to build a new API, for operations.
46:56
So instead of having an API for different database operations, I would have an API for I don't know, updating a, like creating a status update or notifying stakeholders in a project or something like that.
47:09
That.
47:09
Right.
47:10
Or creating a financial projection.
47:12
And then I ended up creating a I think there's 84 different API endpoints.
47:17
And then turned every endpoint into an MCP tool.
47:21
And that means that our Joe MCP connection can now understand those 48 actions with schema.
47:29
So it doesn't have to always figure out what the query is, it just calls that API endpoint.
47:35
But in order for that to work I would need to basically Translate task number one to a sequence of tool calls from that one, from that 48 bucket.
47:47
Like how do you express task number one as a sequence of these 48 tool calls?
47:53
And once we had that we, the system now works in a way that I just ran the evaluations yesterday and exactly there.
48:02
So out of the 30 tasks we have 80 task coverage.
48:06
there are, if we add, if we add three extra MCP servers, stuff like Gmail, weather API and that sort of thing, we actually go up to 27 task coverage.
48:18
So 60% goes up to what, 90, 95%.
48:23
I don't even know what the percentage of that is.
48:26
Anyway, so we're almost there.
48:28
And then also had a median 80% success rate.
48:33
That's success rate in the evals, that I think it's actually higher.
48:38
But there were some logging issues right.
48:41
Like not all the MCP tool calls were actually logged in the evals.
48:45
And the way it works.
48:46
It did, it does have.
48:48
So Joe has a VAPI custom prompt and custom system prompt.
48:52
And there are two more interesting aspects of it.
48:55
One is it runs on a Groq infrastructure, Groq inference.
48:59
It runs llama maverick 17 billion parameters.
49:05
And this gives us an exactly 0.1 dollar per minute cost and the 475 475 475 millisecond latency.
49:15
So if I'm Looking at these parameters, this works, this works, this works, this works.
49:21
This also kind of works.
49:23
I think I'm not really logging execution time yet.
49:26
So that's where I'm at right now.
49:29
so yeah, there is one more thing that makes things faster and this is this part and that is the memory layer.
49:36
So looking at, looking at memory, I said user prompt.
49:41
Joe calls Joe mcp.
49:44
Done.
49:44
What if I say after that save what happened to memory, Both what the user asked if that was good and the actual output.
49:56
And then I would change the process as well.
49:59
So I would go this and then also memory.
50:03
Right.
50:04
So instead of trying to do everything right away, I would want Joe to just get stuff from memory.
50:10
And for that I'm using MEM0, which is pretty cool.
50:14
And they just got a bunch of funding so I'm pretty sure they're going to be raising prices soon.
50:18
There is a startup pack package and yeah basically you go to Smithery and inside Smithery there are a bunch of, there are a bunch of mem0mcp servers and there's also an official one.
50:31
But honestly it wasn't working.
50:33
at least it wasn't working for me.
50:34
And I don't know what was the issue but all the MC, all the MEM0 MCP servers had the same issue.
50:39
So I built one for myself and you can find it as a David AI.
50:43
that's what I built.
50:44
And it basically only has two tools.
50:46
It adds a memory and it search memories and that's it.
50:48
So it does exactly what we want it to do.
50:51
I say something, search my memories to see if there is a relevant bit and then if no calls the JOMCP and then says what happened?
51:00
If there is an actual memory then just immediately answer it without moving forward.
51:04
That actually sort of a memory caching, that makes things work a lot faster.
51:09
so that's the Joe project and I'm hoping to hand it over soon.
51:14
because we are at the last stages of this work and then there is the last project which is Jig.
51:21
And that's the most interesting one right?
51:24
Because Jig is nothing else than is a cognitive assembly line.
51:31
Now this is interesting because there is again same problem like N8N versus agents, right?
51:37
I've been talking about it a lot because with N8N you have deterministic plumbing but you have deterministic workflow and manual plumbing.
51:50
Which means that if I want to create a five step process I actually need to create a 50 node workflow because JSON, expression, aggregate, merge, custom JavaScript, whatever.
52:02
And then if anything breaks, the whole thing breaks.
52:05
And then anything also cannot just resume from that step unless you're saving executions properly.
52:11
And it's just very complicated and very brittle.
52:14
But if it works, right?
52:16
If it works, it's great.
52:17
If it doesn't, you're debugging at 2am and you don't know why.
52:22
And that's, that's the big problem.
52:23
And also it gets very, very complicated.
52:25
Like the graphs are so complicated.
52:28
And then you have, you look at agents, which is, you know, chatgpt, Agent, Manus, whatever.
52:33
And then they are probabilistic so they drift off and hallucinate.
52:39
But they can be resourceful to, right?
52:44
So if, let's say a JSON is broken and and N8N breaks, manage or chatgpt agent or cloth code, it would just figure it out.
52:51
So, hey, looks like it's broken.
52:52
It looks like there is an extra comma there and I need to remove.
52:55
And that's it.
52:55
And you know, I don't want to be debugging at 2am because of an extra comma.
52:59
I want the agent to automatically figure it out.
53:02
So what happens here is, do you have, you have a bunch of nodes?
53:07
This is how Nadan works, right?
53:09
You have a bunch of nodes, that's it.
53:11
and then you also have a trigger and that's it, that's how N8N works.
53:15
And then you have an output at the end.
53:18
Okay, so the agent works differently, right?
53:21
The agent just sits in the middle and just dust things as a black box.
53:27
You have zero control over what's happening.
53:29
Or you can build an N8N workflow in a way that it does stuff similarly like this.
53:36
It triggers an agent, then it triggers another agent, then it triggers another agent and then it generates the output.
53:42
That's actually not a bad idea.
53:44
But then the problem is basically the payload.
53:48
So let's just add a bag, right?
53:51
Here's our payload.
53:52
So all the information that we send over here, these are all custom designed, right?
53:59
Inside N8N, you need to tell what are you sending off to the next agent?
54:04
What's the context that it should be getting?
54:06
Right, so you need to be manually designing every single bit.
54:12
However, inside an agent it just gets an idea and then it figures everything out on its own.
54:17
But if you want to build AI agents inside N8N, you have to still do the same plumbing thing.
54:22
So that means that even though it might work, you still need to deal with jsons and commas and whatnot.
54:29
And my idea was that, what if we had this logic, right?
54:34
What if we had this logic, but whatever the payload is, this would be managed somewhat, I don't know, somehow it would be managed differently.
54:43
It would be managed, with purpose, with intent.
54:46
Like, how would that look like, how would, how would it look like if.
54:50
Oops.
54:50
How would it look like if I wanted to intelligently pass on context so the agent can actually understand what the other, what the previous agent did, but in a very compressed way, so it doesn't bloat, doesn't get bloated.
55:06
And that's when I went to CLAUDE code, because I felt, okay, so what happens if I start claude?
55:12
Let's say I would start CLAUDE code, I would give agent an MCP tool number one.
55:18
And I also give it a simple prompt and prompt for a single task, right?
55:23
And then whatever the output is, I don't give a shit what the output is, just send everything over to the other agent, figure out what the ideal output should be, and then the next agent would do MCP number two with another prompt, single task number two.
55:40
And that means that basically we are launching new agents for every single task, just like what we did with NA10.
55:47
But then the context is decided.
55:51
It's just, everything is just shoved over together from one to one to the other.
55:56
And here is the problem with this, right?
55:58
The problem with this is that basically it's like you launch Claude, CLAUDE code, you turn on some connections you have, you let it run, then you turn off those connections, give another prompt, but new connections, and then you run it again and again and again and again.
56:13
And because it's the same session, it's actually very hard to make it so that you know, you, you either have to manually turn everything on and off all the time, or if you don't want to do that, then you need to figure out how to, how to restrict tools per each agent run.
56:29
And you either, to solve that, you either have to solve the N8N problem.
56:35
So instead of sending everything from one agent to another or using the same agent again and again, you need to figure it out manually what to send over, or you need to have a system that manages contacts between these agents at a very, very effective lossless compression, while also orchestrating the deterministic part of do this task.
56:57
This is the tool you use, this is the input I'm expecting, this is the context, this is the Output I'm expecting, and that's it.
57:05
And then as those things run, then the agent itself, basically does everything, which means that the macro bit is completely deterministic, the micro bit is agentic.
57:18
And that hybrid between context was really important for us because the idea was what happens if this is a human and not an agent?
57:27
So right, what happens if I want to replace one with another?
57:29
Then how do I feed the right context to the human without overwhelming them?
57:34
It has to be, and this was one of the key ideas of jig, that it has to be intelligence agnostic.
57:41
It shouldn't matter who the human is or who the actor is.
57:44
Like, is this a human, is this an agent, is this a manager, is this an engineer?
57:48
Like everybody should be able to do it.
57:50
And the way we did it is we created the jigdsl, which let me start with another thing.
57:56
So we created a framework, the IKO Framework, which really just on stuff like jobs to be done and other frameworks.
58:06
The idea here is that in an AI native world, every task you do inside a business, can be described by these four things.
58:15
Let me explain.
58:16
First big statement around JIG is that every operation is a CRUD operation.
58:22
Everything you do in your business is a CRUD operation.
58:26
You are creating, reading, updating or deleting some row, some value in some database, even if that database exists in your head.
58:35
so the question is, how do we get access to all the databases that we need to do and how do we actually describe these CRUD operations?
58:42
And that's when we came up with this IKO framework, which means that in a truly AI native business you would have every CRUD operation has an intent, has some relevant context, a specific action you do, and the specific output.
58:57
Right?
58:57
if you remember, it's like the jobs to be done framework says that you don't buy products, you hire them to get a job done.
59:05
Right?
59:05
Me wanting to get a job done is the intent.
59:09
Why I want to get the job done is also part of the intent.
59:13
And also once the job is done, how do I know it's done?
59:16
That's my output and getting it done is the action and then how to get it done and what you need to get it done is the context.
59:24
So ideally, every action inside the business can be described by this framework, which means that you know, we have a specific action.
59:34
Let's say you have a task that can be described with this IKO framework.
59:38
What happens if you create a workflow which have three tasks, three tasks in it, the workflow itself can also be described with the IKO framework.
59:48
And then what happens if you have entire business units that have, I don't know, five workflows in it?
59:54
Then again that can be described.
59:56
So it's really like a fractal pattern and that's why we called it the IKO fractal.
01:00:01
Now in order for this to be translatable into agentic work, we really needed to have a very specific, very efficient compression of information that is understandable by humans.
01:00:14
And that is the jigdsl.
01:00:16
It's a YAM based language, which really just describes these intent, context, action and output in a structured format.
01:00:26
And then once we have, once we created that, so we had the intelligence agnostic idea, which then led us to the IKO fractal, which then led us to the jig dsl, which then led us to the idea that ideally I should be able to create a database, right?
01:00:44
A jig database, jig database that, where I can just take a, workflow description that's described in this yam and then every, every part of this yam is translated into some database entry.
01:00:59
And once I have that done, I gain a really interesting thing because now I can orchestrate work through a database instead of through code.
01:01:09
N8N orchestrates through code, make orchestrates through code Zapier orchestrates through code.
01:01:16
It's not very durable.
01:01:17
So if I orchestrate everything through a database which is execute task that has an ID ABC in this database, and here is all the configuration pattern of that inside this database, then it becomes very durable.
01:01:32
If we actually log every, every step of the way.
01:01:36
so we would have like a run, log in the database that says, okay, so step number three.
01:01:42
This is what was the input, this is what happened, this is what was the output.
01:01:46
Then we got an error.
01:01:47
So instead of redoing everything, I immediately have everything saved in the database that says it failed at step number three.
01:01:54
We just want to figure out how to make it work.
01:01:56
So inside this JIG database, right?
01:01:58
I would have, I would have tasks somehow explained.
01:02:03
tasks explained.
01:02:04
I would have workflows explained.
01:02:05
Oopsie.
01:02:07
Let me just.
01:02:07
Okay, so we would have tasks, which we call stations and workflows.
01:02:12
And let me explain this.
01:02:14
So we call them stations because when Henry Ford created the assembly line, what happened is that before Ford, before the assembly line, Craftsman, Craftsman owned production end to end, right?
01:02:30
And then after the assembly Line after the assembly line, factory workers owned specific task.
01:02:39
Now why did that happen?
01:02:40
It happened because Henry Ford a research and a study on his own factory and realized that because a craftsman was owning a longer process, they had to work around the factory.
01:02:53
So if there was somebody who was working 10 hours a day, they actually were pedestrians.
01:02:58
In 40% of the time, it's not factory workers, it's pedestrian.
01:03:02
You walk around for four hours and do actual work for six hours.
01:03:07
So instead of humans going to the work, he asked, how do I eliminate people being pedestrians?
01:03:13
And the answer, let's go it the other way.
01:03:15
Let's not have people to go to the work.
01:03:18
Let's have work to come to the people.
01:03:20
And that was what the assembly line became.
01:03:23
Now how does that translate to work, right, inside knowledge work.
01:03:28
you have problems, right?
01:03:30
And that means that you own the problem end to end.
01:03:34
So as a knowledge worker, you have a series of problems that you need to keep solving all the time, and you own it.
01:03:40
You have full ownership of the problem.
01:03:42
You need to know when the problem arises.
01:03:44
You need to know when the problem needs solving.
01:03:46
When problem needs escalating, you need to follow through.
01:03:49
You own the problems end to end.
01:03:51
And then the big problem is that problems have lots of smaller problems, lots smaller problems, that require different context.
01:04:01
So then this is the key, learning here is that we had pedestrians, pedestrians for factory workers, right?
01:04:12
We know that they weren't as productive because they were being pedestrians.
01:04:16
They were working around different, parts of the factory.
01:04:20
What's the knowledge work equivalent of that?
01:04:23
And the knowledge work equivalent of that is context switching.
01:04:27
Because different problems require different context.
01:04:30
So if the solution for pedestrians, for making people not pedestrians, was to make people make, move work, not people, right?
01:04:41
Then the idea here is that we would move context, not people.
01:04:45
So, the cognitive version of an assembly line is that the context always comes to you the same way, but the task itself never changes.
01:04:53
And that's a big problem because human work is messy, it's random.
01:04:58
And knowledge work is messy and is random.
01:05:01
So we really need to figure out how we can manage context.
01:05:04
And that's the most, interesting part of jig, how we manage context.
01:05:08
I'm not going to go into the details.
01:05:10
There's a whole context layer and the context ledger that connects everything.
01:05:15
but the point is that you have this jig database that creates the foundation for all this.
01:05:22
If I move here.
01:05:23
So here is a specific, jig run.
01:05:26
As you can see, there was an actual execution by an agent.
01:05:30
It cost 2.86 cents.
01:05:32
we had a specific type.
01:05:34
Here is a number of tokens that were created.
01:05:36
And what you can see here is the full execution log of what the agent did.
01:05:40
And we had.
01:05:42
What was the intent?
01:05:43
I was like, search my memories about David and report findings.
01:05:46
And here are the tools that were used and how many times.
01:05:49
And you can also use, see what connections were provided, what was the actual prompt we gave.
01:05:55
And then here's the context block in the context ledgers.
01:05:58
Basically, after every execution, jig analyzes itself and says, okay, did we actually achieve the intent of this work?
01:06:06
And how confident am I that my answer is the right answer?
01:06:09
and then it also writes a report, that's human readable for later.
01:06:13
If I say let's create a file, it saves it to an artifact as well.
01:06:17
So, and then I can take a look at all the workflow history as well.
01:06:21
And then there were a bunch of evaluation stuff because, the idea here is that if you have this, you can use Jig to run evaluations.
01:06:30
You can use Jig to do a bunch of stuff.
01:06:33
So yeah, the interesting thing is now we have a Jig MCP server that creates these things.
01:06:38
And basically, the execution part is following this, right?
01:06:43
So the jig logic transports the context.
01:06:47
And every agent is a cloud code agent.
01:06:50
So it's like a cloud agent built on a cloud agent SDK.
01:06:53
It spins up a cloud agent, gives it the tools it needs, authenticates everything and and starts with the workflow.
01:07:00
Then it generates the output as specified, in the jig and then calls the next agent.
01:07:06
The next agent forks the previous cloth agent.
01:07:10
So it has all its memories.
01:07:12
But now it has a different task, it has a different intent, a different prompt, different tools, different access.
01:07:18
It's a different one.
01:07:19
It's kind of the same same, but different.
01:07:22
And then it keeps forking the cloud agents until it gets to a point that it reaches the final output and then it does the final output.
01:07:29
And creating jigs, out of SOPs is a really powerful thing for us now because, I got a loom video that would last it five minutes from the operations manager of the client and on how they do reconciliation of transactions at an E commerce company.
01:07:45
And I fed it into this jig system.
01:07:49
I have an architect that can transform these rule translations or raw transcripts into jigs and then implemented it.
01:07:57
And like with a single one shot attempt.
01:07:59
We got 95 plus percent accuracy in the completion.
01:08:04
And we're not creating new agents, we're just using existing ones.
01:08:07
And if you have a GAM cp then you can actually run everything from Claude or ChatGPT or whatever.
01:08:13
So like there are a lot of stuff that like overlap between different projects because they're all trying to kind of work in the same problem space.
01:08:21
the interesting thing about about JIG is lost.
01:08:25
The latest thing that I built is that I can, I can create a station and run it and then like as a temporary station, which really allows me to just create short lived cloth agents as part of a workflow and save all the context they generate.
01:08:41
It's getting very complicated very quickly.
01:08:44
But so this is the other project that I'm working.
01:08:46
I'm very happy about this project.
01:08:48
This is a really cool project.
01:08:50
yeah, so these are most of the projects that I work on.
01:08:53
And then there are a couple other stuff like the bootcamp is returning next week and then I also have a fun project about, I think it's here, the no Code no clue process which translates n8n workflow JSON fuzz into tutorials and radio plays.
01:09:08
And I started building an actual workflow to generate the content and inside an 8.
01:09:16
And it was very, very complicated.
01:09:18
and ultimately I ended up using my dovetail powered cloth code process and basically built the whole system I don't know in, in less than an hour.
01:09:27
So now what it does is this.
01:09:30
So basically it runs a simple dashboard, but it also works from the terminal.
01:09:34
So I have launched it with a simple command.
01:09:38
It says NPM run dashboard.
01:09:40
And then what I have here, I know I call everything Alfred.
01:09:43
I should really stop that, stop doing that.
01:09:45
So what happens is that when I start running it, it starts running through a bunch of steps and it starts generating like it starts going through a 13 step process and then if some, if it contains or encounters some errors, it can pick it up and continue on from that later.
01:10:02
And what happens is that it takes a look at the workflow and it understands the workflow.
01:10:08
So it was, it took a JSON file that was Jackie, an AI assistant.
01:10:12
I got it from the NAN Template Marketplace.
01:10:14
And then it actually tries to understand what that workflow does.
01:10:18
It creates an analysis and then it creates another analysis which is, you know, where does this actually solve a problem?
01:10:24
What's the context?
01:10:26
where is this an overkill?
01:10:27
So it generates some extra context and then as a next step it creating a script.
01:10:33
So it's like, okay, let's try to generate a, Hold on, there was an error there.
01:10:38
There you go.
01:10:39
So step number three is it actually, creates a script, metadata.
01:10:44
there is like a sitcom episode generator which has like a guide.
01:10:49
And I, can walk you through that in a minute.
01:10:51
And then it generates the pack, and from that it generates a script.
01:10:55
And now we have the script here, which is a pretty long script for a sitcom.
01:10:59
and then I'm like, okay, next, turn that into a JSON.
01:11:02
And once you have that JSON, I want you to go through each and every one of those and then generate the actual radio play.
01:11:09
So turn the script.
01:11:10
We have voices, assigned to different, persistent actors, permanent actors, and then there are random voices for, episodic actors.
01:11:19
And then it generates something like this, which is like, pretty cool.
01:11:22
It's full radio play, here.
01:11:24
And you can also check the one on the notion tripling workflow.
01:11:28
And it's fully automated.
01:11:30
Elevenlabs now has the V3 model, model available in the API.
01:11:34
So it uses FFmpeg, MPEG to generate all the, dialogue, blobs.
01:11:40
And then, it generates one big audio file for the radio player and saves it.
01:11:44
And then it generates some metadata, some SEO stuff.
01:11:47
And then once it has everything, it takes the actual workflow and then generates an actual tutorial.
01:11:53
And and then the last step is that it, posts it to, Ghost as a blog article.
01:11:59
And I'm still working on it, but I will probably turn this into like an automated regular N8N tutorial, article sequence maybe once a day.
01:12:09
It will help me with SEO and it will probably also you guys some ideas on how to do stuff on N8N, if you still want to go down that road.
01:12:18
So you can see, you know, there's like, okay, here's the source.
01:12:20
What tools are used, setup time, difficulty time saved per month, and then there is the tutorial.
01:12:27
So I'm still playing around with it.
01:12:28
we'll see how it goes.
01:12:30
But that's kind of a fun side project that I'm doing.
01:12:32
So.
01:12:33
Okay, the raw recording is almost, 90 minutes now.
01:12:35
So I'm going to stop talking.
01:12:37
There's a lot of stuff that I'm working on right now, as you can see, which, I just wanted to start talking a bit more about that.
01:12:43
So I'm going to, I'm going to share the linear document, the linear roadmap, links and I'm going to just start uploading YouTube videos separately per project and then also sending out different, different emails, like on a weekly basis.
01:12:58
So thanks for watching.
01:13:00
If there is any questions you have, I'm always an email away.
01:13:03
And, yeah, which project you're most excited about?