Agents @ Work: Lindy.ai

Latent Space: The AI Engineer Podcast

1×

0:00

Current time: 0:00 / Total time: -1:09:53

-1:09:53

Agents @ Work: Lindy.ai

How Lindy.ai evolved beyond the LLM-pilled era, why agents need rails, and how they rickrolled customers by accident

Nov 15, 2024

Transcript

Alessio will be at AWS re:Invent next week and hosting a casual coffee meetup on Wednesday, RSVP here! And subscribe to our calendar for our Singapore, NeurIPS, and all upcoming meetups!

We are still taking questions for our next big recap episode! Submit questions and messages on Speakpipe here for a chance to appear on the show!

If you've been following the AI agents space, you have heard of Lindy AI; while founder Flo Crivello is hesitant to call it "blowing up," when folks like Andrew Wilkinson start obsessing over your product, you're definitely onto something.

In our latest episode, Flo walked us through Lindy's evolution from late 2022 to now, revealing some design choices about agent platform design that go against conventional wisdom in the space.

The Great Reset: From Text Fields to Rails

Remember late 2022? Everyone was "LLM-pilled," believing that if you just gave a language model enough context and tools, it could do anything. Lindy 1.0 followed this pattern:

Big prompt field ✅
Bunch of tools ✅
Prayer to the LLM gods ✅

Fast forward to today, and Lindy 2.0 looks radically different. As Flo put it (~17:00 in the episode): "The more you can put your agent on rails, one, the more reliable it's going to be, obviously, but two, it's also going to be easier to use for the user."

Instead of a giant, intimidating text field, users now build workflows visually:

Trigger (e.g., "Zendesk ticket received")
Required actions (e.g., "Check knowledge base")
Response generation

This isn't just a UI change - it's a fundamental rethinking of how to make AI agents reliable. As Swyx noted during our discussion: "Put Shoggoth in a box and make it a very small, minimal viable box. Everything else should be traditional if-this-then-that software."

The Surprising Truth About Model Limitations

Here's something that might shock folks building in the space: with Claude 3.5 Sonnet, the model is no longer the bottleneck. Flo's exact words (~31:00): "It is actually shocking the extent to which the model is no longer the limit. It was the limit a year ago. It was too expensive. The context window was too small."

Some context: Lindy started when context windows were 4K tokens. Today, their system prompt alone is larger than that. But what's really interesting is what this means for platform builders:

Raw capabilities aren't the constraint anymore
Integration quality matters more than model performance
User experience and workflow design are the new bottlenecks

The Search Engine Parallel: Why Horizontal Platforms Might Win

One of the spiciest takes from our conversation was Flo's thesis on horizontal vs. vertical agent platforms. He draws a fascinating parallel to search engines (~56:00):

"I find it surprising the extent to which a horizontal search engine has won... You go through Google to search Reddit. You go through Google to search Wikipedia... search in each vertical has more in common with search than it does with each vertical."

His argument: agent platforms might follow the same pattern because:

Agents across verticals share more commonalities than differences
There's value in having agents that can work together under one roof
The R&D cost of getting agents right is better amortized across use cases

This might explain why we're seeing early vertical AI companies starting to expand horizontally. The core agent capabilities - reliability, context management, tool integration - are universal needs.

What This Means for Builders

If you're building in the AI agents space, here are the key takeaways:

Constrain First: Rather than maximizing capabilities, focus on reliable execution within narrow bounds
Integration Quality Matters: With model capabilities plateauing, your competitive advantage lies in how well you integrate with existing tools
Memory Management is Key: Flo revealed they actively prune agent memories - even with larger context windows, not all memories are useful
Design for Discovery: Lindy's visual workflow builder shows how important interface design is for adoption

The Meta Layer

There's a broader lesson here about AI product development. Just as Lindy evolved from "give the LLM everything" to "constrain intelligently," we might see similar evolution across the AI tooling space. The winners might not be those with the most powerful models, but those who best understand how to package AI capabilities in ways that solve real problems reliably.

Full Video Podcast

Flo’s talk at AI Engineer Summit

Chapters

00:00:00 Introductions
00:04:05 AI engineering and deterministic software
00:08:36 Lindys demo
00:13:21 Memory management in AI agents
00:18:48 Hierarchy and collaboration between Lindys
00:21:19 Vertical vs. horizontal AI tools
00:24:03 Community and user engagement strategies
00:26:16 Rickrolling incident with Lindy
00:28:12 Evals and quality control in AI systems
00:31:52 Model capabilities and their impact on Lindy
00:39:27 Competition and market positioning
00:42:40 Relationship between Factorio and business strategy
00:44:05 Remote work vs. in-person collaboration
00:49:03 Europe vs US Tech
00:58:59 Testing the Overton window and free speech
01:04:20 Balancing AI safety concerns with business innovation

Show Notes

Transcript

Alessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol.ai.

Swyx [00:00:12]: Hey, and today we're joined in the studio by Florent Crivello. Welcome.

Flo [00:00:15]: Hey, yeah, thanks for having me.

Swyx [00:00:17]: Also known as Altimore. I always wanted to ask, what is Altimore?

Flo [00:00:21]: It was the name of my character when I was playing Dungeons & Dragons. Always. I was like 11 years old.

Swyx [00:00:26]: What was your classes?

Flo [00:00:27]: I was an elf. I was a magician elf.

Swyx [00:00:30]: Well, you're still spinning magic. Right now, you're a solo founder and CEO of Lindy.ai. What is Lindy?

Flo [00:00:36]: Yeah, we are a no-code platform letting you build your own AI agents easily. So you can think of we are to LangChain as Airtable is to MySQL. Like you can just pin up AI agents super easily by clicking around and no code required. You don't have to be an engineer and you can automate business workflows that you simply could not automate before in a few minutes.

Swyx [00:00:55]: You've been in our orbit a few times. I think you spoke at our Latent Space anniversary. You spoke at my summit, the first summit, which was a really good keynote. And most recently, like we actually already scheduled this podcast before this happened. But Andrew Wilkinson was like, I'm obsessed by Lindy. He's just created a whole bunch of agents. So basically, why are you blowing up?

Flo [00:01:16]: Well, thank you. I think we are having a little bit of a moment. I think it's a bit premature to say we're blowing up. But why are things going well? We revamped the product majorly. We called it Lindy 2.0. I would say we started working on that six months ago. We've actually not really announced it yet. It's just, I guess, I guess that's what we're doing now. And so we've basically been cooking for the last six months, like really rebuilding the product from scratch. I think I'll list you, actually, the last time you tried the product, it was still Lindy 1.0. Oh, yeah. If you log in now, the platform looks very different. There's like a ton more features. And I think one realization that we made, and I think a lot of folks in the agent space made the same realization, is that there is such a thing as too much of a good thing. I think many people, when they started working on agents, they were very LLM peeled and chat GPT peeled, right? They got ahead of themselves in a way, and us included, and they thought that agents were actually, and LLMs were actually more advanced than they actually were. And so the first version of Lindy was like just a giant prompt and a bunch of tools. And then the realization we had was like, hey, actually, the more you can put your agent on Rails, one, the more reliable it's going to be, obviously, but two, it's also going to be easier to use for the user, because you can really, as a user, you get, instead of just getting this big, giant, intimidating text field, and you type words in there, and you have no idea if you're typing the right word or not, here you can really click and select step by step, and tell your agent what to do, and really give as narrow or as wide a guardrail as you want for your agent. We started working on that. We called it Lindy on Rails about six months ago, and we started putting it into the hands of users over the last, I would say, two months or so, and I think things really started going pretty well at that point. The agent is way more reliable, way easier to set up, and we're already seeing a ton of new use cases pop up.

Swyx [00:03:00]: Yeah, just a quick follow-up on that. You launched the first Lindy in November last year, and you were already talking about having a DSL, right? I remember having this discussion with you, and you were like, it's just much more reliable. Is this still the DSL under the hood? Is this a UI-level change, or is it a bigger rewrite?

Flo [00:03:17]: No, it is a much bigger rewrite. I'll give you a concrete example. Suppose you want to have an agent that observes your Zendesk tickets, and it's like, hey, every time you receive a Zendesk ticket, I want you to check my knowledge base, so it's like a RAG module and whatnot, and then answer the ticket. The way it used to work with Lindy before was, you would type the prompt asking it to do that. You check my knowledge base, and so on and so forth. The problem with doing that is that it can always go wrong. You're praying the LLM gods that they will actually invoke your knowledge base, but I don't want to ask it. I want it to always, 100% of the time, consult the knowledge base after it receives a Zendesk ticket. And so with Lindy, you can actually have the trigger, which is Zendesk ticket received, have the knowledge base consult, which is always there, and then have the agent. So you can really set up your agent any way you want like that.

Swyx [00:04:05]: This is something I think about for AI engineering as well, which is the big labs want you to hand over everything in the prompts, and only code of English, and then the smaller brains, the GPU pours, always want to write more code to make things more deterministic and reliable and controllable. One way I put it is put Shoggoth in a box and make it a very small, the minimal viable box. Everything else should be traditional, if this, then that software.

Flo [00:04:29]: I love that characterization, put the Shoggoth in the box. Yeah, we talk about using as much AI as necessary and as little as possible.

Alessio [00:04:37]: And what was the choosing between kind of like this drag and drop, low code, whatever, super code-driven, maybe like the Lang chains, auto-GPT of the world, and maybe the flip side of it, which you don't really do, it's like just text to agent, it's like build the workflow for me. Like what have you learned actually putting this in front of users and figuring out how much do they actually want to add it versus like how much, you know, kind of like Ruby on Rails instead of Lindy on Rails, it's kind of like, you know, defaults over configuration.

Flo [00:05:06]: I actually used to dislike when people said, oh, text is not a great interface. I was like, ah, this is such a mid-take, I think text is awesome. And I've actually come around, I actually sort of agree now that text is really not great. I think for people like you and me, because we sort of have a mental model, okay, when I type a prompt into this text box, this is what it's going to do, it's going to map it to this kind of data structure under the hood and so forth. I guess it's a little bit blackmailing towards humans. You jump on these calls with humans and you're like, here's a text box, this is going to set up an agent for you, do it. And then they type words like, I want you to help me put order in my inbox. Oh, actually, this is a good one. This is actually a good one. What's a bad one? I would say 60 or 70% of the prompts that people type don't mean anything. Me as a human, as AGI, I don't understand what they mean. I don't know what they mean. It is actually, I think whenever you can have a GUI, it is better than to have just a pure text interface.

Alessio [00:05:58]: And then how do you decide how much to expose? So even with the tools, you have Slack, you have Google Calendar, you have Gmail. Should people by default just turn over access to everything and then you help them figure out what to use? I think that's the question. When I tried to set up Slack, it was like, hey, give me access to all channels and everything, which for the average person probably makes sense because you don't want to re-prompt them every time you add new channels. But at the same time, for maybe the more sophisticated enterprise use cases, people are like, hey, I want to really limit what you have access to. How do you kind of thread that balance?

Flo [00:06:35]: The general philosophy is we ask for the least amount of permissions needed at any given moment. I don't think Slack, I could be mistaken, but I don't think Slack lets you request permissions for just one channel. But for example, for Google, obviously there are hundreds of scopes that you could require for Google. There's a lot of scopes. And sometimes it's actually painful to set up your Lindy because you're going to have to ask Google and add scopes five or six times. We've had sessions like this. But that's what we do because, for example, the Lindy email drafter, she's going to ask you for your authorization once for, I need to be able to read your email so I can draft a reply, and then another time for I need to be able to write a draft for them. We just try to do it very incrementally like that.

Alessio [00:07:15]: Do you think OAuth is just overall going to change? I think maybe before it was like, hey, we need to set up OAuth that humans only want to kind of do once. So we try to jam-pack things all at once versus what if you could on-demand get different permissions every time from different parts? Do you ever think about designing things knowing that maybe AI will use it instead of humans will use it? Yeah, for sure.

Flo [00:07:37]: One pattern we've started to see is people provisioning accounts for their AI agents. And so, in particular, Google Workspace accounts. So, for example, Lindy can be used as a scheduling assistant. So you can just CC her to your emails when you're trying to find time with someone. And just like a human assistant, she's going to go back and forth and offer other abilities and so forth. Very often, people don't want the other party to know that it's an AI. So it's actually funny. They introduce delays. They ask the agent to wait before replying, so it's not too obvious that it's an AI. And they provision an account on Google Suite, which costs them like $10 a month or something like that. So we're seeing that pattern more and more. I think that does the job for now. I'm not optimistic on us actually patching OAuth. Because I agree with you, ultimately, we would want to patch OAuth because the new account thing is kind of a clutch. It's really a hack. You would want to patch OAuth to have more granular access control and really be able to put your sugar in the box. I'm not optimistic on us doing that before AGI, I think. That's a very close timeline.

Swyx [00:08:36]: I'm mindful of talking about a thing without showing it. And we already have the setup to show it. Why don't we jump into a screen share? For listeners, you can jump on the YouTube and like and subscribe. But also, let's have a look at how you show off Lindy. Yeah, absolutely.

Flo [00:08:51]: I'll give an example of a very simple Lindy and then I'll graduate to a much more complicated one. A super simple Lindy that I have is, I unfortunately bought some investment properties in the south of France. It was a really, really bad idea. And I put them on a Holydew, which is like the French Airbnb, if you will. And so I received these emails from time to time telling me like, oh, hey, you made 200 bucks. Someone booked your place. When I receive these emails, I want to log this reservation in a spreadsheet. Doing this without an AI agent or without AI in general is a pain in the butt because you must write an HTML parser for this email. And so it's just hard. You may not be able to do it and it's going to break the moment the email changes. By contrast, the way it works with Lindy, it's really simple. It's two steps. It's like, okay, I receive an email. If it is a reservation confirmation, I have this filter here. Then I append a row to this spreadsheet. And so this is where you can see the AI part where the way this action is configured here, you see these purple fields on the right. Each of these fields is a prompt. And so I can say, okay, you extract from the email the day the reservation begins on. You extract the amount of the reservation. You extract the number of travelers of the reservation. And now you can see when I look at the task history of this Lindy, it's really simple. It's like, okay, you do this and boom, appending this row to this spreadsheet. And this is the information extracted. So effectively, this node here, this append row node is a mini agent. It can see everything that just happened. It has context over the task and it's appending the row. And then it's going to send a reply to the thread. That's a very simple example of an agent.

Swyx [00:10:34]: A quick follow-up question on this one while we're still on this page. Is that one call? Is that a structured output call? Yeah. Okay, nice. Yeah.

Flo [00:10:41]: And you can see here for every node, you can configure which model you want to power the node. Here I use cloud. For this, I use GPT-4 Turbo. Much more complex example, my meeting recorder. It looks very complex because I've added to it over time, but at a high level, it's really simple. It's like when a meeting begins, you record the meeting. And after the meeting, you send me a summary and you send me coaching notes. So I receive, like my Lindy is constantly coaching me. And so you can see here in the prompt of the coaching notes, I've told it, hey, you know, was I unnecessarily confrontational at any point? I'm French, so I have to watch out for that. Or not confrontational enough. Should I have double-clicked on any issue, right? So I can really give it exactly the kind of coaching that I'm expecting. And then the interesting thing here is, like, you can see the agent here, after it sent me these coaching notes, moves on. And it does a bunch of other stuff. So it goes on Slack. It disseminates the notes on Slack. It does a bunch of other stuff. But it's actually able to backtrack and resume the automation at the coaching notes email if I responded to that email. So I'll give a super concrete example. This is an actual coaching feedback that I received from Lindy. She was like, hey, this was a sales call I had with a customer. And she was like, I found your explanation of Lindy too technical. And I was able to follow up and just ask a follow-up question in the thread here. And I was like, why did you find too technical about my explanation? And Lindy restored the context. And so she basically picked up the automation back up here in the tree. And she has all of the context of everything that happened, including the meeting in which I was. So she was like, oh, you used the words deterministic and context window and agent state. And that concept exists at every level for every channel and every action that Lindy takes. So another example here is, I mentioned she also disseminates the notes on Slack. So this was a meeting where I was not, right? So this was a teammate. He's an indie meeting recorder, posts the meeting notes in this customer discovery channel on Slack. So you can see, okay, this is the onboarding call we had. This was the use case. Look at the questions. How do I make Lindy slower? How do I add delays to make Lindy slower? And I was able, in the Slack thread, to ask follow-up questions like, oh, what did we answer to these questions? And it's really handy because I know I can have this sort of interactive Q&A with these meetings. It means that very often now, I don't go to meetings anymore. I just send my Lindy. And instead of going to like a 60-minute meeting, I have like a five-minute chat with my Lindy afterwards. And she just replied. She was like, well, this is what we replied to this customer. And I can just be like, okay, good job, Jack. Like, no notes about your answers. So that's the kind of use cases people have with Lindy. It's a lot of like, there's a lot of sales automations, customer support automations, and a lot of this, which is basically personal assistance automations, like meeting scheduling and so forth.

Alessio [00:13:21]: Yeah, and I think the question that people might have is memory. So as you get coaching, how does it track whether or not you're improving? You know, if these are like mistakes you made in the past, like, how do you think about that?

Flo [00:13:31]: Yeah, we have a memory module. So I'll show you my meeting scheduler, Lindy, which has a lot of memories because by now I've used her for so long. And so every time I talk to her, she saves a memory. If I tell her, you screwed up, please don't do this. So you can see here, oh, it's got a double memory here. This is the meeting link I have, or this is the address of the office. If I tell someone to meet me at home, this is the address of my place. This is the code. I guess we'll have to edit that out. This is not the code of my place. No dogs. Yeah, so Lindy can just manage her own memory and decide when she's remembering things between executions. Okay.

Swyx [00:14:11]: I mean, I'm just going to take the opportunity to ask you, since you are the creator of this thing, how come there's so few memories, right? Like, if you've been using this for two years, there should be thousands of thousands of things. That is a good question.

Flo [00:14:22]: Agents still get confused if they have too many memories, to my point earlier about that. So I just am out of a call with a member of the Lama team at Meta, and we were chatting about Lindy, and we were going into the system prompt that we sent to Lindy, and all of that stuff. And he was amazed, and he was like, it's a miracle that it's working, guys. He was like, this kind of system prompt, this does not exist, either pre-training or post-training. These models were never trained to do this kind of stuff. It's a miracle that they can be agents at all. And so what I do, I actually prune the memories. You know, it's actually something I've gotten into the habit of doing from back when we had GPT 3.5, being Lindy agents. I suspect it's probably not as necessary in the Cloud 3.5 Sunette days, but I prune the memories. Yeah, okay.

Swyx [00:15:05]: The reason is because I have another assistant that also is recording and trying to come up with facts about me. It comes up with a lot of trivial, useless facts that I... So I spend most of my time pruning. Actually, it's not super useful. I'd much rather have high-quality facts that it accepts. Or maybe I was even thinking, were you ever tempted to add a wake word to only memorize this when I say memorize this? And otherwise, don't even bother.

Flo [00:15:30]: I have a Lindy that does this. So this is my inbox processor, Lindy. It's kind of beefy because there's a lot of different emails. But somewhere in here,

Swyx [00:15:38]: there is a rule where I'm like,

Flo [00:15:39]: aha, I can email my inbox processor, Lindy. It's really handy. So she has her own email address. And so when I process my email inbox, I sometimes forward an email to her. And it's a newsletter, or it's like a cold outreach from a recruiter that I don't care about, or anything like that. And I can give her a rule. And I can be like, hey, this email I want you to archive, moving forward. Or I want you to alert me on Slack when I have this kind of email. It's really important. And so you can see here, the prompt is, if I give you a rule about a kind of email, like archive emails from X, save it as a new memory. And I give it to the memory saving skill. And yeah.

Swyx [00:16:13]: One thing that just occurred to me, so I'm a big fan of virtual mailboxes. I recommend that everybody have a virtual mailbox. You could set up a physical mail receive thing for Lindy. And so then Lindy can process your physical mail.

Flo [00:16:26]: That's actually a good idea. I actually already have something like that. I use like health class mail. Yeah. So yeah, most likely, I can process my physical mail. Yeah.

Swyx [00:16:35]: And then the other product's idea I have, looking at this thing, is people want to brag about the complexity of their Lindys. So this would be like a 65 point Lindy, right?

Flo [00:16:43]: What's a 65 point?

Swyx [00:16:44]: Complexity counting. Like how many nodes, how many things, how many conditions, right? Yeah.

Flo [00:16:49]: This is not the most complex one. I have another one. This designer recruiter here is kind of beefy as well. Right, right, right. So I'm just saying,

Swyx [00:16:56]: let people brag. Let people be super users. Oh, right.

Flo [00:16:59]: Give them a score. Give them a score.

Swyx [00:17:01]: Then they'll just be like, okay, how high can you make this score?

Flo [00:17:04]: Yeah, that's a good point. And I think that's, again, the beauty of this on-rails phenomenon. It's like, think of the equivalent, the prompt equivalent of this Lindy here, for example, that we're looking at. It'd be monstrous. And the odds that it gets it right are so low. But here, because we're really holding the agent's hand step by step by step, it's actually super reliable. Yeah.

Swyx [00:17:22]: And is it all structured output-based? Yeah. As far as possible? Basically. Like, there's no non-structured output?

Flo [00:17:27]: There is. So, for example, here, this AI agent step, right, or this send message step, sometimes it gets to... That's just plain text.

Swyx [00:17:35]: That's right.

Flo [00:17:36]: Yeah. So I'll give you an example. Maybe it's TMI. I'm having blood pressure issues these days. And so this Lindy here, I give it my blood pressure readings, and it updates a log that I have of my blood pressure that it sends to my doctor.

Swyx [00:17:49]: Oh, so every Lindy comes with a to-do list?

Flo [00:17:52]: Yeah. Every Lindy has its own task history. Huh. Yeah. And so you can see here, this is my main Lindy, my personal assistant, and I've told it, where is this? There is a point where I'm like, if I am giving you a health-related fact, right here, I'm giving you health information, so then you update this log that I have in this Google Doc, and then you send me a message. And you can see, I've actually not configured this send message node. I haven't told it what to send me a message for. Right? And you can see, it's actually lecturing me. It's like, I'm giving it my blood pressure ratings. It's like, hey, it's a bit high. Here are some lifestyle changes you may want to consider.

Alessio [00:18:27]: I think maybe this is the most confusing or new thing for people. So even I use Lindy and I didn't even know you could have multiple workflows in one Lindy. I think the mental model is kind of like the Zapier workflows. It starts and it ends. It doesn't choose between. How do you think about what's a Lindy versus what's a sub-function of a Lindy? Like, what's the hierarchy?

Flo [00:18:48]: Yeah. Frankly, I think the line is a little arbitrary. It's kind of like when you code, like when do you start to create a new class versus when do you overload your current class. I think of it in terms of like jobs to be done and I think of it in terms of who is the Lindy serving. This Lindy is serving me personally. It's really my day-to-day Lindy. I give it a bunch of stuff, like very easy tasks. And so this is just the Lindy I go to. Sometimes when a task is really more specialized, so for example, I have this like summarizer Lindy or this designer recruiter Lindy. These tasks are really beefy. I wouldn't want to add this to my main Lindy, so I just created a separate Lindy for it. Or when it's a Lindy that serves another constituency, like our customer support Lindy, I don't want to add that to my personal assistant Lindy. These are two very different Lindys.

Alessio [00:19:31]: And you can call a Lindy from within another Lindy. That's right. You can kind of chain them together.

Flo [00:19:36]: Lindys can work together, absolutely.

Swyx [00:19:38]: A couple more things for the video portion. I noticed you have a podcast follower. We have to ask about that. What is that?

Flo [00:19:46]: So this one wakes me up every... So wakes herself up every week. And she sends me... So she woke up yesterday, actually. And she searches for Lenny's podcast. And she looks for like the latest episode on YouTube. And once she finds it, she transcribes the video and then she sends me the summary by email. I don't listen to podcasts as much anymore. I just like read these summaries. Yeah.

Alessio [00:20:09]: We should make a latent space Lindy. Marketplace.

Swyx [00:20:12]: Yeah. And then you have a whole bunch of connectors. I saw the list briefly. Any interesting one? Complicated one that you're proud of? Anything that you want to just share? Connector stories.

Flo [00:20:23]: So many of our workflows are about meeting scheduling. So we had to build some very open unity tools around meeting scheduling. So for example, one that is surprisingly hard is this find available times action. You would not believe... This is like a thousand lines of code or something. It's just a very beefy action. And you can pass it a bunch of parameters about how long is the meeting? When does it start? When does it end? What are the meetings? The weekdays in which I meet? How many time slots do you return? What's the buffer between my meetings? It's just a very, very, very complex action. I really like our GitHub action. So we have a Lindy PR reviewer. And it's really handy because anytime any bug happens... So the Lindy reads our guidelines on Google Docs. By now, the guidelines are like 40 pages long or something. And so every time any new kind of bug happens, we just go to the guideline and we add the lines. Like, hey, this has happened before. Please watch out for this category of bugs. And it's saving us so much time every day.

Alessio [00:21:19]: There's companies doing PR reviews. Where does a Lindy start? When does a company start? Or maybe how do you think about the complexity of these tasks when it's going to be worth having kind of like a vertical standalone company versus just like, hey, a Lindy is going to do a good job 99% of the time?

Flo [00:21:34]: That's a good question. We think about this one all the time. I can't say that we've really come up with a very crisp articulation of when do you want to use a vertical tool versus when do you want to use a horizontal tool. I think of it as very similar to the internet. I find it surprising the extent to which a horizontal search engine has won. But I think that Google, right? But I think the even more surprising fact is that the horizontal search engine has won in almost every vertical, right? You go through Google to search Reddit. You go through Google to search Wikipedia. I think maybe the biggest exception is e-commerce. Like you go to Amazon to search e-commerce, but otherwise you go through Google. And I think that the reason for that is because search in each vertical has more in common with search than it does with each vertical. And search is so expensive to get right. Like Google is a big company that it makes a lot of sense to aggregate all of these different use cases and to spread your R&D budget across all of these different use cases. I have a thesis, which is, it's a really cool thesis for Lindy, is that the same thing is true for agents. I think that by and large, in a lot of verticals, agents in each vertical have more in common with agents than they do with each vertical. I also think there are benefits in having a single agent platform because that way your agents can work together. They're all like under one roof. That way you only learn one platform and so you can create agents for everything that you want. And you don't have to like pay for like a bunch of different platforms and so forth. So I think ultimately, it is actually going to shake out in a way that is similar to search in that search is everywhere on the internet. Every website has a search box, right? So there's going to be a lot of vertical agents for everything. I think AI is going to completely penetrate every category of software. But then I also think there are going to be a few very, very, very big horizontal agents that serve a lot of functions for people.

Swyx [00:23:14]: That is actually one of the questions that we had about the agent stuff. So I guess we can transition away from the screen and I'll just ask the follow-up, which is, that is a hot topic. You're basically saying that the current VC obsession of the day, which is vertical AI enabled SaaS, is mostly not going to work out. And then there are going to be some super giant horizontal SaaS.

Flo [00:23:34]: Oh, no, I'm not saying it's either or. Like SaaS today, vertical SaaS is huge and there's also a lot of horizontal platforms. If you look at like Airtable or Notion, basically the entire no-code space is very horizontal. I mean, Loom and Zoom and Slack, there's a lot of very horizontal tools out there. Okay.

Swyx [00:23:49]: I was just trying to get a reaction out of you for hot takes. Trying to get a hot take.

Flo [00:23:54]: No, I also think it is natural for the vertical solutions to emerge first because it's just easier to build. It's just much, much, much harder to build something horizontal. Cool.

Swyx [00:24:03]: Some more Lindy-specific questions. So we covered most of the top use cases and you have an academy. That was nice to see. I also see some other people doing it for you for free. So like Ben Spites is doing it and then there's some other guy who's also doing like lessons. Yeah. Which is kind of nice, right? Yeah, absolutely. You don't have to do any of that.

Flo [00:24:20]: Oh, we've been seeing it more and more on like LinkedIn and Twitter, like people posting their Lindys and so forth.

Swyx [00:24:24]: I think that's the flywheel that you built the platform where creators see value in allying themselves to you. And so then, you know, your incentive is to make them successful so that they can make other people successful and then it just drives more and more engagement. Like it's earned media. Like you don't have to do anything.

Flo [00:24:39]: Yeah, yeah. I mean, community is everything.

Swyx [00:24:41]: Are you doing anything special there? Any big wins?

Flo [00:24:44]: We have a Slack community that's pretty active. I can't say we've invested much more than that so far.

Swyx [00:24:49]: I would say from having, so I have some involvement in the no-code community. I would say that Webflow going very hard after no-code as a category got them a lot more allies than just the people using Webflow. So it helps you to grow the community beyond just Lindy. And I don't know what this is called. Maybe it's just no-code again. Maybe you want to call it something different. But there's definitely an appetite for this and you are one of a broad category, right? Like just before you, we had Dust and, you know, they're also kind of going after a similar market. Zapier obviously is not going to try to also compete with you. Yeah. There's no question there. It's just like a reaction about community. Like I think a lot about community. Lanespace is growing the community of AI engineers. And I think you have a slightly different audience of, I don't know what.

Flo [00:25:33]: Yeah. I think the no-code tinkerers is the community. Yeah. It is going to be the same sort of community as what Webflow, Zapier, Airtable, Notion to some extent.

Swyx [00:25:43]: Yeah. The framing can be different if you were, so I think tinkerers has this connotation of not serious or like small. And if you framed it to like no-code EA, we're exclusively only for CEOs with a certain budget, then you just have, you tap into a different budget.

Flo [00:25:58]: That's true. The problem with EA is like, the CEO has no willingness to actually tinker and play with the platform.

Swyx [00:26:05]: Maybe Andrew's doing that. Like a lot of your biggest advocates are CEOs, right?

Flo [00:26:09]: A solopreneur, you know, small business owners, I think Andrew is an exception. Yeah. Yeah, yeah, he is.

Swyx [00:26:14]: He's an exception in many ways. Yep.

Alessio [00:26:16]: Just before we wrap on the use cases, is Rick rolling your customers? Like a officially supported use case or maybe tell that story?

Flo [00:26:24]: It's one of the main jobs to be done, really. Yeah, we woke up recently, so we have a Lindy obviously doing our customer support and we do check after the Lindy. And so we caught this email exchange where someone was asking Lindy for video tutorials. And at the time, actually, we did not have video tutorials. We do now on the Lindy Academy. And Lindy responded to the email. It's like, oh, absolutely, here's a link. And we were like, what? Like, what kind of link did you send? And so we clicked on the link and it was a recall. We actually reacted fast enough that the customer had not yet opened the email. And so we reacted immediately. Like, oh, hey, actually, sorry, this is the right link. And so the customer never reacted to the first link. And so, yeah, I tweeted about that. It went surprisingly viral. And I checked afterwards in the logs. We did like a database query and we found, I think, like three or four other instances of it having happened before.

Swyx [00:27:12]: That's surprisingly low.

Flo [00:27:13]: It is low. And we fixed it across the board by just adding a line to the system prompt that's like, hey, don't recall people, please don't recall.

Swyx [00:27:21]: Yeah, yeah, yeah. I mean, so, you know, you can explain it retroactively, right? Like, that YouTube slug has been pasted in so many different corpuses that obviously it learned to hallucinate that.

Alessio [00:27:31]: And it pretended to be so many things. That's the thing.

Swyx [00:27:34]: I wouldn't be surprised if that takes one token. Like, there's this one slug in the tokenizer and it's just one token.

Flo [00:27:41]: That's the idea of a YouTube video.

Swyx [00:27:43]: Because it's used so much, right? And you have to basically get it exactly correct. It's probably not. That's a long speech.

Flo [00:27:52]: It would have been so good.

Alessio [00:27:55]: So this is just a jump maybe into evals from here. How could you possibly come up for an eval that says, make sure my AI does not recall my customer? I feel like when people are writing evals, that's not something that they come up with. So how do you think about evals when it's such like an open-ended problem space?

Flo [00:28:12]: Yeah, it is tough. We built quite a bit of infrastructure for us to create evals in one click from any conversation history. So we can point to a conversation and we can be like, in one click we can turn it into effectively a unit test. It's like, this is a good conversation. This is how you're supposed to handle things like this. Or if it's a negative example, then we modify a little bit the conversation after generating the eval. So it's very easy for us to spin up this kind of eval.

Alessio [00:28:36]: Do you use an off-the-shelf tool which is like Brain Trust on the podcast? Or did you just build your own?

Flo [00:28:41]: We unfortunately built our own. We're most likely going to switch to Brain Trust. Well, when we built it, there was nothing. Like there was no eval tool, frankly. I mean, we started this project at the end of 2022. It was like, it was very, very, very early. I wouldn't recommend it to build your own eval tool. There's better solutions out there and our eval tool breaks all the time and it's a nightmare to maintain. And that's not something we want to be spending our time on.

Swyx [00:29:04]: I was going to ask that basically because I think my first conversations with you about Lindy was that you had a strong opinion that everyone should build their own tools. And you were very proud of your evals. You're kind of showing off to me like how many evals you were running, right?

Flo [00:29:16]: Yeah, I think that was before all of these tools came around. I think the ecosystem has matured a fair bit.

Swyx [00:29:21]: What is one thing that Brain Trust has nailed that you always struggled to do?

Flo [00:29:25]: We're not using them yet, so I couldn't tell. But from what I've gathered from the conversations I've had, like they're doing what we do with our eval tool, but better.

Swyx [00:29:33]: And like they do it, but also like 60 other companies do it, right? So I don't know how to shop apart from brand. Word of mouth.

Flo [00:29:41]: Same here.

Swyx [00:29:42]: Yeah, like evals or Lindys, there's two kinds of evals, right? Like in some way, you don't have to eval your system as much because you've constrained the language model so much. And you can rely on open AI to guarantee that the structured outputs are going to be good, right? We had Michelle sit where you sit and she explained exactly how they do constraint grammar sampling and all that good stuff. So actually, I think it's more important for your customers to eval their Lindys than you evaling your Lindy platform because you just built the platform. You don't actually need to eval that much.

Flo [00:30:14]: Yeah. In an ideal world, our customers don't need to care about this. And I think the bar is not like, look, it needs to be at 100%. I think the bar is it needs to be better than a human. And for most use cases we serve today, it is better than a human, especially if you put it on Rails.

Swyx [00:30:30]: Is there a limiting factor of Lindy at the business? Like, is it adding new connectors? Is it adding new node types? Like how do you prioritize what is the most impactful to your company?

Flo [00:30:41]: Yeah. The raw capabilities for sure are a big limit. It is actually shocking the extent to which the model is no longer the limit. It was the limit a year ago. It was too expensive. The context window was too small. It's kind of insane that we started building this when the context windows were like 4,000 tokens. Like today, our system prompt is more than 4,000 tokens. So yeah, the model is actually very much not a limit anymore. It almost gives me pause because I'm like, I want the model to be a limit. And so no, the integrations are ones, the core capabilities are ones. So for example, we are investing in a system that's basically, I call it like the, it's a J hack. Give me these names, like the poor man's RLHF. So you can turn on a toggle on any step of your Lindy workflow to be like, ask me for confirmation before you actually execute this step. So it's like, hey, I receive an email, you send a reply, ask me for confirmation before actually sending it. And so today you see the email that's about to get sent and you can either approve, deny, or change it and then approve. And we are making it so that when you make a change, we are then saving this change that you're making or embedding it in the vector database. And then we are retrieving these examples for future tasks and injecting them into the context window. So that's the kind of capability that makes a huge difference for users. That's the bottleneck today. It's really like good old engineering and product work.

Swyx [00:31:52]: I assume you're hiring. We'll do a call for hiring at the end.

Alessio [00:31:54]: Any other comments on the model side? When did you start feeling like the model was not a bottleneck anymore? Was it 4.0? Was it 3.5? 3.5.

Flo [00:32:04]: 3.5 Sonnet, definitely. I think 4.0 is overhyped, frankly. We don't use 4.0. I don't think it's good for agentic behavior. Yeah, 3.5 Sonnet is when I started feeling that. And then with prompt caching with 3.5 Sonnet, like that fills the cost, cut the cost again. Just cut it in half. Yeah.

Swyx [00:32:21]: Your prompts are... Some of the problems with agentic uses is that your prompts are kind of dynamic, right? Like from caching to work, you need the front prefix portion to be stable.

Flo [00:32:32]: Yes, but we have this append-only ledger paradigm. So every node keeps appending to that ledger and every filled node inherits all the context built up by all the previous nodes. And so we can just decide, like, hey, every X thousand nodes, we trigger prompt caching again.

Swyx [00:32:47]: Oh, so you do it like programmatically, not all the time.

Flo [00:32:50]: No, sorry. Anthropic manages that for us. But basically, it's like, because we keep appending to the prompt, the prompt caching works pretty well.

Alessio [00:32:57]: We have this small podcaster tool that I built for the podcast and I rewrote all of our prompts because I noticed, you know, I was inputting stuff early on. I wonder how much more money OpenAN and Anthropic are making just because people don't rewrite their prompts to be like static at the top and like dynamic at the bottom.

Flo [00:33:13]: I think that's the remarkable thing about what we're having right now. It's insane that these companies are routinely cutting their costs by two, four, five. Like, they basically just apply constraints. They want people to take advantage of these innovations. Very good.

Swyx [00:33:25]: Do you have any other competitive commentary? Commentary? Dust, WordWare, Gumloop, Zapier? If not, we can move on.

Flo [00:33:31]: No comment.

Alessio [00:33:32]: I think the market is,

Flo [00:33:33]: look, I mean, AGI is coming. All right, that's what I'm talking about.

Swyx [00:33:38]: I think you're helping. Like, you're paving the road to AGI.

Flo [00:33:41]: I'm playing my small role. I'm adding my small brick to this giant, giant, giant castle. Yeah, look, when it's here, we are going to, this entire category of software is going to create, it's going to sound like an exaggeration, but it is a fact it is going to create trillions of dollars of value in a few years, right? It's going to, for the first time, we're actually having software directly replace human labor. I see it every day in sales calls. It's like, Lindy is today replacing, like, we talk to even small teams. It's like, oh, like, stop, this is a 12-people team here. I guess we'll set up this Lindy for one or two days, and then we'll have to decide what to do with this 12-people team. And so, yeah. To me, there's this immense uncapped market opportunity. It's just such a huge ocean, and there's like three sharks in the ocean. I'm focused on the ocean more than on the sharks.

Swyx [00:34:25]: So we're moving on to hot topics, like, kind of broadening out from Lindy, but obviously informed by Lindy. What are the high-order bits of good agent design?

Flo [00:34:31]: The model, the model, the model, the model. I think people fail to truly, and me included, they fail to truly internalize the bitter lesson. So for the listeners out there who don't know about it, it's basically like, you just scale the model. Like, GPUs go brr, it's all that matters. I think it also holds for the cognitive architecture. I used to be very cognitive architecture-filled, and I was like, ah, and I was like a critic, and I was like a generator, and all this, and then it's just like, GPUs go brr, like, just like let the model do its job. I think we're seeing it a little bit right now with O1. I'm seeing some tweets that say that the new 3.5 SONNET is as good as O1, but with none of all the crazy...

Swyx [00:35:09]: It beats O1 on some measures. On some reasoning tasks. On AIME, it's still a lot lower. Like, it's like 14 on AIME versus O1, it's like 83.

Flo [00:35:17]: Got it. Right. But even O1 is still the model. Yeah.

Swyx [00:35:22]: Like, there's no cognitive architecture on top of it.

Flo [00:35:23]: You can just wait for O1 to get better.

Alessio [00:35:25]: And so, as a founder, how do you think about that, right? Because now, knowing this, wouldn't you just wait to start Lindy? You know, you start Lindy, it's like 4K context, the models are not that good. It's like, but you're still kind of like going along and building and just like waiting for the models to get better. How do you today decide, again, what to build next, knowing that, hey, the models are going to get better, so maybe we just shouldn't focus on improving our prompt design and all that stuff and just build the connectors instead or whatever? Yeah.

Flo [00:35:51]: I mean, that's exactly what we do. Like, all day, we always ask ourselves, oh, when we have a feature idea or a feature request, we ask ourselves, like, is this the kind of thing that just gets better while we sleep because models get better? I'm reminded, again, when we started this in 2022, we spent a lot of time because we had to around context pruning because 4,000 tokens is really nothing. You really can't do anything with 4,000 tokens. All that work was throwaway work. Like, now it's like it was for nothing, right? Now we just assume that infinite context windows are going to be here in a year or something, a year and a half, and infinitely cheap as well, and dynamic compute is going to be here. Like, we just assume all of these things are going to happen, and so we really focus, our job to be done in the industry is to provide the input and output to the model. I really compare it all the time to the PC and the CPU, right? Apple is busy all day. They're not like a CPU wrapper. They have a lot to build, but they don't, well, now actually they do build the CPU as well, but leaving that aside, they're busy building a laptop. It's just a lot of work to build these things. It's interesting because, like,

Swyx [00:36:45]: for example, another person that we're close to, Mihaly from Repl.it, he often says that the biggest jump for him was having a multi-agent approach, like the critique thing that you just said that you don't need, and I wonder when, in what situations you do need that and what situations you don't. Obviously, the simple answer is for coding, it helps, and you're not coding, except for, are you still generating code? In Indy? Yeah.

Flo [00:37:09]: No, we do. Oh, right. No, no, no, the cognitive architecture changed. We don't, yeah.

Swyx [00:37:13]: Yeah, okay. For you, you're one shot, and you chain tools together, and that's it. And if the user really wants

Flo [00:37:18]: to have this kind of critique thing, you can also edit the prompt, you're welcome to. I have some of my Lindys, I've told them, like, hey, be careful, think step by step about what you're about to do, but that gives you a little bump for some use cases, but, yeah.

Alessio [00:37:30]: What about unexpected model releases? So, Anthropic released computer use today. Yeah. I don't know if many people were expecting computer use to come out today. Do these things make you rethink how to design, like, your roadmap and things like that, or are you just like, hey, look, whatever, that's just, like, a small thing in their, like, AGI pursuit, that, like, maybe they're not even going to support, and, like, it's still better for us to build our own integrations into systems and things like that. Because maybe people will say, hey, look, why am I building all these API integrations

Flo [00:38:02]: when I can just do computer use and never go to the product? Yeah. No, I mean, we did take into account computer use. We were talking about this a year ago or something, like, we've been talking about it as part of our roadmap. It's been clear to us that it was coming, My philosophy about it is anything that can be done with an API must be done by an API or should be done by an API for a very long time. I think it is dangerous to be overly cavalier about improvements of model capabilities. I'm reminded of iOS versus Android. Android was built on the JVM. There was a garbage collector, and I can only assume that the conversation that went down in the engineering meeting room was, oh, who cares about the garbage collector? Anyway, Moore's law is here, and so that's all going to go to zero eventually. Sure, but in the meantime, you are operating on a 400 MHz CPU. It was like the first CPU on the iPhone 1, and it's really slow, and the garbage collector is introducing a tremendous overhead on top of that, especially a memory overhead. For the longest time, and it's really only been recently that Android caught up to iOS in terms of how smooth the interactions were, but for the longest time, Android phones were significantly slower

Swyx [00:39:07]: and laggier

Flo [00:39:08]: and just not feeling as good as iOS devices. Look, when you're talking about modules and magnitude of differences in terms of performance and reliability, which is what we are talking about when we're talking about API use versus computer use, then you can't ignore that, right? And so I think we're going to be in an API use world for a while.

Swyx [00:39:27]: O1 doesn't have API use today. It will have it at some point, and it's on the roadmap. There is a future in which OpenAI goes much harder after your business, your market, than it is today. Like, ChatGPT, it's its own business. All they need to do is add tools to the ChatGPT, and now they're suddenly competing with you. And by the way, they have a GPT store where a bunch of people have already configured their tools to fit with them. Is that a concern?

Flo [00:39:56]: I think even the GPT store, in a way, like the way they architect it, for example, their plug-in systems are actually grateful because we can also use the plug-ins. It's very open. Now, again, I think it's going to be such a huge market. I think there's going to be a lot of different jobs to be done. I know they have a huge enterprise offering and stuff, but today, ChatGPT is a consumer app. And so, the sort of flow detail I showed you, this sort of workflow, this sort of use cases that we're going after, which is like, we're doing a lot of lead generation and lead outreach and all of that stuff. That's not something like meeting recording, like Lindy Today right now joins your Zoom meetings and takes notes, all of that stuff.

Swyx [00:40:34]: I don't see that so far

Flo [00:40:35]: on the OpenAI roadmap.

Swyx [00:40:36]: Yeah, but they do have an enterprise team that we talk to You're hiring GMs?

Flo [00:40:42]: We did.

Swyx [00:40:43]: It's a fascinating way to build a business, right? Like, what should you, as CEO, be in charge of? And what should you basically hire

Flo [00:40:52]: a mini CEO to do? Yeah, that's a good question. I think that's also something we're figuring out. The GM thing was inspired from my days at Uber, where we hired one GM per city or per major geo area. We had like all GMs, regional GMs and so forth. And yeah, Lindy is so horizontal that we thought it made sense to hire GMs to own each vertical and the go-to market of the vertical and the customization of the Lindy templates for these verticals and so forth. What should I own as a CEO? I mean, the canonical reply here is always going to be, you know, you own the fundraising, you own the culture, you own the... What's the rest of the canonical reply? The culture, the fundraising.

Swyx [00:41:29]: I don't know,

Flo [00:41:30]: products. Even that, eventually, you do have to hand out. Yes, the vision, the culture, and the foundation. Well, you've done your job as a CEO. In practice, obviously, yeah, I mean, all day, I do a lot of product work still and I want to keep doing product work for as long as possible.

Swyx [00:41:48]: Obviously, like you're recording and managing the team. Yeah.

Flo [00:41:52]: That one feels like the most automatable part of the job, the recruiting stuff.

Swyx [00:41:56]: Well, yeah. You saw my

Flo [00:41:59]: design your recruiter here. Relationship between Factorio and building Lindy. We actually very often talk about how the business of the future is like a game of Factorio. Yeah. So, in the instance, it's like Slack and you've got like 5,000 Lindys in the sidebar and your job is to somehow manage your 5,000 Lindys. And it's going to be very similar to company building because you're going to look for like the highest leverage way to understand what's going on in your AI company and understand what levels do you have to make impact in that company. So, I think it's going to be very similar to like a human company except it's going to go infinitely faster. Today, in a human company, you could have a meeting with your team and you're like, oh, I'm going to build a facility and, you know, now it's like, okay,

Swyx [00:42:40]: boom, I'm going to spin up 50 designers. Yeah. Like, actually, it's more important that you can clone an existing designer that you know works because the hiring process, you cannot clone someone because every new person you bring in is going to have their own tweaks

Flo [00:42:54]: and you don't want that. Yeah.

Swyx [00:42:56]: That's true. You want an army of mindless drones

Flo [00:42:59]: that all work the same way.

Swyx [00:43:00]: The reason I bring this, bring Factorio up as well is one, Factorio Space just came out. Apparently, a whole bunch of people stopped working. I tried out Factorio. I never really got that much into it. But the other thing was, you had a tweet recently about how the sort of intentional top-down design was not as effective as just build. Yeah. Just ship.

Flo [00:43:21]: I think people read a little bit too much into that tweet. It went weirdly viral. I was like, I did not intend it as a giant statement online.

Swyx [00:43:28]: I mean, you notice you have a pattern with this, right? Like, you've done this for eight years now.

Flo [00:43:33]: You should know. I legit was just hearing an interesting story about the Factorio game I had. And everybody was like, oh my God, so deep. I guess this explains everything about life and companies. There is something to be said, certainly, about focusing on the constraint. And I think it is Patrick Collison who said, people underestimate the extent to which moonshots are just one pragmatic step taken after the other. And I think as long as you have some inductive bias about, like, some loose idea about where you want to go, I think it makes sense to follow a sort of greedy search along that path. I think planning and organizing is important. And having older is important.

Swyx [00:44:05]: I'm wrestling with that. There's two ways I encountered it recently. One with Lindy. When I tried out one of your automation templates and one of them was quite big and I just didn't understand it, right? So, like, it was not as useful to me as a small one that I can just plug in and see all of. And then the other one was me using Cursor. I was very excited about O1 and I just up front

Flo [00:44:27]: stuffed everything

Swyx [00:44:28]: I wanted to do into my prompt and expected O1 to do everything. And it got itself into a huge jumbled mess and it was stuck. It was really... There was no amount... I wasted, like, two hours on just, like, trying to get out of that hole. So I threw away the code base, started small, switched to Clouds on it and build up something working and just add it over time and it just worked. And to me, that was the factorial sentiment, right? Maybe I'm one of those fanboys that's just, like, obsessing over the depth of something that you just randomly tweeted out. But I think it's true for company building, for Lindy building, for coding.

Flo [00:45:02]: I don't know. I think it's fair and I think, like, you and I talked about there's the Tuft & Metal principle and there's this other... Yes, I love that. There's the... I forgot the name of this other blog post but it's basically about this book Seeing Like a State that talks about the need for legibility and people who optimize the system for its legibility and anytime you make a system... So legible is basically more understandable. Anytime you make a system more understandable from the top down, it performs less well from the bottom up. And it's fine but you should at least make this trade-off with your eyes wide open. You should know, I am sacrificing performance for understandability, for legibility. And in this case, for you, it makes sense. It's like you are actually optimizing for legibility. You do want to understand your code base but in some other cases it may not make sense. Sometimes it's better to leave the system alone and let it be its glorious, chaotic, organic self and just trust that it's going to perform well even though you don't understand it completely.

Swyx [00:45:55]: It does remind me of a common managerial issue or dilemma which you experienced in the small scale of Lindy where, you know, do you want to organize your company by functional sections or by products or, you know, whatever the opposite of functional is. And you tried it one way and it was more legible to you as CEO but actually it stopped working at the small level. Yeah.

Flo [00:46:17]: I mean, one very small example, again, at a small scale is we used to have everything on Notion. And for me, as founder, it was awesome because everything was there. The roadmap was there. The tasks were there. The postmortems were there. And so, the postmortem was linked

Swyx [00:46:31]: to its task.

Flo [00:46:32]: It was optimized for you. Exactly. And so, I had this, like, one pane of glass and everything was on Notion. And then the team, one day,

Swyx [00:46:39]: came to me with pitchforks

Flo [00:46:40]: and they really wanted to implement Linear. And I had to bite my fist so hard. I was like, fine, do it. Implement Linear. Because I was like, at the end of the day, the team needs to be able to self-organize and pick their own tools.

Alessio [00:46:51]: Yeah. But it did make the company slightly less legible for me. Another big change you had was going away from remote work, every other month. The discussion comes up again. What was that discussion like? How did your feelings change? Was there kind of like a threshold of employees and team size where you felt like, okay, maybe that worked. Now it doesn't work anymore. And how are you thinking about the future

Flo [00:47:12]: as you scale the team? Yeah. So, for context, I used to have a business called TeamFlow. The business was about building a virtual office for remote teams. And so, being remote was not merely something we did. It was, I was banging the remote drum super hard and helping companies to go remote. And so, frankly, in a way, it's a bit embarrassing for me to do a 180 like that. But I guess, when the facts changed, I changed my mind. What happened? Well, I think at first, like everyone else, we went remote by necessity. It was like COVID and you've got to go remote. And on paper, the gains of remote are enormous. In particular, from a founder's standpoint, being able to hire from anywhere is huge. Saving on rent is huge. Saving on commute is huge for everyone and so forth. But then, look, we're all here. It's like, it is really making it much harder to work together. And I spent three years of my youth trying to build a solution for this. And my conclusion is, at least we couldn't figure it out and no one else could. Zoom didn't figure it out. We had like a bunch of competitors. Like, Gathertown was one of the bigger ones. We had dozens and dozens of competitors. No one figured it out. I don't know that software can actually solve this problem. The reality of it is, everyone just wants to get off the darn Zoom call. And it's not a good feeling to be in your home office if you're even going to have a home office all day. It's harder to build culture. It's harder to get in sync. I think software is peculiar because it's like an iceberg. It's like the vast majority of it is submerged underwater. And so, the quality of the software that you ship is a function of the alignment of your mental models about what is below that waterline. Can you actually get in sync about what it is exactly fundamentally that we're building? What is the soul of our product? And it is so much harder to get in sync about that when you're remote. And then you waste time in a thousand ways because people are offline and you can't get a hold of them or you can't share your screen. It's just like you feel like you're walking in molasses all day. And eventually, I was like, okay, this is it. We're not going to do this anymore.

Swyx [00:49:03]: Yeah. I think that is the current builder San Francisco consensus here. Yeah. But I still have a big... One of my big heroes as a CEO is Sid Subban from GitLab.

Flo [00:49:14]: Mm-hmm.

Swyx [00:49:15]: Matt Mullenweg

Flo [00:49:16]: used to be a hero.

Swyx [00:49:17]: But these people run thousand-person remote businesses. The main idea is that at some company size, your company is remote anyway. Yeah. Because if you go from one building to two buildings, congrats, you're now remote from the other building. If you want to go from one city office to two city offices, they're remote from each other.

Flo [00:49:35]: But the teams are co-located. Every time anyone talks about remote success stories, they always talk about this real force. Yeah. It's always GitLab and WordPress and Zapier. Zapier. It used to be Envision. And I will point out that in every one of these examples, you have a co-located counterfactual that is sometimes orders of magnitude bigger. Look, I like Matt Mullenweg a lot, but WordPress is a commercial failure. They run 60% of the internet and they're like a fraction of the size of even Substack. Right?

Swyx [00:50:05]: They're trying to get more money.

Flo [00:50:07]: Yeah, that's my point, right? Look, GitLab is much smaller than GitHub. Envision, you know, is no more. And Figma, like, completely took off. And Figma was like very in-person. So, I think if you're optimizing for productivity, if you really know, hey, this is a support ticket, right, and I want to have my support ticket for a buck 50 per support ticket and next year I want it for a buck 20, then sure, send your support ticket team to offshore, like the Philippines or whatever, and just optimize for cost. If you're optimizing for cost, absolutely be remote. If you're optimizing for creativity, which I think that software and product building is a creative endeavor, if you're optimizing for creativity, it's kind of like you have to be in person and hear the music to do that.

Swyx [00:50:52]: Yeah. Maybe the line is that all jobs that can be remote should be AI or Lindy's and all jobs that are not remote are in person. Like, there's a very,

Flo [00:51:04]: very clear separation of jobs. Sure. Well, I think over the long term,

Swyx [00:51:09]: every job is going to be AI anyway. It would be curious to break down what you think is creativity in coding and in product defining and how to express that for sure. You're definitely what I call a temperature zero use case of LLMs. You want it to be reliable, predictable, small. And then there's other use cases of LLMs that are more for creativity and engines. Right? I haven't checked, but I'm pretty sure no one uses Lindy for brainstorming. Actually,

Flo [00:51:36]: probably they do. I use Lindy for brainstorming

Swyx [00:51:38]: a lot, actually. Yeah, yeah. But you want to have something that's anti-fragile to hallucination. Hallucinations are good.

Flo [00:51:45]: By creativity, I mean, is it about direction or magnitude? If it is about direction, like decide what to do, then it's a creative endeavor. If it is about magnitude and just do it as fast as possible, as cheap as possible, then it's magnitude. And so sometimes, you know, software companies are not necessarily creative. Sometimes you know what you're doing. And I'll say that it's going to come across the wrong way, but linear. I look up to a huge amount, like such amazing product builders, but they know what they're building. They're building a I don't mean to throw shade at them. Like, good for them.

Swyx [00:52:20]: I think they're aware that they're not like They recently got shit for saying that they have work-life balance on their job description.

Flo [00:52:26]: They're like, what do you mean by this? We're building a new kind of product that no one's ever built before. And so we're just scratching our heads all day trying to get in sync about like, what exactly is it

Swyx [00:52:37]: that we're building? What does it consist of? Inherently creative struggle. Yeah. Dare we ask about San Francisco? And there's a whole bunch of tough stuff in here. Probably the biggest one I would just congratulate you on is becoming American, right? Very French, but your heart was sort of in the U.S. You eventually found your way here. What are your takes for founders? A few years ago, you wrote this post on Go West, young man. And now you've basically completed that journey, right? You're now here and up to the point where you're kind of mystified by how Europe has been so decel.

Flo [00:53:11]: In a way, though, I feel vindicated because I was making the prediction that Europe was over 14 years ago or something like that. I think it's been a walking corpse for a long time. I think it is only now becoming obvious that it is paying the consequences of its policies from 10, 20, 30 years ago. I think at this point, I wish I could rewrite the Go West, young man article but really even more extreme. I think at this point, if you are in tech, especially in AI, but if you're in tech and you're not in San Francisco, you either lack judgment or you lack ambition. It's funny, I recently told that to someone and they were like, oh, not everyone wants to be like a unicorn founder. And I was like, like I said, judgment or ambition. It's fine to not have ambition. It's fine to want to prioritize other things than your company in life or your career in life. That's perfectly okay. But know that that's the trade-off you're making. If you prioritize your career, you've got to be here.

Alessio [00:54:03]: As a fellow European escapist, I grew up in Rome.

Flo [00:54:05]: Yeah, how do you feel?

Swyx [00:54:06]: We never talk about your feelings about Europe.

Alessio [00:54:08]: Yeah, I've been in the U.S. now six years. Well, I started my first company in Europe 10 years ago, something like that. Yeah, you can tell nobody really wants to do much. And then you're like, okay. It's funny, I was looking back through some old tweets and I was sending all these tweets to Marc Andreessen like 15 years ago like trying to like learn more about why are you guys putting money in these things that most people here would say you're like crazy to like even back. And eventually, you know, I started doing venture six, five years ago. And I think just like so many people in Europe reach out and ask, hey, can you like talk to our team and they just cannot comprehend like the risk appetite that people have here. It's just like so foreign to people, at least in Italy and like in some parts of Europe. I'm sure there's some great founders in Europe, but like the average European founders, like why would I leave my job at the post office to go work on the startup that could change everything and become very successful but might go out of business instead in the U.S. You have like, you know, we host a hackathon and it's like 400 people and it's like, where can I go work that it's like no job security, you know? It's just like completely different and there's no incentives from the government to change that. There's no way you can like change such a deep-rooted culture of like, you know, going and wine and April spritz

Flo [00:55:27]: and all of that

Alessio [00:55:28]: early in the afternoon.

Flo [00:55:29]: So, I don't really know how it's going to change.

Alessio [00:55:32]: It's quality of life. Yeah, totally. That's why I left. The quality is so high that I left. But again, I think it's better to move here and just, if you want to do this job and do this, you should be here. If you don't want to, that's fine.

Flo [00:55:47]: But like,

Alessio [00:55:48]: don't copium. Don't be like, oh no, you can also be successful doing this and knees or like whatever. No, probably not, you know? So,

Flo [00:55:59]: yeah,

Alessio [00:56:00]: I've already done my N400

Flo [00:56:01]: so I should get my U.S. citizenship interview soon. Yeah. And I think to be fair, I think what's happening right now to Europe and they've said no to capitalism. They've decided to say no to capitalism a long time ago. They've like completely over-regulated. Taxation is much too high and so forth. But I also think some of this is a little bit of a self-fulfilling prophecy or it's a self-perpetuating phenomenon because, look, to your point, like once there is a network effect that's just so incredibly powerful, they can't be broken, really. And we tried with San Francisco. I tried with San Francisco. Like during COVID,

Swyx [00:56:35]: there was a movement of people moving to Miami.

Flo [00:56:38]: How did that pan out? You can't break the network effect,

Swyx [00:56:41]: you know? It's so annoying because first principles wise, tech should not be here. Like tech should be in Miami because it's just a better city.

Flo [00:56:48]: San Francisco does not want tech to be here.

Swyx [00:56:50]: San Francisco hates tech.

Flo [00:56:51]: 100%.

Swyx [00:56:52]: This is the thing I actually wrote down.

Alessio [00:56:54]: San Francisco hates tech. It is true. I think the people that are in San Francisco that were here before, tech hated it and then there's kind of like this passed down thing. But I would say people in Miami would hate it too if there were too much of it. You know? The Mickey Beach crowd would also not gel.

Swyx [00:57:08]: They're just rich enough and chill enough to not care.

Flo [00:57:10]: Yeah, I think so too.

Swyx [00:57:11]: They're like, oh, crypto kids.

Flo [00:57:13]: Okay, cool. Yeah. Miami celebrates success which is one thing

Swyx [00:57:17]: I loved about it.

Flo [00:57:18]: A little bit too much.

Swyx [00:57:19]: Maybe the last thing I'll mention, I just wanted a little bit of EUAC talk. I think that's good. I'll maybe carve out that I think the UK has done really well. That's an argument for the UK not being part of Europe is that, you know, the AI institutions there at least have done very well. Right?

Flo [00:57:34]: Sure. I think a lot of Britain is in the gutter. Yeah, exactly.

Swyx [00:57:38]: They've been stagnating at best. And then France has a few wins.

Flo [00:57:41]: Who?

Swyx [00:57:42]: Mistral.

Flo [00:57:43]: Who uses Mistral?

Swyx [00:57:44]: Hugging face.

Flo [00:57:45]: A few wins.

Swyx [00:57:46]: I'm just saying. They disappointed their first AI minister. You know the meme with the guy

Flo [00:57:51]: who's celebrating with his trophy and then he's like, no, that's France. Right? To me, that's France. It's like, aha, look, we've got Mistral! It's like champagne! It's like maybe 1% of market share. And by the way, and it's not a critic of them, it's a critic of France and of Europe. And by the way, I think I've heard that the Mistral guys were moving to the US. They're opening an office here. They're opening an office here. But, I mean,

Swyx [00:58:15]: they're very French, right?

Flo [00:58:16]: Right.

Swyx [00:58:17]: You can't really avoid it. There's one interesting counter move which is Jason Warner and ISOCAT moving to Paris for poolside. I don't know. It remains to be seen how that move is going. Maybe the last thing I'll say, you know, that's the Europe talk. We try not to do politics so much, but you're here. One thing that you do a lot is you test your overturned windows. Right? Like far more than any founder I know. You know it's not your job. Someone, for sure, you're just indulging. But also, I think you consciously test. And I just want to see what drives you there and why do you keep doing it? Because you treat very spicy stuff, especially for like the San Francisco sort of liberal dynasty.

Flo [00:58:59]: I don't know because I assume you're referring to I posted something about pronouns and how nonsense...

Swyx [00:59:05]: Just in general. I don't want you to focus on any particular thing unless you want to.

Flo [00:59:09]: You know, well, that tweet in particular, when I was tweeting it, I was like, oh, this is kind of spicy. Should I do this? And then I just did it. And I received zero pushback.

Swyx [00:59:20]: And the tweet was actually

Flo [00:59:21]: pretty successful and I received a lot of people reaching out like, oh my God, so true. I think it's coming from a few different places. One, life is more fun this way. Like I don't feel like if everyone always self-censors, you never know what everyone, what anyone thinks. And so it's becoming like a self-perpetuating thing. It's like a public lies, private truth sort of phenomenon. Or like, you know, there's this phenomenon called the preference cascade. It's like, there's this joke. It's like, oh, there's only one communist left in USSR. The problem is no one knows which one it is. So everyone pretends to be communist because everyone else pretends to be communist. And so I think there's a role to be played when you have a boss who's going to fire me. It's like, look, if I don't speak up and if founders don't speak up, I'm like, why? What are you afraid of? Right? Like, there's really not that much downside. And I think there's

Swyx [01:00:14]: something to be said about standing up for what you think is right and being real and owning your opinions. I think there's a correlation there between having that level of independence for your political beliefs and free speech or whatever and the way

Flo [01:00:27]: that you think about business too. But I think there's such a powerful insight at its core, which is groupthink is real and pervasive and really problematic. Like, your brain constantly shuts down because you're not even thinking in your other way or you're not thinking. You just look around you and you decide to adopt the same beliefs as people around you. And everyone thinks

Swyx [01:00:48]: they're immune

Flo [01:00:49]: and everyone else

Swyx [01:00:50]: is doing it

Flo [01:00:51]: except themselves. I'm a special snowflake. I have free will. That's right. And so I actually make it a point to look for, and then I think about it and I'm like, do I believe this thing? And very often the answer is yes. And then I just say it. And so I think the AI safety is an example of that. Like, at some point, Marc Andreessen blocked me on Twitter and it hurt, frankly. I really look up to Marc Andreessen

Swyx [01:01:13]: and I knew he would block me. It means you're successful on Twitter.

Flo [01:01:17]: It's just the right message. Marc Andreessen was really my booster initially on Twitter. He really made my account. And I was like, look, I'm really concerned about AI safety. It is an unpopular view

Swyx [01:01:27]: among my peers. I remember, you were one of the few that actually came out in support of the bill.

Flo [01:01:32]: I came out in support of SB1047 a year and a half ago. I put like some tweet storms about how I was really concerned. And yeah, I was blocked by a bunch of AI safety people and I don't like it, but you know, it's funny, maybe it's my French education. But look, in France, World War II is very present in people's minds and the phenomenon of people collaborating with the Nazis and there's always this sort of debate that people have like at dinner and it's like, ah, would you really have resisted during World War II? And everybody is always saying, oh yeah, we totally have resisted. It's like, yeah, but no. The reality of it is 95% of the country did not resist and most of it actually collaborated actively with the Nazis. And so 95% of y'all are wrong. You would actually have collaborated, right? I've always told myself I will stand up for what I think is right because some people got attacked and the way I was brought up is if someone gets attacked before you, you get involved. It doesn't matter, you get involved and you help the person, right? And so, look, I'm not pretending we're nowhere near a World War II phenomenon but I'm like, exactly because we are nowhere near

Alessio [01:02:45]: this kind of phenomenon. The stakes are so low and if you're not going to stand up

Flo [01:02:49]: for what you think is right when the stakes are so low,

Swyx [01:02:52]: are you going to stand up when it matters? There's an inconsistency in your statements because you simultaneously believe that AGI is very soon and you also say stakes are low. You can't believe both are real.

Flo [01:03:03]: Well, why does AGI make the stakes of speaking up higher?

Swyx [01:03:06]: Sorry, the stakes of safety.

Flo [01:03:08]: Oh yeah, no, the stakes of AI

Swyx [01:03:11]: are like physical safety?

Flo [01:03:12]: No, AI safety. Oh no, the stakes of AI safety couldn't be higher.

Swyx [01:03:17]: I meant the stakes

Flo [01:03:18]: of speaking up about

Alessio [01:03:19]: pronouns or whatever. How do you figure out who's real and who isn't? Because there was a manifesto for responsible AI that hundreds of VCs and people signed and I don't think anybody actually thinks about it anymore.

Flo [01:03:30]: Was that the pause letter?

Swyx [01:03:31]: The six-month pause?

Flo [01:03:32]: No,

Alessio [01:03:33]: there was something else that I think general catalyst and some fun sign. And then there's maybe the anthropic case which is like, hey, we're leaving open AI because you guys don't take security seriously and then it's like, hey, what if we gave AI access to a whole computer

Flo [01:03:49]: to just go do things?

Alessio [01:03:50]: How do you reconcile like, okay, I mean, you could say the same thing about Lindy. It's like, if you're worried about AI safety, why are you building AI? Right? That's kind of like the extreme thinking. How do you internally decide between participation and talking about it and saying, hey, I think this is important but I'm still going to build towards that and building actually makes it safer because I'm involved versus just being like anti. I think this is unsafe but then not do anything about it and just kind of remove yourself

Flo [01:04:20]: from the whole thing. What I think about our own involvement here is I'm acutely concerned about the risks at the model layer and I'm simultaneously very excited about the upside. Like, for the record, my PDoom, insofar as I can quantify it, which I cannot, but if I had to, like my vibe is like 10% or something like that and so there's like a 90% chance that we live in like a pure utopia. Right? And that's awesome. Right? So like, let's go after utopia. Right? Let's talk about the 10% chance that we live in a utopia where there's no disease and it's like a post-scarcity world. I think that utopia is going to happen through, like again, I'm bringing my little contribution to the movement. I think it would be silly to say no to the upside because you're concerned about the downside. At the same time, we want to be concerned about the downside. I know that it's very self-serving to say, oh, you know, like the downside doesn't exist at my layer, it exists at like the model layer. But truly, look at Lindy, look at the Apple building. I struggle to see exactly how it would like get up if I'm concerned about the model layer.

Swyx [01:05:21]: Okay. Well, this kind of discussion can go on for hours. It is still daylight, so not the best time for it. But I really appreciate you spending the time. Any other last calls to actions or thoughts that you feel like you want to get off your chest?

Flo [01:05:33]: AGI is coming.

Flo [01:05:37]: Are you hiring

Alessio [01:05:38]: for any roles? We are.

Flo [01:05:40]: Oh yeah, I guess that should be the...

Swyx [01:05:43]: Don't bother.

Flo [01:05:44]: No, can you stop saying AGI is coming and just talk about it? We are also hiring yeah, we are hiring designers and engineers right now. Yeah. So hit me up at flo.lindy.ai

Alessio [01:05:55]: And then go talk to my Lindy. You're not actually going to read it.

Flo [01:05:58]: Actually, I have wondered

Swyx [01:05:59]: how many times when I talk to you, I'm talking to a bot. Part of that is I don't have to know, right?

Flo [01:06:05]: That's right. Well, it's actually doubly confusing because we also have a teammate

Swyx [01:06:09]: whose name is Lindy. Yes, I was wondering when I met her, I was like, wait, did you hire her first?

Flo [01:06:14]: Marketing is fun. No, she was an inspiration after we named the company both after her. Oh, okay.

Swyx [01:06:19]: Interesting. Yeah, wonderful. I'll comment on the design piece just because I think that there are a lot of AI companies that very much focus on the functionality and the models and the capabilities and the benchmark. But I think that increasingly I'm seeing people differentiate with design and people want to use beautiful products and people who can figure that out and integrate the AI into their human lives. You know, design at the limit. One, at the lowest level is to make this look pretty, make this look like Stripe or Linear's homepage. That's design. But at the highest level of design it is make this integrate seamlessly into my life. Intuitive, beautiful, inspirational maybe even. And I think that companies that, you know, this is kind of like a blog post I've been thinking about, companies that emphasize design actually are going to win more than companies that don't. Yeah,

Flo [01:07:06]: I love Steve Jobs' quote and I'm going to butcher it. It's something like, design is the expression of the soul of a man-made product through successive layers of design. Jesus. Right? He was good. He was cooking. He was cooking on that one. He was cooking. It starts with the soul of the product which is why I was saying it is so important to reach alignment about that soul of the product, right? It's like an onion, like you peel the onion in those layers, right? And you design an entire journey just like the user experiencing your product chronologically all the way from the beginning of like the awareness stage I think it is also the job of the designer to design that part of the experience. It's like, okay, design is immensely important. Okay.

Alessio [01:07:46]: Lovely. Yeah.

Flo [01:07:48]: Thanks for coming on, Flo. Yeah, absolutely. Thanks for having me.

Discussion about this episode

Latent Space: The AI Engineer Podcast

The podcast by and for AI Engineers! In 2024, over 2 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0.

We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al.

Full show notes always on https://latent.space

Listen on