We are happy to announce that there will be a dedicated MCP track at the 2025 AI Engineer World's Fair, taking place Jun 3rd to 5th in San Francisco, where the MCP core team and major contributors and builders will be meeting. Join us and apply to speak or sponsor!
When we first wrote Why MCP Won, we had no idea how quickly it was about to win.
In the past 4 weeks, OpenAI and now Google have now announced the MCP support, effectively confirming our prediction that MCP was the presumptive winner of the agent standard wars. MCP has now overtaken OpenAPI, the incumbent option and most direct alternative, in GitHub stars (3 months ahead of conservative trendline):
We have explored the state of MCP at AIE (now the first ever >100k views workshop):
And since then, we’ve added a 7th reason why MCP won - this team acts very quickly on feedback, with the 2025-03-26 spec update adding support for stateless/resumable/streamable HTTP transports, and comprehensive authz capabilities based on OAuth 2.1.
This bodes very well for the future of the community and project.
For protocol and history nerds, we also asked David and Justin to tell the origin story of MCP, which we leave to the reader to enjoy (you can also skim the transcripts, or, the changelogs of a certain favored IDE). It’s incredible the impact that individual engineers solving their own problems can have on an entire industry.
Full video episode
Like and subscribe on YouTube!
Show Links
Timestamps
00:00 Introduction and Guest Welcome
00:37 What is MCP?
02:00 The Origin Story of MCP
05:18 Development Challenges and Solutions
08:06 Technical Details and Inspirations
29:45 MCP vs Open API
32:48 Building MCP Servers
40:39 Exploring Model Independence in LLMs
41:36 Building Richer Systems with MCP
43:13 Understanding Agents in MCP
45:45 Nesting and Tool Confusion in MCP
49:11 Client Control and Tool Invocation
52:08 Authorization and Trust in MCP Servers
01:01:34 Future Roadmap and Stateless Servers
01:10:07 Open Source Governance and Community Involvement
01:18:12 Wishlist and Closing Remarks
Transcript
Alessio [00:00:02]: Hey, everyone. Welcome back to Latent Space. This is Alessio, partner and CTO at Decibel, and I'm joined by my co-host Swyx, founder of Small AI.
swyx [00:00:10]: Hey, morning. And today we have a remote recording, I guess, with David and Justin from Anthropic over in London. Welcome. Hey, good You guys have created a storm of hype because of MCP, and I'm really glad to have you on. Thanks for making the time. What is MCP? Let's start with a crisp what definition from the horse's mouth, and then we'll go into the origin story. But let's start off right off the bat. What is MCP?
Justin/David [00:00:43]: Yeah, sure. So Model Context Protocol, or MCP for short, is basically something we've designed to help AI applications extend themselves or integrate with an ecosystem of plugins, basically. The terminology is a bit different. We use this client-server terminology, and we can talk about why that is and where that came from. But at the end of the day, it really is that. It's like extending and enhancing the functionality of AI application.
swyx [00:01:05]: David, would you add anything?
Justin/David [00:01:07]: Yeah, I think that's actually a good description. I think there's like a lot of different ways for how people are trying to explain it. But at the core, I think what Justin said is like extending AI applications is really what this is about. And I think the interesting bit here that I want to highlight, it's AI applications and not models themselves that this is focused on. That's a common misconception that we can talk about a bit later. But yeah. Another version that we've used and gotten to like is like MCP is kind of like the USB-C port of AI applications and that it's meant to be this universal connector to a whole ecosystem of things.
swyx [00:01:44]: Yeah. Specifically, an interesting feature is, like you said, the client and server. And it's a sort of two-way, right? Like in the same way that said a USB-C is two-way, which could be super interesting. Yeah, let's go into a little bit of the origin story. There's many people who've tried to make statistics. There's many people who've tried to build open source. I think there's an overall, also, my sense is that Anthropic is going hard after developers in the way that other labs are not. And so I'm also curious if there was any external influence or was it just you two guys just in a room somewhere riffing?
Justin/David [00:02:18]: It is actually mostly like us two guys in a room riffing. So this is not part of a big strategy. You know, if you roll back time a little bit and go into like July 2024. I was like, started. I started at Anthropic like three months earlier or two months earlier. And I was mostly working on internal developer tooling, which is what I've been doing for like years and years before. And as part of that, I think there was an effort of like, how do I empower more like employees at Anthropic to use, you know, to integrate really deeply with the models we have? Because we've seen these, like, how good it is, how amazing it will become even in the future. And of course, you know, just dogfoot your own model as much as you can. And as part of that. From my development tooling background, I quickly got frustrated by the idea that, you know, on one hand side, I have Cloud Desktop, which is this amazing tool with artifacts, which I really enjoyed. But it was very limited to exactly that feature set. And it was there was no way to extend it. And on the other hand side, I like work in IDEs, which could greatly like act on like the file system and a bunch of other things. But then they don't have artifacts or something like that. And so what I constantly did was just copy. Things back and forth on between Cloud Desktop and the IDE, and that quickly got me, honestly, just very frustrated. And part of that frustration wasn't like, how do I go and fix this? What, what do we need? And back to like this development developer, like focus that I have, I really thought about like, well, I know how to build all these integrations, but what do I need to do to let these applications let me do this? And so it's very quickly that you see that this is clearly like an M times N problem. Like you have multiple like applications. And multiple integrations you want to build and like, what that is better there to fix this than using a protocol. And at the same time, I was actually working on an LSP related thing internally that didn't go anywhere. But you put these things together in someone's brain and let them wait for like a few weeks. And out of that comes like the idea of like, let's build some, some protocol. And so back to like this little room, like it was literally just me going to a room with Justin and go like, I think we should build something like this. Uh, this is a good idea. And Justin. Lucky for me, just really took an interest in the idea, um, and, and took it from there to like, to, to build something, together with me, that's really the inception story is like, it's us to, from then on, just going and building it over, over the course of like, like a month and a half of like building the protocol, building the first integration, like Justin did a lot of the, like the heavy lifting of the first integrations in cloud desktop. I did a lot of the first, um, proof of concept of how this can look like in an IDE. And if you, we could talk about like some of. All the tidbits you can find way before the inception of like before the official release, if you were looking at the right repositories at the right time, but there you go. That's like some of the, the rough story.
Alessio [00:05:12]: Uh, what was the timeline when, I know November 25th was like the official announcement date. When did you guys start working on it?
Justin/David [00:05:19]: Justin, when did we start working on that? I think it, I think it was around July. I think, yeah, I, as soon as David pitched this initial idea, I got excited pretty quickly and we started working on it, I think. I think almost immediately after that conversation and then, I don't know, it was a couple, maybe a few months of, uh, building the really unrewarding bits, if we're being honest, because for, for establishing something that's like this communication protocol has clients and servers and like SDKs everywhere, there's just like a lot of like laying the groundwork that you have to do. So it was a pretty, uh, that was a pretty slow couple of months. But then afterward, once you get some things talking over that wire, it really starts to get exciting and you can start building. All sorts of crazy things. And I think this really came to a head. And I don't remember exactly when it was, maybe like approximately a month before release, there was an internal hackathon where some folks really got excited about MCP and started building all sorts of crazy applications. I think the coolest one of which was like an MCP server that can control a 3d printer or something. And so like, suddenly people are feeling this power of like cloud connecting to the outside world in a really tangible way. And that, that really added some, uh, some juice to us and to the release.
Alessio [00:06:32]: Yeah. And we'll go into the technical details, but I just want to wrap up here. You mentioned you could have seen some things coming if you were looking in the right places. We always want to know what are the places to get alpha, how, how, how to find MCP early.
Justin/David [00:06:44]: I'm a big Zed user. I liked the Zed editor. The first MCP implementation on an IDE was in Zed. It was written by me and it was there like a month and a half before the official release. Just because we needed to do it in the open because it's an open source project. Um, and so it was, it was not, it was named slightly differently because we. We were not set on the name yet, but it was there.
swyx [00:07:05]: I'm happy to go a little bit. Anthropic also had some preview of a model with Zed, right? Some kind of fast editing, uh, model. Um, uh, I, I'm con I confess, you know, I'm a cursor windsurf user. Haven't tried Zed. Uh, what's, what's your, you know, unrelated or, you know, unsolicited two second pitch for, for Zed. That's a good question.
Justin/David [00:07:28]: I, it really depends what you value in editors. For me. I, I wouldn't even say I like, I love Zed more than others. I like them all like complimentary in, in a way or another, like I do use windsurf. I do use Zed. Um, but I think my, my main pitch for Zed is low latency, super smooth experience editor with a decent enough AI integration.
swyx [00:07:51]: I mean, and maybe, you know, I think that's, that's all it is for a lot of people. Uh, I think a lot of people obviously very tied to the VS code paradigm and the extensions that come along with it. Okay. So I wanted to go back a little bit. You know, on, on, on some of the things that you mentioned, Justin, uh, which was building MCP on paper, you know, obviously we only see the end result. It just seems inspired by LSP. And I, I think both of you have acknowledged that. So how much is there to build? And when you say build, is it a lot of code or a lot of design? Cause I felt like it's a lot of design, right? Like you're picking JSON RPC, like how much did you base off of LSP and, and, you know, what, what, what was the sort of hard, hard parts?
Justin/David [00:08:29]: Yeah, absolutely. I mean, uh, we, we definitely did take heavy inspiration from LSP. David had much more prior experience with it than I did working on developer tools. So, you know, I've mostly worked on products or, or sort of infrastructural things. LSP was new to me. But as a, as a, like, or from design principles, it really makes a ton of sense because it does solve this M times N problem that David referred to where, you know, in the world before LSP, you had all these different IDEs and editors, and then all these different languages that each wants to support or that their users want them to support. And then everyone's just building like one. And so, like, you use Vim and you might have really great support for, like, honestly, I don't know, C or something, and then, like, you switch over to JetBrains and you have the Java support, but then, like, you don't get to use the great JetBrains Java support in Vim and you don't get to use the great C support in JetBrains or something like that. So LSP largely, I think, solved this problem by creating this common language that they could all speak and that, you know, you can have some people focus on really robust language server implementations, and then the IDE developers can really focus on that side. And they both benefit. So that was, like, our key takeaway for MCP is, like, that same principle and that same problem in the space of AI applications and extensions to AI applications. But in terms of, like, concrete particulars, I mean, we did take JSON RPC and we took this idea of bidirectionality, but I think we quickly took it down a different route after that. I guess there is one other principle from LSP that we try to stick to today, which is, like, this focus on how features manifest. More than. The semantics of things, if that makes sense. David refers to it as being presentation focused, where, like, basically thinking and, like, offering different primitives, not because necessarily the semantics of them are very different, but because you want them to show up in the application differently. Like, that was a key sort of insight about how LSP was developed. And that's also something we try to apply to MCP. But like I said, then from there, like, yeah, we spent a lot of time, really a lot of time, and we could go into this more separately, like, thinking about each of the primitives that we want to offer in MCP. And why they should be different, like, why we want to have all these different concepts. That was a significant amount of work. That was the design work, as you allude to. But then also already out of the gate, we had three different languages that we wanted to at least support to some degree. That was TypeScript, Python, and then for the Z integration, it was Rust. So there was some SDK building work in those languages, a mixture of clients and servers to build out to try to create this, like, internal ecosystem that we could start playing with. And then, yeah, I guess just trying to make everything, like, robust over, like, I don't know, this whole, like, concept that we have for local MCP, where you, like, launch subprocesses and stuff and making that robust took some time as well. Yeah, maybe adding to that, I think the LSP inference goes even a little bit further. Like, we did take actually quite a look at criticisms on LSP, like, things that LSP didn't do right and things that people felt they would love to have different and really took that to heart to, like, see, you know, what are some of the things. that we wish, you know, we should do better. We took a, you know, like, a lengthy, like, look at, like, their very unique approach to JSON RPC, I may say, and then we decided that this is not what we do. And so there's, like, these differences, but it's clearly very, very inspired. Because I think when you're trying to build and focus, if you're trying to build something like MCP, you kind of want to pick the areas you want to innovate in, but you kind of want to be boring about the other parts in pattern matching LSP. So the problem allows you to be boring in a lot of the core pieces that you want to be boring in. Like, the choice of JSON RPC is very non-controversial to us because it's just, like, it doesn't matter at all, like, what the action, like, bites on the bar that you're speaking. It makes no difference to us. The innovation is on the primitives you choose and these type of things. And so there's way more focus on that that we wanted to do. So having some prior art is good there, basically.
swyx [00:12:26]: It does. I wanted to double click. I mean, there's so many things you can go into. Obviously, I am passionate about protocol design. I wanted to show you guys this. I mean, I think you guys know, but, you know, you already referred to the M times N problem. And I can just share my screen here about anyone working in developer tools has faced this exact issue where you see the God box, basically. Like, the fundamental problem and solution of all infrastructure engineering is you have things going to N things, and then you put the God box and they'll all be better, right? So here is one problem for Uber. One problem for... GraphQL, one problem for Temporal, where I used to work at, and this is from React. And I was just kind of curious, like, you know, did you solve N times N problems at Facebook? Like, it sounds like, David, you did that for a living, right? Like, this is just N times N for a living.
Justin/David [00:13:16]: David Pérez- Yeah, yeah. To some degree, for sure. I did. God, what a good example of this, but like, I did a bunch of this kind of work on like source control systems and these type of things. And so there were a bunch of these type of problems. And so you just shove them into something that everyone can read from and everyone can write to, and you build your God box somewhere, and it works. But yeah, it's just in developer tooling, you're absolutely right. In developer tooling, this is everywhere, right?
swyx [00:13:47]: And that, you know, it shows up everywhere. And what was interesting is I think everyone who makes the God box then has the same set of problems, which is also you now have like composability off and remotes versus local. So, you know, there's this very common set of problems. So I kind of want to take a meta lesson on how to do the God box, but, you know, we can talk about the sort of development stuff later. I wanted to double click on, again, the presentation that Justin mentioned of like how features manifest and how you said some things are the same, but you just want to reify some concepts so they show up differently. And I had that sense, you know, when I was looking at the MCP docs, I'm like, why do these two things need to be the difference in other? I think a lot of people treat tool calling as the solution to everything, right? And sometimes you can actually sort of view kinds of different kinds of tool calls as different things. And sometimes they're resources. Sometimes they're actually taking actions. Sometimes they're something else that I don't really know yet. But I just want to see, like, what are some things that you sort of mentally group as adjacent concepts and why were they important to you to emphasize?
Justin/David [00:14:58]: Yeah, I can chat about this a bit. I think fundamentally we every sort of primitive that we thought through, we thought from the perspective of the application developer first, like if I'm building an application, whether it is an IDE or, you know, call a desktop or some agent interface or whatever the case may be, what are the different things that I would want to receive from like an integration? And I think once you take that lens, it becomes quite clear that that tool calling is necessary, but very insufficient. Like there are many other things you would want to do besides just get tools. And plug them into the model and you want to have some way of differentiating what those different things are. So the kind of core primitives that we started MCP with, we've since added a couple more, but the core ones are really tools, which we've already talked about. It's like adding, adding tools directly to the model or function calling is sometimes called resources, which is basically like bits of data or context that you might want to add to the context. So excuse me, to the, to the model context. And this, this is the first primitive where it's like, we, we. Decided this could be like application controlled, like maybe you want a model to automatically search through and, and find relevant resources and bring them into context. But maybe you also want that to be an explicit UI affordance in the application where the user can like, you know, pick through a dropdown or like a paperclip menu or whatever, and find specific things and tag them in. And then that becomes part of like their message to the LLM. Like those are both use cases for resources. And then the third one is prompts. Which are deliberately meant to be like user initiated or. Like. User substituted. Text or messages. So like the analogy here would be like, if you're an editor, like a slash command or something like that, or like an at, you know, auto completion type thing where it's like, I have this kind of macro effectively that I want to drop in and use. And we have sort of expressed opinions through MCP about the different ways that these things could manifest, but ultimately it is for application developers to decide, okay, you, you get these different concepts expressed differently. Um, and it's very useful as an application developer because you can decide. The appropriate experience for each, and actually this can be a point of differentiation to, like, we were also thinking, you know, from the application developer perspective, they, you know, application developers don't want to be commoditized. They don't want the application to end up the same as every other AI application. So like, what are the unique things that they could do to like create the best user experience even while connecting up to this big open ecosystem of integration? I, yeah. And I think to add to that, the, I think there are two, two aspects to that, that I want to. I want to mention the first one is that interestingly enough, like while nowadays tool calling is obviously like probably like 95% plus of the integrations, and I wish there would be, you know, more clients doing tool resources, doing prompts. The, the very first implementation in that is actually a prompt implementation. It doesn't deal with tools. And, and it, we found this actually quite useful because what it allows you to do is, for example, build an MCP server that takes like a backtrack. So it's, it's not necessarily like a tool that literally just like rawizes from Sentry or any other like online platform that, that tracks your, your crashes. And just lets you pull this into the context window beforehand. And so it's quite nice that way that it's like a user driven interaction that you does the user decide when to pull this in and don't have to wait for the model to do it. And so it's a great way to craft the prompt in a way. And I think similarly, you know, I wish, you know, more MCP servers today would bring prompts as examples of, like how to even use the tools. Yeah. at the same time. The resources bits are quite interesting as well. And I wish we would see more usage there because it's very easy to envision, but yet nobody has really implemented it. A system where like an MCP server exposes, you know, a set of documents that you have, your database, whatever you might want to as a set of resources. And then like a client application would build a full rack index around this, right? This is definitely an application use case we had in mind as to why these are exposed in such a way that they're not model driven, because you might want to have way more resource content than is, you know, realistically usable in a context window. And so I think, you know, I wish applications and I hope applications will do this in the next few months, use these primitives, you know, way better, because I think there's way more rich experiences to be created that way. Yeah, completely agree with that. And I would also add that I would go into it if I haven't.
Alessio [00:19:30]: I think that's a great point. And everybody just, you know, has a hammer and wants to do tool calling on everything. I think a lot of people do tool calling to do a database query. They don't use resources for it. What are like the, I guess, maybe like pros and cons or like when people should use a tool versus a resource, especially when it comes to like things that do have an API interface, like for a database, you can do a tool that does a SQL query versus when should you do that or a resource instead with the data? Yeah.
Justin/David [00:20:00]: The way we separate these is like tools are always meant to be initiated by the model. It's sort of like at the model's discretion that it will like find the right tool and apply it. So if that's the interaction you want as a server developer, where it's like, okay, this, you know, suddenly I've given the LLM the ability to run a SQL queries, for example, that makes sense as a tool. But resources are more flexible, basically. And I think, to be completely honest, the story here is practically a bit complicated today. Because many clients don't support resources yet. But like, I think in an ideal world where all these concepts are fully realized, and there's like full ecosystem support, you would do resources for things like the schemas of your database tables and stuff like that, as a way to like either allow the user to say like, okay, now, you know, cloud, I want to talk to you about this database table. Here it is. Let's have this conversation. Or maybe the particular AI application that you're using, like, you know, could be something agentic, like cloud code. is able to just like agentically look up resources and find the right schema of the database table you're talking about, like both those interactions are possible. But I think like, anytime you have this sort of like, you want to list a bunch of entities, and then read any of them, that makes sense to model as resources. Resources are also, they're uniquely identified by a URI, always. And so you can also think of them as like, you know, sort of general purpose transformers, even like, if you want to support an interaction where a user just like drops a URI in, and then you like automatically figure out how to interpret that, you could use MCP servers to do that interpretation. One of the interesting side notes here, back to the Z example of resources, is that has like a prompt library that you can do, that people can interact with. And we just exposed a set of default prompts that we want everyone to have as part of that prompt library. Yeah, resources for a while so that like, you boot up Zed and Zed will just populate the prompt library from an MCP server, which was quite a cool interaction. And that was, again, a very specific, like, both sides needed to agree upon the URI format and the underlying data format. And but that was a nice and kind of like neat little application of resources. There's also going back to that perspective of like, as an application developer, what are the things that I would want? Yeah. We also applied this thinking to like, you know, like, we can do this, we can do this, we can do this, we can do this. Like what existing features of applications could conceivably be kind of like factored out into MCP servers if you were to take that approach today. And so like basically any IDE where you have like an attachment menu that I think naturally models as resources. It's just, you know, those implementations already existed.
swyx [00:22:49]: Yeah, I think the immediate like, you know, when you introduced it for cloud desktop and I saw the at sign there, I was like, oh, yeah, that's what Cursor has. But this is for everyone else. And, you know, I think like that that is a really good design target because it's something that already exists and people can map on pretty neatly. I was actually featuring this chart from Mahesh's workshop that presumably you guys agreed on. I think this is so useful that it should be on the front page of the docs. Like probably should be. I think that's a good suggestion.
Justin/David [00:23:19]: Do you want to do you want to do a PR for this? I love it.
swyx [00:23:21]: Yeah, do a PR. I've done a PR for just Mahesh's workshop in general, just because I'm like, you know. I know.
SPEAKER_03 [00:23:28]: I approve. Yeah.
swyx [00:23:30]: Thank you. Yeah. I mean, like, but, you know, I think for me as a developer relations person, I always insist on having a map for people. Here are all the main things you have to understand. We'll spend the next two hours going through this. So some one image that kind of covers all this, I think is pretty helpful. And I like your emphasis on prompts. I would say that it's interesting that like I think, you know, in the earliest early days of like chat GPT and cloud, people. Often came up with, oh, you can't really follow my screen, can you? In the early days of chat of, of chat, GPT and all that, like a lot, a lot of people started like, you know, GitHub for prompts, like we'll do prop manager libraries and, and like those never really took off. And I think something like this is helpful and important. I would say like, I've also seen prompt file from human loop, I think, as, as other ways to standardize how people share prompts. But yeah, I agree that like, there should be. There should be more innovation here. And I think probably people want some dynamicism, which I think you, you afford, you allow for. And I like that you have multi-step that this was, this is the main thing that got me like, like these guys really get it. You know, I think you, you maybe have a published some research that says like, actually sometimes to get, to get the model working the right way, you have to do multi-step prompting or jailbreaking to, to, to behave the way that you want. And so I think prompts are not just single conversations. They're sometimes chains of conversations. Yeah.
Alessio [00:25:05]: Another question that I had when I was looking at some server implementations, the server builders kind of decide what data gets eventually returned, especially for tool calls. For example, the Google maps one, right? If you just look through it, they decide what, you know, attributes kind of get returned and the user can not override that if there's a missing one. That has always been my gripe with like SDKs in general, when people build like API wrapper SDKs. And then they miss one parameter that maybe it's new and then I can not use it. How do you guys think about that? And like, yeah, how much should the user be able to intervene in that versus just letting the server designer do all the work?
Justin/David [00:25:41]: I think we probably bear responsibility for the Google maps one, because I think that's one of the reference servers we've released. I mean, in general, for things like for tool results in particular, we've actually made the deliberate decision, at least thus far, for tool results to be not like sort of structured JSON data, not matching a schema, really, but as like a text or images or basically like messages that you would pass into the LLM directly. And so I guess the correlation that is, you really should just return a whole jumble of data and trust the LLM to like sort through it and see. I mean, I think we've clearly done a lot of work. But I think we really need to be able to shift and like, you know, extract the information it cares about, because that's what that's exactly what they excel at. And we really try to think about like, yeah, how to, you know, use LLMs to their full potential and not maybe over specify and then end up with something that doesn't scale as LLMs themselves get better and better. So really, yeah, I suppose what should be happening in this example server, which again, will request welcome. It would be great. It's like if all these result types were literally just passed through from the API that it's calling, and then the API would be able to pass through automatically.
Alessio [00:26:50]: Thank you for joining us.
Alessio [00:27:19]: It's a hard to sign decisions on where to draw the line.
Justin/David [00:27:22]: I'll maybe throw AI under the bus a little bit here and just say that Claude wrote a lot of these example servers. No surprise at all. But I do think, sorry, I do think there's an interesting point in this that I do think people at the moment still to mostly still just apply their normal software engineering API approaches to this. And I think we're still need a little bit more relearning of how to build something for LLMs and trust them, particularly, you know, as they are getting significantly better year to year. Right. And I think, you know, two years ago, maybe that approach would have been very valid. But nowadays, just like just throw data at that thing that is really good at dealing with data is a good approach to this problem. And I think it's just like unlearning like 20, 30, 40 years of software engineering practices that go a little bit into this to some degree. If I could add to that real quickly, just one framing as well for MCP is thinking in terms of like how crazily fast AI is advancing. I mean, it's exciting. It's also scary. Like thinking, us thinking that like the biggest bottleneck to, you know, the next wave of capabilities for models might actually be their ability to like interact with the outside world to like, you know, read data from outside data sources or like take stateful actions. Working at Anthropic, we absolutely care about doing that. Safely and with the right control and alignment measures in place and everything. But also as AI gets better, people will want that. That'll be key to like becoming productive with AI is like being able to connect them up to all those things. So MCP is also sort of like a bet on the future and where this is all going and how important that will be.
Alessio [00:29:05]: Yeah. Yeah, I would say any API attribute that says formatted underscore should kind of be gone and we should just get the raw data from all of them. Because why, you know, why are you formatting? For me, the, the model is definitely smart enough to format an address. So I think that should go to the end user.
swyx [00:29:23]: Yeah. I have, I think Alessio is about to move on to like server implementation. I wanted to, I think we were talking, we're still talking about sort of MCP design and goals and intentions. And we've, I think we've indirectly identified like some problems that MCP is really trying to address. But I wanted to give you the spot to directly take on MCP versus open API, because I think obviously there's a, this is a top question. I wanted to sort of recap everything we just talked about and give people a nice little segment that, that people can say, say, like, this is a definitive answer on MCP versus open API.
Justin/David [00:29:56]: Yeah, I think fundamentally, I mean, open API specifications are a very great tool. And like I've used them a lot in developing APIs and consumers of APIs. I think fundamentally, or we think that they're just like too granular for what you want to do with LLMs. Like they don't express higher level AI specific concepts like this whole mental model. Yeah. But we've talked about with the primitives of MCP and thinking from the perspective of the application developer, like you don't get any of that when you encode this information into an open API specification. So we believe that models will benefit more from like the purpose built or purpose design tools, resources, prompts, and the other primitives than just kind of like, here's our REST API, go wild. I do think there, there's another aspect. I think that I'm not an open API expert, so I might, everything might not be perfectly accurate. But I do think that we're... Like there's been, and we can talk about this a bit more later. There's a deliberate design decision to make the protocol somewhat stateful because we do really believe that AI applications and AI like interactions will become inherently more stateful and that we're the current state of like, like need for statelessness is more a temporary point in time that will, you know, to some degree that will always exist. But I think like more statefulness will become increasingly more popular, particularly when you think about additional modalities that go beyond just pure text-based, you know, interactions with models, like it might be like video, audio, whatever other modalities exist and out there already. And so I do think that like having something a bit more stateful is just inherently useful in this interaction pattern. I do think they're actually more complimentary open API and MCP than if people wanted to make it out. Like people look. For these, like, you know, A versus B and like, you know, have, have all the, all the developers of these things go in a room and fist fight it out. But that's rarely what's going on. I think it's actually, they're very complimentary and they have their little space where they're very, very strong. And I think, you know, just use the best tool for the job. And if you want to have a rich interaction between an AI application, it's probably like, it's probably MCP. That's the right choice. And if, if you want to have like an API spec somewhere that is very easy and like a model can read. And to interpret, and that's what, what worked for you, then open API is the way to go. One more thing to add here is that we've already seen people, I mean, this happened very early. People in the community built like bridges between the two as well. So like, if what you have is an open API specification and no one's, you know, building a custom MCP server for it, there are already like translators that will take that and re-expose it as MCP. And you could do the other direction too. Awesome.
Alessio [00:32:43]: Yeah. I think there's the other side of MCPs that people don't talk as much. Okay. I think there's the other side of MCPs that people don't talk as much about because it doesn't go viral, which is building the servers. So I think everybody does the tweets about like connect the cloud desktop to XMCP. It's amazing. How would you guys suggest people start with building servers? I think the spec is like, so there's so many things you can do that. It's almost like, how do you draw the line between being very descriptive as a server developer versus like going back to our discussion before, like just take the data and then let them auto manipulate it later. Do you have any suggestions for people?
Justin/David [00:33:16]: I. I think there, I have a few suggestions. I think that one of the best things I think about MCP and something that we got right very early is that it's just very, very easy to build like something very simple that might not be amazing, but it's pretty, it's good enough because models are very good and get this going within like half an hour, you know? And so I think that the best part is just like pick the language of, you know, of your choice that you love the most, pick the SDK for it, if there's an SDK for it, and then just go build a tool of the thing that matters to you personally. And that you want to use. You want to see the model like interact with, build the server, throw the tool in, don't even worry too much about the description just yet, like do a bit of like, write your little description as you think about it and just give it to the model and just throw it to standard IO protocol transport wise into like an application that you like and see it do things. And I think that's part of the magic that, or like, you know, empowerment and magic for developers to get so quickly to something that the model does. Or something that you care about. That I think really gets you going and gets you into this flow of like, okay, I see this thing can do cool things. Now I go and, and can expand on this and now I can go and like really think about like, which are the different tools I want, which are the different raw resources and prompts I want. Okay. Now that I have that. Okay. Now do I, what do my evals look like for how I want this to go? How do I optimize my prompts for the evals using like tools like that? This is infinite depth so that you can do. But. Okay. Just start. As simple as possible and just go build a server in like half an hour in the language of your choice and how the model interacts with the things that matter to you. And I think that's where the fun is at. And I think people, I think a lot of what MCP makes great is it just adds a lot of fun to the development piece to just go and have models do things quickly. I also, I'm quite partial, again, to using AI to help me do the coding. Like, I think even during the initial development process, we realized it was quite easy to basically just take all the SDK code. Again, you know, what David suggested, like, you know, pick the language you care about, and then pick the SDK. And once you have that, you can literally just drop the whole SDK code into an LLM's context window and say, okay, now that you know MCP, build me a server that does that. This, this, this. And like, the results, I think, are astounding. Like, I mean, it might not be perfect around every single corner or whatever. And you can refine it over time. But like, it's a great way to kind of like one shot something that basically does what you want, and then you can iterate from there. And like David said, there has been a big emphasis from the beginning on like making servers as easy and simple to build as possible, which certainly helps with LLMs doing it too. We often find that like, getting started is like, you know, 100, 200 lines of code in the last couple of years. It's really quite easy. Yeah. And if you don't have an SDK, again, give the like, give the subset of the spec that you care about to the model, and like another SDK and just have it build you an SDK. And it usually works for like, that subset. Building a full SDK is a different story. But like, to get a model to tool call in Haskell or whatever, like language you like, it's probably pretty straightforward.
swyx [00:36:32]: Yeah. Sorry.
Alessio [00:36:34]: No, I was gonna say, I co-hosted a hackathon at the AGI house. I'm a personal agent, and one of the personal agents somebody built was like an MCP server builder agent, where they will basically put the URL of the API spec, and it will build an MCP server for them. Do you see that today as kind of like, yeah, most servers are just kind of like a layer on top of an existing API without too much opinion? And how, yeah, do you think that's kind of like how it's going to be going forward? Just like AI generated, exposed to API that already exists? Or are we going to see kind of like net new MCP experiences that you... You couldn't do before?
Justin/David [00:37:10]: I think, go for it. I think both, like, I, I think there, there will always be value in like, oh, I have, you know, I have my data over here, and I want to use some connector to bring it into my application over here. That use case will certainly remain. I think, you know, this, this kind of goes back to like, I think a lot of things today are maybe defaulting to tool use when some of the other primitives would be maybe more appropriate over time. And so it could still be that connector. It could still just be that sort of adapter layer, but could like actually adapt it onto different primitives, which is one, one way to add more value. But then I also think there's plenty of opportunity for use cases, which like do, you know, or for MCP servers that kind of do interesting things in and out themselves and aren't just adapters. Some of the earliest examples of this were like, you know, the memory MCP server, which gives the LLM the ability to remember things across conversations or like someone who's a close coworker built the... I shouldn't have said that, not a close coworker. Someone. Yeah. Built the sequential thinking MCP server, which gives a model the ability to like really think step-by-step and get better at its reasoning capabilities. This is something where it's like, it really isn't integrating with anything external. It's just providing this sort of like way of thinking for a model.
Justin/David [00:38:27]: I guess either way though, I think AI authorship of the servers is totally possible. Like I've had a lot of success in prompting, just being like, Hey, I want to build an MCP server that like does this thing. And even if this thing is not. Adapting some other API, but it's doing something completely original. It's usually able to figure that out too. Yeah. I do. I do think that the, to add to that, I do think that a good part of, of what MCP servers will be, will be these like just API wrapper to some degree. Um, and that's good to be valid because that works and it gets you very, very far. But I think we're just very early, like in, in exploring what you can do. Um, and I think as client support for like certain primitives get better, like we can talk about sampling. I'm playing with my favorite topic and greatest frustration at the same time. Um, I think you can just see it very easily see like way, way, way richer experiences and we have, we have built them internally for as prototyping aspects. And I think you see some of that in the community already, but there's just, you know, things like, Hey, summarize my, you know, my, my, my, my favorite subreddits for the morning MCP server that nobody has built yet, but it's very easy to envision. And the protocol can totally do this. And these are like slightly richer experiences. And I think as people like go away from like the, oh, I just want to like, I'm just in this new world where I can hook up the things that matter to me, to the LLM, to like actually want a real workflow, a real, like, like more richer experience that I, I really want exposed to the model. I think then you will see these things pop up, but again, that's a, there's a little bit of a chicken and egg problem at the moment with like what a client supported versus, you know, what servers like authors want to do. Yeah.
Alessio [00:40:10]: That, that, that was. That's kind of my next question on composability. Like how, how do you guys see that? Do you have plans for that? What's kind of like the import of MCPs, so to speak, into another MCP? Like if I want to build like the subreddit one, there's probably going to be like the Reddit API, uh, MCP, and then the summarization MCP. And then how do I, how do I do a super MCP?
Justin/David [00:40:33]: Yeah. So, so this is an interesting topic and I think there, um, so there, there are two aspects to it. I think that the one aspect is like, how can I build something? I think agentically that you requires an LLM call and like a one form of fashion, like for summarization or so, but I'm staying model independent and for that, that's where like part of this by directionality comes in, in this more rich experience where we do have this facility for servers to ask the client again, who owns the LLM interaction, right? Like we talk about cursor, who like runs the, the, the loop with the LLM for you there that for the server author to ask the client for a completion. Um, and basically have it like summarize something for the server and return it back. And so now what model summarizes this depends on which one you have selected in cursor and not depends on what the author brings. The author doesn't bring an SDK. It doesn't have, you had an API key. It's completely model independent, how you can build this. There's just one aspect to that. The second aspect to building richer, richer systems with MCP is that you can easily envision an MCP server that serves something to like something like cursor or win server. For a cloud desktop, but at the same time, also is an MCP client at the same time and itself can use MCP servers to create a rich experience. And now you have a recursive property, which we actually quite carefully in the design principles, try to retain. You, you know, you see it all over the place and authorization and other aspects, um, to the spec that we retain this like recursive pattern. And now you can think about like, okay, I have, I have this little bundle of applications, both a server and a client. And I can add. Add these in chains and build basically graphs like, uh, DAGs out of MCP servers, um, uh, that can just richly interact with each other. A agentic MCP server can also use the whole ecosystem of MCP servers available to themselves. And I think that's a really cool environment, cool thing you can do. And people have experimented with this. And I think you see hopefully more of this, particularly when you think about like auto-selecting, auto-installing, there's a bunch of these things you can do that make, uh, make a really fun experience. I, I think practically there are some niceties we still need to add to the SDKs to make this really simple and like easy to execute on like this kind of recursive MCP server that is also a client or like kind of multiplexing together the behaviors of multiple MCP servers into one host, as we call it. These are things we definitely want to add. We haven't been able to yet, but like, uh, I think that would go some way to showcasing these things that we know are already possible, but not necessarily taken up that much yet. Okay.
swyx [00:43:08]: This is, uh, very exciting. And very, I'm sure, I'm sure a lot of people get very, very, uh, a lot of ideas and inspiration from this. Is an MCP server that is also a client, is that an agent?
Justin/David [00:43:19]: What's an agent? There's a lot of definitions of agents.
swyx [00:43:22]: Because like you're, in some ways you're, you're requesting something and it's going off and doing stuff that you don't necessarily know. There's like a layer of abstraction between you and the ultimate raw source of the data. You could dispute that. Yeah. I just, I don't know if you have a hot take on agents.
Justin/David [00:43:35]: I do think, I do think that you can build an agent that way. For me, I think you need to define the difference between. An MCP server plus client that is just a proxy versus an agent. I think there's a difference. And I think that difference might be in, um, you know, for example, using a sample loop to create a more richer experience to, uh, to, to have a model call tools while like inside that MCP server through these clients. I think then you have a, an actual like agent. Yeah. I do think it's very simple to build agents that way. Yeah. I think there are maybe a few paths here. Like it definitely feels like there's some relationship. Between MCP and agents. One possible version is like, maybe MCP is a great way to represent agents. Maybe there are some like, you know, features or specific things that are missing that would make the ergonomics of it better. And we should make that part of MCP. That's one possibility. Another is like, maybe MCP makes sense as kind of like a foundational communication layer for agents to like compose with other agents or something like that. Or there could be other possibilities entirely. Maybe MCP should specialize and narrowly focus on kind of the AI application side. And not as much on the agent side. I think it's a very live question and I think there are sort of trade-offs in every direction going back to the analogy of the God box. I think one thing that we have to be very careful about in designing a protocol and kind of curating or shepherding an ecosystem is like trying to do too much. I think it's, it's a very big, yeah, you know, you don't want a protocol that tries to do absolutely everything under the sun because then it'll be bad at everything too. And so I think the key question, which is still unresolved is like, to what degree are agents. Really? Really naturally fitting in to this existing model and paradigm or to what degree is it basically just like orthogonal? It should be something.
swyx [00:45:17]: I think once you enable two way and once you enable client server to be the same and delegation of work to another MCP server, it's definitely more agentic than not. But I appreciate that you keep in mind simplicity and not trying to solve every problem under the sun. Cool. I'm happy to move on there. I mean, I'm going to double click on a couple of things that I marked out because they coincide with things that we wanted to ask you. Anyway, so the first one is, it's just a simple, how many MCP things can one implementation support, you know, so this is the, the, the sort of wide versus deep question. And, and this, this is direct relevance to the nesting of MCPs that we just talked about in April, 2024, when, when Claude was launching one of its first contexts, the first million token context example, they said you can support 250 tools. And in a lot of cases, you can't do that. You know, so to me, that's wide in, in the sense that you, you don't have tools that call tools. You just have the model and a flat hierarchy of tools, but then obviously you have tool confusion. It's going to happen when the tools are adjacent, you call the wrong tool. You're going to get the bad results, right? Do you have a recommendation of like a maximum number of MCP servers that are enabled at any given time?
Justin/David [00:46:32]: I think be honest, like, I think there's not one answer to this because to some extent, it depends on the model that you're using. To some extent, it depends on like how well the tools are named and described for the model and stuff like that to avoid confusion. I mean, I think that the dream is certainly like you just furnish all this information to the LLM and it can make sense of everything. This, this kind of goes back to like the, the future we envision with MCP is like all this information is just brought to the model and it decides what to do with it. But today the reality or the practicalities might mean that like, yeah, maybe you, maybe in your client application, like the AI application, you do some fill in the blanks. Maybe you do some filtering over the tool set or like maybe you, you run like a faster, smaller LLM to like filter to what's most relevant and then only pass those tools to the bigger model. Or you could use an MCP server, which is a proxy to other MCP servers and does some filtering at that level or something like that. I think hundreds, as you referenced, is still a fairly safe bet, at least for Claude. I can't speak to the other models, but yeah, I don't know. I think over time we should just expect this to get better. So we're wary of like constraining anything and preventing that. Sort of long. Yeah, and obviously it highly, it highly depends on the overlap of the description, right? Like if you, if you have like very separate servers that do very separate things and the tools have very clear unique names, very clear, well-written descriptions, you know, your mileage might be more higher than if you have a GitLab and a GitHub server at the same time in your context. And, and then the overlap is quite significant because they look very similar to the model and confusion becomes easier. There's different considerations too. Depending on the AI application, if you're, if you're trying to build something very agentic, maybe you are trying to minimize the amount of times you need to go back to the user with a question or, you know, minimize the amount of like configurability in your interface or something. But if you're building other applications, you're building an IDE or you're building a chat application or whatever, like, I think it's totally reasonable to have affordances that allow the user to say like, at this moment, I want this feature set or at this different moment, I want this different feature set or something like that. And maybe not treat it as like always on. The full list always on all the time. Yeah.
swyx [00:48:42]: That's where I think the concepts of resources and tools get to blend a little bit, right? Because now you're saying you want some degree of user control, right? Or application control. And other times you want the model to control it, right? So now we're choosing just subsets of tools. I don't know.
Justin/David [00:49:00]: Yeah, I think it's a fair point or a fair concern. I guess the way I think about this is still like at the end of the day, and this is a core MCP design principle is like, ultimately, the concept of a tool is not a tool. It's a client application, and by extension, the user. Ultimately, they should be in full control of absolutely everything that's happening via MCP. When we say that tools are model controlled, what we really mean is like, tools should only be invoked by the model. Like there really shouldn't be an application interaction or a user interaction where it's like, okay, as a user, I now want you to use this tool. I mean, occasionally you might do that for prompting reasons, but like, I think that shouldn't be like a UI affordance. But I think the client application or the user deciding to like filter out the user, it's not a tool. I think the client application or the user deciding to like filter out things that MCP servers are offering, totally reasonable, or even like transform them. Like you could imagine a client application that takes tool descriptions from an MCP server and like enriches them, makes them better. We really want the client applications to have full control in the MCP paradigm. That in addition, though, like I think there, one thing that's very, very early in my thinking is there might be a addition to the protocol where you want to give the server author the ability to like logically group certain primitives together, potentially. Yeah. To inform that, because they might know some of these logical groupings better, and that could like encompasses prompts, resources, and tools at the same time. I mean, personally, we can have a design discussion on there. I mean, personally, my take would be that those should be separate MCP servers, and then the user should be able to compose them together. But we can figure it out.
Alessio [00:50:31]: Is there going to be like a MCP standard library, so to speak, of like, hey, these are like the canonical servers, do not build this. We're just going to take care of those. And those can be maybe the building blocks that people can compose. Or do you expect people to just rebuild their own MCP servers for like a lot of things?
Justin/David [00:50:49]: I think we will not be prescriptive in that sense. I think there will be inherently, you know, there's a lot of power. Well, let me rephrase it. Like, I have a long history in open source, and I feel the bizarre approach to this problem is somewhat useful, right? And I think so that the best and most interesting option wins. And I don't think we want to be very prescriptive. I will definitely foresee, and this already exists, that there will be like 25 GitHub servers and like 25, you know, Postgres servers and whatnot. And that's all cool. And that's good. And I think they all add in their own way. But effectively, eventually, over months or years, the ecosystem will converge to like a set of very widely used ones who basically, I don't know if you call it winning, but like that will be the most used ones. And I think that's completely fine. Because being prescriptive about this, I don't think it's any useful, any use. I do think, of course, that there will be like MCP servers, and you see them already that are driven by companies for their products. And, you know, they will inherently be probably the canonical implementation. Like if you want to work with Cloudflow workers and use an MCP server for that, you'll probably want to use the one developed by Cloudflare. Yeah. I think there's maybe a related thing here, too, just about like one big thing worth thinking about. We don't have any like solutions completely ready to go. It's this question of like trust or like, you know, vetting is maybe a better word. Like, how do you determine which MCP servers are like the kind of good and safe ones to use? Regardless of if there are any implementations of GitHub MCP servers, that could be totally fine. But you want to make sure that you're not using ones that are really like sus, right? And so trying to think about like how to kind of endow reputation or like, you know, if hypothetically. Anthropic is like, we've vetted this. It meets our criteria for secure coding or something. How can that be reflected in kind of this open model where everyone in the ecosystem can benefit? Don't really know the answer yet, but that's very much top of mind.
Alessio [00:52:49]: But I think that's like a great design choice of MCPs, which is like language agnostic. Like already, and there's not, to my knowledge, an Anthropic official Ruby SDK, nor an OpenAI SDK. And Alex Roudal does a great job building those. But now with MCPs is like. You don't actually have to translate an SDK to all these languages. You just do one, one interface and kind of bless that interface as, as Anthropic. So yeah, that was, that was nice.
swyx [00:53:18]: I have a quick answer to this thing. So like, obviously there's like five or six different registries already popped up. You guys announced your official registry that's gone away. And a registry is very tempting to offer download counts, likes, reviews, and some kind of trust thing. I think it's kind of brittle. Like no matter what kind of social proof or other thing you can, you can offer, the next update can compromise a trusted package. And actually that's the one that does the most damage, right? So abusing the trust system is like setting up a trust system creates the damage from the trust system. And so I actually want to encourage people to try out MCP Inspector because all you got to do is actually just look at the traffic. And like, I think that's, that goes for a lot of security issues.
Justin/David [00:54:03]: Yeah, absolutely. Cool. And then I think like that's very classic, just supply chain problem that like all registries effectively have. And the, you know, there are different approaches to this problem. Like you can take the Apple approach and like vet things and like have like an army of, of both automated system and review teams to do this. And then you effectively build an app store, right? That's, that's one approach to this type of problem. It kind of works in, you know, in a very set, certain set of ways. But I don't think it works in an open source kind of ecosystem for which you always have a registry kind of approach, like similar to MPM and packages and PiPi.
swyx [00:54:36]: And they all have inherently these, like these, these supply chain attack problems, right? Yeah, yeah, totally. Quick time check. I think we're going to go for another like 20, 25 minutes. Is that okay for you guys? Okay, awesome. Cool. I wanted to double click, take the time. So I'm going to sort of, we previewed a little bit on like the future coming stuff. So I want to leave the future coming stuff to the end, like registry, the, the, the stateless servers and remote servers, all the other stuff. But I wanted to double click a little bit. A little bit more on the launch, the core servers that are part of the official repo. And some of them are special ones, like the, like the ones we already talked about. So let me just pull them up already. So for example, you mentioned memory, you mentioned sequential thinking. And I think I really, really encourage people should look at these, what I call special servers. Like they're, they're not normal servers in the, in the sense that they, they wrap some API and it's just easier to interact with those than to work at the APIs. And so I'll, I'll highlight the, the memory one first, just because like, I think there are, there are a few memory startups, but actually you don't need them if you just use this one. It's also like 200 lines of code. It's super simple. And, and obviously then if you need to scale it up, you should probably do some, some more battle tested thing. But if you're interested, if you're just introducing memory, I think this is a really good implementation. I don't know if there's like special stories that you want to highlight with, with some of these.
Justin/David [00:56:00]: I think, no, I don't, I don't think there's special stories. I think a lot of these, not all of them, but a lot of them originated from that hackathon that I mentioned before, where folks got excited about the idea of MCP. People internally inside Anthropik who wanted to have memory or like wanted to play around with the idea could quickly now prototype something using MCP in a way that wasn't possible before. Someone who's not like, you know, you don't have to become the, the end to end expert. You don't have access. You don't have to have access to this. Like, you know. You don't have to have this private, you know, proprietary code base. You can just now extend cloud with this memory capability. So that's how a lot of these came about. And then also just thinking about like, you know, what is the breadth of functionality that we want to demonstrate at launch?
swyx [00:56:47]: Totally. And I think that is partially why it made your launch successful because you launch with a sufficiently spanning set of here's examples and then people just copy paste and expand from there. I would also highlight the file system MCP server only because it has edit file. And basically, I think people were very excited when we had Eric who built your sort of sweet bench projects on the podcast as well. And people were very interested in this sort of like file editing tool that is basically open source via this project. And I think a lot of there's some libraries out there. There's some other implementations that like, you know, this is core IP for them. And now it's just you guys just put it out there. It's just really cool.
Justin/David [00:57:32]: Yeah. I really, I mean, honestly, the file system server is one of my favorites because I think it really speaks to like a limitation that I was feeling. You know, I was like hacking on a game as a side project and really wanted to connect it to like cloud and artifacts like David talked about before. Just giving cloud or like suddenly being able to give cloud the ability to like actually interact with my local machine was huge. I really love that sort of capability. Yeah. I mean, this is the classic example of like this server directly comes out of the frustration that both created MCP and that server. There was a very clear direct path of like, here's the frustration we're currently having to MCP plus the server that we both felt and Justin in particular. So that regard is close to our heart as like as a spiritual inception point of the protocol itself.
swyx [00:58:28]: Absolutely. Okay. And then I think the last thing I'll highlight is sequential thinking, which you already talked about. This gives like branching, which is kind of interesting. It gives a sort of, you know, I need more space to write, which is kind of super interesting. And I think one thing I also wanted to clarify was Anthropic this week, well, this past week, put out a new engineering blog with a think tool. And there's a bit of community confusion how sequential thinking overlaps with the think tool. I just think that it's just different teams doing similar things in different ways. And I think there's a lot of different parts of the world. But I just wanted to let you guys clarify.
SPEAKER_03 [00:59:03]: I think, mean, there's definitely like, sorry, let me start over.
Justin/David [00:59:11]: As far as I know, there is no common lineage between these two things. But I think it just speaks to a larger thing that like there are many different strategies to get an LLM to be more thoughtful or hallucinate less or whatever it might be. To kind of like express these different dimensions more fully or more reliably. And I don't know. I think this is like the power of MCP that like you could build different servers that do these different things or have like, you know, different products or different tools within the same server that do these different things. And like ask the LLM to apply a particular like mental model or thinking pattern or whatever for different results. So I don't know. I think I guess don't know that there will be like one ideal prescribed method like LLM. How you should think all the time.
SPEAKER_03 [01:00:05]: I think there will be different applications for different purposes. And MCP allows you to do that, right?
Justin/David [01:00:11]: Yeah, I think in addition, there's also like the way that the approach to this that some of the MCP servers, they're filling a gap that existed at the at a point in time that the models later catch up to by themselves. Because, you know, they have this training time and preparation research that goes into making models.
Justin/David [01:00:36]: And you can get a lot of mileage of something as simple as a sequential thinking tool like server. It's not simple, but it's like it's doable within a few days, which is definitely not the time frame you look at adding thinking to a model natively. I guess to come up with an example on the fly, like I could imagine building, you know, if I'm working with a model that is not particularly reliable or, you know, maybe someone considers the generation today overall not particularly reliable. Like I could imagine building an MCP server that gives me like best of three, you know, tries like three times to answer a query with the model and then picks the best one or something like that. Like you could get this kind of like recursive and composable LLM interactions with MCP.
swyx [01:01:18]: Awesome. Okay, cool. I think so, you know, we are sorry. Thanks for indulging on like some of the servers. We just wanted to double click on these. I think we have time for just like future roadmap things. People were most excited about this recent update. Moving from state to state. Stateful to stateless servers. You guys picked SSE as your sort of launch protocol and transport. And obviously transport is pluggable. The behind the scenes of that, like was it Jared Palmer's tweet that caused it or were you already working on it?
Justin/David [01:01:49]: No, we have GitHub discussions going back, like, you know, in public going back months, really talking about this, this dilemma and the trade-offs involved. You know, we do believe that like. The future of AI applications and ecosystem and agents, all of these things I think will be stateful or will be more in the direction of statefulness. So we had a lot of. I think honestly, this is one of the most contentious topics we've discussed as like the core MCP team and like gone through multiple iterations on and back and forth. But ultimately just came back to this conclusion that like if the future looks more stateful, we don't want to move away from that paradigm. Completely. Now we have to balance that against it's it's been operationally complex or like it's hard to deploy an MCP server if it requires this like long lived persistent connection. This this is the original like SSE transport design is basically you deploy an MCP server and then a client can come in and connect. And then basically you should remain connected indefinitely, which is that's like a tall order for anyone operating at scale. It's just like not a deployment or operational model. You really want to support it. So we were trying to think, like, how can we balance the belief that statefulness is important with sort of simpler operation and maintenance and stuff like that? And the news sort of we're calling it the streamable HTTP transport that we came up with still has SSE in there. But it has a more like a gradual approach where like a server could be just plain HTTP, like, you know, have one endpoint that you send HTTP posts to and then, you know, get a result back. But do you think that's it? Yeah. And then you can like gradually enhance it with like, OK, now I want the results to be streaming or like now I want the server to be able to issue its own requests. And as long as the server and client both support the ability to like resume sessions, like, you know, to disconnect and come back later and pick up where you left off, then you get kind of the best of both worlds where it could still be this stateful interaction and stateful server, but allows you to like horizontally scale more easily or like deal with spotty network connections or whatever the case may be.
Alessio [01:04:00]: Yeah. Yeah. And you had, as you mentioned, session ID. How do you think about auth going forward? For some MCPs, I just need to like paste my API key in the command. Is there kind of like a, yeah, what do you see as the future of that? Is there going to be like the dot M equivalent of like for MCPs or? Yeah.
Justin/David [01:04:18]: We do have authorization as a specification in the current draft of the next revision of the protocol. It's mostly at the moment focused on user to server authorization. Using like OAuth 2.1 or like, you know, a subset of, of modern OAuth basically. And I think that has, that has seems to be working well for people and people building on top of that. And that will solve a lot of these issues because you don't really want to have people bring API keys, particularly when you have like, when you think about a world, which, which I truly believe will happen where the majority of servers will be remote servers. So you need some sort of authorization with that server. Now for the local case, because the authorization is defined on, on the transport layer. And so requires framing, which means like headers effectively. This does not work in standard IO, but in standard IO you run locally and you can do whatever you want anyway. And you might just pop open a browser and deal with it that way. And then there's also like some thinking that is somewhat, somewhat not fully decided on about, you know, even using. Yeah. HTTP locally, which would solve that problem. And Justin is laughing because he's very much in favor of this where I'm very much not in favor of this.
Justin/David [01:05:38]: So there's some debate going on there, but like authorization, I think, you know, we have something, I think it's like, it's as everything in the protocol is like fairly minimal, like trying to solve a very practical problem. It tries to be very minimal in what it, what it does. And then we go from there and add based on practical pain points people have. On top of the protocol and don't try to over design it from the beginning. So we'll just see how far our current aspect gets us basically. Yeah. I want to build on that a bit because I think that last point is really important. And like, you know, when, when you're designing a protocol, you have to be extremely conservative because if you make a mistake, you basically can't undo that mistake or you break backwards compatibility. So it's far easier to like only accept things or like only add things that you're extremely certain about and let people kind of do ad hoc things. And then you can't do ad hoc extensions until maybe there's more like consensus that something is worth adding to the core thing and like supporting indefinitely going forward. And with auth in particular, and this example of API keys, I think this is really illustrative because we did a lot of this sort of like brainstorming, like, okay, if I have this use case, could I accomplish that with this version of auth? And I think the answer is yes for like the API key example. Like you can have an MCP server, which is an OAuth authorization server. And add a lot of that stuff. But if you look at the like slash authorize webpage, it just has like a text box for you to put in an API key. Like that would be a totally valid OAuth flow for the MCP server. Maybe not the most ergonomic or not what people would ideally like, but because it does fit into the existing paradigm and is possible today, we're worried about like adding too much other, too many other options that both then clients and servers need to think about.
Alessio [01:07:19]: Yeah. Have you guys gave scopes any thought? If it's like we had an episode with Dharmesh Shah yesterday from AJ. And he was giving the example of like email. Like he has all of his emails and, you know, he would like to have more granular scope for, hey, you can only access these types of emails or like emails to this person. Today, most scopes are like REST driven, basically. It's like what endpoints can you access? Do you see a future in which the model kind of access like the scope layer, so to speak, and kind of dynamically limits the data that passes through?
Justin/David [01:07:54]: I think the. I think there is a potential need for scopes. That goes back to like we have discussions around this, but what we're currently trying to do is just like routing them in very specific example and like have a good set of like these are actual problems that you cannot currently solve with the current implementations. And that's like the bar we set to add to the protocol. And I think that and then, you know, based on that prototype using that extensibility that we have at the moment where every structure that's returned is extensible. And then build on top of that. And prove that you that this will have a good user experience. And then we put it at the protocol. That's usually been for the most part the case. It's actually not quite the case for authorization in general. That's been a bit more top down. But I can totally see why people want it. It's just a matter of like showcasing the specific examples and like what the what the potential solutions would be so that we don't accidentally run into this like, yeah, the stuff that does this approach where like it sounds roughly right. And we put it in and it was actually not really right. And now you're back to this like adding is easy. Removing is hard in protocol design. And so we're just a little bit. We're just a little bit, you know, careful around this, so to speak. That being said, you know, every time I hear it like in the rough description, it makes sense. I would love to have a very practical end-to-end user example of this and where it falls apart the current implementation. Then we can have a discussion. There's a little bit of wariness from my perspective, too. Maybe not with Gov specifically. I think those could make a lot of sense as long as we have the use cases in mind. But I do think. You know, thinking about composability and logical groupings of things, I think it does often make sense for MCP servers to be quite small things. And if you want lots of collections of functionality for those to be discrete servers that you kind of combine together as a user or in the application layer. And so some of the pushback about auth has been like, well, if I need to authorize with like 20 different things on the other side, how can I do that? It's like, well, maybe maybe that's not what the server should be doing. And maybe it shouldn't be connecting to 20 different things. Maybe those should be separate servers that combine up somehow. Awesome.
swyx [01:10:00]: Lots of discussion there. Where should people go if they want to get involved in these debates? Is it just the specification repo discussion page? That's a good start.
Justin/David [01:10:09]: I want to caveat it slightly that on the Internet, it's very easy to be part of a discussion and having an opinion without then actually doing the work. And so I think there. We were. Both Jansen and I are very old school. Open source people that like it's it's marriage driven in the sense that if you have done work and if you if you showcase this with like practical examples and work in SDKs towards the ex extensions you want to make, you have a good chance that it gets in. If you're just there to have an opinion, you're very likely just being ignored, to be frank, because there's a limit to how much discussion points we can read. Of course, we value the discussion. And we want to have the discussion. But we also need to manage our time and our engagement. And we obviously select for the people who are doing the most work. We're trying to figure out, you know, honestly, like, I think. Even compared to open source work I've done in the past, just the sheer volume of conversation and notifications are an MCP stuff is extraordinary, which is great on one hand. But I think we do need to figure out more scalable structures. And I think that's a good start. Just to both engage with the community, but also keep conversations high signal and like effective. And I guess there's something else to be aware of related to David's point is like, I do believe that a big part of running a successful open source project is sometimes making hard decisions that people will be unhappy about. And you kind of just have to like, you know, learn to like, figure out like, what are the things? What is like the actual vision for the project? Where do we as the kind of like maintainers or like shepherds or whatever believes that it's going? And just commit to that and like understand that some people won't agree with that vision. And that's totally fine. But then maybe there will be other projects that are more in line with what they're hoping for or something like that. I think that's a very interesting and quite good point. It's like they're like a like a product like MCP is an entry into like into a solution space of the problems in that in the general like space. Yeah. I mean, it's it is one of many entries in a way. And and if you do not like the direction, you know, that we and like people that are very close, you know, in the development of the protocol choose then and then we cannot then then there's always place for more. Right. That's the beauty of open source. Right. The good old, you know, fork it approach. We do always want to hear the feedback and we need to make it scalable, I think. But also just the recognition that like sometimes we will be going with our intuition about what is the right choice. There might be a lot of like flame in the open source discussions about it. But that's just the nature of projects like this sometimes.
swyx [01:13:04]: Yeah. Fortunately, neither of you are new to that. I would also say there's a lot of history to be drawn from Facebook open source. Right. And both of you, if you weren't directly involved, you know, everyone who was directly involved, I would say reacts. We eventually started because I was obviously deeply part of the React ecosystem. We eventually started working groups where it was it was open. It was conducted on in discussions. And each member of the working group had a voice that represented a significant part of the community, but also showed that they did the work. They had a significance. They weren't sort of drive by people with no skin in the game. And I think that was helpful for a while. I'm not sure it's like an actively managed thing because of React's own issues with the multi-company situation that they're in. The other thing that actually is to me is more interesting is GraphQL. Because MCP currently has the hype that GraphQL had. And I lived through that one. And eventually, you know. Facebook donated GraphQL to an open source foundation. And I think that there's a question of like, do we want to do that? There's trade-offs, right? It's not a clear yes or no. Because I think GraphQL. Okay. So first of all, I would say that obviously I'm happy. Oh, sorry. Am I? Did I disconnect? Okay. I had a little.
Justin/David [01:14:16]: Okay. I had the same warning.
swyx [01:14:19]: Yeah, I had the same warning. Okay. I would say that most people are happy with Anthropic. And you guys, obviously, because you created it. You guys being the stewards. But at some point, at some scale, you're going to hit some ceiling there where you're like, okay, like, you know, this is owned by one company. And, you know, eventually people want to like, you know, the truly open standard is a nonprofit. There's multiple stakeholders. There's a good governance process. All of which is governed by like Linux Foundation, Apache, whatever. So I want to ask, like, any thoughts there? I personally would say it's too early. You know, like, what are your thoughts?
Justin/David [01:14:51]: Yeah. I think governance in general is a super interesting problem in the open source space. I think there are two things. On one side, we really, really, really want to make this and have this be an open standard and open protocol and open project with, you know, participation from everyone who wants to partake. And I think that actually is working quite well so far. If you look at the pull request, if you look, for example, a lot of the inputs on the streamable HTTP thing came from companies. Like, you know, Shopify and others that had discussed and worked on this and brought proposals to the table. And I think that works really well. The thing that we are a bit wary about is any type of official standardization, particularly going through an actual standardization body or any type of, like, foundational work that starts having processes as part of this to stay somewhat fair to everyone. That can add processes that in a fast moving field, like AI, can be detrimental to the project. And that's what we worry about. We worry about processes that are slowing us down. And so we're trying to find this nice middle ground of, like, how can we have participation that we luckily do have from everyone, work towards everyone's, you know, everyone's, like, problems that they have potentially with the governance model and figure the right path forward out without, you know, having to go back and forth.
Justin/David [01:16:23]: I think that's what we're trying to do. But yeah, we genuinely, we are very genuine in our desire to have this be an open project. And like, yes, it was initiated by Anthropic and David and I work at Anthropic. But like, we don't want it to be seen as like, this is Anthropic's protocol. I think it's very important for the whole ecosystem that this is something that, like, any AI lab could have a stake in or contribute to or make use of. But yeah, it's just, it's boundless. And we're balancing that against avoiding death by committee, basically. And so, like, I think there are a lot of models for doing this successfully in open source. I think most of the delicacies are really around, like, you know, sort of corporate sponsorship and corporate say. And we'll kind of navigate that as it comes up. But we absolutely want this to be like a community project. That being said, I want to highlight this, that at the moment, as we speak, there's plenty of people that are not Anthropic employees who have commit access and admin access, who's the repositories right there. You know, some of the people from Pydentic have commit access to the Python SDK because they did a lot of really good work there. And we had a lot of contributions from Block and others to the specifications. SDKs like the Java SDK and the C-Sharp SDKs, they're completely done by different companies. Like the C-Sharp one is done by Microsoft. It's a very recent addition last week. And they do everything there. They have full admin rights over that. The same goes with JetBranch doing the Kotlin one and Spring AI doing. The Java one. So it is actually, if you really look at it, it's already like a multi-company big project with everyone. It was a lot of people beyond just us two having commit access to and rights to the project as is. Yeah.
Alessio [01:18:06]: Awesome, guys. This was great. Just to wrap up, do you have any MCP server wish list? What do you want people to build you that is not there yet? Or client.
Justin/David [01:18:15]: Yeah. Client or server. I want more sampling clients. That's all I want. I want cool. I want someone to build a client that is sampling and someone else that builds me a server that does summarize my Reddit threads or summarize. Like I'm an old EVE Online player. Summarize what happened in EVE Online in the last week for me. I wish that someone would do that. But for that, I want a sampling client. I want this model independent. Not because I wanted to use any other model than Cloud because Cloud is by far the best. But I just want to have a sampling client for the sake of having a sampling client. Just a little bit of you. Yeah. Well, I'll echo that and just even broadly say, like, I think just more clients that support the full breadth of the spec would be amazing. I mean, we kind of designed things so that things can be adopted incrementally anyway. But, like, still, it would be great if, you know, all these primitives that we put this thought into do get manifested somehow. That would be amazing. But going back to, you know, some of my initial motivation for working on MCP and, like, excitement about the file system server. You know, like, I like hacking on a game as a side project. So I would really love to have an MCP client and or MCP server with, like, the Godot engine, which I was using to build the game. And just, like, have really easy, like, AI integration with that or, like, have, you know, Cloud run and playtest my game or something. Like, Cloud plays Pokemon. Who knows? Hey, at least you have them already built. Have Cloud already built your 3D model from now on with Blender, right? Yeah.
swyx [01:19:41]: I mean, honestly, even, like, shader code and stuff already. I was just like, this is not my wheelhouse. It's amazing what you can do when you enable builders. Yeah. We're actually working on a Cloud plays Pokemon hackathon with David Hershey. So to bring MCP into that, I had no plans.
Alessio [01:19:57]: But if he wants to, he can. Awesome, guys. Well, thank you for the time. Yeah. Keep up the good work.
Justin/David [01:20:03]: Thank you both. This was fun. Yeah. Thank you. Really appreciate it. Cheers.
Share this post