PodcastsCiênciaLatent Space: The AI Engineer Podcast

Latent Space: The AI Engineer Podcast

Latent.Space
Latent Space: The AI Engineer Podcast
Último episódio

272 episódios

  • Latent Space: The AI Engineer Podcast

    The Next War Is Already Here. The West Isn't Ready. — Yaroslav Azhnyuk, The Fourth Law & Guest Host Noah Smith, Noahpinion

    18/05/2026 | 1h 59min
    The future of war has been evolving before our eyes in Ukraine, yet the west still plans to fight the last war. In this special episode, guest host Noah Smith (@noahpinion) and Brandon Anderson sit down with Yaroslav Azhnyuk (@YaroslavAzhnyuk), a serial tech founder who went from building PetCube to founding The Fourth Law, one of the world’s most advanced AI-guided drone companies. Over two hours we cover the technology, tactics, and geopolitics of drone warfare, and why the modern battlefield has already left the West behind:
    * Yaroslav’s personal history and the Ukraine war [00:01:04 – 00:14:01]
    * The modern drone tech stack: why FPV drones are the new god of war, the future of the rifleman, fiber optic vs. AI, five levels of autonomy, and the eight dimensions of the autonomous battlefield [00:14:01 – 01:05:13]
    * The geopolitics and economics of drones: China’s manufacturing advantage, the drone race, Western defense readiness, countermeasures, and why the gap is widening [01:05:13 – 01:58:57]
    For those looking for Noah Smith’s commentary, it really gets going around the 00:51:31 mark.
    Yaroslav Azhnyuk / The Fourth Law:
    * X: https://x.com/YaroslavAzhnyuk
    * LinkedIn: https://www.linkedin.com/in/yaroslavazhnyuk/
    * The Fourth Law: https://thefourthlaw.ai
    Noah Smith:
    * Substack: Noah Smith
    * X: https://x.com/noahpinion
    Timestamps
    00:00:00 Cold Open: China’s 4 Billion Drones and the Cameras-to-Explosives Pipeline
    00:01:04 Introduction: Brandon, Noah Smith, and Yaroslav Azhnyuk
    00:05:41 From Tech Entrepreneur to Defense: PetCube, Brave One, and the D3 Fund
    00:10:42 The Ethics of Building Weapons: Dual-Use Technology and the Wolf at the Door
    00:14:01 The Tech Stack: Cameras, Autonomy Modules, Interceptors, and a Semiconductor Fab
    00:18:47 Fiber Optic vs. AI: The Radio Horizon Problem and $32/km Cable
    00:25:32 FPV Drones: The New God of War — 70–80% of Frontline Casualties
    00:28:28 The Five Levels of Drone Autonomy: From Terminal Guidance to Full Autonomy
    00:41:37 The Eight Dimensions of the Autonomous Battlefield
    00:45:32 AI Safety and the Morality of Autonomous Weapons
    00:51:31 The End of the Rifleman? Noah’s 2013 Prediction vs. Battlefield Reality
    01:05:13 China’s Manufacturing Advantage and Western Vulnerabilities
    01:24:21 Policy Advice for Western Defense: Defense Valley and the Widening Gap
    01:32:54 The Drone Race: Who’s Ahead, Category by Category
    01:41:57 Countermeasures: Shotguns, Jammers, Lasers, and Fishnets
    01:58:19 The Wedding and Final Takeaway: Be Prepared for War
    Transcript
    Cold Open: China, FPV Drones, and the New Warning Sign
    Yaroslav [00:00:00]: Think about this. Last year, Ukraine produced 4 million FPV drones. Ukraine is not the most industrious nation in the world. China can produce 4 billion of these FPV drones.
    Noah [00:00:10]: Would you say that right now China is now the supreme conventional military power on Earth, given its ability to manufacture and deploy drones in the quantity and quality that you just described?
    Yaroslav [00:00:20]: I don’t think we have all the information to claim that but we cannot count it out, and that alone should be a big warning sign. As I say, at some point in my life I went from making cameras that fling treats to pets to cameras that fling explosives to the occupiers. So that’s the short story. And when you think about what your nation, what your patriots are going through, you realize that’s the only morally right thing to do is to fight back, and it is immoral not to fight back, and then the choice becomes very clear.
    Introduction: Yaroslav Azhnyuk, Petcube, and the Last Flight into Kyiv
    Brandon [00:01:04]: Welcome to Latent Space. I’m Brandon. I normally do science podcasts, but today we’re going to do something a little bit different. I’m joined by Noah Smith of Noahpinion on Substack and Twitter. And he has lots of interesting things to say about drones. And as a guest, we have Yaroslav Azhnyuk, founder of The Fourth Law and several other, drone-related startups. To get started, it is February 23rd, 2022. You are running a pet startup. You’re connecting pets with their owners. Let’s go in just a little bit of background. How did you get started in tech, and what were you working on before the Ukrainian war started?
    Yaroslav [00:01:50]: Good to be here. Thank you. On February 23rd, late in the evening, 11:00 PM Kyiv time, my wife and I landed in Kyiv. Actually, then she was a fiance. We came from Lviv, where we were looking at a church, where our wedding should have taken place. And we got into this cab ride from the airport to our home, and the driver was like, “You crazy. Like, everyone’s leaving Kyiv. Why do you come?” We’re like, “What? Nothing’s going to happen. Dude, chill.” And then obviously, eight minutes later, or eight hours later, the bombs fell in the city. It was quite surreal. We probably landed on the last flight that landed in Kyiv, or one of those last flights. My background, I’m a tech guy. Studied applied mathematics in Kyiv Polytechnics, born and raised in Kyiv. My parents are old PhDs from academia, and grandparents too. Like, everything, from linguistics to nuclear physics. And I’m an entrepreneur, so I’ve built a bunch of companies. Petcube is the one you were referencing. So I lived in San Francisco 2014 to 2020, building Petcube, which is one of the leading, pet device companies in the world, selling lots of pet cameras. And then, yeah, as I say, at some point in my life I went from making cameras that fling treats to pets to cameras that fling explosives to the occupiers. So that’s the short story.
    February 24th: Leaving Kyiv as the Invasion Begins
    Noah [00:03:28]: February 24th, I guess a few hours after you, go to check out your wedding chapel, what do you do?
    Yaroslav [00:03:37]: We had a plan for this situation. So my parents and family live in Kyiv, and we’re like, “Okay, this has actually started. The worst has, come true.” And so we basically packed our belongings and got in the car and spent 17 hours driving west. And that was pretty sure most people in our audience watched at least one apocalyptic movie in their life, so that was exactly like that. Like, felt exactly like that. Missiles are falling. Like, there was smoke in Kyiv. Like, my dad and I went, like, to central part of the cities. It’s probably, like
    Yaroslav [00:04:20]: 800 meters from presidential office, to pick some stuff up at his workplace. Because he’s, like, the head of an academic institution, so he had to get some of the things with him. And super surreal. Like, the streets are empty. Like, the gas stations are out of gas. Like, we found some gas station. We didn’t have, like, spare canisters with us, so we’re like, We figured out, like, the car was diesel, so like, we figured out, if it’s diesel, you can actually store it in plastic, canisters, and we bought some window wash for the cars. We poured it out of the canisters, and we poured the diesel into that. Yeah, so it was like that. And then, like, helping friends get out, like my friend and his dog. Like, we found Like, my brother was also, like, riding in a separate car. We found a place for my friend who didn’t have a car. It was like, yeah, it was like, totally surreal. And we didn’t know of course, and you didn’t know this will last for so long. You didn’t know whether Ukraine will be able to defend Kyiv. And it was like, yeah, very little information and very little insight into future.
    From Pet Cameras to Defense Tech: Building for Ukraine and the Free World
    Noah [00:05:42]: What are your thoughts with regards to how do you, defend, Ukraine? So you eventually start building drones Like, what is the process to get from there from where you were building, devices that connect owners with pets to building drones, and what other things did you do to help the war effort in the process?
    Yaroslav [00:06:07]: It’s definitely non-trivial, right? Like, I didn’t go, to I didn’t get any, like, military education when I was a student. Like, normally, in Ukraine, you would, you would go to like, this military school even if you’re getting higher education in any other, sphere. I decided to skip that which is like, an unusual way to go. And I never thought that I will be somehow engaged in a war effort. Like, what is war? Of course, wars are over. It’s the end of history. So one thing you got to understand about, like, many Ukrainians and like, I guess, it’s also true about most of the people I met here in the US, that your who you are in terms of your nationality is a big part of your identity. So when that gets under attack, it’s something deeper than just the country you live in gets under attack, right? And I Day one, I figured I’m going to I’m going to fight back with everything I can, right? But I didn’t think on day one that I’m actually going to do, weapons. And a bunch of things. We were reaching out to a number of American, congresspeople and senators, and basically advocating for support of Ukraine, for voting for lend lease, which has happened in May 2022, but didn’t actually work as expected. We helped start, Brave One, which is now a very important defense innovation cluster, sort of like a DIU here in the US. We helped start, a fund called D3. It’s like, it was started or co-started by Eric Schmidt, former CEO of Google. So a bunch of these odd things, but then eventually I was like, “Okay,”by 2023 it was obvious this thing, A is going to last a lot more time, and B, that the whole world is shifting and that there’s going to be a new arms race, that the warfare is redefined by drones as platforms. And for the first time in history, you have a platform that is software defined, that can increase your battlefield capabilities, in a in a step change just overnight. So it’s like if you were able to push a software update and get all of your Roman legionnaires a new helmet? That has never been possible before. It’s the first time in the history of war this is possible. So all of that and many other things like, supply chain fragilization, and the impact that AI is going to have on all of this all these things have become evident to me in 2023, and it’s like, “Okay, I should do what I do best, or what I know how to do best, start a tech company, and sort of leverage the global techno capitalist machine, to provide, defensibility to Ukraine and the free world.” So that’s literally the mission of the company, increase defensibility of Ukraine and the free world. And then there was some sort of soul-searching and like, asking yourself. It’s like, “Okay, am I Actually, I know nothing about weapons. Am I actually, like, ready to make, things that other people use to kill other bad people?”
    Yaroslav [00:09:36]: When you think about what your nation, what your Compatriots are going through And think about all the terror of places like Bucha, the occupied cities in the east and south, the abducted children, the raped women, all the economic damage that’s being done, and the intention to destroy a whole nation, to genocide the people of Ukraine, you realize that’s the only morally right thing to do is to fight back, and it is immoral not to fight back. And then the choice becomes very clear. And look, we’re just passing the ammunition. We’re not doing the actual job. The actual fighters and defenders and heroes are people in the armed forces. We’re just support.
    The Moral Question: Weapons, Responsibility, and Fighting Back
    Noah [00:10:33]: I have so many questions. Actually, I know you seem to have a question. Do you want to ask anything?
    Yaroslav [00:10:38]: No, I’m just listening. Go ahead.
    Noah [00:10:40]: I do want to talk about, some of let’s say, the moral issues, like you just said. You end
    Yaroslav [00:10:50]: I think there are no issues there.
    Yaroslav [00:10:52]: What would an example of a moral question be in this case?
    Noah [00:10:55]: No, I mean Okay. As you just said, you are creating the tools, but others are using them.
    Noah [00:11:05]: I was maybe thinking of having this conversation later, but one of the questions is like, is it actually you are going to be building them for your homeland, which you are building it for your homeland, which is I think, very a strong morally defensible position, but this technology is not going to stay with you, right?
    Noah [00:11:26]: This you will probably be selling these to other people Yeah. So the future is really where the moral issues may come into play
    Yaroslav [00:11:38]: The this question becomes, easier and more complete if we ask this not about a particular technology or particular weapon, if we think that this question actually applies to any kind of technology Right? So -Knife or fire. You can use knife to do surgery and save people’s lives, or you can use it as a weapon to take people’s lives.
    Noah [00:12:06]: Cut tomatoes, too.
    Yaroslav [00:12:08]: Cut tomatoes too.
    Noah [00:12:09]: Yes, knife.
    Yaroslav [00:12:09]: That’s helpful.
    Noah [00:12:10]: In Japan, sword and knife, they, call the same word.
    Yaroslav [00:12:14]: It’s like, it’s with any technology. Large language models, right? Look at how powerful they are and yet they’re available to anyone in North Korea or in Russia.
    Yaroslav [00:12:29]: That’s one side of the argument. The other side is As a maker, what is your responsibility for how the tools you’re creating, will be used? There’s definitely some responsibility, right? Then How should the decision process look like? Should you, like, try to calculate all the possible scenarios before starting to work on something? Or do you create something that is needed now to save people’s lives, and then think about, addressing the unwanted edge cases later? In ideal world where there’s like, or okay, it’s not ideal world. In a mythical world where there is some one governing party and it gets to decide everything, and there is no other country, that can, decide on their own, you could say, “Well, we need to calculate for all the consequences, and only then, maybe build this building, by replacing this park because, maybe we need this park in the city,”right? So that kind of situation. But when you’re in a situation where you’re in a forest, in front of a wolf, you first going to deal with the wolf that wants to eat you, and then you’re going to go consult Greenpeace. So that’s kind of situation that Ukraine is in.
    The Fourth Law, Odd Systems, and Ukraine’s Drone Stack
    Noah [00:13:59]: Enough. Because this is a tech podcast, I did want to spend some time talking about, sort of the tech in that you’ve developed and what you’ve been working on. So can you explain, I guess, first of all, like, the problem that you were trying to solve from a technical standpoint? And I think, and then maybe, like, go into some of the solutions and some of the design process that led you from designing, little laser-guided, guiding lasers with a with an iPhone versus Having drones.
    Yaroslav [00:14:34]: Like, it so happened, that my partners and I, we sort of So I started one company called The Fourth Law, and its goal was and is to Make, massively scalable on-drone autonomy. And then In parallel with that together with my, Petcube co-founders, partners, and friends, we started another company called Odd Systems Which, was focused on making thermal cameras. Cameras, thermal cameras are seeing thermal radiation and are used to see at night. And we’re now sort of those companies are getting closer and closer together and we’re probably going to merge them. And this group of companies is currently the leading, team in on-drone AI and thermal imaging on the Ukrainian battlefield, and Likely one of the leading, if not the leading in the world. So We have these, like, three sort of business units, which are cameras, drone autonomy, and drones. So the cameras and drone autonomy sell daytime and nighttime cameras and different types of drone autonomous modules to other drone manufacturers, over 200 drone manufacturers in Ukraine. And then the UAV, business unit sells the drones themselves to the armed forces of Ukraine, Ukrainian government. And there are different types of drones. Those are sort of front strike, as we call them, so those are sort of FPV strike drones and the bombers, and then interceptors. And there are different kinds of interceptors. We do Shahed interceptors and we do ISR interceptors. We don’t do the deep strike-
    FPV Drones, Interceptors, and Battery-Powered Warfare
    Noah [00:16:32]: What’s an ISR interceptor?
    Yaroslav [00:16:33]: ISR is stands for intelligence, surveillance, reconnaissance, and those are basically drones which are which, Russians are using to watch over positions and then communicate where, the targets are coming.
    Noah [00:16:48]: It’s a reconnaissance.
    Yaroslav [00:16:48]: That’s, the ISR is sort of a classical term for a for a reconnaissance drone.
    Noah [00:16:53]: Are all of these battery-powered drones that you just described? ‘Cause I know that the sort of deep strike drones still have, like Some sort of
    Yaroslav [00:17:01]: Internal combustion engine?
    Noah [00:17:02]: Internal combustion engine. Are all the things you’re talking about battery-powered?
    Yaroslav [00:17:06]: What we’re working on is all battery-powered, right? We don’t do the deep strikes, right? And then in terms of autonomy-
    Noah [00:17:12]: You can catch a Shahed with a battery-powered thing. It’s not Fast to catch.
    Yaroslav [00:17:17]: No, absolutely. Look, Shahed interceptor, like ours, it’s called Zero, it goes up to 326 kilometers per hour.
    Noah [00:17:26]: For reference, how fast is a Shahed?
    Yaroslav [00:17:28]: Eight, like, in internal phase it could be 280, but in cruise phase it’s, like, 220-ish.
    Yaroslav [00:17:36]: Yeah. And sorry, I’m not like you can convert that into miles if you’re interested.
    Noah [00:17:41]: No, that’s fine.
    Noah [00:17:41]: Multiply by two thirds or point six or something.
    Yaroslav [00:17:44]: That’s easy. Yeah, I was saying that for autonomy modules, right, we, -We make systems, autonomous systems for frontline, for interceptors and some for deep strikes as well, and then different levels of autonomy. So from terminal guidance, which is like lasts 500 meters, give or take, to autonomous bombing, to autonomous target detection, to autonomous navigation and all of that across day and night, different terrains, different time of the year, different platforms like quadcopters and fixed wing, and maybe some other platforms. So it’s quite a wide variety of products. We also have like our own simulation. We have our own training school for the war fighters. And we’re about to start construction of two, semiconductor plants to make, sensors for thermal cameras. So that’s super exciting for me as a computer science guy is Doing semiconductors. Super cool.
    Noah [00:18:49]: Like in terms of kind of core drone technologies, you basically are one is an FPV replacement without fiber optics, and the other is
    Yaroslav [00:18:59]: You
    Noah [00:18:59]: Signal tracking with interceptors
    Yaroslav [00:19:00]: With or without fiber optics. Fiber optics Is just like, sort of a communication module.
    Yaroslav [00:19:05]: You can, you can use classical analog, video link and radio link. Those would be two separate radios. You can do digital, or you can do fiber optic, and then fiber optic Has its own advantages but also adds weight and decreases, the distance and decreases, how fast you can, sort of turn and With a drone. Yeah.
    Noah [00:19:33]: Do you need AI for fiber optic drones?
    Yaroslav [00:19:36]: Like you can use AI for fiber optic drones. AI replaces a human, right? Fiber optic is making your communication link more resilient. So those are slightly different goals. Like if you want, you can have, AI controlling hundreds of fiber optic drones instead of having 100 operators for each.
    Fiber Optics, Radio Horizons, and Terminal Guidance
    Noah [00:20:03]: I guess I thought that the key reason that people moved to fiber optic drones was for like electronic, countermeasures. Or I guess to counter those.
    Yaroslav [00:20:13]: I think that’s a correct assessment from sort of a public awareness standpoint. In practice it’s somewhat more difficult Because besides electronic countermeasures, you have these issues of a radio horizon For FPV drones, which means that as
    Yaroslav [00:20:36]: I believe Earth is round Some people disagree. But basically if you fly a drone and you have a land station over here and a drone flying over here
    Yaroslav [00:20:49]: If your drone is flying high, you have good direct radio visibility. If your drone goes low, and usually, Russian infantry and vehicles, they’re on the ground and you want to hit them, you need to go low. Lower you go, maybe you’ll get behind a hill or behind a forest, and if you’re far enough, you’ll just get behind the curvature of the earth. You get into what’s called a radio shadow. And then That is a real bummer because for the last, be it 60 or 20 meters, you won’t be able to see anything and it will be very difficult to hit the target. So to counter that what-- And then the distances that these FPV drones, act on they’re, they can be quite large. So for example, here in the US there was this drone dominance program competition, and in drone dominance the furthest distance was about 10 kilometers.
    Noah [00:21:44]: What was drone dominance? What was that competition?
    Yaroslav [00:21:47]: Drone, the drone dominance is a is a program started, by the US government, to accelerate the development of drone technology here in the US.
    Noah [00:21:57]: Got it. And the longest range thing they were using was 10 kilometers.
    Yaroslav [00:22:00]: Was 10 kilometers, right. In Ukraine, like if your drone doesn’t fly at least 20, 25, it just, no one’s interested in it, and the usual hits are happening. It was like, okay, many hits are happening between 30 and 40 kilometers, and that’s what expected from a regular 10-inch, FPV drone. So at that distance, even at altitudes of like 60 to 100 meters, you might start losing, the link. So some of the earlier AI technology that was fielded in FPV drone was this terminal guidance technology. That was the first product that we ever, launched that helped you as an operator, once you see the target from two, three, 500 meters, you lock onto the target and then, it just, drives the drone towards the target no matter what, even after you lost the visual connection. So optic fiber solves that. However, if you want to go like 20 kilometers with optic fiber, that will add an extra three kilos, of useful weight to your drone. So
    Noah [00:23:12]: ‘Cause the cable that you have to unspool as you go weighs.
    Noah [00:23:15]: It is heavy.
    Yaroslav [00:23:15]: At first, like the spool is about 800 grams, so a bit less than a kilo, and then, and then think about 10, 10 kilometer optic fiber is another kilo, something like that. That takes away from your useful mass and then now you have like, you need a 15-inch drone and it can only carry maybe one or two kilos of explosives if you want to go, 20 kilometers. If you want to go to 30 or 40, like 30 is probably max. 40 is like very problem problematic on optic fiber. And then the problem with optic fiber is it’s actually getting super expensive. So and why? Because of all the data centers for AI. That’s literally the same optic fiber-
    Noah [00:24:01]: We’re running out of centers
    Yaroslav [00:24:02]: That’s being used there.
    Yaroslav [00:24:02]: Like when Ukrainians and Russians come to Chinese factories to buy the optic fiber, they’re like, “We’re out. We sold it out to the Americans.”? That’s the craziest thing. So optic fiber went up in price from like, $4 per, kilometer to like, $32 per kilometer in a few months in the beginning of this year. And I’ve
    Brandon [00:24:26]: Claude Code is stopping the Russian drone effort here.
    Yaroslav [00:24:30]: Ukrainian as well. Yeah.
    Brandon [00:24:31]: Ukrainian. But I read somewhere that the Russians had grown more dependent on fiber optic drones relative to the Ukrainians, and that’s one reason why the Ukrainians have sort of regained the initiative in drones recently.
    Brandon [00:24:42]: How accurate’s that?
    Yaroslav [00:24:43]: The Russians were the first ones to scale that. I think by as of now, Ukraine has caught up. I think, like, as of maybe three months ago, Ukraine is mostly caught up on fiber optic. Yeah.
    Brandon [00:24:57]: What percent of damage would you say is in terms of FPV drone damage would you say is now fiber optic versus, like autonomous?
    FPVs as the New God of War: Tanks, Artillery, and Cost per Kill
    Yaroslav [00:25:07]: For our, for our audience, I actually, I cannot answer that question. Like, it’s like I know the answer, but I would not disclose that. But for our audience, I think another interesting fact is out of all the casualties on the front line Between 70 and 80% are done by FPV drones.
    Brandon [00:25:30]: FPV drones are the new weapon of universal weapon of warfare.
    Yaroslav [00:25:34]: It’s
    Brandon [00:25:35]: Land warfare, anyway
    Yaroslav [00:25:35]: They used to say that artillery is a god of war because artillery used to cause, like 80% of casualties, and now On that ranking-
    Brandon [00:25:46]: FPV
    Yaroslav [00:25:47]: FPV drones rule.
    Brandon [00:25:48]: FPV drones are the god of war.
    Yaroslav [00:25:51]: Sort of. Dethroned artillery. But it’s not to say that artillery is not useful, is not needed. Like, all of these systems are needed. Maybe except cavalry, although Russians still use it. I know, have you seen the videos of Russians using mules and horses?
    Brandon [00:26:09]: What is the usefulness-
    Yaroslav [00:26:10]: It’
    Brandon [00:26:10]: Of a tank in the in the modern-
    Yaroslav [00:26:11]: That’s where we need Greenpeace to say a word, but they’re silent. Yeah.
    Brandon [00:26:15]: What’s the use of a tank on the modern battlefield?
    Yaroslav [00:26:21]: It’s diminishing.
    Brandon [00:26:22]: Diminishing.
    Yaroslav [00:26:22]: However, I think there might be technologies which will, revive the tank. Look, tank still provides you armor, and armor is important. Like, you still need to armor and firepower, right? Like, you can be an armor personal carrier that provides you, armor. The challenge that currently exists is armor is not very well protected against incoming drones. However, there are ways to do to protect it. We were previously talking about this before the podcast. The CEO of Rheinmetall, recently sort of ridiculed, Ukrainian drone industry, saying that like, there is nothing interesting there, no real innovation, no to stand Compared to like, Rheinmetall or Boeing, and it’s all made by housewives. There was like, obviously a ton of memes about this people ridiculing the CEO of Rheinmetall. And one of the best quotes, I heard on this topic is from my friend, Alexey Babenko, who’s, the head of and founder of VIARI Drone, which is one of the largest manufacturers of FPV drones. They’re our partner. They’re using our autonomy. So he said that the drones we manufacture in one day will be more than enough to destroy all the tanks Rheinmetall manufactures in a year.
    Yaroslav [00:27:52]: Then, yeah, cost-wise, of course, a drone is like, $500 and a Rheinmetall tank is what, probably 5 million-ish or maybe more.
    Brandon [00:28:00]: Don’t mess with those housewives.
    Yaroslav [00:28:03]: Drone wives.
    Brandon [00:28:04]: Drone wives.
    Yaroslav [00:28:06]: That’s it.
    Noah [00:28:06]: There’s a classic saying that everyone always fights the last war.
    Noah [00:28:12]: Yet do How did So from your standpoint, how did we get to the point where tanks became irrelevant in at least for now In a matter of just a few years?
    Yaroslav [00:28:24]: Look, I think it’s the same way, how do we get to the point that calculators become irrelevant?
    Yaroslav [00:28:31]: Now we have iPhones. Like, why would you need a calculator? Technology progresses and its influence grows non-linearly. It’s all exponential. So I can tell you that full autonomy, when you put it on a drone Look, so if you, if you think about a tank and a like, it’s not a direct comparison, but even, like, a drone and a artillery shell or like, sort of cost per kill, an artillery shell for 155 caliber, which is a standard NATO caliber Currently market price is about $4,000 per piece. So compare that to say, $400 per drone. That’s 10 times more expensive. Account for the amortization of the artillery gun and for how vulnerable it is and what is the sort of tactical, capabilities it gives you as compared to a drone. You’ll figure out that an FPV drone is maybe three orders of magnitude, more versatile, more useful, more capable than artillery and many of than a classic artillery. Many of Because there are different types of artillery. Not just, like, one 155. You have mortars, you have all that. But give or take, roughly three orders of magnitude maybe. Again, it doesn’t have that firepower. It’s not one-to-one comparison still.
    Yaroslav [00:29:53]: Now, take that FPV drone. When you put full autonomy on that FPV drone, which can be not very expensive, like systems that we’re, producing are like, in hundreds of dollars of pure bomb
    Full Autonomy: From Human Pilots to Smartphone-Directed Drone Missions
    Noah [00:30:06]: Just interrupt. You said full autonomy Just a second ago you were saying that the autonomy here is guidance, right? It’s not decision-making.
    Yaroslav [00:30:14]: No, I was I was saying that’s the f-First and sort of easiest pieces of autonomy that was fielded by us. But if you, if you add full autonomy to a drone
    Brandon [00:30:24]: He, I think he’s asking what does it can you, for the listeners, can you explain What the term full autonomy means?
    Yaroslav [00:30:29]: Basically, I think a good way to think about an FPV drone is like an iPhone of warfare. It’s, like, very inexpensive, very mass producible, very versatile. You don’t need a bunch of other things when you have a iPhone in your pocket. You don’t have, need an MP3 player, you don’t need a calculator, don’t need other things. All right? So FPV drone is an iPhone. Or like, okay, Apple please don’t sue me, is a smartphone. And then, when you add autonomy to it sort of becomes like Uber or ride sharing. Okay? So what it means is instead of actually being a trained pilot who has this complex remote controller device which requires a couple months of training to actually pilot the drone, and then having to pilot it for 30 minutes, flying towards the target, et cetera, et cetera, now you basically, you have your smartphone, you have a drone, you pick your smartphone, you say, “We are here. The bad guys are here. Go and get them.” And the drone goes up, flies in a given direction, localizes itself on the map, finds the dedicated area where they, the bad guys are supposed to be sees the bad guys, bombs them, return, like, watches, so does a damage assessment, returns back, sits down, and then you can pick it up and watch the video if you didn’t have the radio link, right?
    Noah [00:31:59]: That’s a bomber drone.
    Yaroslav [00:32:00]: That’s full autonomy for a bomber drone, right?
    Noah [00:32:03]: You’re saying that no human decision is made in this entire process?
    Brandon [00:32:06]: That’s not, that’s not what he’s saying.
    Yaroslav [00:32:07]: A human decision was made at the beginning of the process-
    Noah [00:32:09]: I get it. I get it
    Yaroslav [00:32:09]: The same way as you would fire an artillery.
    Yaroslav [00:32:12]: When you fire an artillery, you don’t stop at like, 500 meters away from a target and ask it whether, you want to strike or not. That’s exactly, a human decision is always made at some point. So when you do that’s full autonomy, and such full autonomy is happening as we speak. And such full autonomy increases the capabilities of an FPV drone, which is already, like, three orders more powerful than an artillery shell. Full autonomy increases its capabilities by four orders of magnitude because now you can have 100 times as many people who can use it, because you don’t need to train those people, and this is important. You can have 10 times, mission success rate, and you can have 10 times utility per drone because now instead of being one-way kamikaze, it’s, it can be a bomber.
    Brandon [00:33:05]: Now wait, let’s, you said 10 times mission success rate, which means that fully autonomous bomber drones succeed in their missions 10 times more often than human piloted bomber drones do. That’s an important thing to know.
    Noah [00:33:17]: Maybe, to push back on
    Brandon [00:33:19]: They’re super, they’re superhuman. They’re, they’ 10X superhuman.
    Yaroslav [00:33:22]: They’re not vulnerable to electronic warfare. They don’t care about the radio horizon. They don’t lose track during navigation. They are not susceptible to human error when, an artillery shell or other drone blows up besides you and you’re like, “Hell no,”like, “I’m getting out of here.” Right? That doesn’t happen to an autonomous drone. Like, all of those things. Like, we have, like, one of the brigades that’s using our drones with just first level autonomy They literally said that their success rates-
    Brandon [00:33:53]: What’s first level autonomy?
    Yaroslav [00:33:54]: First level autonomy is just the terminal guidance.
    Yaroslav [00:33:57]: By the way, we have video of that. We can watch that.
    Brandon [00:33:59]: Terminal guidance means a human gets it nearby and then the AI takes over.
    Yaroslav [00:34:03]: The human flies it all the way, like 30 kilometers towards the target, and obviously the target was probably given to that human by someone who’s flying some ISR drone, some reconnaissance drone, right? So all the way to the target, and once you see the target from a distance of 500 meters, you do target lock, and from there drone flies autonomous. So just that feature alone, it has increased the guy’s, his call sign is Grom, so it has increased his, mission success rate, like precision of mission, yeah, mission success rate from 20% to 71%, and it also increased his kill zone from three kilometers to 10 kilometers, which means there’s certain area around the front line which is designated kill zone. Whenever enemy goes into that area, it’s almost guaranteed to be to be destroyed by a drone. And then obviously the drones are not launched from like, the zero line. They’re usually launched from like, minus 10 kilometer-
    Mission Success, Failure Modes, and the Five Levels of Autonomy
    Brandon [00:35:03]: What is a zero line?
    Yaroslav [00:35:05]: Zero line is sort of an imaginary line of control, of two conflicting forces.
    Brandon [00:35:14]: It’s important to explain these things to a lot of the listeners who are
    Yaroslav [00:35:17]: Thank you for asking
    Brandon [00:35:18]: Familiar with warfare.
    Noah [00:35:20]: Myself.
    Noah [00:35:20]: I’m one of those listeners.
    Brandon [00:35:20]: You said that level one autonomy, in other words just terminal guidance, just, like, human gets it to the finish line and then it goes over the finish line, increases mission success from 20 something percent to 71%, or something like that.
    Yaroslav [00:35:33]: Increases the kill zone
    Brandon [00:35:34]: Increases the kill zone
    Yaroslav [00:35:34]: Three kilometers to 10 kilometers.
    Brandon [00:35:36]: Got it.
    Yaroslav [00:35:36]: On both parameters-
    Brandon [00:35:37]: What is full autonomy, dude? And
    Noah [00:35:38]: Actually on real quick, can we define mission success and like, maybe in a way, what are the failure modes of missions?
    Brandon [00:35:44]: I have a guess what mission success is.
    Noah [00:35:46]: But I could
    Brandon [00:35:47]: Get ‘em.
    Yaroslav [00:35:49]: No, but that’s a very good question, in fact, because, even if you fly into the target, well, first the target can be damaged or destroyed. Those are two different modes. Then there can be different targets. A sole infantryman is one kind of target. A dugout where supposed there are some, enemies there is another kind of target, and a some mechanical equipment is another type of target. Radio emitting equipment, which, like, often, like, the targets that the military want to get more than anything else is the some enemy radio tower or something like that or some small radio dish that really makes life difficult in that area, in that combat area. So those are different targets, right? It can be destroyed, can be damaged.Then sometimes, the drone hits but doesn’t explode. Like, that happens. And then, there are other failure modes. You didn’t even reach the target because you were A jammed by electronic warfare; B, you lost the control over drone because of the radio horizon; C, you were jammed by a different type of electronic warfare that happens way before You hit the target area. It’s, impacting your, video receiver. So like jamming on video or jamming on control are two different types of jamming. Then something malfunctioned on a drone, just a mechanical malfunction, maybe like a motor broke or like, whatever. So all of those are different failure modes. Yeah, or maybe you got lost, you’re navigate navigating to your, to your target. That happens, too.
    Noah [00:37:41]: The Level one autonomy, basically you manage to point in a direction.
    Noah [00:37:49]: You go there, and then the last mile The drone taking over.
    Yaroslav [00:37:52]: We define this like, I define that but it sort of got picked up by the industry. We define five levels of autonomy. So level one is terminal guidance. It’s what we just discussed. Level two is bombing. Level three is autonomous target detection and engagement decision. Level four is autonomous navigation. And level five is autonomous takeoff and landing.
    Noah [00:38:15]: Those are good things to know
    Yaroslav [00:38:16]: Those are five levels of autonomy. Now, if you
    Noah [00:38:19]: I have a question for you.
    Yaroslav [00:38:19]: Sorry. Like, let me finish with
    Noah [00:38:21]: Sorry
    Yaroslav [00:38:21]: Theoretical part.
    Noah [00:38:23]: What is Tesla running at right now?
    Yaroslav [00:38:25]: Tesla?
    Noah [00:38:25]: No, sorry.
    Yaroslav [00:38:26]: That’s very good point. Like, it’s exactly, it was inspired by the levels of self-driving autonomy.
    Noah [00:38:32]: Waymo’s level five, right?
    Noah [00:38:35]: You just tell it where you want to go, it picks you up, and then you go there.
    Yaroslav [00:38:36]: I think, like, if you, if you look at the classic definitions of self-driving cars, Waymo is still, like, level four because it still requires even remote, but still, like, human control. It’s like if Waymo gets in trouble, there is an operator who takes over and resolves this. So that would still be a level four. It doesn’t map directly, but it’s also five levels.
    Brandon [00:38:58]: Can I, can I interject a question here? In terms of an FPV drone that’s like a suicide drone that’ll just blow itself up killing something, how do what it hit? Like, does it, just transmit back, or do you sort of like, lose track of it and hope it hit? Like, what happens to that?
    Yaroslav [00:39:16]: That’s a great question. So
    Brandon [00:39:18]: You need another drone
    Yaroslav [00:39:19]: Like, the current battlefield in Ukraine is saturated with different types of drones. So obviously you have all the FPV drones and last year alone, Ukraine manufactured about 4 million of these, and then Russia’s maybe, like, 20% less than that. And for this year, the publicly voiced target was 7 million on Ukrainian side. So it’s, like, serious numbers. We’re getting in serious numbers here. And then besides those, there are different, reconnaissance drones, ISR as we call them, and there are sort of tactical level ISR where we, both Ukrainians and Russians usually use, Mavic, drone by DJI. And then there are a bunch of locally produced drones, which are sort of fixed wing drones that can stay in the air for much longer than Mavic, maybe, like, half an hour. And then, there are drones that can stay for many hours or even up to a day. And those drones have, are more expensive, have more expensive cameras, et cetera, et cetera. We hunt those drones that Russians launch. The Russians hunt our drones, and so on. But ideally, when you, are a group of soldiers operating an FPV, you’ll have someone in your, company, or someone in your platoon who has an ISR asset that will do target designation for you. They’ll say, “Oh, like, there’s a Russian vehicle over there. Go and get him.”and you go there, you get it, and they’re like, “Okay, confirmed.”
    Battlefield Surveillance and the Eight Dimensions of Autonomy
    Brandon [00:40:57]: Those guys are watching. They have their own drones in the sky.
    Yaroslav [00:40:59]: Target destroyed. They have, like, a carousel of drones because One Mavic cannot stay more than 30 minutes. It
    Brandon [00:41:06]: They’re constantly surveilling the battlefield.
    Yaroslav [00:41:07]: Almost every spot on the battlefield.
    Yaroslav [00:41:11]: It’s not always the case. Sometimes you will not have a surveillance asset, so then you would launch another FPV just to confirm that there was a hit. Then if you see there was a hit and you’re not sure if it completely destroyed, you maybe hit again for good measure.
    Brandon [00:41:26]: You double tap.
    Yaroslav [00:41:28]: That’s how it works. But I was about to give you another sort of piece of taxonomy. So you have five levels of autonomy, right? Then you have sort of eight dimensions of autonomous battlefield. So what is eight dimensions? It’s crucial to understand how autonomy evolves in a modern, battlefield environment. So dimension number one is level of autonomy. What are the capabilities that your asset has? Dimension number two is the platform you’re operating on. So it can be a quadcopter, a fixed wing drone, different types of maybe, like, a long range drone or short range drone, but it can also be a missile. You can have autonomy even on an artillery shell or a ground vehicle or a sea vehicle. So all of those are different platforms. Level three would be domain. So it’s ground to ground or ground to air as an intersection, or ground to sea or sea to air. They’re all, like, all the nuances with different domains. Then level four, would be higher levels of autonomy, such as swarming, drone carriers, drone nests, et cetera.
    Brandon [00:42:39]: Now when you’re saying level, you’re talking about dimensions, not about-
    Yaroslav [00:42:42]: Sorry. Yeah
    Brandon [00:42:43]: Autonomy levels. So dimension four.
    Yaroslav [00:42:43]: The dimension. Yeah, I used to say I was supposed to say dimension. I say dimension because each of them works with another, right? So you might have, like third level autonomy, fixed wing drone operating in land to air, and stuff like that right? And then operating in a swarm or operating from a nest. Right? Then you have, sort of dimension number five is environment. So is it day or night? Is it summer or winter? Is it, humid, cold, dry? What kind of target is it? Is your target hiding in a forest, or is it, behind a hill or within buildings? So all of that is environment. Then you have, dimension number six is command and control. How are you dealing with or like, tens of thousands of those assets around the battlefield? How are you coordinating that on the higher levels of command? How are you collecting data? All that.
    Yaroslav [00:43:44]: Dimension number seven would be infrastructure, so things like simulation, data collection tools, security, deployment mechanisms, et cetera. So all those systems have to be developed separately and integrate with all the others. And finally, dimension number eight is sort of distribution. Have you deployed 100 of these systems or 100,000 of these systems? Because those are two very different ballgames. So that now gives you a more broad overview of how autonomy propagates across the battle space.
    Targeting, Human Responsibility, and Rules of Engagement
    Noah [00:44:23]: As someone who has done machine learning and had gone out of distribution and had things, go horribly wrong, you were talking several of these, kind of axes of thinking about drone warfare seem like they could be very susceptible to some sort of distribution shift if you start making things autonomous.
    Yaroslav [00:44:41]: Like what?
    Noah [00:44:41]: I mean Well, first of
    Yaroslav [00:44:43]: If the I’m very interested Sort of sort of kinds of scenarios that you’re thinking about.
    Noah [00:44:48]: Like the most obvious one is you, if I assume these are computer vision guided systems for at least the last mile, how do you ensure that oh, well, like you now have some fog roll in or something, and you, the drones just attack the wrong thing? Or maybe, it probably will not turn around and fly back and attack you, but you
    Yaroslav [00:45:10]: Same, the same, the same question, how do you ensure that your mortar fire hits the right thing? Well, it’s like mortar fire, give or take half a kilometer could be plus or minus. So maybe you fire one, and then you fire another. So drones are actually, much better in being precise in those scenarios. And I think, to your point, I think five to 10 years from now it will be immoral to use weapons without AI.
    Yaroslav [00:45:44]: ‘Cause weapons without AI will be more likely to cause, collateral damage or unwanted damage. Same way, it will be immoral to drive your own car manually on a public road because it’s more likely to cause, unwanted damage.
    Noah [00:46:02]: Wow, I never considered that might
    Brandon [00:46:04]: Really? That’s definitely coming.
    Yaroslav [00:46:07]: Anyway.
    Brandon [00:46:07]: No, but that’ I don’t know, it’s an obvious, an obvious thought. I agree with you.
    Brandon [00:46:12]: I, No, they, obviously they’re not going to let you drive once most of the cars on the road are autonomous.
    Noah [00:46:17]: No, that one, don’t I believe.
    Yaroslav [00:46:19]: No, I think you were you were talking about drones, right?
    Brandon [00:46:21]: The drones, right. Cool.
    Yaroslav [00:46:22]: The weapons, right?
    Brandon [00:46:23]: Friendly fire and collateral damage and stuff like that is all minimized with AI.
    Brandon [00:46:27]: Here’s my question. Take all let’s go to level six autonomy. Let’s take all of the target selection. Let’s take all the battlefield data, integrate it into one big AI, and have that big AI basically be in command of the battlefield And agentically do target selection.
    Yaroslav [00:46:44]: Be the general, right?
    Brandon [00:46:44]: It’s a general. It’s, you’ve cut humans out of the loop except maybe as dexterous robots, repairing drones and fastening things to drones or maybe something like that because you don’t have those robots yet. How soon are we there? AI general.
    Yaroslav [00:46:58]: The most important thing to ask ourselves is who will be faster to that us or our adversaries?
    Brandon [00:47:07]: I assume us, but how fast will we be to that? I hope us.
    Yaroslav [00:47:11]: I hope so too.
    Brandon [00:47:12]: How fast can we Like when are we looking at that in terms of like horizons years?
    Yaroslav [00:47:18]: Like technically, it could be done now. The question is of course, there’s, some engineering work to be done. The bigger challenge is deployment. Right? So okay, technically Like operation in Iran, right? They, the publicly, it was claimed that I think Palantir system was used for target designation, et cetera, et cetera. So it is not exactly as you say, the AI makes all the decisions, but basically AI goes through all the data you have, gives you these 1,027 different targets and says, “You-- To confirm, please press Okay.” And you look at the targets and you’re like, “Yeah, sounds right. Press Okay.”so that’s, I think that’s where we are now already, or we were a couple weeks ago as we’re recording this on April 10th. Another question is how massively deployable it is. Is it, like, every decision being made like that or is it, like, just some of the decisions made like that? And then different levels of command and control. There you have, like, the platoon, the company level, the battalion, et cetera, et cetera, et cetera. But the tricky thing here when we get into that territory, the tricky thing is If your enemy is getting advantage of being Thousand times faster than yourself by deploying such systems What do you do?
    Yaroslav [00:49:10]: You got to-
    Brandon [00:49:12]: The if the enemy is a thousand times faster than you at deploying those systems?
    Yaroslav [00:49:16]: Like, if enemy starts deploying level six autonomy, as you call And you have not started doing
    Brandon [00:49:22]: You’re in trouble
    Yaroslav [00:49:23]: Yes, exactly. So you have to catch up. So my point is that it is very important to think about the safety of these systems, but that thinking should not slow you down in developing them because they are critical for your existential, survival, right? And like, one person who doesn’t think, doesn’t get to think about the ethics of the war is a dead person. That person surely doesn’t get to think about that.
    Brandon [00:49:52]: What would be the safety risk of such a system?
    Yaroslav [00:49:55]: Of course-
    Brandon [00:49:56]: Friendly fire?
    Yaroslav [00:49:56]: Just wrong decisions, right?
    Brandon [00:49:59]: I see.
    Yaroslav [00:49:59]: Maybe, these decisions-
    AI Command Decisions, Dead Zones, and Complex Battlefields
    Brandon [00:50:06]: Skynet AI decides it’s going to use
    Yaroslav [00:50:08]: No, these-
    Brandon [00:50:08]: Drone army to kill us
    Yaroslav [00:50:09]: Decisions will not only be made about drones. They are likely to made about what the humans should do on your side as well. Then obviously some environments are more like Ukrainian-Russian war, where you have
    Brandon [00:50:26]: It will have to choose to risk lives. It will have to choose to sacrifice human lives-
    Yaroslav [00:50:28]: Of course
    Brandon [00:50:29]: On your side.
    Yaroslav [00:50:29]: Of course. And then some environments are just, like, dead, like, dead zones and there are no civilians there, or virtually no civilians close to the front line because, like, super dangerous. Everyone has evacuated from there. But there are other environments which are more like, okay, there’s a counterterrorist operation. There’s, like, a group of terrorists or a group of civilians. Or like, it’s like the recent operations in Iran, I imagine that the US and Israeli forces do not want to harm civilians. They only targeted the military targets there, right? So in those situations, it’s a different level of responsibility for that decision-making as well. And then there is just such a big variety of those military missions, and I’m not even, like, well-informed or well-educated in military science to tell you about all those scenarios. We would need to put some general besides me, and maybe a Ukraine general and American general would have told you very different stories about these things.
    Brandon [00:51:34]: Got it. Can I ask a few more questions? All right. So in 2013, I wrote one of my first, paid articles ever was about how the era of drones will change human society. I was just sitting around bored thinking about things.
    Yaroslav [00:51:54]: You were way ahead of your time.
    Brandon [00:51:55]: I said, I said, “The following will happen.”
    Yaroslav [00:51:57]: It’s, this article is real. I’ve read it.
    Yaroslav [00:51:58]: It’s actually-
    Brandon [00:51:59]: I said small autonomous, suicide drones, will cleanse the battlefield of human infantry. Human infantry will not be able to stand against swarms of AI-powered, suicide drones. That was I didn’t even know about, like, AlexNet at the time, I think.
    Yaroslav [00:52:19]: You’re just an avid sci-fi reader.
    Brandon [00:52:23]: I’m an avid sci-fi reader, but also, like, it’s not Like, there will be a way to do that. It’s a it’s a nonlinear multidimensional search problem, and you get enough compute, you’ll find some search algorithm that will get you there. And so
    Brandon [00:52:38]: I, yeah, I think that one sentence describes the bitter lesson right there.
    Brandon [00:52:41]: It’s just like it’s a multidimensional search space. You search it somehow. I don’t know. Figure out some get a grad student-
    Yaroslav [00:52:47]: Sooner or later
    Brandon [00:52:47]: To make a search algorithm.
    Brandon [00:52:48]: It’s not that hard. Anyway, so but then, but I guess the point is The point is that human infantry on the battlefield will be will be gone at the end. I wrote that in 2013. Many people on social media laughed at me for that called me hysterical, said things like, “Electronic warfare will knock all the drones out of the sky.”like, “You need humans to hold ground.”that’s something you still hear from a lot of people on social media today. I feel that this article that I’ve written has never been directionally wrong. It has gotten more and more right steadily over time, and that we’re very reading the battlefield reports from Ukraine, where, human infantry are basically guy, like a few guys hiding in dugouts for months, and I’m not sure what they’re doing.
    Yaroslav [00:53:35]: That’s on Ukraine’s side. On the Russian side, that’s just like a zerg rush.
    Brandon [00:53:38]: The zerg rush, and then they just die. Then, but they have some guys in dugouts too, right? Like hiding in dugouts for months.
    Yaroslav [00:53:45]: They have. Yeah.
    Brandon [00:53:45]: Like, but that like, what are those guys doing in the dugouts? Are providing, like, frontline, like, reconnaissance? Like, what are they doing?
    Yaroslav [00:53:54]: If there is a guy in a dugout with some bullets and automatic weapon, the other guy cannot come and take the that dugout. That’
    Brandon [00:54:07]: I see
    Yaroslav [00:54:08]: They are they’re establishing control over territory.
    Brandon [00:54:10]: I see. So that is so there still is a use for human infantry on the battlefield as of today.
    Yaroslav [00:54:15]: Like
    Brandon [00:54:15]: How long will that last?
    Yaroslav [00:54:17]: I think it will last for a while. This is funny. There’s this whole Layer of the modern culture, a modern Ukraine culture built around the war-related stuff. So there is this -Punk rock band, that is called SZC, I guess in English that would be. Which stands short for like a deserter or something like that. So anyhow, this band has a song titled “2030.” It’s basically about the year 2030, and the war still goes on as like the whatever, third world war or whatever. And they basically, they, sang about the AI and like cyborgs and everything, but the simple infantry is still needed, and we’re still, like, getting cold in those dugouts, and we’re still doing our job. That’s sort of the theme of the song. And it seems like that’s actually what’s going to happen. There are
    Ground Robots, Simulation, and the Limits of World Models
    Brandon [00:55:30]: Ground robots will not replace humans in the dugouts soon.
    Yaroslav [00:55:34]: I’m very much interested in following the whole humanoid robot theme and
    Brandon [00:55:39]: What about like a dog robot?
    Noah [00:55:41]: Or just mobile controlled platforms or something.
    Brandon [00:55:44]: Spider robot, yeah.
    Brandon [00:55:45]: Everything evolves into a crab.
    Brandon [00:55:46]: You build a crab robot.
    Yaroslav [00:55:47]: A humanoid-
    Noah [00:55:48]: The carcinization of warfare.
    Yaroslav [00:55:51]: There is a lot of utility in humanoid robots because the world is designed around humanoids. So I would not, like, 100% disqualify the possibility that sometimes 10 years in the future, humanoid robots, will be actually fighting. So that’s an actual Terminator kind of scenario.
    Brandon [00:56:14]: Yeah, in the first Terminator movie, you look at what they’ve got on the battlefield, they’ve got flying bomber drones and humanoid robots.
    Yaroslav [00:56:20]: Look, the cost of large language models of running them is getting so low, you can have basically an inexpensive computer running, what was a state-of-the-art model a year and a half ago, running it locally on a device with an open source model, which also means that the Chinese can have it, the Russians can have it, the North Koreans can have it, et cetera. So that is already possible. And with when we’re looking at the acceleration of the neural nets, I would’ve, if not the acceleration of the large language models, I would’ve said that I don’t think that humanoid robots will be able to be useful in the battlefield earlier than in 10 years. But if you account for the exponential, it might be five years or so. The problem with all of the autonomous systems, and it’s like starts with self-driving cars and even with all the AI, like modern day AI agents, to make them really, useful, you have to solve such a long tail of edge cases, that it’s really difficult to make them useful. Like we were promised, self-driving cars, what, like 2007, Sebastian Thrun and Google, and even before that all the challenges, everything. And Elon of course told us it’s going to be one year from 2014, and now we still don’t have self-driving Teslas everywhere. We have Waymos in SF and some other places, but they’re still, like, not perfect. So I think, I expect something similar from self-flying drones and fully autonomous drones, and we saw that firsthand as with each level of autonomy that we’re adding, there is a very wide distance between a prototype and something that is ready to be scaled to millions of units and something that has been scaled to millions of units. But the race with like AI coding tools is just insane. So things might accelerate very fast, faster than we can imagine.
    Noah [00:58:46]: I think your point is that with due to this long tail behavior Level one autonomy as you’ve defined it, is actually very natural. Like you basically are just solving an image recognition and tracking system.
    Yaroslav [00:59:02]: It’s actually interesting that you say it that way, and I thought about this the very same way, and we have this joke that there are like 200 companies in Ukraine which are trying to solve last mile, targeting or terminal guidance. It seems like we’re like the only company that actually solved that because even that problem-
    Noah [00:59:22]: I’m not saying it’s, I’m not saying it’s trivial, but it’s at least something that you imagine given our current state.
    Yaroslav [00:59:26]: Like us and Eric Schmidt, like Eric Schmidt’s companies are pretty good.
    Yaroslav [00:59:29]: Like, I actually have lots of respect to what they’re doing, and they’re, they have been practically influential and helpful on the battlefield, and they have good engineering.
    Noah [00:59:38]: I wasn’t, I wasn’t saying it’s trivial. I’m just saying this is a something naturally adaptive based upon things that we know work, well. But some of the other domains that where you do have to make decisions and you have a long tail become much harder, and you worry about edge cases more.
    Yaroslav [00:59:57]: Like the more, the more complex behavior you’re trying to simulate, the more edge cases there are right? The more ways to do it wrong there are. And then there are different approaches. It’s like if you think about, if you read academic papers about robotics, right? You sort of the robot is represented as something that has the sort of sensor input, and then you have three, levels of sort of logics or decision-making, which are perception, planning, and control, and then you have actuators as output.So pre-neural nets, you would do perception output and control all with classic logics, right? Then, with AlexNet and computer vision, you could do perception with neural nets and the rest with logic. You cannot currently do each of those separately with neural nets, each of those separately with logics, or you can just have one huge neural net that just takes lots of sensory data. It’s not just pixels. Could be sound, could be accelerometer, could be everything, as input, and just outputs the controls. And some of the self-driving car companies are doing that or like, experimenting between different ways of doing that. So you can also, like, think about that and the way you implement those features, also influences how much degrees of freedom the system would have, right? Like control, you can do it classical algorithmic control with common filters and PAD filter, PAD controllers, et cetera, or you can do a neural net, that was trained in a gym with a reinforcement learning, et cetera. And those would be two different behaviors of a system.
    Noah [01:01:53]: I-- Maybe my point was just much more high level. It’
    Yaroslav [01:01:56]: Or you can If you go even like, if you go high level, you can, you can like train to like have whatever, like Feifei Li and folks who are doing like physical, sort
    Brandon [01:02:08]: World models
    Yaroslav [01:02:08]: World models, right, physical intelligence, they’re trying to make these big models and sort of understand the world and then supposedly you have such model and you can tell a drone, “Okay, like, go over that hill and like, find the bad guys and then get them,”or “Make me a video, make me a photo of the guy smiling and get back to me.” Right? That’s one way. Another way you have like these subsystems, like one is navigation, another is finding the person, another is like getting to them to take a photo. And those are again, very different behaviors. And then it’s not that one is necessarily better than the other, and we might have more technological ability to do one or another. But all of those systems will exist. And then again, you should always keep in mind that it’s only the not only the good guys that are developing these systems, the bad guys are developing these systems as well.
    China’s Drone Supply Chain and the West’s Manufacturing Gap
    Noah [01:03:00]: I guess where I’m going with this back to Noah’s original thought with the end of the end of the soldier. And so in order to replace-
    Brandon [01:03:10]: Or at least the end of the rifleman.
    Noah [01:03:11]: Or the end of the rifleman, yeah.
    Yaroslav [01:03:13]: I’m not seeing that very close, and it was like I’m, as much as I’m a lover of sci-fi and all of that and a technologist, the more I try to be
    Yaroslav [01:03:27]: Like the I try to have certain humility about these things, and like the military, domain and there was just so much human history and blood and tears, dedicated to sort of understanding this art of war and perfecting it and so on. There is so much knowledge in there that I don’t feel like I even started to comprehend, a lot of that. But one thing that I really understood is that even though drones are now making eighty percent of the casualties, you go to the actual officers, you talk to the actual, like, brigade commanders, corps commanders, and they explain to you, how all of it fits together, how when you’re thinking about an operation that involves a couple thousand people to get this piece of land, out of the enemy’s hands, deoccu deoccupy it, how it is so complex, it involves, dozens of different types of drones and then land operations and reconnaissance operations, psychological operations and then aviations and tanks and logistics and all kinds of these different assets. So modern warfare is really very complex, and the fact that the drones are the latest, coolest thing, and then the AI is latest, coolest thing, doesn’t mean that now it’s that and only that right? So yeah. Whoever’s looking into that I think should realize that it’s not just what the press talks about, that the reality is much more difficult, much more complex.
    Brandon [01:05:17]: Let’s talk about China and China’s manufacturing capabilities. So suppose that someone, like suppose the United States went to war with China. And
    Yaroslav [01:05:26]: I hope not.
    Brandon [01:05:27]: I hope not as well. And then but suppose that drones were very essential to that war of all the types of drones that we’re talking about here, and that suppose that China said, “All right, well, you need X and Y and Z, to make those drones to fight us, and we control the production of X and Y and Z, so we’re just going to cut you right off, and now you have no drones.”
    Brandon [01:05:47]: I know that a number of countries, including Ukraine and Taiwan, have been making moves to China-proof their drone productions that China couldn’t do that. Examples of things they might be able to cut off might include rare earths, fiber optic cable that you were talking about before, various other things that where even if they don’t control one hundred percent of the production, they control enough of the production that would be extremely expensive to produce it without relying on Chinese sources. Or the market’s fragmented enough, et cetera. What do you see as China’s key bottlenecks, and how easy are those to overcome in terms of China-proofing drone production in case of a war against China?
    Yaroslav [01:06:30]: Let me start with a saying that -Although China does not sell directly to Ukraine and it does sell directly to Russia, a lot of Ukrainian supply chains, they start in China, right?
    Yaroslav [01:06:49]: We’re not in a conflict with China, and we would not want to be in a conflict with China. And we’d hope that China stays a neutral power between Ukraine and Russia and the US as well. That said, the scenario that you’re describing, everything is much worse.
    Yaroslav [01:07:11]: Think about this. Last year, Ukraine produced four million FPV drones. Ukraine is not the most industrious nation in the world.
    Yaroslav [01:07:19]: China can produce four billion of these FPV drones.
    Yaroslav [01:07:23]: China can make them not drones with propellers, but fixed-wing drones, which go not forty kilometers far, but maybe two to three hundred kilometers inland. Slightly more expensive.
    Brandon [01:07:34]: With internal combustion
    Yaroslav [01:07:36]: No. With
    Brandon [01:07:36]: Battery-powered fixed-wing drones.
    Yaroslav [01:07:38]: Battery, yeah.
    Brandon [01:07:39]: What’s the propulsion system on those propellers?
    Brandon [01:07:43]: I don’t-- I just don’t know how that works.
    Yaroslav [01:07:44]: You have that. They can also make them all fully autonomous. They have DJI, the world’s most advanced drone company. They can make them fully autonomous without GPS, without anything. Then they can put those drones on maybe tens of thousands of fully autonomous underwater submarines, or maybe not even that just on shipping containers and barges that ship goods or freight ships. And then they show up with millions of drones packed onto those, sea vessels. They show up to any coastline in the world, be it Taiwan or be it California, and they have millions of long-range impactors targeted at a at a piece of land.
    Yaroslav [01:08:38]: What do you do with that? There are not enough hunter submarines. There are not enough anti
    Brandon [01:08:46]: Ship missiles.
    Yaroslav [01:08:47]: Anti-ship missiles, anti-ship, planes. They can produce these assets, on in tens of thousands of factories because they’re so simple to produce that even the if the FBI director picks a phone, calls to the President of the United States, says, “Hey The scenario Yaroslav was warning us about is beginning to unfold. We need to do a preemptive strike,”You wouldn’t have enough assets, to do preemptive strikes because there can be like tens of thousands of places where these things are being manufactured. And then so to counteract a scenario like that we would need to have like a similar amount of mass
    Brandon [01:09:39]: You mean a similar number of drones.
    Yaroslav [01:09:41]: Yes, to intercept that like either in sea or in air, et cetera, at a similar cost, right? So economics should work out. I’ll tell you that currently, we in the West and we in the United States, we don’t have the technology to do that. We don’t
    Four Layers Behind China: Technology, Manufacturing, Components, and Rare Earths
    Brandon [01:10:01]: What technologies, key technologies do we lack?
    Yaroslav [01:10:03]: Like autonomy, mass drone manufacturing, stuff like that.
    Brandon [01:10:06]: We lack autonomy technology?
    Yaroslav [01:10:09]: I think so.
    Brandon [01:10:10]: Because our computer vision algorithms are not as good?
    Yaroslav [01:10:12]: It’s not only about the computer vision algorithms. It’s like the like if a group of companies by Eric Schmidt founded two, three years ago and my small startup, was like maybe not as small, but it’s also founded three years ago, are sort of two of the leading companies in the world, and maybe a couple others who are capable of something like that but not really on small drones. I do think we’ll, we were behind China in technology. So we lack technology, we lack mass manufacturing capacity, we lack the components, and we lack the rare earth materials. So there are four layers in which we’re behind this challenge. And that’s why it is my point that we in the in the West, and especially in the United States, we should, there should be far more smarter people working in defense, and there should be more funding, if we want to keep the resemblance of our good past life.
    Brandon [01:11:14]: That’s really important. Would you say that right now, as things stand, in conventional terms, not, abstracting from strategic nuclear weapons, but in conventional terms, would you say that China is now the supreme conventional military power on Earth, given its ability to manufacture and deploy drones in the quantity and quality that you just described?
    Yaroslav [01:11:35]: Look, I don’t, I don’t think we have all the information to claim that but
    Yaroslav [01:11:41]: We cannot count it out, and that alone should be a big warning sign. We have not seen, Chinese drones in action. We’ve seen some of the Iranian drone in action and Russian drones in action. Not Chinese really. Not seen Chinese forces in action. Obviously, hopefully, this never happens, but the conflict of a scale US, China, there are many Sort of classical assets that we should not discount. As we just discussed, we should not discount artillery in the land war, we should not discount, air-carrying groups and the air force, and long-range missiles and electronic warfare and satellites, et cetera. But then there are also things that we, at least we as a general public don’t really know about China. I’m sure there’s a lot of information that the US intelligence has about the Chinese capabilities. -I think if you, if you get back to the scenario that I just described, and if you take that like, sort of to the maximum You basically see that whoever has bigger manufacturing capacity, that side wins.
    Brandon [01:13:03]: That’s just a typical law of conventional warfare Has been forever.
    Yaroslav [01:13:07]: Sort of.
    Noah [01:13:07]: Do you read Noah’s blog?
    Yaroslav [01:13:09]: I not as often as I would like. But I read Noah’s, X.
    Brandon [01:13:15]: It’s not necessary.
    Noah [01:13:15]: It’s a theme where
    Brandon [01:13:16]: Don’t read my X.
    Brandon [01:13:19]: It’s just for
    Noah [01:13:19]: He doesn’t, he has no opinion about certain things. Yeah
    Brandon [01:13:22]: It’s just jokes.
    Yaroslav [01:13:22]: No opinion. Okay.
    Brandon [01:13:22]: Okay, so here’s the I guess there’s two questions here. The question of could The United States and other countries allied with the United States even develop supply chains that are independent of China to make any of these drones? And the second question is could they do it in sufficient mass? And so I think the answer to the question of can they do it in sufficient mass is today, no. But in a extended, prolonged war situation, things change a lot. And all the development restrictions that we put on new factories go out the window, and a sense of urgency. Ukraine obviously wasn’t making all these drones before the war.
    Yaroslav [01:14:04]: Of course.
    Brandon [01:14:04]: So if America had the same kind of urgency that Ukraine has now, things would happen. Things would move, and of course, America has allies too, or had allies until recently, and may have them again in the future. But America has or had allies that would also scale up very quickly, like Japan and European countries if we ever ally with them again, et cetera. And so a lot of things could then change in terms of the actual mass. So I, in terms of looking at China and saying they have all these factories today, and looking at the history of conventional warfare, America had very few military very little defense production capability on the eve of World War II, and ended up easily outproducing everyone else, even the Soviet Union.
    Yaroslav [01:14:47]: Maybe not easily. Yeah.
    Brandon [01:14:49]: Not easily, but by a long, a long shot.
    Yaroslav [01:14:51]: Also the added benefit of not being attacked.
    Brandon [01:14:54]: That’s right. That’s right.
    Yaroslav [01:14:54]: That helps.
    Brandon [01:14:55]: Who knows how Secure they are now, but or what, where cyber influence
    Yaroslav [01:15:03]: No, look, I totally agree with your sentiment. I like, and I’m not as y, I’m even less doomerish than you are. Or as it seems to me, you’re a little bit doomerish, but like, in the long term, you’re bullish.
    Choke Points, Europe’s Wake-Up Call, and Defense Industrial Policy
    Brandon [01:15:17]: I’m not, I’m not doomerish. I’m thinking about the I’m thinking about what we need to do.
    Brandon [01:15:21]: I’m not, I’m not thinking like, “Oh, we’re doomed.” That’s not my point. It’s never useful saying that. If you’re doomed, then just don’t go on podcasts.
    Brandon [01:15:28]: Go pet a rabbit and play a video game or something. It’s Anyway, no, if you’re, we’re not doomed, but I’m saying step one, how, what are the key choke points that we need tomorrow, besides rare earths, which we already know, what are the other key choke points that the West needs to free itself from Chinese supply chains on in order to manufacture even one drone Free Chinese supply chains?
    Yaroslav [01:15:54]: There are companies here who are doing that like our, we have, good friends, a company called Neuros. I know they’re, down in El Segundo or whatever, like somewhere on South California.
    Brandon [01:16:05]: What are the most pressing choke points besides rare earths that everyone talks about?
    Yaroslav [01:16:09]: That’s one of the pieces that we do, thermal cameras. That’s like actually a big one.
    Brandon [01:16:16]: Thermal cameras.
    Yaroslav [01:16:17]: Then, like, the motors. Like you need The special-
    Brandon [01:16:25]: Even after you have the magnets, then you turn them into a really good motor.
    Yaroslav [01:16:28]: You have, you need these special magnets, and then that’s sort of your rare earth component.
    Brandon [01:16:34]: That’s, that’
    Yaroslav [01:16:34]: Like rare earth is not that oh, like there are these metals that only for some reason, God only put them under the Chinese territory and not under any others. No, like they’re distributed. There are plenty of them around Earth. It’s about the refining capabilities and like, investing into that and so on. And then, like, frankly, at some point, we don’t have that many humans. Like, that’s where the humanoid robots help. Like China is a big populous country. The population of like, United West is comparable to that but the population of the US is much lower than that. And I definitely think that the whole West should get their act together, because, ubi semper victoria, ibi concordia. There’s always victory where there is union.
    Brandon [01:17:27]: Agreement.
    Yaroslav [01:17:27]: Agreement, yes.
    Yaroslav [01:17:31]: I think we sort of as the free nations of the world, we should get their act together because freedom is what unites us. And I’m also, like, pretty mad at what’s happening in the European Union. And I think that Current US administration is the best thing that has ever happened to Europe, since World War II probably. Or since post-World War II, because World War II wasn’t the best thing.
    Brandon [01:17:59]: Trump withdrawing the image of omnipotent American support forced the Europeans to get their butts in gear, unite Develop their defense industries.
    Yaroslav [01:18:07]: Also, like, doing that not in a nice way, right? Like when JD Vance came to Munich, Forum one year ago, he wasn’t, like, super nice, like, “Oh, please, our European friends, please could you please increase your, defense spending?” He was somewhat pushy. Let’s put it that way. And that I think that was a necessary measure. Like, I’ve been, I’ve been thinking about that. Could it, could it have been he, maybe he could have been nicer? I was like, no, because, like, the voters of European leaders, the European countries, would have not understood this. They would not get the message. And now I think the message was gotten across, but Europe is still sort ofSlow to wake up, I would put it that way. Things are getting better, but I’m not happy about the speed of how they’re getting better. So when I, when I, like, when I would go to some of the European capitals, I would get back pretty depressed from like, talking to their, military officials and their entrepreneurs, et cetera. Here, I’ve been in the US for the last month or so. I’m not depressed. I’m actually, I’m actually excited. I still think you should, like, 10X the effort in sort of making sure that you remain the strongest power, in the world and you can defend your values, et cetera. But I’m very optimistic, and definitely once we are in danger, I think, we’re just, like, lots of very smart people in the West who can figure these things out. But people in China are also extremely smart. It’s very different from even the Cold War sort of situation. Like, Soviet Union was economically a very declining power. China’s not like that. And then if we look at electric car race, I think they’re ahead of the US and ahead of the whole world, definitely ahead of Europe, which used to be sort of a car superpower. When you look at AI, I think they’re Almost where we are maybe slightly behind. When you look at humanoid robotics, I would argue they’re ahead. And in many other, like, in like medicine and sort of biosciences, there are lots of interesting things there, and like, in consumer space, there are lots of interesting, things there. I don’t know if you heard this podcast called 996. I don’t know if it’s still airing or not. There used to be a fantastic podcast by some, American Chinese, businessman, maybe venture funds.
    Humility About China, Taiwan, and Deterrence
    Brandon [01:20:55]: About the Chinese economy?
    Yaroslav [01:20:56]: About China from a sort of tech venture point of view. So and I lived in China for maybe four months, and I visited a couple times. Like, even WeChat is like, such a more advanced app than anything we have in the West. So we, it’s very important not to be too arrogant, and I think we’re guilty of that like, definitely in the US. Sometimes we tend to be too arrogant. Like, I think, like, humility helps always, at least to me personally. And then I think, like, we don’t have to we don’t have to obviously be enemies. So Like with Ukraine and Russia, it’s like Russia came to kill all of these people and get all this territory. With China and the US, it’s not like that and thanks God it’s not like that right?
    Brandon [01:21:54]: It might be with China and Taiwan. Maybe.
    Yaroslav [01:21:57]: Hopefully not. Yeah. It’s
    Brandon [01:21:59]: Hopefully not
    Yaroslav [01:22:00]: It’s like China has their own, problems probably with human rights, et cetera. But hopefully, it’s still not beyond the fixing point.
    Brandon [01:22:13]: Hopefully. Hopefully.
    Yaroslav [01:22:14]: We should, we should be armed, right? We should, we should be ready to whatever, and then that alone decreases the probability of any conflict. If you’re weak, you’re basically provoking the conflict. The problem with Europe these days is that like, last year, Ukraine and Russia went in drone technology of 2025, year to drone technology of 2026. Europe went from winter of 2022 to spring of 2022. So the gap, Europe didn’t even make one year of progress. The and the US, I would argue, made less than a year of progress as well in the last year. So the gap, the technological gap is getting wider and wider and wider. And at some point, like, I’m looking at polls who are like, very close to us and close to Russia.
    Brandon [01:23:06]: Polish people-
    Yaroslav [01:23:07]: Polish people
    Brandon [01:23:08]: Not surveys.
    Yaroslav [01:23:09]: Not, yeah. Oh, yeah, sorry. Yeah. That’s what I meant. Sorry, not my first language.
    Brandon [01:23:12]: When I’m looking at the polls, what do they, what do they say?
    Yaroslav [01:23:15]: Polish people. Polls.
    Brandon [01:23:16]: No, it’s the right word.
    Brandon [01:23:18]: You’re just thinking about-
    Yaroslav [01:23:20]: No, we.
    Yaroslav [01:23:20]: I’m looking at them, and they bought like 100 tanks and four submarines. It’s like, dudes, you don’t have, like, 1,000 people who know how to operate an FPV. What the hell you’re doing?
    Brandon [01:23:30]: Poland is not preparing for war correctly.
    Yaroslav [01:23:33]: From what I can
    Brandon [01:23:36]: They’re doing a very bad job
    Yaroslav [01:23:36]: They’re not doing it right. And the problem is they’ll be in a situation where, they’re so proud of their winged hussars and like, their cavalry, and the enemy is attacking with airplanes and tanks. That’s literally like the gap is getting wider between Russia and Poland.
    Brandon [01:23:57]: That happened in 1939.
    Yaroslav [01:24:01]: I don’t want that to happen again.
    What America Should Learn from Ukraine’s Defense Valley
    Brandon [01:24:03]: All right, so the Europeans need to wake up more. If you were advising America’s defense establishment, which you might be doing in real life, but if you were saying things on a podcast that might be heard by some people connected to that defense establishment Then which you may or may not be what are like, the besides more funding, more funding, that’ll be necessary for anything, literally anything. But so what are the top priorities policy-wise for America to increase its readiness right now? And let’s say three to five priorities.
    Yaroslav [01:24:38]: Look, I really like this quote, I think it’s by Arthur C. Clarke, that “the future is already here - it’s just not evenly distributed yet.”and just the same way as Silicon Valley as this Sort ofFuture location for all things tech. Kyiv and Ukraine is sort of the defense valley. It’s the point where the future of defense has already arrived, and there is a ton of things to learn from that starting with particular, hundreds of companies in very particular fields, to the battlefield experience, from battlefield commanders of every level, starting from soldiers, surgeon to platoon level commander to brigade level commander, special forces and intelligence, all of that to how the government, organizes, the sort of the infrastructure and sort of the playing ground for all these businesses to flourish, et cetera. So I would definitely look into much tighter integration and exchanging, the experience and so on. That would be one thing.
    Yaroslav [01:26:03]: I think Reform and procurement would be another thing, and I think that’s what, is currently being done with drone dominance. I think Pete Hegseth is leading that and maybe some other people in the administration. I think that’s extremely sort of powerful and right thing to do, and they should scale that big times.
    Yaroslav [01:26:26]: Obviously, any sort of military person would say, “Well, yes, okay, Yar, you’re fine, cool,”but Ukraine and its war theater is very much different from potential scenarios that U.S. Might have to fight, and yes, I agree, but there is still so much to learn even, like, from the sea warfare that Ukraine is doing and then long strain, long range drones like these Shaheds that unfortunately damaged some of the American equipment in the Middle East. They can fly up to two thousand kilometers. So like, if you think about in the Pacific region, like two thousand kilometers, that covers a lot of land with all the like, islands and aircraft carriers, et cetera.
    Brandon [01:27:16]: I think America is learning that lesson right now in Iran, in the Middle East.
    Yaroslav [01:27:20]: You would think so but then, I’m not sure. It’s like there was so many chances to learn that lesson from Ukraine before, and I don’t think it was like, fully learned, so I’m not sure how fully learned the Middle East lessons were.
    Brandon [01:27:34]: Perhaps losing a war to a minor power will teach America.
    Yaroslav [01:27:38]: You can, you
    Brandon [01:27:39]: Although the their economic weapon will be the most important and decisive by far, but still, some of our bases were supposedly, allegedly rendered unusable by their Shahed-type drones.
    Yaroslav [01:27:51]: Look, I think, there are so many lessons to be taken from this like Russia, a much bigger power attacking Ukraine. Given the same logic that we discussed, whoever has more production capacity should win. But then Russia didn’t achieve victory in Ukraine, and then the US didn’t get, like, full victory in Iran. Probably achieved some of the goals, but probably not all of them. So that also, you can flip that. Like when you say, “Okay, what if China has so much more capacity than the US? What if they attack us for whatever reason? How can we hold them back if we don’t have the rare earths?” Well, as the Ukraine and Iranian examples show, you actually can hold back something like that even if you’re a less capable, party.
    Brandon [01:28:42]: Well, those examples did rely on Chinese supply chains, though.
    Yaroslav [01:28:47]: Partially, yes. But then if you think about Ukraine in February twenty-two, twenty-two to first half a year or a year, wasn’t much reliance on Chinese supply chain. We were just relying on whatever we’ve got. So that’s one side of things. Another side of things is basically how much suffering can you withstand along multiple axes? It’s not just the military axis, it’s also, like, the economic axis and the political axis, I would, I would argue. So like, one of the reasons why wars stop or start is because the political pressure on the leadership internally in the country is so high that you just have to stop that right? So I think that differs big times, from whether you were the one who’s seen by the population as the party which started the conflict or the one who was attacked. That’s one part. Another, just by overall state of the society. Like, and one thing I’m worried about in Europe now, that people are not ready to fight even if they’re attacked. Like, when people are asked about that they’re like, “Oh, I’m just going to move to somewhere where there’s like less, there’s no war.”so that’s a challenge, and that’s what makes Europe weaker right now. And the US didn’t really have to ever, I think, fight a foreign war on its own turf. I hope that never happens, but in case that would have happened, I don’t know what would be how would the rich cities of East or West Coast, how would people behave? Like, would all the Wall Street bankers and Silicon Valley VCs, mobilize and really start working on defense stuff? I would love to think so. I like-- That’s the way I think about the American spirit.
    The Nuclear Lesson: Budapest, Deterrence, and the World After 2022
    Brandon [01:30:49]: The way we did in World War II.
    Yaroslav [01:30:53]: In a way, but look, like it wasn’t that clear in World War II, and like Churchill was like famously said, “America will always make the right decision after trying all the wrong ones,”right? And it’s like one could argue that there is this sort of this USA that lives in popular culture and was sort of created by Hollywood as like cool dudes that will always come and do the right thing, right? And then if you, if you look at like, international politics
    Yaroslav [01:31:21]: It doesn’t necessarily always look like that. Like the Budapest Memorandum, like Ukraine gave all of its nuclear weapons, the second, worst, third largest, nuclear arsenal, because the US and Russia and the others were very persuasive and they’re like, “Yeah, just give it away. We guarantee you security.” And they’re like, “Oh, it’s not guarantees, it’s assurances. We use the word assurances, so therefore we didn’t promise you much. You just gave it away for free.” And then like Russia attacks and like no reaction. So the whole world, like 2022, the whole world looks at it and is like, “Oh, okay, so maybe we should get nukes.” So like my prediction, next couple decades, a lot more countries, will be working their own nukes.
    Brandon [01:32:02]: They really should. I’ve, I’m consistently advocated for specifically Japan, South Korea, and Poland to get nukes. But obviously Ukraine should as well, but can’t
    Yaroslav [01:32:11]: Someone could argue that if a country currently doesn’t work on their own nuclear program, they’re, doing a disservice to their country and the government should be fired. Like, because it seems like from the recent world history that is like the only way to actually provide credible deterrence, all right? So I guess I think like in Europe, people are not quite sure, how will America behave. Will it behave as the Hollywood hero, or will it behave pragmatically as it did at the beginning of World War II, or as it did, with when Ukraine was attacked by Russia and the US just decided to sort of push the Budapest Memorandum, aside because of course Russia’s a nuclear power and like we don’t want to mess with it.
    The Drone Race: Where Ukraine, Russia, and the West Stand
    Brandon [01:32:59]: Everyone says Russia’s behind right now in the drone war.
    Yaroslav [01:33:04]: True. Okay.
    Brandon [01:33:04]: But that wasn’t true a year ago. So a year ago people were saying either Russia was ahead or they’re at parity, or maybe a year and a half ago.
    Brandon [01:33:12]: Russia has more people, four times as many people about, or more.
    Yaroslav [01:33:17]: I think give or take, yeah. 30 versus like 120-ish. Yeah.
    Brandon [01:33:21]: Four times as many people.
    Brandon [01:33:27]: More help from China.
    Yaroslav [01:33:28]: Like economy is like 10, 10- 20 times bigger, I don’t know. A lot bigger.
    Brandon [01:33:33]: A lot of oil money, a lot of oil money, that Ukraine just doesn’t have. More direct help from China than Ukraine is getting.
    Brandon [01:33:41]: Russia just has this massive advantage in scaling against Ukraine itself. Ukraine has financial assistance from the EU, but Right now Ukraine is ahead in the drone race
    Yaroslav [01:33:54]: I’m not sure about that by the way.
    Brandon [01:33:56]: Is that I was Well, that was going to be my next question. Is that true? And if it is true, how long before Russia manages to pivot, course correct, and regain the lead?
    Noah [01:34:05]: Sorry. For my own curiosity, can we define drone race?
    Yaroslav [01:34:09]: Look, I think it’s also for our listeners It’s helpful to understand that there are
    Yaroslav [01:34:17]: At least 30 different types, categories of drones, right? Like you have If you, if you, first you have like different domains. You have flying drones, ground vehicles, and you have sea vehicles, and you have undersea vehicles, right? Then for each of those domains, you have multiple use cases. Like for ground vehicles, you have logistics, evacuation, mining, de-mining
    Yaroslav [01:34:48]: Like maybe something else. For aerial, you have reconnaissance, front strike, mid strike, deep strike, mining, de-mining, radio repeating, kamikaze and bombing, ISR, different types of surveillance, so tactical surveillance, operational level surveillance, maybe strategic level surveilla surveillance at some point.
    Yaroslav [01:35:17]: Logistics also with aerial drones. For sea drones, same thing. So In each of those categories, you have Dozens, sometimes over 100 companies, and products which compete. So that’s the current Ukrainian, battlefield. From the Russian side, it’s less of a zoo, as we say. So they, in each category, they usually have one to maybe three products, and then they scale it sort of in a centralized fashion. And then so when you talk about whether we are behind or who’s behind or ahead in drone warfare You got to analyze
    Brandon [01:36:04]: It’s asymmetric, so it’s hard to compare
    Yaroslav [01:36:05]: Sort of area by area, right? So if you’re like talking about their front strike, I would argue that Ukraine has gotten ahead recently with after scaling the fiber optic. Before that Russia was slightly ahead. So Ukraine got ahead. With like mid strikes, so say something like 40 to 200 kilometers
    Yaroslav [01:36:35]: It’s hard for me to judge. At some point Russia was ahead. I think maybe we’re getting ahead as well, and deep strike we recently got ahead, so we were we were doing more damage to Russia with deep strike drones than they’re doing to us. In sea drones, we’re consistently ahead, always were ahead. In ground drones, I think we’re ahead. Yeah, I think like on
    Brandon [01:37:00]: Where are they still ahead?
    Yaroslav [01:37:01]: In general, I think we’re ahead. Where they, where they are still ahead? I think in certain parts, -Of the components, like A GPS free or navigation like these CRPA antennas are pretty good. They have, these, winged, bombs that they drop from their bomber planes.
    Yaroslav [01:37:33]: I forgot the English name for it.
    Brandon [01:37:34]: Glide bomb?
    Yaroslav [01:37:35]: Sort of. Yeah. So they’re ahead on that side, and it’s like it’s difficult to protect from those.
    Brandon [01:37:42]: What’s the range of that?
    Yaroslav [01:37:45]: It can be pretty big. I think it’s like, can be up to 80 kilometers. Then obviously the range-
    Brandon [01:37:52]: From like a fighter plane, like a strike?
    Yaroslav [01:37:54]: The range is a very iffy subject here because the range is
    Yaroslav [01:38:01]: Is like basically the distance from where you drop the bomb to where it lands, but also you drop it from a fighter plane, and then fighter planes are susceptible to aerial interceptor missiles. So on our side, we have our own fighter planes, and we have the ground anti-air systems. And then, and then those two assets, they have their radars and radar fields. And then, depending on the enemy tactics, you can, calculate how big is the aerial area that you cover with those assets. And look, I’m not a professional military guy, so I’m covering these topics in a in layman terms. Don’t quote me on this. I’m just trying this to make this as understandable to an average listener as possible.
    Brandon [01:38:50]: Helicopters. I’ve recently seen reports of drones taking out helicopters in the air, and that this is new.
    Brandon [01:39:00]: Is that new? Is that going to be a big deal? Is that going to incre like, is that going to eventually get rid of helicopters the way drones are getting rid of tanks in the battlefield?
    Helicopters, Drone Carriers, and Future Air Defense
    Yaroslav [01:39:10]: Look, helicopters are also versatile assets. Front strike helicopters, I think we’re going to be seeing fewer and fewer of them. These few Russian helicopters that Ukraine’s intercepted with drones were more like edge cases than a systematic, sort of helicopter hunting campaign. I think it is possible to turn it into a systematic, countermeasure against helicopters.
    Brandon [01:39:38]: What kind of Will those be battery powered drones themselves, do you think?
    Yaroslav [01:39:41]: Potentially. And there are like so many different scenarios. Like you can have large aerial drone carriers carrying interceptor drones.
    Brandon [01:39:54]: That then go hit the helicopters.
    Yaroslav [01:39:56]: For example. Or you can have, battery powered interceptor drones, but not of a missile with a propeller type, as many of these well-known drones like Stinger or P-One Sun. They look like basically a missile with a quadcopter, behind it. But you can also have a plane or like fixed wing like, aerial interceptors.
    Brandon [01:40:25]: Does anyone, does anyone have like a little like, drone that flies super low under the helicopter and like shoots it from underneath?
    Yaroslav [01:40:33]: Like in theory you can imagine that but it’s just
    Brandon [01:40:37]: Or like surface, a drone that carries surface-to-air missiles somehow.
    Yaroslav [01:40:40]: I don’t think that’s very practical because whatever you have going on land will be just super slow and not fast enough to be able to hunt down a helicopter.
    Brandon [01:40:50]: I mean like in the in the air. Is it, is are is there a drone capable of carrying a small surface-to-air missile that can like skim, low and then launch its little missile, like a flying missile platform or something?
    Yaroslav [01:41:00]: In theory, but like a big part of a mission like that is not just kinetically getting to a helicopter, but also identifying it, either by means of first radar and then visually, and placing the asset you have, the interception asset you have in the right place in the right time. So the combination of those things is much more complex than just, how can we strike it like from behind or from below. But then helicopters are not, that does not mean they’re becoming like completely useless. Like for example, helicopters are used to intercept, deep strike drones. Like Ukraine uses a lot of helicopters to shoot down Shaheds.
    Yaroslav [01:41:44]: Russia uses helicopters to shoot down our deep strike drones.
    Counter-Drone Systems: Shotguns, EW, and Surviving FPVs
    Brandon [01:41:50]: A lot of people talk Oh, so Some ideas about drone countermeasures, things people do technologically to try to shoot down FPV drones or bomber drones or whatever.
    Brandon [01:42:03]: Dumb question that I probably already know the answer to but for the listeners, why can’t you use a shotgun? Shoot down drones that are coming after you. When you have like a Why can’t you just shoot the thing?
    Yaroslav [01:42:11]: That’s the main, weapon that people use against them.
    Brandon [01:42:15]: Why aren’t they very good?
    Yaroslav [01:42:17]: They’re pretty good. Like there are there are like hundreds, maybe thousands of cases of drones being shut down with shotguns, both by definitely thousands, but both by Ukrainians and Russians. There’s even like statistics of
    Brandon [01:42:29]: Got it
    Yaroslav [01:42:29]: What is the percentage of Ukraine FPV drones that didn’t accomplish the mission because they were shut down by a shotgun.
    Brandon [01:42:35]: Got it. So if I’m a guy with a shotgun, I’m walking around, FPV drone comes for me
    Yaroslav [01:42:40]: I don’t recommend that.
    Brandon [01:42:42]: No. I don’t plan on it.
    Brandon [01:42:44]: I’m saying suppose that were the case. In or suppose there’s a there is a guy, he’s not me.
    Brandon [01:42:50]: He’s dumber than me, okay? He’s got a shotgun, he’s walking around. FPV drone is sent. Someone says, “Okay, there’s a guy walking around. Kill him. FPV drone go.”
    Brandon [01:43:00]: FPV drone goes after him. And he has a shotgun.
    Brandon [01:43:03]: What are his chances of using that shotgun to shoot down the drone before the drone gets him? Can Is Are you allowed to say that?
    Yaroslav [01:43:08]: Depending how good you are with a shotgun. I’ll tell
    Brandon [01:43:11]: Random dude
    Yaroslav [01:43:11]: Like I was I was talking to some Ukraine pilot group, and they told me like there was this Russian guy. He was just likeRambo.
    Yaroslav [01:43:20]: He’s like, he like, he shot down like seven FPV drones. They couldn’t, they couldn’t get him. They finally got him, but it was like nothing they’ve seen before, right?
    Brandon [01:43:30]: Got it.
    Brandon [01:43:30]: Your average non-Rambo.
    Yaroslav [01:43:32]: Average non-Rambo will just die.
    Brandon [01:43:34]: Will just die. So there’s like very low chance that they’ll be able to use a shotgun to shoot down the drones.
    Yaroslav [01:43:38]: Rather low chance. Yeah.
    Brandon [01:43:39]: Got it. Well, that was the kind of question I was getting at and there’s no, there’s no sort of portable electronic countermeasure that can get FPV drones if you’re just holding it, very effectively.
    Yaroslav [01:43:50]: There are plenty of it just, depends on it’s always like Electronic countermeasures are used all across the front line. The tricky thing is electronic countermeasures cover certain, radio electronic bands of frequencies.
    Brandon [01:44:06]: Let me simplify my question. Sorry.
    Yaroslav [01:44:07]: Like each side tries to tries to find frequency Will not be covered.
    Brandon [01:44:10]: Let me simplify my question. Is there a man portable system that will give me a greater than 50% chance of living if an FPV drone specifically targets me to come kill me right now?
    Yaroslav [01:44:21]: Look, if your system jams the frequency the drone works on and the drone doesn’t have optic fiber or a last mile autonomy, then you have 100% chance that it will, it will not fly towards you. But then what is the chance to not have drone that can either use different frequency or autonomy or fiber optic? Well, that depends on the on the area you’re in and who’s your adversary in that area, in that zone.
    Brandon [01:44:51]: Let’s I guess this question was maybe too dumb that I was trying to ask.
    Yaroslav [01:44:57]: No, it’s a great question. There are no dumb questions here, and it is just like my answers, if you feel the common theme here, is that things in practice, in war, things are way more complex than they seem.
    Brandon [01:45:11]: What, but so I want, like, I want I’ve read tons of things that say that basically if you’re walking around in the open and drones come for you’re not 100% dead, but you’re probably dead, and I’ve read a bunch of things that say that. I want Listeners to understand why, like, people, who are paying a tiny bit of attention to this debate, to this issue from far away intermittently in America, who don’t, I think don’t understand the weakness of our military against this kind of attack Against drone attack.
    Yaroslav [01:45:48]: I think there was I
    Brandon [01:45:49]: Have a lot of mechanisms, psychological mechanisms by which they cope with the mental idea of drones. I would like to bust those mechanisms by explaining why drones defeat in human infantry on the battlefield.
    Yaroslav [01:46:01]: It’s just A guided bomb flying at you, and it knows exactly where you are right? It’s not that it’s the ultimate weapon, but I think like one of the things that went viral in Ukrainian defense tech bubble, even before the words of the CEO of Rheinmetall, was some American, tank, battle tank pilot, who was interviewed and he was he was asked whether he’s afraid of FPV drones, and he’s like, “No, it’s like we have Our tanks are strong.” And that went viral among Ukrainians because they’re like, “Dude, you have no idea what you’re talking about.” Like, “Don’t mess with those drones.”like, Abrams tank, great tank, but against an FPV drone, sorry, dude, but it’
    Brandon [01:46:54]: Not just deadly
    Yaroslav [01:46:54]: Not going to work.
    Brandon [01:46:55]: Deadly.
    Yaroslav [01:46:55]: No, I was like, maybe not from one drone, but like a dozen drones will take it out. So yeah. But there is hope. So you just have to have kinetic countermeasures. Interesting thing-
    Brandon [01:47:10]: Kinetic countermeasure means a thing that shoots down the drone.
    Yaroslav [01:47:13]: Can mean many things. So if you, if you go to Ukrainian east and sort of territories close to the front lines, I think like about 50 kilometers in from the front line, all the roads are covered by fish nets.
    Yaroslav [01:47:31]: You literally, you ride in a corridor of fish nets, and that’s the mechanical countermeasure against the drone.
    Brandon [01:47:39]: You count that as a kinetic countermeasure?
    Yaroslav [01:47:41]: Mechanical. It says mechanical. Yeah.
    Brandon [01:47:42]: Got it. Got it.
    Brandon [01:47:43]: I don’t know all the jargon, so it’s, I’m, I’
    Yaroslav [01:47:45]: Whatever.
    Brandon [01:47:45]: What I’m talking about.
    Yaroslav [01:47:46]: Whatever. Then the tanks, if you look at Russian tanks and sometimes Ukrainian tanks or equipment They all look like Porcupines. They have these long sticking, I don’t know, poles? We talked about poles already on this podcast.
    Brandon [01:48:05]: Different kind of poles.
    Yaroslav [01:48:05]: Different kind of poles.
    Brandon [01:48:06]: A third kind of poles.
    Yaroslav [01:48:06]: That’s the way to protect from drone. That’s to make to that’s the way to make the drone detonate, maybe half a meter or a meter away from the actual shell of the tank. Or yeah, sometimes there are like nets on top of these tanks, just welded on some extra, sort of equipment. Then of course, there are guns That
    Yaroslav [01:48:35]: Like what both Russians and Ukraine or Ukrainians are beginning to experiment with is Kind of interceptor drone, anti-FPV interceptor drone, which you put on top of something like a gun, like harpoon sort of thing, and when you see like a drone coming at you, maybe you can notice or hear it from 200 meters or 100 meters. So you have a couple of seconds, and you grab that thing, you point it, and you fire it, and then onboard it has certain AI that helps it to guide the small drone towards an attacking drone and intercept it that way. So those are the things that are being developed and like, we’re working on some of these things as well, and then you can imagine like an armor with -Hundreds on of drones on top of it, which are protector drones. They’re sort of like active armor. Whenever they see a drone-
    Brandon [01:49:27]: Huh
    Yaroslav [01:49:27]: Coming at you, they, like, take off.
    Lasers, Skynex, and the Cost-to-Effect Problem
    Brandon [01:49:29]: That’s cool. What about, what about the kind of things that the Germans are building, which is basically like a big truck with a some sort of automated shotgun on it?
    Yaroslav [01:49:40]: Like they have Skynex. It’s, by Rheinmetall, by the guy whom we mentioned today. Skynex is considered to be an okay weapon. Their shots are quite expensive though. So I’ll tell you this different story, about
    Brandon [01:50:00]: It’s about cost to fire each shot really and stuff.
    Yaroslav [01:50:03]: Cost to effect in a sort of a more abstract way. So I was last year I was speaking at Land Europe Conference. It’s the biggest USAA, USA Army, conference in Europe, called Land Europe. And There was an expo there, and there was like a Raytheon, a RTX booth there. And Raytheon is an amazing company. Gosh, we love Raytheon. They’re making Patriots. Patriots are the best. And they make a bunch of other things. And they had this laser gun project there basically.
    Brandon [01:50:44]: That’s what I was going to ask about next is laser.
    Yaroslav [01:50:46]: Laser thing was like they have it in two variations, two kilowatt, sorry, 10 kilowatt laser and 20 kilowatt laser. I’m like, “Okay, 10 kilowatt laser, tell me about it.” He’s like, “Can it take down an FPV drone?” I’m like, “Yes, of course it can.” I’m like, “Okay, cool. How much time does it take to take down an FPV drone?” And they’re like, “Well, maybe three seconds.” I’m like, “three seconds. That’s like a lot of time. But okay, maybe fine. And what if FPV drone tries to evade, right?” And he’s like, “Well, we will retarget it again.” And it’s like, “And then three seconds start again?”“Yeah.”“Okay. Well, can it take down like a dozen FPV drones?” They’re like, “Yeah, for sure.” I’m like, “Okay, a dozen FPV drones, 30 seconds? Maybe, yes. Two kilometers? Maybe yes, maybe no.” And I’m like, “Okay, how much does it cost?” And he said something like $3 million or something like that.
    Yaroslav [01:51:44]: I’m like, “Okay, $3 million. So that is 6,000 FPV drones.
    Yaroslav [01:51:51]: I doubt this thing will be able to handle 6,000 FPV drones or even 600 FPV drones coming at it at the same time.” So you have this kind of economic. And this product may not be necessarily a product against an FPV drone. It might Or against an FPV drone in an active battlefield environment. It might be guarding a stadium in a peaceful country. And then, some random dudes launch a couple drones above a stadium, shoot them down. Okay, everyone’s happy, although the drone will fall down, maybe fall on someone’s head. That wouldn’t be cool. So you would want something like catching bad drones with a net above a stadium or something like that. But whatever.
    Yaroslav [01:52:33]: My point is the economics matters
    Brandon [01:52:35]: You’re talking about the 6,000 drones. If you sent them one by one, it wouldn’t, it would just be pew.
    Yaroslav [01:52:40]: But who would send them one by one?
    Brandon [01:52:40]: If you sent a mass of 6,000, it wouldn’
    Yaroslav [01:52:42]: Of course, yeah.
    Brandon [01:52:46]: What about just like a more powerful laser, like 100, kilowatt laser or something that wouldn’t need to spend, that would
    Yaroslav [01:52:51]: No, that’s worse. You need less powerful laser that achieves the same effect.
    Brandon [01:52:56]: For cost of the system.
    Yaroslav [01:52:56]: A more powerful, yeah, a more powerful laser would be more expensive, heavier, more difficult to transport. It will be more difficult to make many of them. And therefore you wouldn’t be able to cover a long front line, and would be super expensive to replace if it gets damaged, all of those issues. So the reason why FPV drones or iPhones become so popular is because they’re small and everyone can have one? And so is with the countermeasures. So that’s, you were asking me about sort of policy advice. So that’s like another sort of mental shift that you got to go through. It’s no longer about an aircraft carrier that costs whatever, $14 billion and takes forever to build. It’s about mass, that is you can iterate on very quickly. You can upgrade it. Everyone can operate it. And then that mass when it is combined or the technologies when they’re, extrapolated from like one domain to another domain, they add up, right, as it happens with software. So I think that’s important.
    Noah [01:54:14]: Can I ask a follow-up question? So Russia is not necessarily the smartest army you could be fighting. What would happen if you, your adversary was smarter? Do you think things would change meaningfully?
    Yaroslav [01:54:31]: Look, I don’t know if I fully agree with not the smartest army. Who is the smartest army?
    Brandon [01:54:37]: Ukraine?
    Noah [01:54:38]: That’s a great question.
    Yaroslav [01:54:40]: I don’t know. I don’t know.
    Yaroslav [01:54:43]: I think those are like, very dangerous assumptions to make.
    Brandon [01:54:48]: Who was the smartest army in World War I?
    Yaroslav [01:54:51]: Like, well, define smart.
    Russia’s Strategy, Western Assumptions, and Preparing for War
    Brandon [01:54:53]: The United States. Yeah.
    Yaroslav [01:54:53]: Why do you think so?
    Yaroslav [01:54:55]: Why do you think Russia is not the smartest army?
    Noah [01:54:56]: Maybe this is just my own, information bubble.
    Yaroslav [01:55:00]: I’m just like, maybe I agree with you. But I’m just like, I’m naturally wired To challenge those assumptions.
    Noah [01:55:06]: No, that’s a that’s a really good point. I guess, when I, from my information bubble, it seems like Russia’s strategy has largely been to just throw resources, people-
    Yaroslav [01:55:17]: You are living in a Western propaganda Information bubble, of course.
    Yaroslav [01:55:21]: Like, as am I.
    Yaroslav [01:55:22]: Like, because we’re all rooting Ukraine to win, right? Sorry, go on.
    Noah [01:55:26]: In but going back to this granted there’s a history of large powers failing to take over smaller, -Strategically, you
    Yaroslav [01:55:38]: Divide and Goliath
    Noah [01:55:40]: They, this
    Brandon [01:55:40]: They fail a lot more now than they used to. The success rate of taking-
    Noah [01:55:44]: That’s true
    Brandon [01:55:44]: Places over has gone way down.
    Noah [01:55:46]: Certainly, yeah. But regardless, it does, I do wonder, like, if Russia had not essentially assumed victory early It may have different, yeah
    Yaroslav [01:55:56]: I, like, they’re super stupid, of course.
    Yaroslav [01:55:58]: Like, they were marching at With their parade, costumes and like, they were thinking they’re going to have a parade in Kyiv in a few days. Like, that was super stupid. And like, there were lots of stupid things that are like they have no regard, no care for human life. They’re sending those Russian folks just, like, without armor, without anything, like folks on crutches, like sending them to storm Ukrainian positions. And it’s
    Brandon [01:56:23]: They’re the Zerg.
    Noah [01:56:23]: You think at this point there’s
    Yaroslav [01:56:24]: I have, like, I have actually a good friend. He’s American. He’s from Seattle. He’s, served, had been in the Special Forces here in the US, had been in maybe three deployments, and then went to Ukraine, volunteered.
    Yaroslav [01:56:39]: He’s been fighting since, like, 2022. He’s a very good friend of mine. So at some point he’s like, he’s been texting me, and he’s like, “Okay, I’m near Pokrovsk,”and sorry, not Pokrovsk. It was gosh, the other city, Chasiv Yar.
    Yaroslav [01:56:55]: It, and he’s like, “Okay, so what Russians are doing, they’re just creating so much work for all the all the psychologists who are going to heal those Ukrainian, whatever, riflemen or machine gunmen, who are just, like, shooting at the Russians who are like, going nonstop,”right? So it’s like causing, or Russians are causing psychological trauma on Ukrainians because they’re dying in such stupid way.
    Noah [01:57:26]: Jeez
    Yaroslav [01:57:26]: That is indeed stupid of sort of Russian higher command, et cetera, et cetera, et cetera. But then that’s the resource they have. And
    Brandon [01:57:38]: If you’ve got, if you’ve got Zerglings, you use your Zerglings.
    Yaroslav [01:57:40]: That’s the way. That’s their strategy. That’s their way of strategy, right?
    Brandon [01:57:43]: If you’re going to play Back in the That’s what you do.
    Yaroslav [01:57:46]: If you play StarCraft, that’s how Zergs win.
    Brandon [01:57:48]: Are Ukrainians the Terrans?
    Yaroslav [01:57:52]: I don’t know. I hope we will become Protoss soon.
    Yaroslav [01:57:57]: I’m working on that. I’m working on that.
    Brandon [01:58:02]: Protoss had fairly bad political management at the top
    Yaroslav [01:58:04]: I wish Protoss with a speed closer to like, humans or Terrans, whatever it is. Hopefully we can do Protoss technology with a Zerg speed. That would be the best. I think that’s what the housewives are working on in fact.
    Brandon [01:58:20]: You cannot beat those housewives. Do not oppose Ukrainian housewives.
    Yaroslav [01:58:23]: Do not mess with Ukrainian housewives, for sure. Yeah.
    Noah [01:58:26]: Two final questions. First one, you started out by telling us a story about going to a chapel on February 23rd.
    Noah [01:58:34]: Were you able to get married there? Can you finish that story?
    Yaroslav [01:58:40]: We actually, we did get married, but we postponed the wedding as a social event, until the war is over.
    Noah [01:58:49]: Then last question, what do you want our audience to take away? If you have one point you want them to walk away with what would it be?
    Yaroslav [01:58:58]: You want peace, be prepared for war. Got to invest in defense and security.
    Noah [01:59:04]: All right. Thanks. Thank you for talking with us.
    Yaroslav [01:59:06]: Thank you.
    Noah [01:59:07]: Thank you, Noah, for all the great questions.
    Yaroslav [01:59:11]: No, it was fantastic.
    Yaroslav [01:59:12]: Thanks so much.
    Brandon [01:59:13]: Really fun.
    Noah [01:59:13]: Awesome. Thanks.


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
  • Latent Space: The AI Engineer Podcast

    AI-Native Healthcare: 100M Doctor Visits, 10–20 Hours Saved, Prior Auth in Minutes — Janie Lee & Chai Asawa, Abridge

    14/05/2026 | 1h 5min
    Special discounts up for AIE Melbourne (LS discount) and AIE World’s Fair (group discounts up to 25% - CFPs still open for Autoresearch and Vertical AI) Cya there!
    Abridge did not start as an “GPT wrapper”. It was founded in 2018, years before the Cambrian explosion of AI application layer companies. OpenAI launched ChatGPT publicly on November 30, 2022 and by then, Abridge had already spent years doing the unglamorous work of building trust for one of the highest context, most important workflows in healthcare: the conversation between a patient and a clinician.
    Abridge’s original wedge was clinical documentation. Listen to the visit, generate the note, reduce the clerical burden, and let clinicians spend more time with patients instead of the EHR. By focusing on how doctors actually document, how health systems actually buy, how EHR integration actually works, how clinicians verify outputs, and how missing context during a visit turns into downstream friction across billing, prior authorization, quality, and follow-up, the adoption of LLMs became a force multiplier on a workflow already optimized for sensitive context gathering.
    The company has scaled fast: Abridge says it is projected to support 80M+ patient-clinician conversations this year across 250 large and complex U.S. health systems, with support for 28+ languages and 50+ specialties. It raised $300M at a $5.3B valuation in June 2025, after a $250M round earlier that year.
    Today, Janie Lee and Chaitanya “Chai” Asawa of Abridge join us for another crossover pod with Redpoint’s Jacob Effron (who is on the board of Abridge) to dive into how Abridge is building the clinical intelligence layer for healthcare starting with ambient documentation, then expanding into clinical decision support, prior authorization, payer/provider/pharma workflows, and eventually real-time agents that act before, during, and after the patient conversation.
    We go inside the product, data, infra, evals, workflow, privacy, and org design choices behind bringing AI into one of the highest-stakes enterprise environments from 100M+ medical conversations and specialty-specific evals to real-time alerts, EHR integration, de-identification, clinician-scientist teams, and why healthcare may solve some of the hardest AI problems first.
    We discuss:
    * Why Abridge started with clinical documentation, “pajama time,” and saving clinicians 10–20 hours a week
    * The transition from ambient scribe to clinical intelligence layer: save time, save money, and save lives
    * Why conversations between patients and clinicians may be the most important workflow in healthcare (patient visit summary feature)
    * Chai’s “healthcare-coded Glean” framing: context is king, but healthcare raises the stakes on safety, evals, and rollout
    * Why Abridge wants AI to feel like “air conditioning”: always in the background, but only interrupting when it truly matters
    * The prior authorization example: turning a denied MRI weeks later into real-time guidance while the patient is still in the room
    * Why payer policies, EHR data, medical literature, and hospital-specific guidelines make the problem hard, and also create the moat
    * How Abridge thinks about ambient form factors: mobile, desktop, in-room devices, nursing workflows, multimodality, and future AR
    * The multi-sided healthcare customer: CMIOs, CFOs, CIOs, clinicians, patients, payers, and pharma
    * The hardest AI problem at Abridge: high-quality, low-latency, low-cost real-time support in a high-stakes clinical setting
    * When Abridge uses frontier models vs proprietary models, and why its unique data from medical conversations matters
    * Why “every agent is a coding agent underneath,” and how the EHR can be thought of as a filesystem for healthcare agents
    * How Abridge approaches personalization across individual doctors, specialties, and health systems
    * Why “AI slop” is AI without context, and how edits, memories, and clinician preferences create a data flywheel
    * Abridge’s eval stack: LFDs, LLM judges, in-house clinicians, third-party evaluators, specialty-specific evals, and progressive rollout
    * HIPAA, PHI, de-identification, one-way anonymization, customer contracts, and learning from healthcare data safely
    * What changes when you operate at 100M+ conversations: reliability, cost, post-training, model routing, and infrastructure optimization
    * Why the same clinical conversation can serve doctors, patients, payers, pharma, and future clinical-trial workflows
    * How Abridge works with EHRs, and why deep interoperability is table stakes for clinician adoption
    * Why healthcare AI has regulatory tailwinds, why 80/20 does not work here, and why high-stakes domains may drive AI forward
    * Why Abridge embeds “clinician scientists” into product and eval teams
    * What Chai learned from Glean about search, quality, and durable AI infrastructure
    * Why the future of AI infra may look like context layers, event-driven systems, Kafka, Temporal, sockets, CRDTs, and tools built for humans
    * Why Janie changed her mind on “PRDs are dead,” and why crisp written clarity matters more in complex AI products
    * How Abridge uses Claude Code, Cursor, and coding agents internally
    Abridge:
    * Website: https://www.abridge.com/
    * X: https://x.com/AbridgeHQ
    Janie Lee:
    * LinkedIn: https://www.linkedin.com/in/janiejlee
    Chaitanya “Chai” Asawa:
    * LinkedIn: https://www.linkedin.com/in/casawa
    Timestamps
    00:00:00 Introduction and what Abridge does
    00:02:05 From ambient documentation to clinical intelligence
    00:04:04 Clinical decision support and context as king
    00:06:57 Alert fatigue, proactive intelligence, and prior authorization
    00:12:36 Ambient AI form factors and healthcare customers
    00:16:59 The hardest AI problems in healthcare
    00:18:26 Frontier models, proprietary data, and model strategy
    00:21:07 The EHR as a filesystem for agents
    00:24:03 Personalization, memory, and clinician preferences
    00:30:40 Evals, LLM judges, and progressive rollout
    00:36:47 HIPAA, de-identification, and privacy
    00:39:21 100M conversations and operating at scale
    00:44:10 EHR integration and the clinical intelligence layer
    00:46:39 Healthcare regulation, latency, and high-stakes AI
    00:50:11 Clinician scientists and long-tail quality
    00:53:04 Lessons from Glean and durable AI infrastructure
    00:57:03 The future of agentic healthcare workflows
    00:57:34 PRDs, product clarity, and building serious AI products
    01:03:11 AI coding tools at Abridge
    01:04:06 Outro
    Transcript
    Introduction: Abridge, Clinical Intelligence, and the Latent Space x Unsupervised Learning Crossover
    Swyx [00:00:00]: Okay. This is a special crossover Latent Space Unsupervised Learning pod.
    Jacob [00:00:07]: Very excited to do this.
    Jacob [00:00:08]: At this point, we get together once a year.
    Swyx [00:00:10]: Once a year
    Jacob [00:00:11]: And this is a fun occasion to get to do it on.
    Swyx [00:00:13]: I really wanted to talk to Abridge but I felt very underqualified because healthcare is not something we cover very intensely. It just so happens that Redpoint’s our big investors and supporters of Abridge.
    Jacob [00:00:27]: Anytime you want to have a portfolio company on your podcast
    Jacob [00:00:29]: Please, by all means.
    Swyx [00:00:31]: So we’ll introduce our guests. Chai and Janie, welcome to the pod.
    Janie [00:00:34]: Thanks for having us.
    Chai [00:00:35]: Thank you.
    Janie [00:00:35]: We’re excited to be here.
    Chai [00:00:36]: Thank you.
    Swyx [00:00:36]: So for listeners, what do you guys do, just to situate you guys in the company?
    Janie [00:00:42]: Abridge is a clinical intelligence layer for health systems. We really started with documentation and building for clinicians and as we think about reducing the burden that clinicians have, they’re spending 10 to 20 hours a week on documentation. There’s a massive doctor shortage in the country. We also think that conversations between patients and clinicians are probably the most important workflow in healthcare. It’s where care is given and received but if you think about the 20% of our GDP that goes towards healthcare, almost everything is a derivative of that conversation, whether it’s the claim, the payment, the actual diagnosis given, the treatment. And we’ve started with a conversation to reduce the burden for doctors on documentation but we’re really excited about the path ahead as we become this broader clinical intelligence layer.
    Chai [00:01:34]: I’m Chai. I work on clinical decision support at Abridge.
    Swyx [00:01:37]: Yes.
    Chai [00:01:37]: And so as Janie said, we’re uniquely situated where we started off with the clinical note. What I’m really excited about and where we’re expanding towards is what are all the things you can do before the conversation, during the conversation and after the conversation if you did have access to all the context about patients, payer guidelines, medical literature and put that together and to serve, how healthcare could look fundamentally different.
    Swyx [00:02:01]: And that’s the context engine that you guys have?
    Chai [00:02:04]: Yes.
    Swyx [00:02:04]: Is that what it’s called? Okay.
    Swyx [00:02:05]: So historically, as I understand it, the company started in 2018. A lot of people would be familiar with the AI voice notes form factor that doctors would be “Well, do you consent to being recorded?” It replaces handwriting and what have you. But it sounds like more recently there’s been a big transition in the company. Tell me about the broader transition.
    From Documentation to Clinical Intelligence: Save Time, Save Money, Save Lives
    Janie [00:02:26]: So from a transition perspective, we really think about our journey as The first act was: how do we help save time? And that’s where a lot of that original product was.
    Swyx [00:02:37]: By the way, one of those interesting stats
    Swyx [00:02:39]: On your landing page was, doctors spend time after hours.
    Janie [00:02:43]: They call it pajama time.
    Swyx [00:02:44]: Why is that pajama time?
    Janie [00:02:46]: Doctors after work in their pajamas
    Swyx [00:02:48]: In their pajamas. Oh
    Janie [00:02:49]: At home are just writing and catching up on their notes every day.
    Janie [00:02:53]: Some of our favorite customer love stories, we have a Slack channel called Love Stories. We have clinicians telling us, “Abridge has helped us, from retiring early or we’re now finally able to
    Janie [00:03:06]: go home and eat dinner with our kids for the first time.”
    Chai [00:03:08]: Save the marriage in some cases.
    Swyx [00:03:10]: One of the quotes was “We’re not divorcing anymore.”
    Swyx [00:03:12]: I’m asking, “Why?”
    Swyx [00:03:14]: Because they’re working too much.
    Janie [00:03:16]: But, in terms of where we’re going and where we’re expanding, we really think about our second and third acts around how do we help health systems save and make more money. Health systems are operating with record-low operating margins. It’s getting harder and harder to serve patients and they have regulatory, some tailwinds but also a lot of headwinds coming their way and AI is ripe for helping on the saving and make-more-money piece. And then ultimately, how do we help save lives? The fact that our software and our product is open millions of times a week before, during and after a patient walks in the room, gives us massive opportunity with products like clinical decision support, which Chai is building but so many others to improve patient outcomes and probably one of the most important workflows and problems to be going after right now.
    From Glean to Healthcare: Context Is King
    Jacob [00:04:04]: One thing that’s interesting, Chai, is you came over to Abridge from Glean and clinical decision support, which for our listeners is, in the context of a visit, helping a doctor figure out the right type of care. It’s really a search problem in many ways, going through lots of different data sources. Very analogous to your previous role as one of the earliest engineers over at Glean. I’m sure a lot of our listeners are curious what’s similar about the problems that you’re going after now and what feels different, now that you’re in healthcare.
    Chai [00:04:33]: Very similar. Taking a step back, with every wave, there’s a lot of very similar patterns that happen across different products. A lot of social networking products look the same. A lot of credit-based products look the same. And we’re seeing that very similar in the agent era with many companies, of course, in Redpoint’s portfolio and so forth. And the key insight between both companies is that you have amazing models but context is king. Context is what puts them to work. So I see it in a lot of ways, a lot of similarities in this is a healthcare-coded version of Glean but the differences are really interesting. A couple things that come to mind. First and foremost, the rigor of the setting we’re in. The downside risk is extremely high here in healthcare. It can be fatal in some cases. You prescribe something that the patient is allergic to for example. Whereas at Glean, it’s “Oh, you got the question wrong.” It wasn’t the end of the world in most cases. And so what does that mean? That shapes our evaluation strategy, both offline evaluation, progressive rollout and there’s a lot more we could go into there. Second thing that comes to mind is, vertical versus horizontal. In both cases, there’s a large variance but when Glean is, it’s a much more horizontal company, there’s a variance of personas, companies that you’re working with. We also have a variance of personas, different types of specialties, different hospital systems. But the variance is a little more narrow. So from a product perspective, you’re able to focus far more, especially when you have a maturing technology and you’re building new products that never existed before. It lets you go after them much more easily and especially in healthcare where so many problems were solved with labor and process, that it’s extremely ripe for AI to keep helping augment and enable. And the final thing that’s really interesting, Abridge specifically compared to many other companies in the AI area, is the modality we started with where we’re ambient and we’re always listening in the background. And many more AI products will go that way but it’s how we started. And that’s the greatest form of AI we can create, AI that’s seamless. You’re not looking at your screen. It’s always there. It’s always helping you out and being proactive. The Jarvis vision that, every hackathon I went to over the past decade, there was always a Jarvis competitor. But Abridge very much started from the opportunity and continues to go that way.
    Ambient AI and Alert Fatigue: When Should the Product Interrupt?
    Jacob [00:06:57]: One thing that is super interesting then from a product perspective is you have this always-on seamless in the background and then you have to decide when you break the wall almost and say, “Hey, clinician, you might not have thought about X,” or whatever it is that you want to do. And in healthcare traditionally there’s been this idea of alert fatigue and a million pop-ups and then a doctor just ignores all of them. It’s probably a pattern that a lot of builders are thinking through now. How do you think about the right way to intervene or to pop up in a doctor visit?
    Janie [00:07:26]: It’s such a good question. Alerts are notorious in healthcare specifically. Over 90% of alerts are ignored. The first and most important thing is context is everything, as Chai alluded to and I also think about how do we go from being reactive alerting to really proactive intelligence at the point at which it matters most. One thing we like to say is we want our product to feel like air conditioning. It should be in the background just making things better and if there is something that has great clinical risk and we’re acutely aware that intervening now and not later is incredibly important, we should decide to act. But if you think about proactive versus reactive, instead of alerting a clinician during a visit when they’re with their patient having a pretty serious and sensitive conversation, how do we prep a clinician before they walk into the room with that patient? And so historically, clinicians might have to manually go through charts with a patient that they’ve had over the course of months or years and they’ll try to suss out what are the things they should be doing. You can imagine a world with Abridge. We’ll summarize all of the most recent context for you, tell you based on the reason for a visit the patient is coming in for the types of things you should be discussing. And so you’re going into that conversation prepped rather than walking in cold to that patient visit and then having this product interrupt you five or 10 times throughout the visit. And there might be times where it’s really important to interrupt. We have a product called Prior Authorization and so this is when you may go into a doctor’s office with knee pain. They’ll prescribe you an MRI and so many of us have had this experience before, where in four weeks you’ll get a call saying, “Hey, Sean, that MRI that you were prescribed wasn’t approved and why don’t you come back in? We’ll figure it out.” In a world with Abridge, we might choose to quietly but still alert a doctor in that visit. And alert is probably not even the word we would want to use. Before a patient leaves, we would want to tell the doctor, “Hey, Doctor, before Sean leaves, you should ask him, has he had physical therapy and has his pain lasted for more than six weeks? Because the Aetna plan that he’s on in California requires six things. We’ve already confirmed four of them have been met ‘cause we have all the context. But these two last criteria, if you can address with Sean before he leaves the room, we could guarantee that your MRI is approved before you leave.” And so when you think about clinical usefulness, impact to the patient, there are instances in which if we can catch a doctor while the patient is still in the room, as we think about save time, save money, save lives, we get to check all of those boxes. But when doctors have 15 minutes between visits, we have to be really thoughtful about when it matters.
    Prior Authorization: Reducing Latency in Care
    Chai [00:10:23]: There’s this interesting product opportunity AI has is reducing latency in the world. For example, prior authorization is an example of where care gets delayed and so great AI can reduce that. And the problem with alerts before partially is a technical problem: the quality of your alerts really matters. They’re going to get ignored if you get alerts that... Similarly in engineering, where they’re noisy alerts that you can’t act on. But if you can make really high-quality alerts with both the context, as Janie said, and really high-quality models, then you can create a whole other game.
    Janie [00:10:53]: And I really like that experience because it starts to tease apart, what makes this so hard and unique. One, to make that prior authorization example possible, think about all the data that you need to have. You need to integrate with the electronic health record to know all of the patient context. Do we have access to your previous labs, previous imaging? And then to match you and to know that you’re on Aetna, we have to collect all of the different payer policies and they vary by state. Some of these payer policies live on websites. Some of them live in unstructured 50-page PDF files.
    Jacob [00:11:31]: I thought this episode was
    Jacob [00:11:31]: To make sure we didn’t scare people from healthcare.
    Janie [00:11:34]: But when you think about the things that make it hard, it also gives you the moat.
    Janie [00:11:39]: And then the second is the AI and the model quality we need to be able to hang our hat on. And so the bar, similarly when I worked at Opendoor, I worked on pricing models. Every outlier wiped out the margins of 30 and so similarly here in healthcare, the bar for accuracy is so high. And then I’d say the last is workflow is everything. If insurance companies deploy AI, it typically happens too late and this is when you have the notorious comical examples of AI just fighting each other when it’s too late. But if we can pull forward the use of both the AI but also the ability to solve problems when the patient’s in the room, you can start to collapse what typically takes weeks or months after your visit, ideally down to minutes or real-time. And it’s where healthcare is both very difficult but also extremely rewarding if you can crack it.
    Product Form Factors: Mobile, Desktop, In-Room Devices, and AR
    Swyx [00:12:36]: Just to get some baseline on the form factors, because I’ve seen some videos on your website and stuff. You guys talk a lot about ambient AI. Is it primarily on the phone? Is there any other form factor that people get Abridge in? Is there an Abridge room setup where it’s always on? I don’t know.
    Jacob [00:12:55]: An Abridge podcast studio.
    Janie [00:12:58]: Primary form factor is mobile and desktop. Usually
    Janie [00:13:00]: Clinicians are walking in and out of rooms with mobile but at the end of the day, when they’re closing out their notes or wanting to prep for the day ahead, they might use desktop. We have been having a lot of really interesting partnership conversations with a lot of these in-room device companies as you think about the power of multimodality and even more data, as you think about all of what is not captured today. It is fascinating to think about, especially even as we go into building and scaling our nursing product. It’s one where nurses constantly, as they’re walking in to check in on a patient for two minutes or maybe even 30 seconds,
    Janie [00:13:43]: Starting an Abridge experience is probably going to take longer than the visit. And so what can we do with in-room devices that are always on starts to raise really interesting and fun product questions.
    Swyx [00:13:54]: I was thinking, the way in tech companies we have all these Google Meet
    Swyx [00:13:58]: And other things, we might as well set up entire rooms with just Abridge tech.
    Chai [00:14:02]: Very much. AR glasses and related form factors are also relevant: how do we bring the information to the clinician in real-time without a screen, while still letting them focus on the patient?
    Swyx [00:14:18]: Do you think they want that? I’m skeptical of AR, but I’m curious what you’ve tried.
    Chai [00:14:26]: Admittedly, it’s not a near-term product roadmap
    Chai [00:14:29]: By any means. I’m being far-fetched.
    Jacob [00:14:31]: There’s some sick AR stuff for surgeries.
    Swyx [00:14:33]: Really?
    Jacob [00:14:33]: When people are trying to visualize, you’re about to make an incision but you want to see, what the cut might look or what the body might look like inside and they can layer in imaging.
    Swyx [00:14:43]: That’s cool.
    Chai [00:14:45]: At some point in the future.
    Janie [00:14:46]: But there are a lot of our largest customers and at the largest health systems integrating already and so even as we think about building into it, unlocks a lot of product capabilities.
    Swyx [00:14:57]: And just to establish the terminology. Sorry, and I know I’m asking basic questions somewhat for myself but also for the audience who might be
    Health Systems, Buyers, Clinicians, Patients, and Payers
    Swyx [00:15:05]: Less integrated. When you say health systems, it’s like the Johns Hopkins, the Kaiser Permanentes.
    Janie [00:15:09]: Mayos, the Kaisers of the world.
    Swyx [00:15:10]: These are your customers, right? And the outcome that you deliver for them is happier doctors, reduced cost of processing, reduced mistakes. It’s weird in a sense that I feel like there’s also, a secondary customer, the customer of the customer and I don’t know if you — do you think about it that way?
    Janie [00:15:28]: The other interesting and complex part of building product is we have our buyers, who are the chief medical information officers
    Janie [00:15:39]: The chief financial officers, the CIOs of these large health systems. Our users today are clinicians but if you think about who downstream is impacted, it’s patients. And so as we build, with every product in mind, we think about who we’re building for, who the secondary user is and what does that mean either in terms of experience, security compliance, ROI that we have to make tangible. And so like you said, time savings is one of them. But for CFOs, they care a lot more than just time savings. We have to show for every dollar you put into Abridge, because you have more compliant documentation or because you have fewer queries coming from your billing team, we save or add real dollars to your bottom line or top line, are things that we’re constantly thinking about because of the dynamic across all three sets of users.
    Chai [00:16:32]: There’s a whole other axis too with the payers and pharma
    Chai [00:16:35]: as well. Connecting all these three big stakeholders in healthcare is
    Swyx [00:16:39]: Do the payers ever see your data? Sorry, the payers meaning the insurers, right?
    Chai [00:16:44]: Yes.
    Swyx [00:16:44]: They also see Abridge data?
    Chai [00:16:47]: No
    Swyx [00:16:47]: Like the direct integration to you guys
    Chai [00:16:48]: They wouldn’t see the raw Abridge data but when you’re working together on something like prior authorization, whatever information they need, we’d communicate to them.
    Jacob [00:16:59]: That’s cool. I would love to dig into the AI side. You still have a lot of problems on the AI side. And so maybe to start at the highest level, what’s one of the hardest problems you have to solve in AI at Abridge today?
    The Hardest AI Problems: Quality, Latency, and Cost
    Chai [00:17:11]: To make things simple, let’s take, building off the prior auth example. So one thing Janie talked about is okay, this data is all over the place and there’s this combinatorial explosion of procedures, payer policies and even sometimes different health systems. There can be some cross-product of all of these different considerations you have to take into account. But what’s really hard about this problem is doing it real-time in the conversation. So, in any AI product, usually the three KPIs you care about are quality, latency and cost. Now, what we’re saying is we want you to do this real-time in the conversation, guiding the clinician. How do we do it in a way that does not break the bank? But we’re using — But we also need very intelligent models because you’re working with this cross-product of data and this, all this context layer as well. So you need high intelligence and high-quality because you don’t want the alert fatigue but you also need to be fast and cost-effective. And so that’s where a lot of clever engineering goes. It’s okay, without getting into all the details here, can you model these policies in some intermediate representation or other things that you can do that can make this problem tractable? And of course, the Pareto frontier is always changing but we are also trying to do this now.
    Model Strategy: Third-Party Models, Proprietary Data, and Medical Conversations
    Jacob [00:18:26]: What implications has that had for what you take off-the-shelf and say, “ what? We don’t need to be world-class at X. We’ll just take this from the model providers or from some infrastructure player,” and what you’re “No, this is where we spend most of our time focused on”?
    Chai [00:18:38]: This is, the fun challenge in AI?
    Jacob [00:18:42]: It changes every three months? So
    Chai [00:18:42]: Of course, with the shifting landscape, we try to be extremely thoughtful on predicting the trends of where third-party models are going and where we can uniquely go. And, sometimes when you talk about AI models, we’re the models are just going to get infinitely better. But I don’t think... It may be in the grandness of time you could say that but, within every month, every quarter, there’s specific ways they’re getting better. They’re training on a lot more, coding data to be better coding agents, for example. And so
    Chai [00:19:14]: We have to think about where are the things that won’t — unique data that we’re uniquely training on or to step back a little, where is a proprietary model bringing advantage to us is if it can give higher quality or lower cost and latency for similar quality, very similar to many other companies. And when we can do that is when we have proprietary data. So, for example, we have on the order of eighty million or hundreds of millions now getting close to of medical conversations.
    Jacob [00:19:44]: It’s insane.
    Chai [00:19:45]: This is a unique data set. And this data set, it’s very interesting because this data set is effectively a large part of the trace between the patient and the provider. That’s where the quote-unquote debugging happens in healthcare. We have these traces at scale, as in as, our CEOs even called it, an exhaust that comes out of our product. And so when you have these traces, that’s how you can train better agents on certain use cases, whether it’s your transcription diarization use cases or so on or like note generation models and we can do that much cheaper and faster. But we’re always also working with these third-party model providers. We closely collaborate with them and that’s how we predict where the trends are going. The thing that I think about a lot is that, I know that the model providers are going to train much more on agentic workflows and so forth, so that’s great, so that you have a better agentic harness. But the other thing that’s interesting is that the model providers, because a large class of the consumer model providers is healthcare queries, that they might, optimize to train a lot of healthcare data to encode the knowledge in its weights. And this is just a great thing for us as well, where the off-the-shelf models can keep bett-getting better at general healthcare information, such that what our strategy is, we have a constellation of models, we can use something for this, that and, we only care about, at the end of the day, the best product experience.
    EHR as File System: Agentic Workflows and Real-Time Interfaces
    Jacob [00:21:07]: And, you have, overall capabilities improving. I’m curious, as these models get better, is there something you look at and you’re “, three months ago, we really couldn’t do that but God, the the latest models really allow us to do it”?
    Chai [00:21:19]: So here’s something interesting that I’ve, been toying with. So all models are... This wasn’t super obvious a year ago but now it’s become clear and clear that almost every agent is a coding agent underneath the hood? So you give it whatever file system, it can write its own code and so forth. So when you think about within healthcare and the use case that we have, you can think of the EHR effectively like a file system. It’s just — it’s a storage of all this information. It’s a lot of information there that cannot fit into the context window, at least of today’s models and you want to use that context effectively for all these product use cases we’re talking about. And so if you have better agents that can, manipulate data, read that data, treat it as a file system as we see they’re going and we know model companies are investing this way, then that very directly benefits us.
    Swyx [00:22:09]: Yeah. Okay, cool. Again, just establishing basic things. But we’re going back to the model stuff. I’m really interested in double-clicking more on the real-time, element, which is pretty important for both of you. Is it — Is real-time just batches of every one minute, every five minutes? Is that how we do it? Or is there some more native, genuinely real-time in the sense that OpenAI has a real-time API or Gemini has a real-time API?
    Chai [00:22:35]: Yeah. Yeah. So today it is more on the on the batch basis but there’s interesting
    Chai [00:22:41]: Prototypes that we have that we’re still not fully, full time, voice in text out or in that sense. But, can you trigger your models, your agents or agentic workflows, depending on the right times in the conversation?
    Chai [00:22:58]: And so you can imagine, different techniques to bring this latency down and, you want to bring the feedback loop down as much as you can. And so a lot of clever engineering there without fully... Maybe one day we’ll do full voice in and text out, train a model to do something like that.
    Swyx [00:23:15]: You do — People don’t want voice in voice out?
    Chai [00:23:18]: Now we aren’t creating experiences that are, during the conversation, inter — It’s almost like
    Swyx [00:23:25]: Might be too disruptive
    Chai [00:23:26]: Too disruptive until, who knows, maybe eventually you could have full voice agents once we — the quality and we improve the comfort of the technology. But right now gra — that change is much more gradual and it’s more text focus, text out.
    Janie [00:23:42]: And so much of currently what our product is trying to do is allow a clinician to focus on their patient and maybe at some point but right now patients, clinicians don’t want a third voice, at least in a literal voice in that room. And so how do we be there with all the contacts and information ready at hand when there’s the right moment?
    Personalization: Individual Doctors, Specialties, and Health Systems
    Jacob [00:24:03]: Jenny, one thing I’m curious about is how you think about, personalization in the product. I imagine, every doctor is a special snowflake in their own way, has their own way they like to do things. There are probably a bunch of different approaches you could take to doing that, both within the model layer itself but then also just with clever prompting or engineering. How do you
    Jacob [00:24:20]: Deliver on that?
    Janie [00:24:21]: It’s such a good question. Personalization is massive for us. We think about personalization at three levels. The first is at the individual, the second is at the specialty level and then the third is at the health system or the organization level. To your point, there are a lot of individual preferences. You-When a note is produced, it almost is a reflection that is so deeply personal of a doctor’s work and how they give care. And so do they have preferences on things like style? They might want bullets versus paragraphs, really concise versus comprehensive. They also might have phrases that they really like to use or the templates that they want every note to be structured. And, we see it in our feedback all the time. We want two spaces in between sentences or I refuse to use this tool. And so that’s something that we’ve had to build in. And the tricky part is how do you make sure that stylistic preferences don’t interrupt accuracy and quality and that’s something that we’ve really had to refine and hone over time. Second is at the specialty level. A cardiologist note or workflow is going to look very different from a dermatologist workflow.
    Jacob [00:25:32]: I assume cardiology notes are the highest stakes for you guys, given your CEO is a cardiologist.
    Jacob [00:25:36]: It’s “Oh my God, make sure we get this one.”
    Janie [00:25:37]: Shiv, our CEO, is still a practicing cardiologist. He rounds once a month. And so, first call when we want just quick and easy user feedback too.
    Janie [00:25:46]: But, specialties require a lot of personalization, both in terms of what does the product look and so we make sure that as new users onboard, we catch that and the product proportionally reflects that. But also on the back end, evals at the specialty level, they are hard-earned to calibrate and get. What does a really great dermatology note look like? What makes it complete? What makes it compliant and billable is very different than a primary care doctor. And so it’s not just about what does the product experience look but on the back end tuning and really deepening our understanding for the specialists. What does great output look like? And that’s, a problem that we need to calibrate internally, externally, online, offline but, takes lots of cycles but is necessary in a high-stakes environment. And then at the health system level, for products like clinical decision support, you have health systems who’ve spent years or decades refining their best practices and they want to know, “Hey, we love your clinical decision support product but how do we embed our own hospital guidelines into them to inform clinicians before, during or after a visit what brest — best practices should look like?” And as you think about, deepening moats as well, when health systems, trust us with that data, allow us to productize it and directly into the clinical workflow, makes us a really great partner to health systems who want to build something that truly meets their needs, their practicing guidelines.
    AI Slop, Memory, and Product Data Flywheels
    Chai [00:27:23]: And I want to add onto that. The for the clinical documentation problem, it’s very similar to AI writing that doesn’t feel like your own and then we call that slop. But the way I describe one framing of slop is like AI without context. But we have all that context and both the clinicians, can have it and can guide it. And so part of the other interesting exhaust for us is, memory is, one of these new systems records
    Chai [00:27:49]: Almost.
    Janie [00:27:50]: And we also have all the edits people make on our product and when you think about a data flywheel and how we get better over time becomes really powerful as a mechanism to just going deeper in personalization.
    Jacob [00:28:04]: It’s interesting. I love this idea of working with systems on the guidelines they built up over a long time. I feel like so many of the best AI app companies today are... The question is: How do you take the expertise that a law firm or a bank has built up over many years and then add that as context and also a special sauce over, a an AI tool? And so seems like y’all are really doing that very effectively.
    Janie [00:28:24]: We’re now starting to have our customers ask, “What are other customers doing?”
    Janie [00:28:28]: “And how are they doing it?”
    Janie [00:28:30]: And as we think about having visibility across such a large set of care being delivered right now, a really interesting place we could also partner.
    Swyx [00:28:40]: I’m just curious. I — This may be a nothing question but, how different are health system guidelines from each other? Don’t they all converge to the same thing? And if not, where do they differ?
    Chai [00:28:52]: At a really high level, they’re going to talk about very similar things but the difference is probably in some more of the details. “Oh, you should refer to specialists only when XYZ conditions are met,” or so forth and maybe different organizations have different practices and guidelines around that. But high level, talking about similar things but the details are what, of course, that shapes the context and the decisions you make.
    Swyx [00:29:15]: And this all goes into the context engine and it might affect the notes but maybe not.
    Chai [00:29:21]: The — For these local pathways, we’re definitely thinking about it a little more for our clinical decision support product.
    Chai [00:29:26]: So yeah.
    Swyx [00:29:27]: Which is your stuff, yeah.
    Swyx [00:29:28]: And then the memory which you raised, let’s just tell us more about that. What have you tried in memory? What’s the structure of the memory? What works? What doesn’t work?
    Chai [00:29:38]: There’s, of course, many different ways you could do memory, where it’s okay, can you bake it into the model weights or can you do it in some external store? For us, what’s interesting is, of course, when you think the models are rapidly changing, whether it’s in-house or third-party, baking into the model weights, sometimes you worry that it could be a little throwaway. And so, how do you... You need to find a way that you decompose the problem, the preferences from the underlying models and so forth. The thing we’re right now most both that’s easiest to start with and we’re excited about is having, a separate store for memory, where you have, for example, a memory sub-agent that’s, working in the background, figuring out what are the important parts of the clinician’s actions that we want to remember for the long term. And then you can also imagine, other things where in the — you have background jobs that are running that are collating these, memories similar to Sleep, of course and what other pattern, patterns products do as well. Learning over all these action, all the action data we have, again, note edits, the conversations they did and the actual transcripts.
    Evals: LFD, LLM Judges, and Clinical Safety
    Jacob [00:30:40]: What about evals? How in the world do you... It is such a complex product surface area. We would love to hear you riff on that and also how has that evolved? I’m sure you’ve gotten better at it, so any learnings along the way.
    Janie [00:30:50]: From an evals perspective, we, from day one when we build any new product or feature, we think about, what does good look like? And there are table stakes things like clinical safety but then you start to get deeper into what does good quality look like. And when you go into something like our core product, there’s stuff like style and completeness and there’s things like does this note become something that can be billable, which is very high stakes for a health system. We have a number of ways in which we get confidence for this. We have, internal in-house clinicians who do what we call an LFD process to give us our very first pass at is this or isn’t this a good enough output, look at the effing data.
    Jacob [00:31:41]: LFD?
    Chai [00:31:42]: That’s why I was smiling. I was “Is Janie going to mention what it stands for?”
    Jacob [00:31:46]: I was not... There’s like a million acronyms.
    Jacob [00:31:48]: How am I supposed to know that I don’t? So “Oh yeah, of course, an LFD.”
    Swyx [00:31:51]: I’ve never heard of LFDs.
    Chai [00:31:53]: It’s a bridge for sure.
    Janie [00:31:55]: I got through three days and then I had to ask someone.
    Janie [00:31:58]: I thought it was just me that didn’t know
    Janie [00:32:01]: It’s our internal process.
    Swyx [00:32:02]: But look at the data as a meme in ML, ‘cause you tend to not look at it. You just want to look at number go up.
    Chai [00:32:06]: Exactly.
    Swyx [00:32:07]: But yes.
    Janie [00:32:08]: But so, we make sure we look at the data and then as we think about all of the components of good output, we, one, create LLM judges across all of these and we make sure with annotated data and either internal or external evaluators, we feel like these judges are calibrated. And then depending on the stakes, we also work with in-house and third-party evaluators across all of these before we ship any big change. And the goal is, in terms of evolution, how do you go from this process taking months, down to weeks, down to days? Some of it is, a true science and ML problem. A lot of it’s also just, hard operational work. Have you planned ahead in terms of what you need? Have you really optimized the capacity that you need across all of the different specialties you need? Have you gotten a really good sense of which third parties are great to work with for what use cases? This takes a lot of domain, expertise and, lots of mistakes and errors in figuring that out. And so as much of it is an ML problem, so much of it has also been operational gains that are hugely important, where domain-specific expertise is everything.
    Specialty-Level Evaluation and Progressive Rollouts
    Jacob [00:33:23]: But it’s funny, ‘cause I feel like people talk about healthcare like it’s one giant market and the reality is
    Jacob [00:33:26]: It’s, dozens and dozens of sub-markets. And so it feels like in your evals you have to build that up across the board, probably.
    Swyx [00:33:34]: And is specialization the primary cardinality at... That’s the word that comes to mind.
    Janie [00:33:40]: Sometimes, depending on the product or the use case. And so if we’re making a note improvement or feature for a particular specialty, definitely but we have products that are for nurses. We have products that, are really aimed at making the document or the output a lot more billable. And so we’ll want to work with coding teams and not necessary clinicians. And so like
    Jacob [00:34:05]: Coding meaning healthcare coding.
    Janie [00:34:06]: Yes. Yes.
    Jacob [00:34:07]: Not
    Chai [00:34:07]: Yes. I see you.
    Swyx [00:34:07]: Other kinds.
    Janie [00:34:09]: But is this output proportional to the work that was delivered? Is there sufficient documentation to justify the amount that a health system may end up charging? And so, specialty sometimes but also domain, very different across all of the different products that we’re working for. And building out that network is, not easy and is where a lot of our operational investments have gone into.
    Chai [00:34:35]: And I view a lot of analogies to self-driving cars here, where, part of it is we really want progressive rollout of features to test in the real world is this useful? Is this going to work? One big difference compared to past lives is before I’d build a product, maybe I’d alpha it and then I’d like GA it the next week, ‘cause I’m “Go, move fast, ship,” and whatnot. But the mentality is like you... I want to make contact with the reality as quick as possible but I want a progressive rollout. Because as much as I get as large of an offline eval set, I want the distribution of that to match real-life distribution. And over time, by rolling out early, similar to Waymo has a tagline, “The world’s most experienced driver,” another thing that can, at least linearly increase for us is, both the size of our evaluation offline and online, that and it all feeds back.
    Janie [00:35:25]: Something that’s been earned over time, speaking of evolution, is just the trust we’ve gotten with customers. Historically, a lot of these health systems, when they bring on new vendors, their release cycles are quarters, sometimes twice a year. We’ve gotten our customers onto monthly release cycles, which is pretty fast for health systems but what is more exciting over the last, call it, few quarters, has been, a subset of our customers have said, “We want to innovate with you. We trust you,” and we have a pretty, decent chunk of our customers who say, “We’ll develop with you outside of these monthly release cycles. We have a higher tolerance. We know that the stakes are very high but we want to be the first ones using these products, giving you feedback.” And so for a pretty substantial set of our customers, we’ve been able to convince them to be able to ship, in this gradual way before GA. Something we talk about a lot internally is, trust is earned in drops, earned in buckets and so we still can’t do what I used to do when I worked at Loom. We had 30 million users. I’d just be, rolling out experiments left and. The bar is still quite high for iterative rollout but because of the trust we’ve earned, we’re able to learn at pretty high volume very quickly.
    Privacy, HIPAA, and De-Identification
    Swyx [00:36:45]: Your scale is still pretty huge.
    Swyx [00:36:47]: One thing I want to... We were going to go into scale? In a sec. One thing I wanted to call up, follow up on evals, which, again, just coming from a generalist engineer point of view, just thinking through what would people be scared of in doing this, the privacy and HIPAA
    Jacob [00:37:00]: Elements of this. I have zero experience in that. What do you have to do? What is surprisingly not that bad?
    Chai [00:37:06]: So one thing that’s really important here from a compliance perspective is very much that any of the data we use needs to be de-identified, any real-world data we use as a basis of online eval sets we’re learning from. And so you have to — And there’s, very clear, government guidelines, what counts as PHI. And so we’ve even have built models that can take, for example, a clinical transcript and remove all the key PHI indicators and so you have a scrubbed/de-identified version. And then once you... And so one thing that’s important is first you’ve got to get confidence in that model in the first place? And prove that out. Because, now you have, multiple probabilistic systems on top of each other.
    Chai [00:37:46]: But once you have that, then you can train on it use it for evaluation and so forth, provided one of the cool things also that you can do from a business side is the right data contracting as well with your partners.
    Jacob [00:37:57]: Is the anonymization one way? Once it’s done, you cannot undo it? Or is there someone
    Chai [00:38:01]: Yes
    Jacob [00:38:02]: Who holds the master key that can... Yeah, okay. So it’s one way.
    Chai [00:38:05]: It’s one way. Yeah.
    Jacob [00:38:06]: That’s how it works. I just wanted to... Because, there’s a lot of this, learning from feedback and everything that, you would want to debug more but you can’t because you just physically don’t allow yourself to.
    Janie [00:38:17]: Some of it’s also written in our customer contracts in terms of who can or can’t access PHI data, how long do we retain it,
    Jacob [00:38:27]: Very good
    Janie [00:38:27]: Before it gets de-identified. And so we have a pretty high bar for who can access that PHI data, just to make sure that we always respect our customer data and privacy. But that’s something that we partner with our customers on too, to make sure that as we want full, as close to precision as possible in that quality
    Janie [00:38:48]: We can still use it.
    Jacob [00:38:50]: But it’ll be fascinating to see how that space evolves? Because you think about, I used to work at a company that, did a lot of healthcare data in the cancer space and if you asked, the average cancer patient, “Hey, do you want people, do you want other patients to be able to learn-”
    Chai [00:39:03]: Take it.
    Jacob [00:39:03]: “... Learn from your experience?”
    Chai [00:39:04]: Take it all.
    Jacob [00:39:05]: They’re “Please.”
    Jacob [00:39:06]: “I’d love, nothing more than for other people to be able to learn from
    Jacob [00:39:10]: The experience that I had.” And so in the past it was a lot harder to do that learning. But with this technology, that might really be practical and so it’ll be fascinating to see how that continues to evolve.
    Chai [00:39:21]: There’s so much in our data set of 100 million conversations.
    Chai [00:39:26]: You can imagine things like insights that you can give to the clinician. How could you, oh, how could you have reacted to this? In coaching or insights around, which treatments are effective or, like... Because you have this, again, this data source that was never captured before but that’s, where, intuition or experience is created from, going back to this idea that the conversation is the agent of truth.
    Operating at Scale: Reliability, Cost, and Token Efficiency
    Jacob [00:39:46]: Back to the 100 million conversations, I feel like you have this insane scale that maybe only a few other AI app companies have and everyone else dreams of. So not everyone has had to confront this yet but maybe just talk about some of the challenges of operating at that scale and what, our listeners have to look forward to if they ever get to this level of scale.
    Chai [00:40:05]: At large and larger in scale, so of course there’s a general, infrastructure reliability. When you... In any given startup, you’re building the plane while it’s flying. So there’s some notion of that. But what gets interesting on the AI and ML side for sure is this, as you get at more and more scale, so one, you have the data to first and foremost do this. But, you start thinking about costs or infrastructure in a whole different way at scale versus, a prototype.
    Chai [00:40:34]: You can use the most expensive model, you can burn as many tokens as you want but when you’re doing 100 million conversations
    Jacob [00:40:41]: Token max on leaderboards are less upsetting than that context.
    Chai [00:40:45]: . When you’re doing that and so that comes for we have the data and we also have the team that’s able to post-train based on this and you can optimize for efficiency, especially in areas where you believe that maybe a lot of the quality headroom is less so and you don’t expect the other off-the-shelf models to go that way, such that you want to do, efficiency maximization, in terms of compute and tokens.
    Jacob [00:41:08]: I feel like you guys live in the future in some way where most use cases today are really just in use case discovery mode, where it’s “God, I really hope I can find something that can get to scale,” and so you’re always going to use the most powerful model. And then the few things that do get to this level of scale, you start to do those optimizations.
    Chai [00:41:22]: It’s a natural trajectory where it’s like zero-to-one, we’re not talking about any of these optimizations.
    Chai [00:41:26]: But when maybe we’re in the one-to-100 or so forth, then we’re in optimization mode and, what works out really well is you’ve got all this data from zero-to-one that lets you do this.
    What Comes Next: The Conversation as the Shared Healthcare Platform
    Jacob [00:41:36]: That’s fascinating. I feel like one thing that’s so interesting about the Abridge footprint is that you’re in the doctor-patient visit in real-time. I always like to say, there’s like probably 50 years’ worth of product you could build on top of that. What gets each of you, I don’t know, what are you most excited about building, either in the short term or medium term or even, long down the line?
    Janie [00:41:53]: Something that I get really excited about is that the same conversation can serve so many stakeholders. If you think about the conversation, a doctor needs to know what is the documentation, how do I make sure that this fully represent the care I gave? A patient needs to know, “What the heck just happened? This was really overwhelming. What are my next steps?” A payer needs to know, was this the proper and appropriate care given? A pharma company might want to know why isn’t this drug being properly used or is there a good candidate for this clinical trial that I’m about to run? And where I get excited is that our product and our platform and our infrastructure can be the same product across all of those things and start to what’s today, separate, very expensive, complex systems that serve each one of these stakeholders in very different ways, start to collapse all of that into a singular platform that enables not just more efficiency across the board but also better outcomes for everyone. And, all of us experience healthcare in probably very painful ways and knowing that there is a world in which we can simplify a lot is really exciting to me and it all starts with the conversation.
    Chai [00:43:15]: It’s interesting. Of it very similar to going back to the KPIs that any AI product cares about. How do you increase quality of care? How do you reduce latency to care? And how do you reduce costs? Which is a huge, in healthcare
    Jacob [00:43:28]: They call it the triple aim in healthcare.
    Chai [00:43:30]: But very similar to building AI products and the thing that really excites me is when we talk about that latency piece, we talked about one example earlier of prior authorization, can you reduce the latency to care? But you can imagine so much more. Oh, as soon as the lab value gets updated, do you have like a background agent that, kicks off and uses all the context to be “Oh, hey, the patient should do this next,” for example. And of flagging that to the clinician who’s always in the loop but reducing that latency, to care. And then you can imagine this is much further down the road but it’s like even connecting that to the direct patient and the consumer. And so how can you, how can you build a bridge to all of these things?
    EHR Partnerships and the Clinical Intelligence Layer
    Jacob [00:44:10]: Very cool. The connections piece is just an ever-growing thing. And one of the key partners is the EHR and I wonder what that relationship is like. Will they, look at this as, something that is valuable enough that they want to own someday?
    Janie [00:44:29]: Our partnerships with the EHR is, we know that we have to be extremely close partners with all the EHRs who we partner with. Being able to not only pull and push all of the data into the right places is, not only table stakes, if we can’t do that, health systems don’t want to use us. The second and the reality of today is clinicians spend a lot of their days in the EHR. So much of what allowed us to win in the largest health systems was pretty direct and, very close partnerships with some of the largest electronic health records that allowed us to pull and push data with APIs that weren’t ready out of the box. And clinicians want to save clicks. Anytime we introduce a new product that, adds two clicks for them in their day, they’re “We’re not going to use it.”
    Janie [00:45:21]: They have 15-minute back-to-back appointments with their patients. They’re spending, hours during pajama time doing documentation. Every second and every minute counts and so we really think about being deeply integrated into the EHR as also table stakes to getting real usage and adoption. And anything that we build or introduce, we really talk about earn the right internally a lot, which is we have to provide so much value or save so much time that people will use us. But those are the two things that are close to us, is we know that the product won’t be used unless it is deeply interoperable.
    Chai [00:46:01]: And strategically, to your point, it’s like what does EHR want to own versus us? EHRs are really focused on the clinical workflows and so forth but some of the things that we’re talking about here, I do these traditionally are outside of the domain where it’s oh, connecting pairs and providers together with provider policies or the clinical trial matching, as Janie brought up. And so these are, entirely — we position ourselves as building this entirely new intelligence, clinical intelligence layer across, again, providers, pharma and, payers.
    Chai [00:46:33]: And so that’s a it’s a whole different ballgame that we try to play
    Chai [00:46:36]: In combination with them.
    Jacob [00:46:37]: But it’s like a different layer of scope.
    Healthcare AI Regulation, Technical Depth, and What Changed Their Minds
    Jacob [00:46:39]: I’m curious, you are both relatively newcomers to healthcare. People have these, there’s lots of futuristic healthcare AI takes of “Oh, everything will look different.”, now that you’ve been in healthcare for a bit, you live at the edge of AI, what have you, changed your mind on around this, as you think about what healthcare looks like in ten, 20 years? Any updates to your mental model from the time being close to the problems?
    Chai [00:47:02]: One thing that I
    Chai [00:47:04]: Was hesitant about before and it’s a common thing when I’m trying to recruit engineers that people ask me around, is definitely oh, healthcare, heavily regulated space. And it is, rightfully so. You want to keep, the patients at the end of the day safe. But one of the interesting things that, is a that surprised me how much it is coming to the company is there’s a lot of really favorable regulatory tailwinds as well. Where you think about, government really wants interoperability between all these systems that we talked about and so agents can access this information. The government just in January, the FDA released updated guidance on clinical decision support, what I work on in such a way that they used to have guidance from like 2022 that required you to have, mention all these options and do all these other things but it’s a very forward and forward-looking way. And so for me, what’s been really cool to work on is this, there’s this very special moment both in AI in general, we all know that but there’s a special moment also regulatory in healthcare as well.
    Janie [00:48:05]: One thing I would call out is for the very reasons things are higher stakes or, potentially considered more difficult in healthcare, it’s where some of the hardest AI problems will get solved first, just because the bar is so high. When I first joined, I was “Oh, this is where we’ll be on the tail end of where, all of the AI innovation will be able to be applied.” But when you think about, zero error evals or multi-step workflows that have really low tolerance, a lot of the innovation will happen here just because we have to or else we can’t ship.
    Jacob [00:48:42]: ‘Cause like in other domains, you’d much rather just solve the 80%-is-good-enough problems first
    Janie [00:48:46]: 80/20 doesn’t work here
    Chai [00:48:48]: And building off that, traditionally, there was a bit of stigma that, oh, healthcare companies are not that interesting from a technical perspective or I’ve seen that or faced that myself. But these are really hard and fun problems from a pure technical perspective beyond just the impact. How do you bring the latency of this thing down and make it really high-quality?
    Reducing Latency: Clinical Workflows, Agents, and Implementation Reality
    Jacob [00:49:07]: How do you bring the latency of things down?
    Chai [00:49:10]: Yeah. Yeah. Yeah. So okay, let’s answer the latency question. And maybe hopefully not too redundant with some of the things I’ve said earlier but some part of it is with any latency, you have to like what is, what is really your bottleneck. In a lot of workflows, it’s sometimes it’s the model itself. And so that’s where like our data flywheel, our post-training team and so forth come in so that can you make the models far more efficient. So that’s one aspect of latency. But there’s whole other aspects of latency where it’s okay, on top of that, if you use a constellation of different models, can you use — can you first use like a — it’s like thinking fast and slow. Can you use a cheap, fast model that triages and hands it off to a larger model where you get more intelligence and so forth and so all these
    Chai [00:49:56]: Clever tricks to make it work.
    Chai [00:49:58]: And by the way, we are totally — we also realize that the parameter frontier is changing and so these tricks will — may not get us to where we want to be in five years but we need to if we want to build a useful product right now.
    Jacob [00:50:11]: Should we go to the quick-fire or you want to ask more about Abridge? We can stuff everything that’s not Abridge into the quick-fire
    Swyx [00:50:16]: I don’t mind. I was — I feel like Janie was on the topic of more long tail stuff, which is
    Swyx [00:50:21]: Not the eighty/twenty thing and that really matters. And I’ll —, if you have any tips or cool stories or just general approaches that have worked for you that’s interesting to dig into.
    Janie [00:50:32]: One of them is even just how we staff our teams looks different than a traditional software engineering team, I’d say.
    Swyx [00:50:40]: Let’s go.
    Clinician Scientists, Edge Cases, and Evals at Scale
    Janie [00:50:41]: We have a bunch of folks with different roles who are clinicians and so we have this role called the clinician scientist and I heard one of our leaders refer to them as mutants recently. But they are people who’ve had clinical backgrounds, so MDs typically, who are also deeply technical, somewhere, on the spectrum of like a full stack engineer all the way to like extremely scrappy prompter. But having each of these people embedded within our teams instantly raises the bar for everything that we build because not only are they determining, is this product clinically useful but they’re deeply embedded in our whole evals process. And so when we talk about LFDs, when we talk about what is our actual evaluation criteria, you don’t want Chai or me creating what those are because we don’t have clinical background. But is probably unique to Abridge but has been game changing. And when you think about where the puck is going, you have people build with clinical backgrounds who are technical and where AI tools are going, they just become
    Janie [00:51:53]: More and more, critical and like the killers of the team. And so that’s one. And then the second is just the scale at which we do evals to catch that long tail up front before anything ever gets into production is something that we’ve pretty much like really started to fine-tune, both from a scale but when do we know we need to get several hundred versus several thousand offline responses, what helps us make that quick decision and make this less of an art and as much of a science as possible. But that’s also been something we’ve had to tune over time.
    Swyx [00:52:27]: And you have partners who opted in to give you those evals.
    Janie [00:52:31]: So we work either internally or with third-party for offline evals and then we have customers who also agree to give us, whether it’s like thumbs up, thumbs down to like choose this or that, a lot of data to get us to what is as close to fully confident as possible.
    Swyx [00:52:51]: The term that comes to mind is
    Swyx [00:52:53]: Like active learning on things where you’re weak. I feel like it’s a lost art
    Swyx [00:52:58]: Is a lot of the polish that comes into doing something like this.
    Janie [00:53:02]: Really.
    Chai [00:53:03]: Hundred percent.
    Lessons from Glean: Technical Foundations and AI App Infrastructure
    Jacob [00:53:04]: Maybe, on a totally unrelated note, Chai, you had a very, storied run at Glean before heading over to Abridge. And so, I’m curious like that — it’s was one of the early AI app success stories. As reflecting back on that experience, what do you think Glean got most, maybe most wrong? Yeah, curious for your reflections.
    Chai [00:53:24]: The... I attribute Glean’s success really to very strong technical foundations, that have really stood the test of time. And so it started with — it started with a known problem and like finding information where work is hard. The best technology at the time was to build really high-quality search. A lot of times enterprise search startups failed because the quality wasn’t great enough. But the learning that people took away from that is, oh, enterprise search is not good enough. And so like quality, really changes the game of like if something can be useful or not. It’s like similarly like people may have taken it that way, “Oh, Alexa voice assistants are not that useful.” But when you have quality, things can change the game. And so Glean’s early foundations, by bringing people who had built search at Google, the best place to have ever built search and being really creative and having a very concrete problem to solve but with the right technical backgrounds, laid the foundation for all of its success for the many years to come. And what’s interesting is always figuring out, hey, how does a company adapt in this, as we all know and we’ve talked many times, in this changing landscape. And so for Glean, how do you put this context layer to the use, has been the thing that we’ve really, the last few years, has been the fun from the challenge. That where like you could say, that’s been the opportunity for the company as well as the challenge as well.
    Jacob [00:54:46]: Definitely a competitive market. It feels like one at the epicenter of the foundation models and, the hyperscalers, so it’ll be interesting to see how it all plays out.
    Chai [00:54:55]: When you think about can you build something that helps everyone at knowledge work as well is a massive opportunity.
    Jacob [00:55:02]: Always my mental model is like there’s a few markets that are like the foundation model companies have to win or are like big enough to go after and It’s probably like consumer code and that.
    Jacob [00:55:11]: And so it would definitely be interesting to see how it plays out. One thing we often think about on the investing side is, the pace of progress in models changes so fast and so the building patterns adjust so fast. And it’s always hard to figure out, what pieces of the way people are building today, the infrastructure tools they use, are going to prove persistent versus, okay, six months later we’re doing something completely different because
    Jacob [00:55:31]: Models have improved. I’m curious of the stuff you use today, how do you think about the pieces of AI infrastructure software that feel a little bit more persistent?
    Chai [00:55:40]: So generally, if you take the thesis that the models are going to be more and more agentic, before we had to build a lot of scaffolding around that. In previous gigs, I’ve — we’ve effectively, we made our own DSL effectively and you can view the because the models were not capable enough, so you needed to simplify things. And you can view it similar to other agent frameworks. But over time, if the models become more and more agentic and can use the similar tools that we already have, where it’s like computer use, writing code itself in sandbox, much more around, far more about, what are the right context layers and the tools to give agents. And then the other things that I think about are how do you really build truly event-driven real-time systems and especially at Abridge, again, where you’re doing something real-time in the conversation. And so there’s a lot of event-driven technology. And by the way, stuff that we’ve always used in the past, whether it’s Kafka, Temporal, Sockets and so forth, how do you bring that together is also durable. Or thinking about patterns in which humans collaborated with each other on Google Docs. How do you think about like CRDT and so forth when you have conflicts, when you have multi-agent systems? So all these things that we’ve built for — the things we’ve built for humans are the things that are going to be, continue to be durable.
    Jacob [00:56:55]: . Just with like 1,000 times more the scale of agents running at them instead.
    Jacob [00:56:58]: They’re going to really work.
    Chai [00:56:58]: So make sure that they scale, of course and fast and whatnot. Without a doubt, yes.
    How Agentic Does Abridge Become?
    Swyx [00:57:03]: Does Abridge become more agentic over time than, what is the next more agentic version of that look like?
    Swyx [00:57:10]: ‘Cause you’re already pretty proactive it’s, with like the notifications.
    Chai [00:57:15]: And so I view that as like a piece of being agentic but I also view it as maybe some of the things we mentioned before, oh, reacting to labs or, doing work in the background or doing
    Chai [00:57:25]: Even more capabilities on behalf of the clinician, who we believe has a super important role to play as, in terms of patient connection and so forth.
    What They Changed Their Minds On: PRDs, Prototypes, and Judgment
    Jacob [00:57:34]: I’m curious for both of you, what’s one thing you’ve changed your mind on in AI in the past year?
    Janie [00:57:39]: The one I flopped on and this is much more product specific, is, probably the hotter take is that prototypes are the end all be all and that PRDs are dead.
    Janie [00:57:51]: We’ve tried switching and... We continue to evolve the way product is developed and, the products that we’re building are extremely complicated and nuanced and it is very difficult for a prototype to capture the full complexity of what can we or can’t we do with this data. What and who... Is this the actual right problem to be solving for in a world where software has become so cheap? Yes, this is a cool looking prototype but should we be spending any of our precious hours here? If so, why? And how does this deepen our moat in a world of decreasing moats? Does this require custom implementation from our customer to use? None of that gets captured in a prototype and so we’ve, we’re continuously evolving the way that we develop product here but even if not written in the same traditional ways as it was two years ago, as a team we’ve gotten pretty, high conviction that in a world of so much noise, crisp written clarity is more important than ever. It might now live in a markdown file that more teams and systems can use as context but that’s probably one that is much more
    Swyx [00:59:06]: So you’re
    Janie [00:59:06]: Function specific to me.
    Jacob [00:59:08]: I love that.
    Swyx [00:59:09]: You’re disagreeing with the consensus
    Janie [00:59:10]: That PRDs are dead
    Swyx [00:59:11]: That’s great, yeah.
    Swyx [00:59:12]: So you are like
    Janie [00:59:14]: That prototypes are the thing.
    Janie [00:59:14]: We should partner with AI to create great documentation but first, probably most important, is strategically answering like why is this problem the one our company and our product should solve? What happens if the next 20 competitors build this? Why, what is our right to win and does this help us differentiate in any way or are we just adding noise? It’s important
    Swyx [00:59:39]: That’s a high bar. I don’t know if I could answer that
    Swyx [00:59:41]: Because a lot of the times the answer is let’s do it first.
    Janie [00:59:44]: And when the cost of doing it first is so expensive, we just talked through the process of getting something out to customers. You need to have a higher bar for as a business, should we invest here? And as all of our roles evolve, one of product or like all of our jobs become should we do this thing? And that’s something that is worth the time spending up front on. And then, as you think about prototypes, it’s still really valuable to quickly show, “Here are the 20 ways we could do it. Clinician, I would love your feedback, which one resonates more?” Or as you get into deeper fidelity, you can also make the prototypes deeper fidelity and like get it as close to production ready as possible. But, beyond that, to get it out to customers, there’s a lot of implementation details, security compliance, edge cases, things that never get caught in a prototype that need to be written out somewhere. And so they look different but still more important than ever.
    Jacob [01:00:52]: It’s interesting. I imagine a lot of that also is like given the context of the stage that Abridge is at.
    Jacob [01:00:58]: I feel like for so many early stage companies, it’s just a desperate race to... You throw like 30 things at the wall, you’re “Please, something just like resonate with my end buyer.” and, you find something and that’s, why the prototype first approach is so powerful. But for you all, it’s like anything you’re going to do is across 200 systems, there’s like a whole, implementation change management side of things and you get a few big bullets to fire at at what you want those systems to do. And so being really thoughtful about that.
    Chai [01:01:25]: It makes a ton of sense and maybe the prototype first takes will all grow into your view of the world when they’re a bit more scaled.
    Janie [01:01:32]: The weekend demo versus it works at the largest health systems is, a massive gap. I don’t think it means we can’t go fast. This is the fastest I’ve built in my career, right now and the
    Chai [01:01:47]: Compared to Loom?
    Janie [01:01:48]: From a the complexity and the scale of the products we’re trying to build and the problems we’re trying to solve, I’d say, yes, maybe I, updated a flow or, shipped a new feature pretty quickly but if you think about some of the products we’re building, we’re trying to collapse prior authorization, things that used to take 45 days across maybe 20 different touch points into one. I’m building faster than I ever have and so the thoughtfulness allows us just to go fast at the right things. It sounds contradictory but that
    Chai [01:02:28]: No
    Janie [01:02:28]: Thought up front
    Chai [01:02:28]: Go slow to go fast.
    Janie [01:02:29]: Exactly.
    Chai [01:02:30]: It’s interesting. In the... When a lot of things are changing and in the AI discourse, sometimes we lose sight of things that always stood the test of time. Judgment and clarity always matters. As an engineer, sometimes I don’t want a prototype. I would like to see... I want the written, the clarity that comes from writing and then we build that. And again, for some things, of course, where it’s a small thing, yeah, just ship the prototype. That’s why, don’t sweat the details. So the interesting thing, the nuance that gets lost sometimes in discussion is, sometimes we need to recalibrate our judgment for sure because the costs and gains have changed but that doesn’t mean we go all the way on one spectrum or the other.
    AI Tools, Claude Code, and Closing Notes
    Chai [01:03:11]: Outside of your specific tool, I always like to ask this question, any other AI tools that you guys are enjoying?
    Chai [01:03:16]: Claude Code. But, that feels, too basic of an answer.
    Chai [01:03:20]: Is all of Abridge engineering very built on Claude Code?
    Chai [01:03:23]: Yes.
    Chai [01:03:23]: Wow.
    Chai [01:03:23]: Very much so. I won’t
    Chai [01:03:26]: We also have Cursor as well.
    Chai [01:03:28]: Many of the
    Chai [01:03:29]: I’m just checking the boxes here.
    Chai [01:03:30]: Many of the tools available but it’s like you look at just earlier in the day, you see an engineer’s screen. You see, six different, Claudes running at it. Sometimes the same person, I’ve seen them on the sofa now with the remote control as well on the mobile. But, very much so. One of the interesting things for me is, as a relatively new person to companies, Claude Code helps me onboard much faster or any of these AI code... And, I feel like I learn so much. I do love the memes of “Claude’s going to do this.” So, I’d like to see Claude,
    Chai [01:04:00]: The venture equivalent is “I’d like to see Claude go do a company at a billion dollars pre-revenue.” Like
    Where to Learn More: Whitepapers, Research, and AbridgeHQ
    Chai [01:04:06]: We always like to leave the last word in these conversations to you both. And so, any place you want to point folks where they can go learn more about Abridge, the work you’re doing, any of the research you guys have done, whatever. The floor is yours.
    Chai [01:04:18]: A couple places. If you... On our Abridge website, we have a lot of our whitepapers where we’ve done a lot of interesting work, such as, reducing a hallucination objection.
    Chai [01:04:27]: Very well-presented, by the way. I liked it. Yeah.
    Chai [01:04:29]: Thank you. Our science team rigorously defined what is the problem. And one of the interesting things, by the way, at Abridge, is we have multiple, stats professors on staff as well. So in that specific whitepaper, Michael Oberst, who’s a professor at JHU. And so we have multiple... And from that comes, very high rigor and then also our taste for design comes from really good presentation. But setting that aside and we’re going to have many more technical topics there, please follow our Twitter account as well, AbridgeHQ. And then the other thing I’ll plug a little is, we have a open house of diving deep into AI and healthcare coming up with Andreessen Horowitz.
    Chai [01:05:07]: Amazing. Well, thanks so much.
    Janie [01:05:09]: Thanks.
    Chai [01:05:09]: This was super fun.
    Chai [01:05:10]: Thanks so much.
    Chai [01:05:10]: Thank you.


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
  • Latent Space: The AI Engineer Podcast

    🔬Doing Vibe Physics — Alex Lupsasca, OpenAI

    05/05/2026 | 1h 31min
    Some people are going crazy over GPT 5.5. Some people. This is the story of the Jagged Frontier. People who use AI to write emails or even code implementation work find the lift moderate whereas people pushing the limits of the model are figuring out that the limits just moved outwards.
    Alex Lupsaska has been tracking this limit for a year and a half now. “When GPT5 came out, it was able to reproduce one of my best papers (that took a very long time to come up with) in 30 minutes.”
    But Alex also notes that this shift was mostly invisible.
    I remember when GPT-5 came out… on Twitter, the reception was lukewarm. A lot of people were like, well, we expected a lot more, and it’s not better at writing email. And I remember thinking, well, okay, GPT-3 could write email. How much better can it get at writing email? That’s not the point. But at the science frontier, the capabilities were really taking off.
    We walk through his paper and more with him in today’s Science pod! Watch here.

    The “Oscar for physics”
    Alex made an early splash in his career with breakthroughs in our understanding of black holes. He’s also known for Black Hole Explorer and an iPhone app that makes visualizing black holes fun and interactive to regular audiences. Alex won the 2024 New Horizons in Fundamental Physics Breakthrough Prize. Known as the “Oscar for physics” this is arguably the most prestigious prize an early stage theoretical physicist can win.
    Alex first saw promise for AI in theoretical physics after he asked o3 for help on his research. In the podcast, Alex recalls asking GPT for help with a calculation that would have taken days, and getting a result in eleven minutes.
    He immediately recognized how impactful AI would be for his work even as though his physicist colleagues and the larger community gave it a lukewarm or skeptical reception.

    The Move 37 Moment for AI x Physics
    GPT-5 had just been released, and Alex tried asking it to solve a problem in a just published paper. GPT-5 said no answer. But Mark Chen, CRO of OpenAI, pushed a bit harder, and had Alex prime the model with a textbook warmup problem, which it easily solved. After using this “priming” trick, GPT-5 was able to reproduce his full result in eleven minutes (yes, the paper was released after the model’s training cutoff).
    “This changes everything.” Alex notes that we seem to be on the edge of a massive change in theoretical physics reasoning. A year prior LLMs were just starting do correct math. Now ChatGPT could reproduce his hardest paper in the time it takes to get a coffee.
    Alex was on sabbatical at Vanderbilt, and he joined OpenAI to start pushing the boundary of AI’s ability to accelerate physics.

    “AI solved the problem before the plane landed”
    Alex began to put GPT through it’s paces, reaching out to colleagues for problems they were stuck on. His old PhD advisor (Prof. Andrew Storminger at Harvard) had an insidght about certain physical quantities known as “single-minus gluon tree amplitudes”.
    In certain cases, these amplitudes may be non-zero when previously shown to always vanish. The team pushed this intuition forward, and came up with a formula for these quantities that appeared nonzero, but which was otherwise completely intractable.

    Spending over a year on this problem, no real progress was made.
    Prof. Storminger planned to visit OpenAI to work on the problem the week after the initial conversation started. In that one week ChatGPT fully solved the problem, as Alex recalled, before Prof. Storminger’s plane even landed.
    What was interesting is not only that ChatGPT solved this problem, but how it solved it. The model quickly realized found a limiting case (known as the “half-collinear regime”), that in hindsight has a nice intuitive explanation. Taking this limit, the gnarly results collapsed down to a simple and intuitive formula!
    The last step was to prove this intuitive formula. The team started with a fresh session, gave a prompt with the context of what they previously learned, and let the model loose. Not only was ChatGPT able to reproduce the previous result, it was able to prove it using a technique unknown to the authors!

    The Vibe Physics moment
    With a concrete success in the bag, the team asked if they could generate new physics from scratch using ChatGPT. They took on what they felt to be a harder problem, looking at the graviton, a proposed particle that should appear when one combines gravity and quantum mechanics. They wrote up a simple prompt asking ChatGPT to perform the same research as the gluon paper but instead for gravitons. And then hit go!
    What came next was truly “vibe physics”, with ChatGPT pushing out 110 pages of novel physics, new calculations, and novel techniques. This was over the course of a day, with most interactions the familiar following the now familiar pattern for anyone who uses a coding agent:
    GPT: Here's your .
    Would you like me to do ?
    Alex: Yes, please do!
    GPT:
    And for those who look deeply, this really was not just a direct 1-1 mapping between gluons and gravitons. ChatGPT imported new techniques that were necessary due to the nature of gravitons, and used them flawlessly.
    They spent the next three weeks verifying all the results. And voila! A new paper featuring novel results in quantum gravity, generated in less than three days total. Truly a “Feel the AGI moment”.

    For those interested, there’s a blog post with the full transcript from initial prompt to final paper. Even if you know no physics, it’s crazy seeing pages of correct calculations fall out of simple prompts such as “Yes calculate outside of SD first. This is the first step.”

    Out-of-domain = new knowledge
    The thing that is qualitatively different between Vibe Physics and Vibe Coding is that Vibe Physics means actually extending the frontier of human knowledge. Looking at the Gluon and Graviton results, they seem in retrospect, like many results in physics and math, like natural extensions of what we already know. This is in fact part of what makes them beautiful. But this was a problem that stumped experts in the domain for a year. Although it does still have a bit of a recombinant flavor, this thing has never been done before.
    It may be that there are still large classes of problems that AI won’t do well on, and approaches that an AI might not think to take. This is the “taste” that everyone has been talking about. Alex told us that these capabilities, however, allow him to explore many possible avenues in order to map out much more ambitious problems to tackle. With AI able to output results basically as fast as we can conceive and validate them, the scope of what one theorist can hope to achieve has just gotten a lot, lot bigger.


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
  • Latent Space: The AI Engineer Podcast

    Physical AI that Moves the World — Qasar Younis & Peter Ludwig, Applied Intuition

    27/04/2026 | 1h 12min
    From building Applied Intuition from YC-era autonomy tooling into a $15B physical AI company, Qasar Younis and Peter Ludwig have spent the last decade living through the full arc of autonomy: from simulation and data infrastructure for robotaxi companies, to operating systems for safety-critical machines, to deploying AI onto cars, trucks, mining equipment, construction vehicles, agriculture, defense systems, and driverless L4 trucks running in Japan today. They join us to explain why “physical AI” is not just LLMs on wheels, why the real bottleneck is no longer model intelligence but deployment onto constrained hardware, and why the future of autonomy may look less like one-off demos and more like Android for every moving machine.
    We discuss:
    * Applied Intuition’s mission: building physical AI for a safer, more prosperous world, powering cars, trucks, construction and mining equipment, agriculture, defense, and other moving machines
    * Why physical AI is different from screen-based AI: learned systems can make mistakes in chat or coding, but safety-critical machines like driverless trucks, autonomous vehicles, and robots need much higher reliability
    * The evolution from autonomy tooling to a broad physical AI platform: starting with simulation and data infrastructure for robotaxi companies, then expanding into 30+ products across simulation, operating systems, autonomy, and AI models
    * Why tooling companies came back into fashion: Qasar on why developer tooling looked unfashionable in 2016, why Applied Intuition still bet on it, and how the AI boom made workflows and tools central again
    * The three core buckets of Applied Intuition’s technology: simulation and RL infrastructure, true operating systems for vehicles and machines, and fundamental AI models for autonomy and world understanding
    * Why vehicles need a real AI operating system: real-time control, sensor streaming, latency, memory management, fail-safes, reliable updates, and why “bricking a car” is much worse than bricking an iPad
    * Physical machines as “phones before Android and iOS”: Peter explains why today’s vehicle and machine software stack is fragmented across many operating systems, and why Applied Intuition wants to consolidate the platform layer
    * Coding agents inside Applied Intuition: Cursor, Claude Code, internal adoption leaderboards, and how AI tools are changing engineering workflows even in embedded systems and safety-critical software
    * Verification and validation for physical AI: why evals get harder as models improve, how end-to-end autonomy changes simulation requirements, and why neural simulation has to be fast and cheap enough to make RL practical
    * From deterministic tests to statistical safety: why autonomy validation is shifting from binary pass/fail requirements toward “how many nines” of reliability and mean time between failures
    * Cruise, Waymo, and public trust: Qasar and Peter discuss why autonomy failures are not just technical issues, how companies interact with regulators, and why Waymo is setting a high bar for the industry
    * Simulation vs. reality: why no simulator perfectly represents the real world, how sim-to-real validation works, and why real-world testing will never disappear
    * World models for physical AI: hydroplaning, construction equipment, visual cues, cause-and-effect learning, and where world models help versus where they are not enough
    * Onboard vs. offboard AI: why data-center models can be huge and slow, but onboard vehicle models need millisecond-level latency, low power, small size, and distillation-like efficiency
    * Why physical AI is not constrained by model intelligence alone: the hard part is deploying models onto real hardware, under safety, latency, power, cost, and reliability constraints
    * Legacy autonomy vs. intelligent autonomy: RTK GPS in mining and agriculture, why hand-coded path-following worked for decades, and why modern systems need perception and dynamic intelligence
    * Planning for physical systems: how “plan mode” applies to robotaxis, mining, defense, and multi-step physical tasks where actions change the state of the world
    * Why robotics demos are not production: the brittle last 1%, humanoid reliability, DARPA Grand Challenge-style prize policy, and the advanced engineering gap between research and deployment
    * Applied Intuition’s hard-earned lessons: after nearly a decade, Peter says they can look at a robotics demo and predict the next 20 problems the company will hit
    * Qasar’s advice to founders: constrain the commercial problem, avoid copying mature-company strategies too early, and remember that compounding technology only matters if you survive long enough to see it compound
    * Why 2014 YC advice may not apply in 2026: capital markets, AI company dynamics, and the difference between building in stealth with a deep network versus building as a new founder today
    * What Applied is hiring for: operating systems, autonomy, dev tooling, model performance, evals, safety-critical systems, hardware/software boundaries, and engineers with deep curiosity about how things work
    Applied Intuition:
    * YouTube: https://www.youtube.com/@AppliedIntuitionInc
    * X: https://x.com/AppliedInt
    * LinkedIn: https://www.linkedin.com/company/applied-intuition-inc
    Qasar Younis:
    * X: https://x.com/qasar
    * LinkedIn: https://www.linkedin.com/in/qasar/
    Peter Ludwig:
    * LinkedIn: https://www.linkedin.com/in/peterwludwig/
    Timestamps
    00:00:00 Introduction: Applied Intuition, Physical AI, and 10 Years of Building
    00:01:37 Physical AI vs. Screen AI: Why Safety-Critical Changes Everything
    00:02:51 The Origin Story: Tooling, YC, and the Scale AI Comparison
    00:05:41 The Three Buckets: Simulation, Operating Systems, and Autonomy Models
    00:11:10 Hardware, Sensors, and the LiDAR Question
    00:14:26 The Operating System Layer: Why Vehicles Are Like Pre-Android Phones
    00:19:13 Customers, Licensing, and the Better-Together Stack
    00:21:19 AI Coding Adoption: Cursor, Claude Code, and the Bimodal Engineer
    00:26:41 Verifiable Rewards, Evals, and Neural Simulation
    00:31:04 Statistical Validation, Regulators, and the Cruise Lesson
    00:40:25 World Models, Hydroplaning, and Cause-Effect Learning
    00:43:34 Onboard vs. Offboard: Latency, Embedded ML, and Distillation
    00:50:57 Plan Mode for Physical Systems and Next-Token Prediction Universally
    00:53:04 Productionization: The 20 Problems Every Robotics Demo Will Hit
    00:58:00 Founder Advice: Constraints, Compounding Tech, and Mature-Company Mimicry
    01:05:41 Hiring Philosophy: Hardware/Software Boundary and Engineering Mindset
    01:08:50 General Motors Institute, Education, and the Curiosity Mindset
    Transcript
    Introduction: Applied Intuition, Physical AI, and 10 Years of Building
    Alessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, founder of Kernel Labs, and I’m joined by Swyx, editor of Latent Space.
    Swyx [00:00:10]: And today we’re very honored to have the founders of Applied Intuition, Qasar and Peter. Welcome.
    Qasar [00:00:17]: You guys really know how to turn it on to podcast mode. That was, you guys are real pros at this.
    Qasar [00:00:23]: They were just joking around right before this, and then they flipped it pretty quick.
    Alessio [00:00:29]: Oh, yeah, it’s good to have you guys. Maybe you just wanna introduce yourself so people know the voice on the mic and they’ll know what they’re hearing.
    Peter [00:00:33]: Oh, sure. Yeah, I’m Peter Ludwig. I’m the co-founder and CTO of Applied Intuition.
    Qasar [00:00:38]: And my name is Qasar Younis. I am the CEO and co-founder with Peter.
    Alessio [00:00:42]: Nice. Can you guys give the high-level overview of what Applied Intuition is? And I was reading through some of the Congress files, when you went out there, Peter, and eighteen of the top twenty global non-Chinese automakers, you two guys, you have customers in agriculture, defense, construction. I think most people have heard of Applied Intuition tied to YC when it was first started, and then you were kinda in stealth for a long time, so maybe just give people the high-level overview of what it is today, and then we’ll dive into the different pieces.
    Peter [00:01:10]: Yeah. So at Applied Intuition, our mission is to build physical AI for a safer, more prosperous world. And so we work on physical AI for all different types of moving systems, everything from cars to trucks to construction and mining equipment, to defense technologies. And we’re a true technology company, so we build and sell the technology, and we sell it to the companies that make the machines. We sell it to the government, really anyone that wants to buy a technology to make machines smart.
    Physical AI vs. Screen AI: Why Safety-Critical Changes Everything
    Qasar [00:01:38]: Yeah. And I think in the broader AI landscape, a lot of the focus, rightfully so in the last, three years has been on large language models, and so everything fits in a screen. Like, whether it’s code complete products or things like that. And what’s different about us is we’re deploying intelligence onto a lot of things that don’t have screens. they’re physical machines. There are sometimes screens within the cabin or for example of a car or a truck or something like that, but most of the value we provide is putting intelligence that is in safety critical environments. So that those two words are really important because learn systems can make mistakes if you’re asking for, like, some, so something like, “Tell me about these podcast hosts
    Qasar [00:02:28]: that I’m about to go meet.” But you can’t do that obviously when you run, like, as an example, we run driverless trucks in Japan right now, as we speak. We can’t have errors. Those are L4 trucks. Yeah.
    Alessio [00:02:40]: Yeah. Was that always the mission? I remember initially, I think people put you and Scale AI very similarly for some things about being kinda like on the data infrastructure side of things. What was the evolution of the company?
    The Origin Story: Tooling, YC, and the Scale AI Comparison
    Peter [00:02:51]: Well, from the very beginning, we always wanted to, really be a technology company that helped generally push forward the industrial sector. And so we started off working in autonomy. Our very first customers were robotaxi companies. And we started off doing a lot of work in simulation and data infrastructure. And then over the years, we’ve expanded our portfolios. Now we have, over thirty products, and it’s a pretty broad technology play within the landscape of physical AI.
    Qasar [00:03:19]: Yeah, I think the Scale reason is because we’re all YC Universe companies. But it was a very different company. Scale, was, is more of a services company, data labeling company fundamentally. We started and still are, do a lot of tooling. So like, you think developer tooling is now in vogue again, thanks to the AI boom. But honestly, ten years ago, it was out of vogue. It w Like, doing a tooling company in 2016, 2017 was not, like, the thing to do because, I don’t know if you remember, the VCs generally, their views was that toolings are They’re just workflows, and workflows ultimately are not really interesting. And we’ve gone and come, full circle with that. But when we started the company, our kind of it’s kinda like in the periphery of what the company wants to be. It was like, from our earliest days, like, we wanna deploy software on physical machines, like on cars and on trucks and things like that. And obviously, we didn’t know that the transformer boom was gonna happen. We didn’t know that autonomy systems would become end-to-end. Those things we didn’t know. And why that’s important when autonomy systems become end-to-end, it is just now those models can be generalized to, multiple form factors. And so back nine, ten years ago, tooling was a great way, and still is a great way to, build the technology and sell technology to our end customers, a lot of them who wanna build this stuff themselves. And so we just offer like a spectrum of solutions from you can just use like one part of a development suite of tools all the way to buying the full thing. The way to think about the company, or at least the way we think about the company is, as Peter said, a technology provider. It’s kinda like, what NVIDIA does or what an AMD, but we just don’t do chips.
    Qasar [00:05:06]: We don’t do silicon. But we’re a technology provider fundamentally. And I think even, we used to joke when we started the company, like, we’re not the guys to build, like, Instagram. Like that was just towards That’s not our That’s just not us in a most fundamental way. I
    Alessio [00:05:20]: You have thoughts.
    Qasar [00:05:21]: Yes.
    Qasar [00:05:22]: Well, it’s, it’s I mean, I think it’s just like what And I mean, we worked on Maps and stuff, Google Maps. Consumer products are extremely difficult for a lot of different reasons. It just, I think doesn’t scratch the itch. I think we’re like Michigan guys who are kind of more of that traditional engineering kind of a realm, or lineage. we used to joke
    The Three Buckets: Simulation, Operating Systems, and Autonomy Models
    Peter [00:05:41]: I gotta say, though, what was clear ten years ago was that there was so much more that was possible with software and AI in vehicles
    Peter [00:05:47]: and that was generally the space that we started in ten years ago.
    Peter [00:05:51]: And the precise path that we’ve taken over the years, I think we’ve been strategic, and we’ve adjusted to make sure that we’re actually building stuff that’s valuable to the market. And like, the technology has changed so much. Like our own technology stack has completely changed, I would say, roughly every two years. And so now we’ve probably done, let’s say, four complete evolutions of our own technology stack. And I sort of see that cadence roughly keeping up.
    Peter [00:06:13]: And so the way even we think about engineering is almost on this two-year horizon, we’re preparing ourselves that, hey, like, we wanna invest the appropriate amount, but then also be very dynamic as the research gets published and as our research team figures out new advancements and adapting to that.
    Qasar [00:06:27]: Yeah. One thing that has been consistent is the type of people we’ve, we’ve recruited. It’s engineers who are fall into the sometimes very traditional, like, Google
    Qasar [00:06:38]: -gen suite, but way different from, other companies. We are hiring folks who really know the intersection of hardware and software, who know really low-level systems. Obviously, traditional ML researchers and folks who’ve, actually, put ML systems into production. That’s been pretty consistent. I think that, like, you look at the mix of our engineering, eighty-three percent of the company is engineering, so it’s, like, a giant list.
    Qasar [00:07:05]: A lot of engineers.
    Alessio [00:07:06]: Which, by the way, a thousand engineers
    Qasar [00:07:07]: Yeah. A thousand engineers.
    Alessio [00:07:08]: that’s on your website, so I imagine it’s up to date.
    Qasar [00:07:11]: It is, it is up to date, yes. Yes.
    Alessio [00:07:12]: okay. And then forty-plus founders.
    Qasar [00:07:15]: Yeah. We would tend to also, This was more luck than strategy. But we’ve recruited a lot of ex-founders. It’s been a great place for founders, YC and non, ‘cause obviously I know a lot of the YC folks. It’s kind of like we recruit a lot of Google people.
    Qasar [00:07:33]: For them to exercise both their technical and non-technical skills because, we’re, we’re, we’re on the applied side. We have a research team that we do fundamental research, we publish, and we’ve, we’ve had great traction there. But fundamentally, the business wants to take this intelligence and deploy it into production and there’s, like, a certain type of person that’s more interested in that.
    Alessio [00:07:54]: Yeah. You mentioned the tech stack, Peter, so I just wanted to give you some rein to just go into it. I’m interested in where Wayve Nutrition, starts and ends in some sense, what won’t you do? What, do you do that’s common among all the verticals that you cover?
    Peter [00:08:10]: There’s a few buckets of work that we do, and we’ve been at this for almost ten years now, so the technology’s pretty broad. But we got started
    Qasar [00:08:17]: Yeah, with a thousand engineers, like, you could work on lots of things.
    Peter [00:08:19]: There’s lots of stuff, yeah, espe-especially with AI tools to help.
    Peter [00:08:22]: So we got our start in simulation and simulation tooling and infrastructure. And so generally, if you’re trying to build a very complex software system that involves moving machines, you need to test that, and the best way to test it is it’s a combination of virtual developments, a simulation, and then also obviously real world testing.
    Peter [00:08:39]: And then there’s a very careful process of that correlation between the simulation results and the real world results and ensuring that the simulator is in fact accurate to that. Simulation’s a very deep topic.
    Peter [00:08:49]: We have a whole suite of products in that, and we could talk for many hours about that specifically. But that is one part of what we do as a company. Reinforcement learning as a subpart of that is also super critical. I think a lot of the a lot of the best advancements happening in a lot of these AI systems right now in some way relate to reinforcement learning, and with now we have lots of compute, and you can do tons of interesting things for reinforcement learning. The second bucket of work that we do is on operating systems technology. true operating systems. Like, think about, schedulers and memory management and middleware and message passing and highly reliable networking and data links. Like, the reality is, if you want to deploy AI onto vehicles, you need a really good operating system. And when we were getting deeper into that space, there wasn’t really anything that we were happy with.
    Peter [00:09:39]: Like, things existed, absolutely, and we were using what was available in the market, and as an engineering organization, we roughly realized these things aren’t great. We think we can do this better, and so let’s, let’s build something. And that was then the that was the moment of inspiration that started our operating systems business, which is now a very real business for us. And in order to write and run great AI, you need a great operating system, and so that-that’s what got us into that. And then the third bucket that we work on, it’s, it’s true fundamental AI technology. Models, we do a lot of work in, as mentioned, the foundational research, but then the also the world models and the actual autonomy models that are running on these physical machines, and that’s across cars, trucks, mining, construction, agriculture, and defense, and so that’s both land, air, and sea.
    Qasar [00:10:31]: And also, a smaller subsector of that third bucket is the interaction of humans with those machines.
    Qasar [00:10:38]: So that’s a multimodal, experience. Historically, if you’re moving a dirt mover or any of these machines, there are, like, buttons you press, whether they’re actual physical tactile buttons or something like a touch screen. That’s just That fundamentally is changing to where you’re just talking to the machine and the machine and you’re teaming with the machine.
    Alessio [00:10:58]: Voice?
    Qasar [00:10:59]: Yeah, voice, absolutely, yeah.
    Alessio [00:11:00]: Oh.
    Qasar [00:11:00]: And also the machine just being aware of who is in the cabin, what their state is. you can think from a safety systems perspective, the most simple version of this is, like, the driver is tired, right? They’re, they’re if you get those alerts when you’re driving your car and says
    Hardware, Sensors, and the LiDAR Question
    Qasar [00:11:15]: -maybe take a coffee break, that take that times, a couple of order of magnitudes up. But this concept of teaming man and machine is important. When you think about running agents or just running, different instances of, Claude and doing work for you in the background, you can take that analogy out, almost copy and paste and put it into, like, a farm, where you have a farmer who’s running a number of machines. So where they interact with the machine is where there’s maybe a critical decision or a disengagement or something like that, but generally speaking, the agent on the physical machine is running and making decisions on the behalf of the farmer until there’s something maybe critical. And that’s also what we work on. So that’s not pure autonomy. It’s a little bit of a mix, but it falls under, autonomy. In the automotive sense, that’s typically defined in SAE levels as an L2++ system
    Qasar [00:12:05]: -with a human in the loop. But just take that idea, to other verticals.
    Alessio [00:12:09]: Yeah. You’ve not mentioned hardware at all, like sensors or obviously we you mentioned you don’t do chips. I think even in AV there’s, like, a big, cameras versus lidars. Like, what are, like, in your space maybe some of those design decisions that you made, and are they driven by the OEM’s ability to put things on the machinery? And like, how much influence do you guys have on co-designing those?
    Peter [00:12:32]: Yeah. So we don’t make sensors. Like, we’re, we’re not a manufacturer. Obviously, we use a lot of sensors in our autonomy products. in terms of what actually goes on the vehicles, we have a preferred set of sensors that we, let’s say fully support, and then our customers, they can sort of choose from those. And obviously if there’s a very strong opinion on supporting something else, we’ll add that to the platform as well. And the lidar question is at this point sort of the age-old,
    Peter [00:12:59]: topic in autonomy, and the state of the industry right now is lidar is hands down a useful sensor, specifically for data collection and the R&D phase of autonomy development. if you see, for example, a Tesla R&D vehicle, it actually has lidar on it
    Peter [00:13:17]: to this day, right? In the Bay Area we see these. you’ll see, like, Model Ys or Cybercab that have lidars on them just driving around. So it’s, it’s useful because it gives you per pixel depth information. So if you can pair a lidar with a camerand you can say that, well, this camera’s looking this direction, this lidar’s looking this direction, and now for each pixel of the camera I can see how far away is that pixel. you can actually then use that as a part of your model training, and then the that depth information then becomes a learned, a learned state of the camera data. And then when you’re doing the production system, you can now remove the lidar
    Peter [00:13:52]: and now you can actually get depth with just the camera. And so that difference between, like, a highly sensored R&D vehicle and then the down-costed production vehicle, we use that across our whole portfolio of products. And of course the end goal is you want super low cost and super reliable.
    Peter [00:14:08]: And then in certain use cases you have some more, bespoke things. Like in defense as an example, you do things at night oftentimes, and so you care about sensors like infrared, more so than And you don’t, you don’t wanna be putting energy out, so you don’t wanna use lidar or radar.
    Peter [00:14:23]: but you still need to be able to see at nighttime. So yeah, we work the whole gamut.
    The Operating System Layer: Why Vehicles Are Like Pre-Android Phones
    Alessio [00:14:27]: Cool. So that’s kinda like on the hardware level. Then on the OS level, how does that look like? What is, like, unique? my drive- I drive a Tesla. Whenever I drive some other car that has a screen, it always sucks.
    Alessio [00:14:38]: It’s on, like, cheap Android tablet. It’s like, it’s laggy and all of that. What does the OS of, like, the autonomy future look like?
    Peter [00:14:46]: When most people, it’s really what you just described. When you think about operating system in a vehicle, you’re thinking about the HMI, right? The human machine interface, and absolutely that’s a an important part of it, but that’s actually only one thin layer on top. So when we talk about operating systems for, like, AI in vehicles, there’s many layers that go deep into the CPU critical realm and embedded systems, and you’re talking about the real time control of
    Peter [00:15:13]: let’s say the electric motors or the engine and the actuators, and you have different redundancies for different, let’s say, the steering actuation in the vehicle. And all of these things, need very core support in the in the operating system. And then of course for autonomy you have real time sensor data that’s streaming in, and the latencies there are really important, right? If you try to Imagine you try to run Microsoft Windows
    Peter [00:15:35]: like streaming your sensor data in or controlling the vehicle. Like, the latencies are gonna be absurd. Like, you can never do that. And so what’s special about what we do is we really have this system level thinking, right? So we’re looking at, we care about every performance characteristics of the entire system, and then we also, because we’re doing a lot of the software or all of that software, we can fine-tune and control all of those things. So we can very carefully tune in the latencies for every aspect of the system. We can carefully tune in the memory management. We can have the right, fail-safes and fallbacks, for different things. ‘Cause you have to account for what if, what if there is a critical failure? What if there’s a cosmic ray that flips
    Peter [00:16:14]: a bit in the middle of the processor that causes some, malfunction? And you have to have a fail-safe to all of that, and so the core operating system is a part of that. And then the one last thing, which is a lot less exciting but is, actually a very big topic, is reliability of updates.
    Peter [00:16:30]: so the I have a Tesla and you get updates fairly frequently, right?
    Peter [00:16:36]: Once a month. Most companies that are making vehicles
    Peter [00:16:40]: are basically never doing updates, and they’re And even if they are doing updates, they’re usually only updating maybe one module. Maybe they’re updating the HMI module. But they’re not able to update, let’s say, the CPU critical parts of the system.
    Peter [00:16:51]: You have to go into the dealer for that. And so with our operating system now we can actually enable highly reliable updates of any system in the vehicle, and that’s way easier said than done. Like, there’s lots of technical, technically deep stuff, in the tech stack to do that in a way that you’re not going to accidentally brick a vehicle.
    Peter [00:17:08]: And right? If, imagine your
    Alessio [00:17:10]: That would be bad.
    Alessio [00:17:11]: Bad.
    Peter [00:17:11]: Bricking a car is a very expensive
    Peter [00:17:13]: and honestly, like across the industry maybe one of the most just pure impactful things that we’ve done is we’ve just, we’re, we’re now enabling the industry to actually do software updates.
    Alessio [00:17:22]: Just to clarify as well, who is the customer for this? Like, I assume a lot of hardware manufacturers have their own firmware, and I’m sure some of them would just have you write it for them because you’re experts. And others would have their own. Like, who pays for this? Who invites you into the house? Is it, is it the end user, or is it, is it the manufacturer?
    Peter [00:17:41]: Yeah. So let me make an analogy firstly on the on the fragmentation of software. So physical machines today are more akin to the state of the phone market before Android and iOS existed, right? So I worked on Android at Google by the way many years ago, and part of the reason that Larry at Google decided to get into Android was they wanted to run Google products on a bunch of phones, and they bought all of these phones from the industry, and it turned out they had like 50 different operating systems on these phones. And it was virtually impossible
    Peter [00:18:17]: for Google to make their app run on all 50 devices equally well. And so the solution was, well, actually what if, what if they created-A really great operating system and made it attractive to all of these phone makers, and that was sort of the genesis for what Android was and why Android existed. It was a way for Google to get their products onto really wide diversity of devices. The state of the physical, industry right now, it’s a little bit like that. Like, there’s yes, these companies have firmware, but they have so many different operating systems, it’s so fragmented, and to actually get a modern AI application to run on these vehicles, you actually, you first have to consolidate the operating system, and so that’s, that’s why we’ve done that. And then, your specific question was who are our customers? It’s, it’s, generally it’s the companies that are making these machines.
    Peter [00:19:06]: And we’re, we’re, we’re selling our technology to them to really simplify the architecture and then enable these AI applications to run on them.
    Customers, Licensing, and the Better-Together Stack
    Swyx [00:19:13]: How much is reusable across? Like, do you have, like, one OS that is just configured for everything, or is there some more customization that is needed?
    Peter [00:19:22]: Yeah, highly reusable. So the fundamental technology is quite universal, right? So things that we do have to think about though are, like, chipset support. And so if you’re, if you’re coding, let’s say, an LLM and you have start with an assumption that, “Hey, oh, I’m gonna, I’m gonna use CUDA, and I’m gonna run this, on an NVIDIA chip,” then you don’t really have to think about the hardware in that sense. Like, you’re just, “Okay, I’m just I’m in the CUDA/NVIDIA ecosystem, and I’m, I’m going to use that.” But the hardware, especially in safety critical systems, it’s a lot more diverse. There’s not one or one or two players. There’s a bunch of different chipsets that we have to support. And so our operating system doesn’t just run on, like, the equivalent of X86. It has to, it has to run on a number of different architectures from chips from a bunch of different companies. But again, we’ve been working on this for a long time now, so we have, we have support for all of those chipsets. And then when you want to then run the AI applications, we can then do that reliably across now a variety of providers.
    Qasar [00:20:19]: And I think that is, like, heavily inspired by Android, right? Android has a huge suite of testing and it’s a reliable operating system that runs on thousands of devices. And we think we can, we can do the same in all these physical moving machines, with the difference that we’re really in a safety critical realm. Android isn’t.
    Alessio [00:20:40]: So on Android, I don’t need to use Gmail, I can use Superhuman. Like, what about your machinery? Like, can people bring somebody else’s automation to it, or is it kinda like all-in-one?
    Qasar [00:20:50]: You have to use us. No. Yeah. we’re If, Yeah. Yeah, it’s totally open. Yeah.
    Peter [00:20:56]: Yeah. our philosophy is that we are a technology company, and so we license our technology to customers to use how they want. And so if a customer wants to If they wanna license our autonomy tech and our operating system, then great, we’ll license those. If they just wanna license the operating system and then use different autonomy tech, that’s fine also, and we have great documentation and
    Swyx [00:21:17]: Or if they wanna use developer tooling.
    Peter [00:21:18]: Yeah, exactly.
    AI Coding Adoption: Cursor, Claude Code, and the Bimodal Engineer
    Swyx [00:21:19]: It’s, like, a better together if, obviously, if you, if they work together. Is it all C++ I assume is with different compile targets?
    Peter [00:21:27]: We use a lot of C++.
    Peter [00:21:28]: Rust is sort of a hot, the new hot kid on the block
    Peter [00:21:32]: for a bunch of things as well. But yeah, the lower level you get, especially when you get to real-time constraints, you hit C++ at some point, and at some point maybe you work your way into assembly when needed.
    Swyx [00:21:44]: Oh, damn.
    Alessio [00:21:46]: I’m curious about the coding agent adoption, just, like, since you’re mentioning more esoteric languages. Like, what’s the adoption internally? What have you learned?
    Peter [00:21:55]: Yeah. We use everything. So Cursor was, I think the hottest tool in the company for a good while. Now Claude Code, I think has taken the reign on that. We have a internal leader, leaderboard that we use just to sort of encourage adoption
    Peter [00:22:09]: with-within the company. And yeah, it’s, they’re phenomenally useful. it’s, Honestly, we take inspiration from some of those tools also in how we’re adapting some of that mindset of thinking to the physical realm. Like if it’s so easy to build an app for this or that thing that lives just on a screen, we can We’re taking now a lot of the same ideas and applying that to, “Okay, well, if you wanted a physical machine to do something, how easy can we make that, using our own tooling and platform as well?”
    Alessio [00:22:40]: Are you changing any of, like, the OS architecture, kinda like the way you expose services to, like, be more AI friendly or?
    Peter [00:22:48]: Yeah, absolutely. The in the early days of our tools infrastructure work, it was a lot about, You had engineers that were experts in certain topics, but the things that you’re dealing with, they’re oftentimes more mathematical or more abstract, where actually GUI tools are very useful for certain things. Like as an example, we have a product we call Sensor Studio, which is, it helps you design the sensor suite for your autonomous vehicle, whether, again, it could be a car, it could be a drone, could be a mining equipment, could be a robot. And you place sensors in different places. You There’s different, There’s a library. You can understand what are the trade-offs that you’re making in the design of that system, and that was, like, a very, a very GUI intensive, thing ‘cause it’s a little more like a CAD tool in that sense
    Swyx [00:23:37]: Yep
    Peter [00:23:37]: if you’ve seen CAD tools. Nowadays, though, right, we expose all of the underlying APIs for that and now using, AI agents, you can actually configure a sensor suite with just text and likely reach a better result than you could’ve through the GUI in the past, and we’re taking that thinking now through the whole product portfolio.
    Swyx [00:23:57]: Another thing I was thinking about is just in terms of, like, AI, adoption, does it change your hiring at least a little bit, or how do you, how do you sort of manage engineers, differently?
    Peter [00:24:08]: Yeah. absolutely, it does. we, I think like every company in the Valley right now, are evolving our hiring practices
    Peter [00:24:16]: because the skills required to be effective are changing so fast, right? you used to really select for just rote implementation ability and now it is more the AI engineer skill set, right? Where it’s like, yeah, how to implement, but actually-Just banging out code is no longer the core job, right? It’s, it’s actually knowing what questions to ask, knowing how to tie, how to tie together these different AI tools. And so the interviews that we give now I think are way harder than they’ve ever been.
    Peter [00:24:46]: But we also allow, right, selective use of AI tools to solve the problems. And I think in that you start to see more of a bimodal distribution of engineers, right? You start to see like wow, there’s, there’s this subset of people that they really get it. Like they’re, they’re all in and they’ve, they’ve clearly invested the hours needed to learn these tools and how to be effective.
    Peter [00:25:09]: And then there’s sort of the group of people that haven’t done that, and that the productivity gap is just enormous. And so we’re, we’re trying to obviously select for the people that are really into this.
    Qasar [00:25:20]: I first wrote the my AI engineer piece three years ago, and when I first wrote about it, I was like, “Actually, not everyone should be an AI engineer,” ‘cause I think there’s a there’s an extremist stance where well, every software is an engineer is an AI engineer. And my actual example of people who should not be adopting AI was embedded systems and operating systems, and database people. Are they adopting AI?
    Peter [00:25:41]: I think it’s the classic bitter lesson, topic, which is the Six months ago I would’ve said the same thing, but it’s, it’s becoming super useful for every domain.
    Qasar [00:25:53]: I’m sure.
    Peter [00:25:54]: Right? Like,
    Peter [00:25:56]: there was, I think six months ago, or maybe a year ago, if you tried to use, let’s say the latest Claude model for writing shaders, GPU shaders, the results were probably underwhelming. And if you use the latest model now to do that kind of task, you’re a little bit blown away, like, “Wow, that actually worked. That’s amazing.” And we see the same thing in the embedded realm. No question though, especially when you get into safety critical systems, the human validation is
    Peter [00:26:25]: is 100% key. Like I You’re not gonna trust your life to a an AI written software that’s, that’s not been very carefully, checked by humans. And so I think now the really the challenge is about that appropriate level of human validation for these safety critical systems.
    Verifiable Rewards, Evals, and Neural Simulation
    Alessio [00:26:41]: How do you think about, yeah, touching on the simulation side, I think verifiable reward and reinforcement learning is, like, the hottest thing. What have you done internally to build around that? And like, what gives you What makes you sleep at night? Like, if somebody’s like, just web coding something or like
    Alessio [00:26:57]: wants to try something new, you have like a good enough system. Because I think the opposite is also true, is like if it’s super easy to write anything
    Alessio [00:27:04]: then it puts a lot of work on like the verifiable
    Alessio [00:27:07]: side of it. Like, what does that look like for people?
    Peter [00:27:10]: Yeah. So verifiability, a broader bucket of like evaluations, right? Like how do you evaluate the results that you’re, you’re getting? I think this is probably the hardest problem right now, because the As the models get better, it can be harder and harder to find the faults on the system.
    Peter [00:27:29]: And so like the problem of doing proper eval to find those faults, like that problem also keeps getting harder as the models get better. But it’s no less important than it’s ever been, right? You still there are still going to be edge cases that are not met and whatnot. And so it’s, it’s a big area of investment for us. On the reinforcement learning topic, the key thing is there’s all these new requirements that come to be in the latest generation of these technologies. So for example, end-to-end is the big thing right now in autonomy and physical AI, which is you can now train these models that can effectively take sensor data in and then put control signals out, and get really good results out of that. But the way that you train and improve those models is really different from the previous generations. And so to do reinforcement learning on an end-to-end model, you now need to actually simulate all the sensor data, right? So then this becomes a we call our, work in this neural simulation, but it’s
    Peter [00:28:26]: think of it like a hybrid of Gaussian, splatting and diffusion methods, and where you really care about performance. Like performance is everything. If you can’t do enough simulation fast enough and cheap enough, you actually can’t get results that are worthwhile, in the end. It also gets to a lot of our work in embedded systems, which is like performance critical work, and that performance optimization, performance criticality, it carries over to a lot of the model training work. because, like, the only way to make it affordable is it has to be really fast.
    Qasar [00:28:58]: I think it’s worth a few minutes talking about our own, evolving thoughts on verification and validation within
    Qasar [00:29:05]: kind of, traditional simulators, which are, you can think of like vehicle dynamics or something like that, which you’re just taking textbooks and taking those formulas
    Qasar [00:29:13]: and putting them into software, to like now this neural sim/world model universe. I think that’s an interesting topic.
    Peter [00:29:20]: Yeah. So in more traditional development, right, you oftentimes would have, more black-and-white answers to questions.
    Peter [00:29:28]: And so the in Europe as an example, there’s, a regulatory, system, it’s called Euro NCAP. It’s the European New Car Assessment Program, and as part of that, the vehicles have to pass a bunch of tests, and those tests actually, include, safety systems. So automatic emergency braking for a child that runs in front of a car
    Peter [00:29:51]: or let’s say an occluded child that runs out and you hit it. And so you have You end up with sort of these binary answers of like, well, did the car under test pass this specific test? And there’s a very well-known set of test cases
    Peter [00:30:05]: that the vehicle has to pass. And that was how the industry worked, let’s say, until 10-ish years ago. But what’s changed now is with these models, everything is statistics, right? Like you no longer have a black-and-white answer, but it’s like, well, how many orders of magnitude or how many nines of reliability can I get in the system, and how can I, how can I prove that to be true? And the big unlock honestly for physical AI as an industry is that these models are just becoming much more reliable. Right? Things like things actually work a lot better. It’s like the number of nines you can get out of these systems are now good enough that it actually becomes cost effective to really deploy these things. And so the big shift in, so verification and validation has been from a little bit more of a Again the past it was strictly requirements, and are you meeting or not? And now it’s more of a statistical, verification and validation case where it’s all about how many nines of reliability and meantime between failures, that sort of thing.
    Statistical Validation, Regulators, and the Cruise Lesson
    Swyx [00:31:04]: And is the target audience regulators or even the customers are yeah, if you I imagine the customers are bought in, and it’s mostly regulators that need to be satisfied.
    Peter [00:31:15]: We do work with the US government, we do work of course with the European governments and the government of Japan, and the government is not like an AI lab by any means.
    Peter [00:31:25]: So Swyx [00:31:26]: They just care about the outcome.
    Peter [00:31:27]: They care about the outcome.
    Peter [00:31:28]: And so we do education, in that regard, and like so sort of teaching about, “Hey, this is how we think validation should be done, and this is an approach that we think is reasonable,” and how to think about like when is a driverless system actually safe enough to go on the roads and that sort of thing. But I wouldn’t say that the government is asking for it. It’s like we’re more teaching the government in that, in that sense. It’s honestly, it’s more so for our own, our own comfort, right? Like, we want to build very safe systems, and then of course our customers care deeply about that as well. But in that context we’re also typically educating our customers.
    Qasar [00:32:01]: Yeah. Our first, our first core value is on round safety. So I think we can’t underline enough that, us also verifying and validating that the systems that we’re deploying are safe to us is probably as important as, like, some regulator or a customer saying,
    Swyx [00:32:19]: Of course. Okay. Yeah.
    Swyx [00:32:20]: You have to satisfy yourselves.
    Peter [00:32:22]: As I say, as a whole across the world, regulation oftentimes it’s like a almost lowest common denominator. But like, you really have to substantially exceed what the regulators are expecting to make good products.
    Swyx [00:32:33]: Yeah. One thing I often talk about, I think and I try to make this relatable to the audience also, is Cruise, where they had an accident that basically ended the company. I wonder if people overreact to single incidents, because incidents are going to happen regardless, right? ‘Cause it’s a statistical thing, but as long I don’t know if regulators understand that, you cannot extrapolate from a single incident, but we do because that’s all we have to go on. And your sample sizes are necessarily gonna be lower than, I don’t know
    Swyx [00:33:00]: consumer driving.
    Qasar [00:33:01]: Yeah. I think the Cruise example wasn’t a technology failure. there was The real, compounding issue there was just how did the company talk to the regulators and what was their kind of behavior, and I think that became more of the issue. If you look,
    Peter [00:33:19]: It isn’t It definitely was a technology failure, but it was made much worse by the
    Swyx [00:33:23]: Put the car back on the woman.
    Qasar [00:33:25]: Yeah. And let me put it another way. There is a version where Cruise still exists.
    Swyx [00:33:29]: right. Right.
    Qasar [00:33:30]: Right. It’s
    Swyx [00:33:30]: It was like the last straw
    Qasar [00:33:31]: It
    Swyx [00:33:31]: in like a long chain of
    Swyx [00:33:33]: like issues.
    Qasar [00:33:33]: So do you feel like ATG had that horrific accident or someone actually dying, because, that was a homeless person crossing the street? So yeah, I think we can’t understate enough that ultimately, like, statistical validation of something, that’s one part of it, but it’s not the only part of it. Like, consumer and let’s say, mainstream adoption of these technologies is also gonna be part of that conversation. I think companies like Waymo are doing a lot of service positively to the industry in the sense of they’re, they’re setting a high benchmark and they’re showing, kind of in a very responsible way how to, how to deal with these. There have been Waymo incidences as well. They’ve just not been as significant as the Cruise one that you mentioned. But yeah, so I think you’ll just continue to see that. I think probably the long term question is really gonna be, again, around Like it is very clear humans are way worse drivers statistically.
    Qasar [00:34:29]: Like, there’s no, there’s no debate. And so at what point But we’re emotional animals.
    Swyx [00:34:34]: Yeah. So my thing is, like, we have to get to a point as a society where we accept horrific accidents that would never happen by a human because statistically we understand that it is safer overall. In the same way that planes, they’re safer, than I think they’re the safest mode of transport that we have.
    Qasar [00:34:50]: Yeah. it’s more dangerous to drive to the airport than it is to get on a flight.
    Qasar [00:34:53]: So if you’re ever
    Qasar [00:34:54]: if you’re ever getting nervous about getting on a plane, just think “I just gotta get to the airport.”
    Swyx [00:34:58]: Yes, we’re flying.
    Qasar [00:34:59]: If I get to the airport
    Qasar [00:35:00]: I’ll be good.
    Swyx [00:35:00]: But then it’s, planes also concentrate the tail risk if planes
    Qasar [00:35:03]: Yeah. And
    Peter [00:35:04]: And I was, I don’t think we honestly have to worry about there ever being, accidents from these systems that are like much worse than what humans would cause, ‘cause humans do terrible things.
    Peter [00:35:14]: Like, people fall asleep at the wheel all the time.
    Swyx [00:35:16]: I have.
    Swyx [00:35:17]: Like, I’ll call, I’ve been a drowsy driver.
    Peter [00:35:19]: Kinda drunk drivers, and that’s
    Peter [00:35:20]: that’s the extreme end of the example. But these AI systems, you have redundancies, you have fallbacks. Like, there’s many things have to go wrong for there to actually be a something catastrophic because there’s, there’s so many, fallbacks that these systems have.
    Alessio [00:35:36]: your simulation is like so vast because there’s so many use cases. What are, like, maybe things that worked in a simulation and then you put it out and it’s like, “F**k, this is
    Alessio [00:35:45]: this just did not work at all?”
    Peter [00:35:47]: Yes.
    Alessio [00:35:47]: Is
    Peter [00:35:47]: That’s maybe a bit of a misconception, about simulation there. So let me go a little bit, more technical on this. So at first go, no simulation is going to represent the real world. There’s always a process of this, sim to real matching
    Peter [00:36:02]: where you actually, you need the real world feedback to basically feed into the parameters that are being used in the simulator, and you have to do that, it’s like this validation flow, a number of times until you can get some confidence that, like I think the simulator is now accurately representing
    Peter [00:36:19]: what’s gonna happen in the real world. Now, if you have a situation where you’ve done that full validation and you thought that it was accurate and then there’s something different, those are much trickier cases, and that’s, that absolutely can happen, but really I think the validation process is a really important part. You can never skip the simulation validation process, like where you’re actually ensuring that, hey, the actual, my sim to real gap here is small enough that I can trust these simulation results. And there’s, there’s so many fun things that you can do when you get into it. Like, I’ll, I’ll give one fun example that came up recently is like in these humanoid robotics, systemsOverheating actuators is a real problem, right? So obviously phenomenal demos. I
    Peter [00:37:01]: The most amazing
    Alessio [00:37:02]: For 10 minutes.
    Peter [00:37:03]: The most amazing I can get. I love, I love watching robots do acrobatics like everybody but the these systems actually overheat, right? If, like, And one of the ways you can use simulation though is you can actually have that, the temperature of those actuators be one of the parameters that’s represented
    Peter [00:37:18]: in the simulation. And if you’re doing reinforcement learning over a certain task, then the robot can actually adjust its motions in the simulation to account for the fact that, oh, it knows that as it’s moving, it’s actually beginning to overheat this motor. But if you didn’t have that parameter of, let’s say, the heat of that motor represented in the simulation initially, then your RL policy might It will disregard that. And now you run that on the robot and the robot will overheat and fail.
    Alessio [00:37:43]: I guess the question is, like, how do you have all of these parameters taken care of while also understanding the deployment environment? Like, temperature is like a great example, right? Well
    Alessio [00:37:53]: why did you make my robot worse when it runs in like a freezer?
    Alessio [00:37:57]: So it actually shouldn’t worry about that. it’s like, yeah, how do you design these simulations?
    Peter [00:38:02]: This is honestly the This is what makes simulation so hard, right? it’s because you Simulation is fundamentally about you’re trying to optimize the development of a system, right? Like, how can I build this system faster and better and cheaper and what are all the levers that I have to actually accomplish that? And because simulation’s just a software program, you can, you can change it a lot more easily than you can hardware systems. And then what’s particularly awesome about the let’s say, world models and using that as a part of simulation is now the simulation doesn’t just scale with, let’s say, adding new math equations in
    Peter [00:38:36]: but we can actually scale the simulation environment now with additional real world data and that also unlocks a whole new field of robotics.
    Qasar [00:38:46]: There is a meniscus line where you cross where still doing real world testing is better. there’s, in this, sim-to-real gap, you can reproduce reality at exceedingly expensive costs and this So nothing is free. So really you have to you’re finding that line where you’re getting great performance, you’re getting great feedback, whether it’s on the training side or on the eval side, but it’s way cheaper than doing it in the real world. At some point it, that doesn’t make sense. And so even, from our earliest days in autonomy, our view was you’re still gonna do real world testing. You There’s, there’s not, there’s not this, magical land where you’re not gonna do that. And maybe even like a more nuanced version of this in like traditional software development is, most of your testing for software in a vehicle, 95% of that can be like traditional CI/CD kind of, flows that you would have in traditional web development. But once you have Now you, let’s say you have a truck. Well, you can do like 4% of those in like a rig which has all the components, the electrical and electronics of a truck, but doesn’t have, it doesn’t have the tires and it doesn’t have the And then you have the 1%, which is actually the vehicle. There’s something There’s a similar analogy in terms of using simulation for intelligent systems. You can do a lot in a simulator, but in using world models, but ultimately it’s, it’s physical AI. So you’re gonna deploy it on physical machines and
    Qasar [00:40:17]: the freezer example comes to, comes to light.
    Alessio [00:40:20]: The world model thing has been to me the hardest thing to
    Alessio [00:40:22]: wrap my head around. Like we have Faith Eliyon on the podcast.
    World Models, Hydroplaning, and Cause-Effect Learning
    Qasar [00:40:25]: We’ve been doing a small series with like another Intuition company, General Intuition as well.
    Qasar [00:40:31]: yeah, and I mean, lots of, lots of coverage on NeRFs and yes.
    Alessio [00:40:34]: Yeah. It feels like we talk with about, the heliocentric system, right? It’s like in a world model, if you just feed visual data, the model might learn that the sun spins around the Earth. It makes sense, right? And it’s like, well, not really. And I think what are like some of these other things that like hydroplaning is one thing I think about, is like can a world model understand hydroplaning and like what amount of water like causes it to happen? And it’s like, yeah, to me it’s like I don’t understand how you guys do it. I guess it’s like the real thing is like when you’re doing both cars and the highway in Japan versus the excavator in a mine in,
    Qasar [00:41:13]: Arizona
    Alessio [00:41:13]: wherever you’re Arizona, wherever you’re deploying them.
    Alessio [00:41:15]: How much of it are you relying on the world models to like generate the simulations for you and then try and close the gap after versus like giving the world models as a tool to your engineers to like curate the simulations if that makes sense?
    Peter [00:41:28]: Yeah, totally. So yeah, I can say at a pure engineering level, I think if you’re hoping to do real world deploys and you’re purely relying on a world model approach, you probably won’t get to something that works, before you go bankrupt. So there is just a very practical mindset of like, world models are amazing and they’re extremely useful for a lot of use cases, but there are a lot of other things that you need to do to actually get something started and something deployed and working. most fundamentally, world models are all about It’s understanding the world, but also understanding what’s going to happen. It’s like the cause-effect relationship.
    Peter [00:42:01]: Right? And so like it, right, if you have a take some sort of construction tool, and that construction tool is gonna be doing some work on the Earth in some way, it’s gonna be moving earth, the world model needs to understand that cause-effect relationship. Like, okay, when I, when I take this material from here and put it over there and now I have things that are over here and not over there anymore and that cause-effect, relationship. data obviously is a is a big problem. The hydroplaning
    Peter [00:42:26]: one is actually a really great example because it’s actually quite non-obvious sometimes. Right? It’s like, well, it’s, it’s raining and well this road, has, let’s say the appropriate curvature to it so the water is running off the road and cars are driving faster here and then you approach a road that’s very flat and water is now puddling on that road and all of a sudden cars are driving slower because when they were driving faster they were starting to lose control. And there are a lot of visual nuance, very nuanced visual cues in the scene and so I do think in the world model concept there’s a good chance that the model actually would learn that you should just drive slower when these visual cues exist, and that’s obviously the beautiful-The beauty of, these kinds of models where they just, they learn these non-obvious things.
    Swyx [00:43:14]: It doesn’t need to know about hydroplaning to know that it needs to drive slower.
    Peter [00:43:17]: Yes.
    Swyx [00:43:17]: I guess it’s Yeah. I wanna ask questions about, also deploying models. I presume, like, you use a lot of these world models for training data and simulation, but what about deploying it onto the systems in production? Presumably you have you have, like, GPUs on device
    Onboard vs. Offboard: Latency, Embedded ML, and Distillation
    Swyx [00:43:36]: but they’re I keep saying on device. What’s the what’s the right term for that?
    Peter [00:43:40]: On machine.
    Swyx [00:43:41]: On machine.
    Peter [00:43:41]: Or embedded, yeah.
    Swyx [00:43:42]: Yeah. What is the embedded world like? because for people who are not used to that world, this is very alien.
    Peter [00:43:49]: Yeah. So it’s actually We call it onboard and off board.
    Peter [00:43:52]: So like, onboard software and off board software.
    Peter [00:43:54]: And the great thing about off board software is you don’t have to care about time, and you can run really large models, right? So you can, you can say, “Well, this model, I don’t care if it takes one second for it to give me a result or 10 seconds for it to give me a result, because we have time.” And the models can be really big, and they can run, in a data center or on a on a huge GPU and you can obviously have distribute to compute, et cetera. But onboard you don’t have any of those benefits. You’re like, “Well, I need I have this many milliseconds where I need an answer from this model.” And so a lot more of the energy then is about, think of it more like distillation and it’s like truly efficiency and like, literally every fraction of a millisecond counts. And you can’t have a situation where the model takes too long because then the vehicle can’t actually function.
    Peter [00:44:42]: And so you can, you can still use a lot of the same techniques, and the models themselves you can think of as like a derivative of larger models that you can run offline, and then you’re, you’re trying to just get a model that is still performs really well but it’s, it’s a it’s smaller, small enough version that you can then run on this embedded system where you care about latency and power.
    Qasar [00:45:03]: Yeah. And I think like, the broader point I think which, maybe is not obvious but it’s worth saying is in physical AI world, we’re not really constrained right now by, like, the intelligence of the models. It’s actually what Peter’s talking about, it’s actually deploying them in
    Swyx [00:45:19]: The hardware they give you.
    Qasar [00:45:21]: Yeah. On the hardware you give you.
    Qasar [00:45:22]: And so And there’s just a reality is of safety critical systems. So those end up being the your limiting factors
    Qasar [00:45:29]: rather than, let’s say, a limiting factor for, a foundation model company
    Qasar [00:45:34]: is gonna be just capital maybe or researchers.
    Qasar [00:45:38]: So we’re, we’re in that way dealing with, for us as people who kind of come in that realm with like a very interesting Those constraints force creativity.
    Swyx [00:45:47]: And I imagine, nobody was deploying or giving you the hardware for transformers back in 2018, whatever, but now they are. What’s the evolution like? just peel back the curtains a little bit.
    Peter [00:45:59]: Yeah. Transformers first off, I think the paper was originally published in 2017.
    Swyx [00:46:02]: 2017.
    Swyx [00:46:02]: So there’s no time.
    Peter [00:46:04]: And I
    Swyx [00:46:05]: But I’m just saying I guess I’m saying, like, embedded ML systems usually, like, a lot less parameters, a lot less compute, and now, like, orders of magnitude more.
    Peter [00:46:14]: Yeah. absolutely. what I was gonna say though was I think in the in the original paper in 2017, maybe it’s in the last paragraph, somewhere in the paper they talk about, like, “Oh, by the way, this technique might be useful for, like, images and videos as well.”
    Peter [00:46:30]: These last subjects.
    Peter [00:46:31]: And it took a few years for that impact to really hit. But like, now, we’re seeing transformers are everywhere.
    Swyx [00:46:39]: Yeah. Vision transformers.
    Peter [00:46:40]: And then then the compute just keeps getting better and better. But you do have this fundamental trade-off, right? It’s like you have power, you have cost, and performance and like, getting the right, getting the right mix of those things in an embedded package that can also be, like, shaken and baked in all the
    Peter [00:47:00]: conditions that these things have to have to operate in. But yeah, I think that they’re only going to keep getting better and so we also try to plan our strategy understanding that, we know the rate of improvements of these systems.
    Swyx [00:47:11]: Yeah. So like, Google just released the Gemma 2B model
    Swyx [00:47:15]: that effective 2B model. Is that useful to you guys or is that too big?
    Peter [00:47:18]: You can run that model on an embedded system, definitely.
    Peter [00:47:21]: the So yes, it’s, it’s useful in that regard. The bigger question is, like, what do you use it for in an embedded system? Like, you actually need to customize it quite a bit to make it useful for something. But yeah, you could run a two billion parameter model, definitely.
    Swyx [00:47:35]: It also interesting, like, what percent is a custom ML model that only does that thing versus a generalist LLM
    Swyx [00:47:41]: which probably is not that useful actually for your context.
    Peter [00:47:46]: Like, you, like, you can imagine different use cases, right?
    Peter [00:47:48]: So the
    Swyx [00:47:49]: The voice stuff, yes.
    Peter [00:47:49]: Yeah, the voice test. Totally, yes.
    Peter [00:47:51]: So for the actual, autonomy elements, that’s 100% in-house. We do every bit of that, the data simulation, the model, everything. But when you get into the more generic use cases like voice or voice assistant kind of thing, that’s where these more generalist models like Gemma actually can be quite, can be quite useful.
    Swyx [00:48:09]: Yeah. And then there’s also obviously a trade-off between, like, what percent must you do on machine, versus just call home.
    Peter [00:48:16]: Yeah. It’s all about latency.
    Swyx [00:48:17]: Latency.
    Peter [00:48:17]: It’s all about latency. Yeah.
    Swyx [00:48:18]: Yeah. Well, like, I think actually in a lot of contexts, especially in the US, you can just have a connection to the web.
    Qasar [00:48:26]: Yeah. I think though most of our universe is everything has to be fairly, embedded and local because just the nature of Even in the US there’s a lot of like
    Swyx [00:48:39]: Patchiness
    Qasar [00:48:40]: don’t have
    Qasar [00:48:41]: have coverage, right? And if you look at, like, the old world of autonomy within mining, which is, like, long before transformers and kind of, neural networks, in the like CNN and kind of a universe, they were really just hand-coded, systems. They were just like, this machine is gonna run to that place with this
    Peter [00:49:03]: That was our GPS, like very accurate GPS.
    Qasar [00:49:05]: Yeah. And so that worked, and that worked for 20 years, so why would we actually need to use transformers or kind of more modern end-to-end systems? Mainly because you can only really run a path and run backwards. That provided a lot of value, but m-Not as much as you get when the machine is actually intelligent. It’s, it’s seeing, it’s perceiving, it’s acting in a dynamic world.
    Alessio [00:49:28]: I looked up RTK, real-time kinematic, one to two-centimeter accuracy.
    Qasar [00:49:32]: Yeah. Fantastic. But the and fantastic in faraway lands where there’s not gonna be cell phone coverage.
    Peter [00:49:39]: Yeah, so it’s widely used on the legacy mining and agricultural autonomy systems today. So like, for example, a combine that can be precise within one or two centimeters as it’s driving down the field, they use RTK.
    Qasar [00:49:53]: Yes.
    Peter [00:49:53]: But it’s, it’s expensive.
    Qasar [00:49:54]: Yeah. And it’s, it’s, it’s autonomy, but it’s not intelligent in the way that I think all of us
    Qasar [00:49:58]: if in twenty-six we’d be talking about intelligence.
    Alessio [00:50:00]: In one of your blog posts, you mentioned research on large scale transformers that are similar to those doing modern generative AI. What are, like, the big differences other than, “You’re absolutely right. I should steer the car, so you probably wanna remove that?”
    Peter [00:50:14]: We have a diversified bet strategy internally, and the reason we’ve done that is because we operate in now a bunch of industries, a bunch of geographies, and each of the approaches has, obviously a different risk to them.
    Peter [00:50:27]: And so like, we’re not going to put all of our eggs in a single basket for a single approach because that approach may not work out.
    Peter [00:50:36]: and so that’s, that’s one of the bets that we have, and it has certain advantages in certain scenarios, and then But the way that these things play out in practice is it has certain benefits and also has certain drawbacks. And then, and then the research team tries to then work on, the situations where that’s actually worse than these other approaches and to ultimately arrive at a really great solution for all of these things.
    Plan Mode for Physical Systems and Next-Token Prediction Universally
    Alessio [00:50:57]: Is there a plan mode for physical autonomy, like the other planning step and then, action step or?
    Peter [00:51:03]: So short answer is yes, right? So just like you can use, Claude code to plan out some complex coding task and you get some almost specification written out, those similar approaches absolutely can be applied to physical systems because imagine you’re trying to accomplish some task. The easiest to think about is robotaxi, but I think
    Peter [00:51:23]: things get more interesting, let’s say, in the defense context or in the in the mining context. You actually do have to think about many steps in advance.
    Peter [00:51:32]: It’s, it’s not just this one thing, but to accomplish the goal, there’s a hundred steps, and then the this concept of the plan mode, it’s, yeah, very applicable, in those
    Alessio [00:51:40]: Yeah. I was gonna say, to me, driving feels like a great next token prediction thing because you’re kinda like on a path and like, it doesn’t really matter what you’ve done before. you can always turn around.
    Qasar [00:51:49]: It’s all planning. Yeah.
    Alessio [00:51:50]: Yeah. Versus, like, mining, it’s like, “Oh, man, I took a I took a scoop out of this thing.” It’s like, now we can’t really
    Alessio [00:51:57]: I can’t really go there anymore. it’s like, is there like a huge difference? Like, how would you I guess, like, do you have like a taxonomy of, like, these different types? So there’s kinda like driving
    Alessio [00:52:07]: excavating, like, flying. How do you
    Peter [00:52:11]: So the interesting thing is, yeah, I think probably everything in the world can actually be boiled down to, like, a next token prediction problem.
    Peter [00:52:18]: and in any workflow, anything, can be thought of almost as like there’s this sequence of steps or the sequence of trajectories or what-whatever you wanna call it, and it can be boiled down actually to that sort of thing. And in the mining case, you can imagine, like, taking that scoop. Okay, that was that set of tokens, and now that’s, the model is now understanding that, okay, that the state space is different, and now the next time I do token predictions, it’s going to, going to be modified by that. But yeah, these The remarkable thing about these techniques is just how universally applicable they are, right? it’s, it’s truly is incredible.
    Alessio [00:52:53]: What else is underrated about what you guys are building on the physical side? I think there I mean, we were talking about it before the episode. There’s a lot of humanoid companies that do these great demos, and then I can’t buy it, so obviously it can’t all be there. In your case, you’re, like, in production on real streets with, like, a lot of customers. What are, like, the things people are underestimating? The same way the Waymo demos seven years ago were great and then took seven years to actually get them on the street. Can you share about maybe like, the last one percent that was really hard to get done technically?
    Productionization: The 20 Problems Every Robotics Demo Will Hit
    Peter [00:53:27]: Yeah. So certainly, productionizing stuff is really challenging no matter what. So I maybe would, I would split the answer maybe into research and then also in production. First, on the production side, there’s just so many problems that you find when you actually get the stuff to go in the real world. And so the classic problem in humanoids right now is these systems are actually pretty brittle.
    Peter [00:53:48]: and so I’m not talking about any one company, but just as an industry, these systems are pretty brittle. interestingly, I saw this thing, the other day that, I think China is doing a marathon with humanoids.
    Qasar [00:54:00]: What?
    Peter [00:54:00]: Yeah. So in government, and not China specifically, but in any government, there is a there’s a concept called, prize policy, which is so that there’s, there’s different ways of influencing an industry to go a certain direction. Like, you can, you can regulate it, right? You can do mandates, or you can actually just do these competitions. So the US version of this was the DARPA Grand Challenge. that
    Alessio [00:54:20]: That worked.
    Peter [00:54:21]: But it really worked. It
    Alessio [00:54:22]: That really worked
    Peter [00:54:22]: took the whole industry. But I think China is literally doing this marathon because they know that reliability, of these humanoids is a problem. And so what cooler way to solve that than to have a competition where humanoids need to run twenty-six miles, right?
    Alessio [00:54:37]: Are we there? Can robots run a marathon?
    Peter [00:54:40]: I think it’s happening any day now.
    Peter [00:54:42]: So it’s
    Alessio [00:54:43]: So we’re there.
    Qasar [00:54:43]: By the way, also, automotive, there’s a version of this which is, like, twenty-four Hours Le Mans, right?
    Qasar [00:54:48]: It’s like Porsche wins twenty-four Hours Le Mans
    Alessio [00:54:51]: New product
    Qasar [00:54:51]: and then literally puts those, the products into production. I would actually break it down. You, talk about research and you talk about production. There’s actually a step in the middle which is, like, advanced engineering, and I think a lot of the industry is moving into advanced engineering where it’s like it’s not fundamental research. Like, we’re coming in with novel techniques. It really is advanced engineering for production. So what are the subcomponents that are gonna limit to getting into production? Once you’re in production, you’re dealing with another set of problems which is, like, the deployment, maintenance, of those machines that exist. So I’d say, at least in our field-We’re mostly in advanced engineering in the like, automotive parlance.
    Peter [00:55:29]: honestly, every step is hard though.
    Alessio [00:55:33]: Paul, this way you’re worth 15 billion dollars, so don’t answer.
    Qasar [00:55:36]: You bleed every step.
    Qasar [00:55:38]: Yeah. And I think
    Peter [00:55:39]: It’s fun. I think it’s like, I don’t know. I find it really enjoyable. Yeah, but what it was also fun is like, so we’ve, we’ve been doing this now for almost ten years, and we’ve just seen, we’ve seen so much bad times. And so right now we can look at any company in this space and like, get a demo, and like, I can, I can write down a list of I know exactly the next 20 problems they’re gonna hit.
    Peter [00:55:59]: And like, and I can guess also what they’re going to try to solve each of those, and I can guess which one’s gonna actually work.
    Qasar [00:56:04]: Yeah. It’s not because we’re, like, particularly, like, geniuses.
    Peter [00:56:07]: We’ve just seen this stuff now.
    Qasar [00:56:07]: Yeah. We’ve seen enough of this stuff. We lived enough of this stuff. We, our own kind of mental models of the world as leads in the company, we’ve tried so many things and many of We’re talking about the winds here. Like
    Qasar [00:56:21]: There
    Peter [00:56:21]: Plenty of losses there.
    Qasar [00:56:21]: There’s plenty of losses among that many people doing that many different things and so that kinda, like, get baked into your, like
    Qasar [00:56:29]: mental model of the world.
    Peter [00:56:30]: Yeah. But I would say and in general, like, we’re excited about robotics for sure, and like
    Peter [00:56:34]: the
    Qasar [00:56:36]: Massive opportunity
    Peter [00:56:37]: massive opportunity and what’s, what’s happening now in the industry is like none of these concept are new, right? What’s new is, like, this stuff is actually working now.
    Peter [00:56:46]: Right? The people have wanted to use, neural nets robotics for a long time, but now, like, again, we now have the data sets, we have the simulation technologies where stuff is actually starting to really work, and yeah, we wanna be part, we
    Peter [00:56:58]: we’re gonna be part of that for sure.
    Alessio [00:57:00]: Do you have requests for startups or like, advice against starting certain startups? There’s a lot of, like, scale-up robotics, companies. It’s like what do you think are things
    Qasar [00:57:10]: A lot of, a lot of applied intuitions for other things.
    Qasar [00:57:14]: I think you hit a you hit a certain, what is it, badge when YC
    Peter [00:57:21]: X for Y
    Qasar [00:57:21]: right, you become like, or literally the same similar names, like,? I think my biggest advice, in this, like, almost like commercialization of technology is I think often the that constraint, so we talked about, like, hardware constraints, or we talked about, there’s also, like, on the commercial side, there’s constraints, which is we’re gonna only do things that fit in this box. That is, I think very good for founders. The reason I think it’s not often focused on is because you have plenty of access to capital, and the technical problems are so hard you’re like, “I already have a constraint,” which is just getting this technical problem solved, and I think the venture community, generally speaking, tends to be not very technical. For them, if you just say, “If we solve this thing, it’s gonna be a lot of money,” that’s kind of enough for them, but you as a founder, I’m not giving you advice on how to pitch VCs. That’ll work for VCs. You still gotta run a sustainable business. And I think we’re really in that, question you asked earlier about kind of, what’s maybe not obvious about our company. It’s like this is truly compounding technology. A lot of the work that we do just compounds. we don’t throw it away. It gets better. The operating system work gets better. The dev tooling gets better. The models get better, and so we’re really gonna get a hu- I think you see it in Waymo as an example. Like, Waymo is a company that is, I would say, very interesting for a long time, but not worth one hundred and twenty-six billion dollars, right? So what happens, like, is that the human brain just doesn’t emotionally understand the compounding effects, so that’s gonna happen in our universe. So now if you’re a founder, you’re at the beginning of that long, walk. If you can put a little constraint on commercials that has a small ability for you to more likely see the other end of that, the that walk, ‘cause if you can get to the other end, you will get the big return from compounding technology. Just a lot of people just don’t make it. So yeah. summarize, like, think a little bit about the equation of how you use money and where you use the limited resources and limited engineers that you have. I think sometimes then founders falsely kind of take very mature companies’ strategies and then apply to their, like, nascent. They’re like, “Oh, well, Steve Jobs says be completely vertical.” Well, yeah, in 2007, Apple is very different than 1978 and 1982. Those companies were different. They were literally just taking electronics from other manufacturers and just putting it in an enclosure. And so just be a bit more like, I don’t know, be a bit more nuanced in your, in your commercial approach as it informs your technical approach.
    Founder Advice: Constraints, Compounding Tech, and Mature-Company Mimicry
    Alessio [01:00:03]: Do you feel differently today? Like, you just joined X, right?
    Alessio [01:00:06]: You’ve been building this company
    Alessio [01:00:08]: you’ve been building this company in stealth, and now you’re like, “Well, I should probably be talking about what I’m doing.” I think a lot of founders are in a similar way where they wanna raise a lot of money to signal they’re strong, and you raise a lot of money without spending it.
    Qasar [01:00:20]: And to hire. And to hire, yeah.
    Alessio [01:00:21]: You obviously like that. Do you think that’s still possible to, like, have a very narrow approach of, like, “Hey, we’re kinda like building a compounding thing without a grand vision right away,” versus
    Qasar [01:00:32]: It’s, it’s very difficult to answer very general questions
    Alessio [01:00:35]: Well
    Qasar [01:00:35]: that, I, but I, so maybe like, maybe I reframe it as in is it possible to build a product that has a small, let’s say, problem space and hope that the problem space will grow? Maybe that’s, like, a different way of asking the same question but ma- more answerable. I think always yes. That is the old YC, like, go really deep and then, rather than very broad and shallow.
    Qasar [01:01:00]: Very broad and shallow unfortunately, there’s just too many especially in hard tech companies, there’s just too many problems, and you can’you’re gonna do all of them in a very mediocre way, and so the full product is actually fairly mediocre. So yeah, I still in, I’m still in the camp of find a small problem space. The other question you’re asking is a tangential is, like, should you, like, build in stealth and anonymity? Well, yeah, if you’re a YC COO
    Qasar [01:01:28]: you can be
    Swyx [01:01:29]: Oh, Travis Kalanick.
    Qasar [01:01:29]: And we, yeah, we worked, we worked, together at Google. We have a long history, and we don’t And which means, which is another way of saying we have big networks. our first of 400 people, majority were Googlers. Like, a majority of the company came from, this giant company we worked at, and that’s just very different. You’re a founder who is doesn’t have that experience. You have to do these things. And I think it’s kinda, that’s a so it’s like just don’t take my version of the world or whatever other founder, Jensen’s version of the world. They are in different time and space.
    Qasar [01:02:02]: And most importantly, their companies are in a different phase.
    Qasar [01:02:06]: And so then if you wanna take inspiration from other really young companies, that’s also bad because most of them are gonna fail.
    Qasar [01:02:11]: So the only, the only solution you really have is use first principle thinking and say, “Based on my skills, my co-founder’s skills, the skills of my early team members, and the what I’m hearing from customers, what’s a product space that I should, I should build?” And
    Qasar [01:02:26]: Yeah. Does that make sense?
    Swyx [01:02:27]: Yeah, it does.
    Alessio [01:02:27]: Yeah. I, Sam Altman, he said he regrets a lot of the advice that he’s given in YC.
    Alessio [01:02:33]: So I’m always curious to ask, founders like you who’ve now been
    Qasar [01:02:36]: So I
    Alessio [01:02:36]: Just a long time ago
    Qasar [01:02:37]: everyone who leaves YC, like, does the opposite.
    Qasar [01:02:41]: well, Sam was president, I was COO.
    Qasar [01:02:43]: Right? So and we’d have a CEO, so we worked together, extremely closely would be an understatement
    Qasar [01:02:48]: ‘cause the firm was also small. The
    Alessio [01:02:50]: Yep
    Qasar [01:02:50]: YC wasn’t wasn’t as big as, like, an OpenAI is. I directionally agree with that, but I would say that’s not more of a YC function, it’s more of the market
    Qasar [01:03:02]: has changed.
    Qasar [01:03:03]: It is a different world. The AI industry is at the AI companies, I should say more specifically, and how they relate to the other YC companies and market, just so fundamentally different. The amount of money raised is different, the amount of investors, the sheer number of seed funds. One of our early investors is Floodgate, and they did some analysis in the late, 2000, like, double O’s, where they were like, “There’s, like, single-digit number of funds that were like Floodgate,” which were, like, writing sub $1 million checks, first checks, and they were not accelerating incubator. And Anne, who’s, who’s one of the co-founders there, with Mike, they said that today they try to do, or like, today as in, like, three, four years ago, they tried to do this analysis and they, like, lost count at, like
    Qasar [01:03:46]: 350 funds or something like that. So we’re just in a different environment, so the YC advice from 2014-
    Qasar [01:03:55]: just would not apply in 2026. But Sam is, like, way better at saying these things than me.
    Qasar [01:04:00]: Like, he sometimes makes sound like He says it in a shorter, most, more interesting and than me. I can just give you, like, the Like, I, like, if you ask me, like, “What is the purpose of a car?” Like, open the owner’s manual and I say
    Qasar [01:04:13]: “Number one, look, there’s a steering wheel,” and instead of, like, “It can change your life and will be there.”
    Alessio [01:04:21]: Yeah, it gives you autonomy and freedom.
    Qasar [01:04:22]: Yeah, exactly. Yeah.
    Swyx [01:04:24]: and then for Peter, I was just kinda curious if there’s any particular tech or research problem that you would call out as very meaningful for you guys if it was solved, and unsolved, and if anyone is working on it, they should get in touch with you.
    Peter [01:04:40]: Yeah, I think th- generally the making models very efficient, right? So because we have to run on actual vehicles, like physical AI is literally, it’s taking, like, very large AI and now making it very small and very efficient. And so we’re constantly just at that boundary of these limitations of, like, well, you have a great model, but now we need to make it faster and smaller and so that in general as a as a field. And then I would say also, folks that are just really passionate about, like, evaluating this technology. As in, like, mo- model evals, is, it’s a hugely difficult topic, especially in safety critical systems. And we have a I think a really great engineering team that works on this now and researchers, but it’s, it’s a big area of investment. And so yeah, folks that are passionate about, yeah, performance, I say model performance, both in terms of capability and literally latency, and then, and then evaluation of models.
    Hiring Philosophy: Hardware/Software Boundary and Engineering Mindset
    Alessio [01:05:41]: Awesome. You guys, any, specific engineering roles that you’re hiring for? And especially, like, who are people that succeed at your company as engineers? I think that’s always the most important thing.
    Qasar [01:05:50]: Yeah. fly.co/careers, I think there’s, there’s literally hundreds of roles. we’re looking at all the topics we talked about from, dev tooling and physical AI to operating systems, to autonomy and AI, within physical machines. The types of engineers, that’s a great question. That’s actually more interesting than
    Qasar [01:06:09]: the roles ‘cause we’re, we’re a large enough company, we’re roughly
    Alessio [01:06:11]: Hiring everything.
    Qasar [01:06:12]: Everything, yeah. We hire everything.
    Qasar [01:06:14]: Yeah. I think we’re a Sunnyvale company and I think just from this conversation and kind of our backgrounds, you can kind of predict a little bit of what that means. we tend to hire fairly serious people, who are, who understand low-level systems, not just like a as a superficial understanding of technology, like engineers’ engineers almost. We definitely hire folks who are, like, have some diverse skill sets. We hire tons of specialists as well, to be very clear, but they’ve seen production and I think that, ‘cause that really informs how you, how you build technology.
    Peter [01:06:53]: Yeah. I would say people that really appreciate the hardware-software boundary.
    Qasar [01:06:56]: Yeah, exactly.
    Peter [01:06:56]: definitely in the vibe coding era, there are a crop of engineers that they don’t think about hardware at all.
    Peter [01:07:05]: And we don’t have that luxury, and so people that are a little more passionate about going a little bit deeper.
    Qasar [01:07:09]: Yeah, if you’re to contrast us versus, like, a AI lab or something, that’s where you’re gonna get the biggest contrast, which is, like, we’re just dealing with reality. what other things? All of the classic stuff. you want, you want folks who work hard and who are, who love the technology and like-Like a podcast like this or rather
    Qasar [01:07:30]: Like, if you made it to this part of the podcast
    Qasar [01:07:33]: you’re probably qualified for or you’re interested in this.
    Swyx [01:07:37]: Yeah. And Peter said that he, likes the podcast as well, which is like
    Swyx [01:07:42]: really cool.
    Qasar [01:07:43]: I’m a I’m a fan. Yeah.
    Swyx [01:07:44]: Yeah. Specifically on the hardware-software boundary part, it’s, it’s something I think about of our education system, in the States, but also maybe just in generally. I feel like there is that retreat away from that classical computer science or EE education
    Qasar [01:07:59]: Computer engineering or Yeah.
    Swyx [01:08:01]: And like, is there a point where you just do it yourself? Like, ‘cause at this point, you guys are the world experts on this, and actually you shouldn’t wait for some college system to spit them out for you.
    Peter [01:08:11]: you mean the in terms of education and upskilling kind of thing?
    Swyx [01:08:14]: Yeah. Yeah, just grab, like, young
    Qasar [01:08:16]: General Motors already did it.
    Swyx [01:08:17]: Smart kids.
    Peter [01:08:19]: GMI.
    Qasar [01:08:19]: Literally.
    Swyx [01:08:19]: Is there a Harvard University?
    Qasar [01:08:21]: Yeah, that’s where I went to for undergrad. Went to the General Motors Institute.
    Swyx [01:08:25]: I, that did not come up. I saw HBS.
    Swyx [01:08:27]: I didn’t
    Qasar [01:08:27]: Everyone sees HBS.
    Qasar [01:08:31]: The Harvard brand, Lewis is high.
    Swyx [01:08:34]: What’s General Motors Institute like? What
    Qasar [01:08:36]: it started 100 years ago for, to answer this exact question, literally the question you just said, which is like
    Qasar [01:08:40]: not enough engineers in Michigan. you’re talking about the early days of the modern corporation
    Qasar [01:08:45]: General Motors being There’s a great book, Alfred P. Sloan’s, My Years with General Motors, that is highly recommended, which basically talks about what becomes a modern corporation. But a part of that is they’re like, “We are, we’re basically buffering on engineers.” So they started a school and actually even Google as most, as recent as probably 10 years ago was thinking of starting a university. In term there was discussions on it. So yeah, it was abso- we definitely up, we definitely upskill folks as well. The amount of training we do in term is actually surprising. Yeah. But it’s a luxury you have when you’re at our size.
    General Motors Institute, Education, and the Curiosity Mindset
    Qasar [01:09:20]: When you’re, like, 25 engineers
    Swyx [01:09:22]: No.
    Qasar [01:09:22]: you just gotta survive. So again, take advice that’s relevant for your company rather than, like, immediately start trying to take high schoolers
    Qasar [01:09:29]: and make them engineers.
    Swyx [01:09:30]: But I, like I did go up to a class that you taught ‘cause, like, it sounds like you can teach a lot.
    Peter [01:09:36]: Yeah. Well, I think honestly, the one of the most amazing use cases of these large models now is education, right?
    Peter [01:09:42]: Like, I’ve, I’ve taken, an engineer who, very good engineer, aerospace engineering background, and in a relatively short time span, like, he’s doing very confident front-end work, very confident back-end work, like, with the help of these models.
    Peter [01:09:57]: And like, not only can you do the implementation with them, but you can also just learn, right? It’s like you ask questions and you don’t feel embarrassed ‘cause the model’s
    Peter [01:10:04]: not gonna, model’s not gonna call you out on anything.
    Qasar [01:10:07]: Yeah. I think the I think the thing you probably need more than an engineering degree, though engineering degrees are, like, very important, like, I don’t know if there’s a way to shortcut, like, fluid dynamics or heat transfer
    Peter [01:10:17]: The fundamental stuff
    Qasar [01:10:17]: the fundamental stuff, at least on the mechanical side, is you need an engineering mindset and that sometimes is actually Not everybody actually has that. Some people are emotionally drawn towards arts or something else and that’s completely fine. There’s no judgment there. But I think the engineering mindset maybe in a more usable way is, like, wanting to understand a lower level and the lower level and the lower Like, how do photons move?
    Peter [01:10:42]: And extreme curiosity.
    Qasar [01:10:44]: Extreme curiosity. Like, what is light? What is a radio wave? Like, these really fundamental questions.
    Peter [01:10:49]: Right. If and if you get curious enough about software, you ultimately end up in hardware.
    Peter [01:10:55]: And so
    Swyx [01:10:56]: That’s the Alan Kay quote. Yeah.
    Qasar [01:10:57]: Yeah, exactly.
    Swyx [01:10:58]: So I’m trying to make analogies and then do all these things. Like, you’re kind of a blend between new General Motors and Tesla autonomy division for everyone else.
    Qasar [01:11:07]: we do work in all these other fields. I think if you talk to our trucking customers, they wouldn’t even perceive, they, like, some sense like, “Oh, you guys did some automotive stuff, but you’re, you’re really helping us.” So
    Swyx [01:11:18]: Automotive is not trucking?
    Qasar [01:11:19]: No. no. That’s, that’s
    Swyx [01:11:20]: It’s, like, a whole
    Qasar [01:11:21]: It’s, it’s, it’s, it’s separate. There’s different problems. The mass And you have, you have the general categories of on-road and off-road. I think that’s what you’re thinking. So there’s on-road and off-road, but within on-road there’s all these subclasses
    Swyx [01:11:33]: Oh, okay
    Qasar [01:11:33]: of machines. Especially when you talk about, you look at, a delivery robot that doesn’t have a human in it. That’s actually very different because now you’re not concerned with, like, the actual feeling that you have
    Qasar [01:11:45]: when you’re in a self-driving system. You don’t have to account for that. You can
    Swyx [01:11:48]: Just break.
    Qasar [01:11:48]: You can, you break hard.
    Qasar [01:11:50]: And you don’t care about jerk and all of these metrics don’t, or become in
    Peter [01:11:53]: The way to think about it, honestly, is a little bit like, any system that you as an as a human would need special training to operate, you can think of a little bit differently. So like, the license to operate a truck is different from the license to operate a car
    Peter [01:12:04]: which is different from the license to fly a plane. It’s different from You get it, right?
    Swyx [01:12:08]: Awesome, guys. Thank you for taking the time.
    Qasar [01:12:10]: Yeah, thanks for having us.
    Peter [01:12:11]: Thanks for having us.
    Peter [01:12:11]: Thank you. [outro music]


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
  • Latent Space: The AI Engineer Podcast

    AIE Europe Debrief + Agent Labs Thesis: Unsupervised Learning x Latent Space Crossover Special (2026)

    23/04/2026 | 54min
    Today, we check in a year after the first Unsupervised Learning x Latent Space Crossover special to discuss everything that has changed (there is a lot) in the world of AI. This episode was recorded just after AIE Europe, but before the Cursor-xAI deal.
    Unsupervised Learning is a podcast that interviews the sharpest minds in AI about what’s real today, what will be real in the future and what it means for businesses and the world - helping builders, researchers and founders deconstruct and understand the biggest breakthroughs.
    Thanks to Jacob and the UL production team for hosting and editing this!
    Jacob Effron
    * LinkedIn: https://www.linkedin.com/in/jacobeffron/
    * X: https://x.com/jacobeffron
    Full Episode on Their YouTube
    We discuss:
    * swyx’s view from the center of the AI engineering zeitgeist: OpenClaw, harness engineering, context engineering, evals, observability, GPUs, multimodality, and why conference tracks now reveal what matters most in AI
    * Whether AI infrastructure has finally stabilized: why “skills” may be the minimal viable packaging format for agents, why infra companies have had to reinvent themselves every year, and why application companies have had an easier time surviving model volatility
    * The vertical vs. horizontal AI startup debate: why application companies can act as the outsourced AI team for enterprises, why some horizontal companies still matter, and why sandboxes may be the clearest reinvention of classic cloud infrastructure for the AI era
    * The “agent lab” playbook: starting with frontier models, specializing for your domain, then training your own models once you have enough data, workload, and user behavior to justify the cost and latency savings
    * Why domain-specific model training is real, not just marketing: how companies like Cursor and Cognition can get users to choose their in-house models, and why search, domain specialization, and distillation are becoming more important
    * Open models, custom chips, and alternative inference infrastructure: why swyx has turned more bullish on open source, why non-NVIDIA hardware is suddenly getting real attention, and why every 10x speedup can unlock new product experiences
    * What it means to sell to agents instead of humans: why agent experience may mostly just be good developer experience by another name, why APIs and docs matter more than ever, and how pretraining-data incumbents are compounding advantages in an agent-first world
    * Why memory and personalization may become the next big wedge: today’s models mostly reward frequency of mentions, but in the future, swyx expects product choice to be shaped much more by personalized memory systems
    * The state of the AI coding wars: why coding has become one of the largest and fastest-growing categories in AI, how Anthropic, OpenAI, Cursor, and Cognition have all ridden the wave, and why the category may still have more room to run
    * Capability exploration vs. efficiency: why the industry is still in a token-maxing, experiment-heavy phase where people are rewarded for spending more rather than less
    * Claude Code vs. Codex and the strange stickiness of coding products: why first magical product experiences may matter more than expected, and why the bigger mystery may be why only a few names have emerged as real winners so far
    * What the end state of the coding market might look like: two major players, a longer tail of niche products, and possible disruption if Microsoft, Mistral, xAI, or the Chinese labs push harder into coding
    * Where application companies still have room against the labs: why frontier labs are trying to expand into verticals like finance and healthcare, but still leave space for focused companies that own the workflow and the last mile
    * Why coding may be a preview of every other AI market: the first category to truly go parabolic, the clearest example of foundation model companies colliding with application companies, and a template for how future vertical AI markets may develop
    * Why AI valuations now feel unbounded: from billion-dollar ARR products built in a year to trillion-dollar market caps, swyx and Jacob unpack how the AI market has broken traditional startup intuitions about scale and durability
    * Consumer AI vs. coding AI: why ChatGPT’s consumer category may have plateaued on frequency and product design, while coding continues to feel like a daily-use category with real momentum
    * The next product frontier beyond coding: consumer agents, computer use, and “coding agents breaking containment,” with swyx’s thesis that 2025 was the year of coding agents and 2026 may be the year they begin to do everything else
    * Whether foundation models are really killing startup categories: why swyx is less worried for early founders, more worried for mid-size startups and traditional SaaS, and why building something ambitious may now be the best job interview for a frontier lab
    * AI vs. SaaS and the internal culture war around adoption: the tension between AI-native employees who want to rip out expensive software and skeptics who think quick AI-built replacements create fragile systems
    * Why traditional SaaS may be under real pressure: swyx’s own experience spending six figures on event and sponsor management software, the temptation to rebuild it cheaply with AI, and the broader question of whether teams will trust custom AI-native replacements
    * Biosafety, security, and frontier model access: why swyx raised biosafety at a dinner with Anthropic’s Mike Krieger, why Krieger argued security is the bigger issue, and what restricted model releases reveal about Anthropic vs. OpenAI
    * The era of giant models: why 10T+ parameter systems may only be a temporary rationing phase before bigger clusters arrive, why labs may increasingly keep their most powerful models private for distillation, and why scale alone no longer feels like a complete answer
    * Memory as the slowest scaling factor in AI: why context windows have improved far more slowly than people hoped, why million-token context still has not changed most real workflows, and why memory may be the key bottleneck for the next generation of systems
    * What swyx changed his mind on in the past year: becoming more bullish on open models, more convinced that the top tier of agent startups behaves very differently from the median AI company, and more optimistic about fine-tuning and specialized model adaptation
    * “Dark factories” and zero-human-review coding: the next frontier after zero human-written code, where models not only write the code but ship it without human review, forcing companies to rethink testing and verification from first principles
    * Why RL and post-training may matter more than people assumed: even if the resulting models get thrown out every few months, the data, workflows, and domain-specific improvements persist
    * Synthetic rubrics, Doctor GRPO, and multi-turn RL: why reinforcement learning is becoming much more domain-specific and multi-step than many people realize, opening the door to much deeper customization
    * The next frontier after coding: memory, personalization, and world models, including why swyx thinks world models matter not just for robotics or gaming, but for giving AI something closer to lived understanding
    * Fei-Fei Li, spatial intelligence, and the Good Will Hunting analogy: the idea that today’s LLMs may know everything by reading it all, but still lack the lived experience that turns knowledge into a deeper kind of intelligence
    Timestamps
    * 00:00:00 Intro preview: AI coding wars, startup pressure, and market structure
    * 00:00:28 Welcome to the Latent Space × Unsupervised Learning crossover
    * 00:01:17 What AI builders are focused on now: OpenClaw, harnesses, and infra
    * 00:04:33 Why AI infra is harder than apps, and where startups can still win
    * 00:06:39 Should companies train their own models?
    * 00:09:28 Open models, custom chips, and the new inference race
    * 00:11:25 Designing products for agents, not just humans
    * 00:16:49 The state of the AI coding wars in 2026
    * 00:19:27 Capability exploration, token-maxing, and why coding is going parabolic
    * 00:21:41 What the end state of the coding market could look like
    * 00:23:50 Where app companies still have room against the labs
    * 00:27:02 Why AI valuations and market swings feel unprecedented
    * 00:28:56 Consumer AI vs. coding AI, and why sticky products still matter
    * 00:32:28 What the next breakthrough product experience might be
    * 00:32:53 2026 thesis: coding agents break containment and eat the world
    * 00:35:27 Are foundation models wiping out startup categories?
    * 00:37:33 AI vs. SaaS, vibe coding, and internal team tensions
    * 00:40:01 Biosafety, security, and the politics of restricted model releases
    * 00:42:19 Giant models, compute constraints, and the limits of scale
    * 00:44:30 Memory as the real bottleneck in AI
    * 00:44:57 Why swyx changed his mind on open models
    * 00:47:44 Dark factories and the future of zero-human-review coding
    * 00:49:36 Why post-training and RL may matter more than people think
    * 00:51:50 Memory, world models, and the next frontier of intelligence
    * 00:53:54 The Good Will Hunting analogy for LLMs
    * 00:54:21 Outro
    Transcript
    [00:00:00] swyx: Isn’t that crazy? That number is just mind boggling.
    [00:00:03] Jacob Effron: What is the state of the AI coding wars today?
    [00:00:05] swyx: We’re in a phase of sort of like capability exploration. The general thesis that I have been pursuing now is that the same way that 2025 was a year coding agents 2026 is coding agents breaking containments to do everything else.
    [00:00:16] Jacob Effron: Do you worry about the foundation models just getting into a bunch of these startup categories?
    [00:00:21] swyx: Mid-size startups. Yes.
    [00:00:23] Jacob Effron: What do you think the end state of this market is
    [00:00:25] swyx: for the market structure to, to significantly change? There would be
    [00:00:28] Jacob Effron: today on unsupervised learning. We had a, a fun episode and what’s really become an annual tradition, a crossover episode with our friends at Latent space.
    Swix and I sat down and we talked about everything happening in the AI ecosystem today. What we thought of the various changes at the model layer, what’s happening in the infra world, the coding wars, and a bunch of other things. It’s a ton of fun to do this with someone I really respect and another great podcaster in the game.
    Without further ado, here’s our episode. Well switch. This is, uh, super fun to be back with another unsupervised learning, uh, latent space crossover episode.
    [00:01:02] swyx: Yeah,
    [00:01:02] Jacob Effron: I feel like a lot of places we could start, but you know, one thing I always find fascinating, uh, about the way you spend your time is you obviously are like at the epicenter of this engineering movement and community, and you run these events and conferences and put on these.
    Awesome talks and, and I think just have a great pulse on the zeitgeist of what’s going on.
    [00:01:16] swyx: Yeah.
    [00:01:17] Jacob Effron: Maybe to, to start just what are the biggest topics people are thinking about right now?
    [00:01:21] swyx: Yeah, so I just came back from London, uh, where we did a IE Europe and we’re doing roughly one per quarter now, which Yeah, you’ve
    [00:01:27] Jacob Effron: really up
    [00:01:27] swyx: the, hopefully
    [00:01:28] Jacob Effron: up the, up the pace.
    [00:01:29] swyx: It’s trying. We’re trying to match AI speed, you
    know?
    [00:01:30] Jacob Effron: Yeah, exactly. The tops would be completely different, I imagine. Uh,
    [00:01:33] swyx: yeah. You know, I definitely curate the tracks, like you can see what I think. When you see the track list and the, the speakers that I invite, obviously Open Claw is like the story of the last four or five months, and then be, be just below that.
    I would consider harness engineering, context engineering to be two related topics in agents and rag. And then there’s a long tail of Evergreen stuff like evals, observability, GPUs, uh, and uh, LM infra and just general, just in general. We also have other updates on like multimodality and, uh, generative media, let’s call it.
    Um, but I definitely, the, the first three that I mentioned are top of mind people. Yeah.
    [00:02:13] Jacob Effron: I think harness is particular like, so interesting. Um, you know, there was this tweet from Harrison Chase, the, the lane chain, CEO, that, that caught my eye recently where he said, you know, it finally feels like we have stability, uh, around the infrastructure for, uh, you know, around ai.
    And I think what. He basically was implying his like, look over the past two, three years as a company at the epicenter of AI infrastructure, it was a bit like playing whack-a-mole, right? You were constantly moving around with, however, the building patterns were evolving
    [00:02:36] swyx: for Harrison for sure. Right? Like he’s basically had to reinvent the company every year since he started Lang Chain.
    Right? It was Lang chain, Ang graph and LP agents and like, uh, I think he’s like one of the most nimble, adept sharp people about this. Yeah. Yeah.
    [00:02:49] Jacob Effron: Saying now, now is finally the time stability
    [00:02:51] swyx: this. Yeah.
    [00:02:52] Jacob Effron: Yeah. Um, do you buy that or what have you kind of make of that take?
    [00:02:56] swyx: I think that. It, it’s very expensive to say this Time is different sometimes, but when you’re just writing code, like it’s actually okay to just like try to make a call and I think it may not even matter if this call is right or not.
    Like I just don’t even care that much because you can be right on a thesis, but if you don’t, you don’t figure out how to monetize the thesis, then who cares if you said something first that said, um, it does feel like, for example. Uh, we went through a lot of different ways of passion packaging integrations up with, uh, with agents.
    And it feels like we’ve landed at skills, which is like the minimal viable format. Yeah. Which is just a markdown file, uh, with some scripts attached to it, and I don’t see how it can be more simple than that. And so there is some justification for. The stability around harnesses. I feel like there may be more adaptation with regards to maybe like the real time elements or subagents or memory or any of those like agent disciplines, let’s call it in, in agent engineering.
    Uh, but if, if the thesis is that, okay, you just want agents are LMS with tools in the loop with a file system, what they can do. Retrieval with, with skills and all these like standard tooling that now seems to be relatively consensus then probably. That makes sense. Um, I just think like there’s no point trying to stake your reputation on this thesis that we’re there because if it changes again, just change with it.
    It’s fine.
    [00:04:33] Jacob Effron: Yeah. It’s always, you know, I’ve always been struck by how that is. Much more challenging for infrastructure companies and application companies. Like obviously I think, yeah. You know, on the application side you’ve seen, you know, Brett Taylor from Sierra Max, from Lara. Like, they’re like, look, we build, you know, what’s ahead of the models and we’re willing to throw everything out every three months, you know, as the models get better and better.
    Exactly. Yeah. But the thing you at least have there is you have. Uh, you have an end customer, right? That’s like decently sticky. Um, you know, they will mostly stick, you know, they’ll, they’ll give you a shot at least of, of building these things. What I’ve always found more challenging, uh, at, at the kind of like, you know, reinvent yourself every three months of the infrastructure layer, it’s like, you know, developers are definitely a, a pickier audience maybe than an accounting firm or, uh, you know, a bank.
    Yeah. And so it’s definitely a, a, a more challenging position to be in to, to have to constantly reinvent yourself.
    [00:05:17] swyx: Yeah. Yeah. Yeah. And, and like when they turn, it’s like. Very complete. Like, they’ll leave to like the, the hot new thing, uh, because there’s like no defensibility, I guess. Like e even, even if you are a database, like, uh, people can migrate workloads off databases.
    Like it’s, it’s a, it’s a known thing. Uh, so I think like basically what we’re talking about is the vertical versus horizontal, uh, debate in, in AI startups. And uh, the way I think about it also is just that like when you are. Um, Lara, when you are a bridge, like you are the outsource AI team, right? You, you are, your job is to apply whatever state ofthe art AI methods.
    [00:05:55] Jacob Effron: Yeah. Like this translation layer between model capabilities and your
    [00:05:57] swyx: own customers. Yeah. To, to the end customers and like, well, if they didn’t have you, they would’ve to hire in house and they’re not gonna hire in house so they have you. And like, I think that’s like a reasonable, like very robust to any whatever trends and, and discoveries that people make in, in the engineering layer.
    I do think like there is, um. It like sort of useful horizontal companies being built, but they’re all. Very much like, sort of like the reinventions of classic cloud in the AI era and the, the primary one being sandboxes. Yeah. Um, which like, it’s another form of compute guys, like, let’s not get too excited about it.
    But I mean, like the, the workloads are enormous.
    [00:06:38] Jacob Effron: Right.
    [00:06:38] swyx: Yeah.
    [00:06:39] Jacob Effron: It’s interesting, and I feel like as, as part of this, you know, the questions that folks are asking around infrastructure, there’s a lot around, you know, the extent to which companies should have their own AI teams and what they should be doing in-house.
    And, you know, uh, I think there’s questions around should people be training their own models? Should people be doing, you know, rl, uh, in-house based on the data they have? I feel like, you know, one has to evolve their takes on this every, every three months with paces. But where, where are you at on this today?
    [00:07:00] swyx: I think, well, I mean actually all models have gone up. Um, and obviously I’m involved in cognition and also cursors doing, doing, uh, a lot of own model training. And I think that that is some part of the, what I’ve been calling the agent lab playbook, where you start off with the state of the art models from, uh, from the big labs and you, uh, specialize for your domain.
    But once you have enough workload and enough high quality data from your users, then you can obviously train your own models and like save a lot on cost and latency and all that, all that good stuff. Um, you also get like a marketing bonus of like calling it some fancy name and putting out some research
    [00:07:38] Jacob Effron: from my seat.
    I can’t tell how much of it is like actual, you know, value that’s provided to the end user. And how much of it is that marketing bonus? Right. It seems some combination of the
    [00:07:45] swyx: I think it’s both.
    [00:07:46] Jacob Effron: Yeah.
    [00:07:46] swyx: Um, no, no. There, there actually is real value. Um, and you, you know that for a number of reasons. Like one, even when it’s not subsidized, people do choose it as like one of the top four or five.
    This is both composer two and, uh, suite 1.6 I one of the top five models. Like in a, in a fair market? In a free market, yeah. In a, in a, in a model switch. Or people do choose it and like, it’s not subsidized. Like, so that’s as good as it gets. Uh, but beyond that, like domain specific models, for example. For search with, with both, which both companies have absolutely makes, makes a ton of sense.
    Everyone says like, yeah, we should always, always do this. And honestly like, I think the infrastructure for that is becoming easier with, um, like thinking machines tinker thing as well as primary like, uh, lab stuff. Yeah, I mean like, this is one of those like reversal of the, the bitter lesson where you first bootstrap on the large models and the general purpose models to get big.
    And as you get very well-defined workloads that are just high quantity but not high variance, um, then you just distill down to a smaller model and run that on your own. Right. Which like totally makes sense.
    [00:08:50] Jacob Effron: What I’m less clear on is the kind of DIY RL use case, which I think is really mostly around, you know, improved, uh, quality for, for different things.
    Obviously there’s probably like more efficient ways to, you know, get a smaller model that’s that’s faster and cheaper. And it’ll be interesting to see whether. You know, obviously you had, you know, uh, two, three years ago this whole case of companies that were, you know, pre-training and claiming better outcomes in, in their domains than getting kind of cooked as each model iteration improved.
    You know, I wonder whether that’s a, a similar story plays out in the, uh, in, in the, our all space. Yeah, for the focus on, on on pure outcomes and quality, not the cost side, which clearly your own models for cost at scale makes a ton of sense.
    [00:09:28] swyx: I think there are this, there are two sides of the same coin.
    Like you basically always want to hold, uh, quality constant or trade off a little bit of quality for a drastic decreasing cost. And that’s true for everyone. Uh, one element I wanted to bring out, which is very much in favor of open models, is custom chips. So this would be cereus, but also talu. And then there’s a huge range of stuff in between.
    This has been a huge story this past year on just like everything non Nvidia is getting bid up, including like freaking MatX is working for, which is very, which is very rewarding for me, but I think one of those things where like, oh, like the suddenly, because the number of alternative. Hard, uh, hardware is increasing and the inference that you can get is insanely high.
    Like, um, we’re talking thousands of tokens per second instead of less than a hundred. So the trade off for qua quality doesn’t hold as much anymore because the speed is so high.
    [00:10:24] Jacob Effron: Have you seen a lot of companies go all in on the alternative chip?
    [00:10:26] swyx: So cognition has Yeah. On Cerebras, uh, and, and so has OpenAI
    Um, uh, and so no, I don’t think so beyond that, uh, and that, do you think that’s like a, that’s mostly, that’s foreshadowing of, that’s, yeah. I used to be kind of a skeptic in terms of like, okay, so what if I get my inference at a hundred to a hundred tokens per second sped up to 200 tokens per second. It’s only two X faster.
    It’s not that big a deal. Um, but when you, uh, I think every 10 x does unlock a different usage pattern. Um, and you, we have proof in Talas and, and some of the others. That you can actually, um, drastically imp improve inference speed and what happens from there? I don’t even really know, like it’s, it’s so hard to predict when entire applications just appear at once.
    Yeah. Uh, and it also isn’t that expensive, right? So like, um, this is one of those things where like, I, I think the, the investment cycle is gonna be multi-year. Um, and I. Would caution people to not dismiss it too, too quickly.
    [00:11:25] Jacob Effron: Yeah. I mean, one other like infra question I was curious to get your thoughts on is obviously it seems increasingly a lot of the cutting edge infra companies are building for agents as the buyers of their product or users of their product, right?
    [00:11:35] swyx: Ooh,
    [00:11:36] Jacob Effron: and
    [00:11:37] swyx: another huge theme. Yeah. Yeah.
    [00:11:38] Jacob Effron: And I’m trying to figure out like what. What, what do you have to do differently about selling into agents? Um, are they just the ultimate rational developers? Uh, or is there, you know,
    [00:11:46] swyx: no, absolutely not. Um, I think they are easily prompt, injected and, uh, very tuned towards like, basically com compounding existing winners.
    [00:11:57] Jacob Effron: Yeah,
    [00:11:57] swyx: so like if, like, congrats if you won the lottery for getting into the training data right before 2023, because now you’re like installed in there for the foreseeable future. But yeah. Uh, you know, one stat that Versal, uh, CTO Malta dropped at my conference was that there are now, uh, 60% of traffic to Elle’s, um, like app arch, like admin app architecture for like configuring versal applications, uh, is bought.
    It’s not, it’s not human. Uh, so like your primary customer is agents now. Um, and it’s mostly co like mostly coding agents, mostly people using CLI on CP or whatever. But yeah, I mean, I think. More. I, I think step one, if it doesn’t exist as an API that agents can use, it doesn’t exist. Right, right. Which I think is like, uh, it’s a good hygiene thing anyway, to, to make everything API available, but not as like an extra, um.
    Push on like products, people to not only work on the ui, um, you should probably work on the on SCLI stuff. Beyond that, I think honestly there is like, so I, I come from the sensibility of, I think everything that you are trying to do for agents experience now, which is the term that Matt Bowman and Nullify is trying to coin, is the same thing that you should have been doing for developer experience.
    That you should have had good docs, you should have had a consistent API, uh, that is. Mostly stateless. Um, you should have, I guess, discoverable or progressive disclosure or like search or like whatever. And so now that people have energy in like finding these customers to do that, that’s great. Um, do I believe in.
    Extending beyond that into something like a EO, um, for gaming The chatbots? Not necessarily, but obviously there’s gonna be huge advantages when people who figure out the short term wins. Yeah. And short term wins can compound.
    [00:13:43] Jacob Effron: Do you think these compounding advantages to like the, the pre-training data cutoff companies, like, you know, obviously over some period of time, I imagine that doesn’t persist.
    And so as you think about like. I dunno, three, four years from now what the, you know, selection criteria end up being. Do you think it still mirrors exactly what you were saying before? Like it’s exactly what you should have been doing all along to sell a good product to developers?
    [00:14:01] swyx: It could be, except that I think in three, four years we’ll probably have much better memory and personalization.
    So then general a EO or GEO doesn’t really matter as much. So I think whatever memory or personalization system we end up with will probably d determine what you end up choosing much more. Than, than what is currently the case, which is just frequency of mentions, let’s call it. Yeah,
    [00:14:26] Jacob Effron: yeah.
    [00:14:26] swyx: Uh, so you just spa quantity and I think that’s, I mean, that’s something I’m looking forward to.
    I do think, like, like, you know, I, I think that the fundamental exercise to work through for yourself is if you start a new, um, sort of. Uh, disruptor company. Now there’s a, there’s a big incumbent that everyone knows, like, like superb base. Super base is like, kind of like the Postgres, like database, uh, incumbent.
    If you wanna start like new superb base, how would you compete with them? And I don’t necessarily have the answer, but I, I, I do think like people, like resend like relatively new. I think they would start like 20, 23 and still there was, there was a recent survey where like, people. Checked what Claude recommends by default.
    If you just don’t prompt it with anything, just say, gimme an email provider and says, resent as in like 70, 70% of each cases. Like the fact that you can get in there with like such a relatively short existence, I think is, is encouraging.
    [00:15:14] Jacob Effron: Yeah.
    [00:15:14] swyx: I do think like. Um, you do want to do whatever it is to, to like to, to get in that Very short mentions this because, um, it’s not gonna be 20 of them, it’s gonna be like three.
    [00:15:26] Jacob Effron: No, definitely. It feels like, uh, you know, probably more, more consolidation than ever. Uh, or, or kind of like, you know, uh, a winner take most market than maybe the, the, the physics of go-to market in the past. Yeah. Might have, uh, enabled.
    [00:15:38] swyx: The other thing also is like, semantic association is gonna be very important, uh, in the sense that like, you want to do like the combo articles where you’re like, use my thing with for sale, with blah, blah.
    And like that all gets picked up in a, in a corpus. And so that’s. Probably one thing that you, you wanna do? Well, I don’t know what else. Uh, it’s, it’s, it’s, it’s one of those things where like, I think I feel, I feel I’m behind, uh, I don’t know how you feel about this, but like,
    [00:16:04] Jacob Effron: I think AI is just everyone constantly feeling like they’re behind some, uh,
    [00:16:08] swyx: yeah.
    With,
    [00:16:09] Jacob Effron: I wanna meet the person that doesn’t feel behind,
    [00:16:11] swyx: but like with, with ax, right? Like, so, so like, my, my stance was that exactly what I said before, like everything that you, that you should do for agents is something that you should have done for humans anyway. Yeah. And so. To the extent that you’re just getting it more energy to, to do things for agents, great.
    But like, uh, it’s hard to articulate what new thing apart from just like more spam, um, that you should be doing. Anyway, that would be my take right now. Um, I I, I do think like there, there will be more turns at this. I think the personalization turn that is coming, um, will be big. And I don’t know what that looks like because like basically we’re kind of, we feel kind of tapped out on the memory side of things.
    [00:16:49] Jacob Effron: Yeah. I, I guess since we last chatted, you know, you, you took this role over at cognition, um, and you’ve obviously have a, have a front row seat to the AI coding space today. You know, I feel like coding in many ways. You know, people view it as this, like, I mean, besides being like the, the mother of all markets and this massive opportunity, I think it’s kinda a preview of like, what’s to come for many other spaces.
    Both. Yeah. You know, I feel like agents are most advanced in coding. I also feel like the, you know, competition between foundation models and application companies, you know, and, uh, mirrors what we may see in other spaces. And so maybe for our listeners, can you just lay out like what is the state of the AI coding wars today?
    [00:17:25] swyx: Um, it is massive, right? Like, uh, and I don’t think necessarily, last time we talked about this, we appreciated the size of what
    [00:17:32] Jacob Effron: No, I wish we did.
    [00:17:33] swyx: I state of AI coding wars today, um, both opening eye philanthropic have made it their p serials to competing coding. Um, and. Tropic is like 2.5 billion in a RR just from Cloud Code.
    The way they recognize a RR is. Opt for debate, uh, open ai. I don’t think the, a public number is known, but let’s call it 2 billion as well. And then cursor is like, rumored to be 2 billion, you know? And, and those, those are like the public numbers that are known? Yeah. Um, so like huge markets that have just been created in the past one year.
    Like, like anthropic, just like Claude Code just recently celebrated their one year anniversary, which is, yeah, pretty nice. Um, so, and then I think, like the other thing that I see is there’s, there’s some other people who are like, oh, here’s like the, the sort of relative penetration of, uh, Claude use cases, right?
    Like, and it’s like coding 50% and then legal, whatever. Health, uh, it’s like the, the remaining ones. And there was a very popular tweet that was like, okay, I’ll look at the, the empty space and all these other use cases. If you are a new founder today, you should be betting on the other stuff because on, on a sort of catch up Yeah.
    Theory and my. Consider my, my pushback is the same pushback that, uh, I had on app over Google, which is like, well, well why is this time different? Like, why, if it went from let’s say 10 to 50% in the past year, why can’t I keep going? Uh, and like getting that wrong is actually a very painful one because you could have just did, did the momentum bet.
    Instead of the mean reversion bed. So I, I, I think that that is the, the state of things now that people are very, very much into psychosis. Um, they’re are getting rewarded for spending more rather than spending less. And I think we’re not in that phase of efficiency. We’re in a phase of sort of like capability exploration.
    So I think people who are more crazy, who are more. Uh, creative, um, get rewarded comparatively. Yeah.
    [00:19:27] Jacob Effron: Well, it’s interesting. I mean, it feels like behind these like token maxing, leaderboards and whatnot is this, it’s like the first phase of this transition from a workforce perspective is you just gotta show your employer like, Hey, I, I use these tools.
    [00:19:37] swyx: Here’s my nu number of tokens I cost, and that’s it. They don’t care about the quality. Right. It is, uh, maybe distasteful to someone who cares about the craft and, and all that. Um, but directionally everyone just wants you to go up regardless. And so, um, there it is not very discerning. It’s, and it’s probably very sloppy, but I think it’s net fine because we’re still probably underusing ai just in generally.
    Yeah. Um, and so I think that’s like very interesting. Like we had on the podcast, uh, Ryan La Poplar from OBI, who spends a billion tokens a day. Yeah. Um, and that’s for those county home, it’s like something like 10,000 worth, $10,000 worth a day of API tokens. If they, they did market rates, um, and like most of us can’t afford that.
    Yeah. But like. And, and, and probably a lot of what he does is slop.
    [00:20:25] Jacob Effron: Right.
    [00:20:25] swyx: But like, he’s going to dis, he’s like, if there were a new capability, he would discover it first before you because he was, he was trying and you were not trying. Right. And like, you only do things that work like, well, good for you.
    But like the, the people who are going to discover the next hot thing are living at the edge.
    [00:20:42] Jacob Effron: Right and increase in living at the edge of just having the compute budget to like run these experiments. I mean, kind of similar to what living at the edge on the research side has always been. You know, it was constrained in many ways by the amount of compute you had to run these experiments.
    It feels similarly on the, almost on the builder or like actualizing these tools now.
    [00:20:56] swyx: Yeah. The other thing that’s, I mean, very obvious is philanthropic is kind of like the high price premium player. Um, that where, you know. Restricting limits or restricting model releases even is like the name of the game.
    Whereas Codex is like, come on in guys, use our SDK, use our login and we don’t care. We’re gonna reset limits. Whatever you do want to try to exploit the subsidies where you can get it. And definitely Codex is super subsidized right now. Gemini also very subsidized. Um, and. Comparatively, like, I think you should make, Hey, I guess while, while that’s going on, it’s not that bad to be a capabilities explorer on just the $200 a month plan from Cloud Code or from OpenAI.
    Um, and, uh, I I, I, my sense is that people aren’t even there yet.
    [00:21:41] Jacob Effron: How do you think this, like, market ultimately plays? I mean, it’s obviously such a big market that, you know, any slice of that market is interesting for, for anyone going after it. But I think what, what makes people so interesting in the coding market particularly is it feels like it’s kind of this.
    Foreshadowing of what will happen in other, you know, any other kind of application market that the foundation models eventually turn to and are all their models against and gather data around. And so how do you think, you know, like does there end up being room for lots of different kinds of players or like, what do you think the end state of this market is and is that, do you think that’s applicable to other markets?
    [00:22:10] swyx: I feel like there will be, I mean. Status quo is probably the most likely outcome, which is there are two big players and there’s a small range of longer tail people that, um, fit other use cases that the, the two big players don’t. That feels right to me. I think that, um, for it to, for the market structure to, to significantly change there would be, there needs to be significant change in like the economics or like the, the brand building or like the, the, the, the value propositions of the, of the companies involved and I.
    Haven’t seen any in the last six months that, that have really changed the stories materially. So I feel like they would just keep going until something, something else happens. Something else happens, meaning like Microsoft wakes up and like goes like. Guys, we have GitHub, we have, uh, you know, we, we, we’ll, we’ll do something much bigger here than other, other than just copilot.
    Um, and, uh, that would be a big change. Um, MSL has put out a model now, and I was in a breakfast with, uh, Alex Wang, where they were like, yeah, like, we, we really, really want to go after the coding use case. We haven’t done anything yet, but like, don’t underestimate them. Right. Um, and, and similarly for the Chinese labs.
    Um, I think they’re trying to go after it. Like ZAI is doing stuff. GLM uh, ZI and GLM is same thing. Um, uh, and, and so it’s, so like everyone’s trying to get a piece of that pie. I, I feel like the, the status quo has been pretty stable for the past, like almost a year I’ll say.
    [00:23:39] Jacob Effron: Yeah. And is the room for the, not like, you know, for, for the application companies more on like the enterprise side or like where do the, where do the, like what surface area do the model companies leave for application companies?
    [00:23:50] swyx: Yeah, that’s a good one. Um. It’s very much evolving. Um, it, I, I, I will say because opening I did not have this, the, this level of attention on coding. Yeah. Uh, a year ago. We just don’t have that much history. Right. Um, and it seems like, for example, so the big push at Open I now is the Super app. Um, is that a consumer thing?
    Is that like a products like. Portfolio rationalization thing, how much is that gonna take away attention from coding at the time when they actually do want to put more coding? I think it’s, it’s very unclear. So I do think like there’s, there’s all these, like in both big labs, there’s. Uh, sorry. Both of the, and, and drop and, and deep minus and XAI are are separate cases.
    Um, they are trying to see the other time expansion areas. So cloud code for finance. Yeah. Um, uh, cloud cowork, all those, all those things. Whereas I think cursor and cognition are like comparatively just focused on coding and so I, I do think they leave space and I do think for the other verticals that also means the same thing.
    Right. That, uh, that they’re not gonna be that. Um, intensely focused on, on, on that domain. Except for, I, I think I would mark out finance and healthcare as like the next ones, um, that they’re clearly going after. Uh, I, I would say comparatively, healthcare seems more thorny. There, there, there’ve been some announcements about it, but like, I would respect the, the finance work a lot more just because like the, the path to money is a lot clearer.
    [00:25:12] Jacob Effron: Yeah, no, I mean, obviously like, I, I think, you know, maybe similar to, to the space that’s being left in these other domains, you know, there’s obviously. Uh, a lot that’s required to actually implement these tools in enterprises, uh, versus, you know, maybe just giving them, uh, giving model access to, to folks outta the box.
    [00:25:27] swyx: Yeah, yeah. Yeah. So the, the agent lab thing is like, we’ll do the last mile for you. Whereas I think the model labs tend to just trust the model and, and be minimalist about it. Both of them work.
    [00:25:38] Jacob Effron: Yeah.
    [00:25:38] swyx: I, I don’t, I don’t necessarily think one, uh, beats the other, uh, for every, for every use case. Um, all I, all I do know is that it does seem like.
    Uh, the large enterprises do want a dedicated partner that isn’t just the model labs, which is kind of interesting.
    [00:25:55] Jacob Effron: We, we’ve been in this phase of, of pure capability exploration. And so I think nothing has been, you know, better for the large labs, right? I mean, they’re always gonna be, uh, uh, the frontier of, of capability exploration.
    And so I think have a very good relationship with a lot of these enterprises. But ultimately over time, like. The, uh, the incentive structure of these labs is always gonna be maximal, you know, token consumption for, uh, for the end customers they work with. And there’s just, I think, so few companies that have actually gotten to massive scale.
    Maybe coding again is the most interesting. So it’s the first space that really is just completely gone, you know? Yeah. You must love it every day. Like absolutely insane. And. I think it
    [00:26:32] swyx: gets even. Okay. I mean, like, I think we, we say good things about crystal cognition, but the sheer liftoff of like both end UPIC and open ai.
    ‘cause they, they, they have independent valuations. I mean, let’s throw an XEI in there because it’s now I ping at 1.2 trillion. That number is just mind boggling. Like I, I feel like in normal investing or normal startups, there’s kind of like a ceiling market cap or valuation. Totally. That, that like you, you reach and you go like, all right, let’s, it’s gonna be chiller from now on.
    And these guys are not slow down. No.
    [00:27:02] Jacob Effron: Well, I also think the dynamic is fascinating about some of these later stage companies is, is, you know, in the past, I feel like in, in venture world, if you got to a certain level of scale, the question around you was really more a valuation question. And this is like why there was different phase, like, you know, types of venture people did and like the late stage growth people were just incredible at like, you know, a little bit of what’s the ultimate market opportunity of this company, but also what’s the right way to, to value it.
    Like we know it’s, it’s in some bands of an outcome that is like. Sure there’s some variance to it, but it’s like relatively understood what that bands is and then maybe you get over time surprised to the upside. Whereas any kind of like later, even the labs themselves, any later stage company, the bands of which that company might be worth right now, even in a year or two years are so massive because of how fast the ecosystem changes that it’s like.
    Even for later stage companies, every three months could be an existential level event to the upside to the downside. Yeah. Um, and I think that, like, you are obviously seeing it in the, in the positive with code, which, you know, if you think about a company like philanthropic, you know, that. For a while, it was like unclear if they were going to have access to enough capital, um, to really stay in the, in the race, right?
    And then coding hit at the exact right time. They had the perfect model for it. They executed brilliantly. Um, and you know, now are, are, you know, uh, you know, one of the most valuable companies in the world.
    [00:28:13] swyx: Uh, at the same time, I, I don’t find, I, I have zero sympathy for opening eye because they’re crushing it and they’re all rich.
    You know, this is like a high class champagne problem to have to, uh, to be number two at coding or whatever. Like, who cares? Like, you’re, you’re doing great.
    [00:28:27] Jacob Effron: Yeah. It’s funny though. I can’t even, I mean, you would be closer to this, uh, you know, even that you’re in the AI coding space, but it’s like a lot of people I talk to think Codex is just as good, if not better than Claude Code.
    Right. I think one thing that I’ve been really surprised by, and maybe, maybe Cloud Code is a better product in some ways, I’m curious your thoughts is just in consumer AI with chat GBT. You saw this big first mover advantage, right? Where admittedly today, like, I don’t know, Claude Gemini. Great products.
    Not sure, not abundantly clear chat GBTs any better, but like. People stick with chat, GBT, it’s the first thing to introduce them.
    [00:28:56] swyx: They stay, but they’re not growing anymore. I don’t know if you’ve seen
    [00:28:59] Jacob Effron: Right. But that to me is more of like a, a, a product problem than it is. They’re not like, it’s not like they’ve like lost share to someone else.
    My understanding is the overall problem with consumer AI today is much more of a how do you take this tool and, you know, for, for folks like us, like knowledge workers, it’s like this incredible magic tool, but it’s not necessarily a daily active use tool for a lot of people around the world today. And what are the like products?
    It’s, it’s kind of a category wide problem. Like in coding, for example, like. The entire space has gone parabolic. There may be some relative growth in, uh, in other consumer AI players, but it’s not like consumer AI as a category is like going parabolic and they’re not capturing most of that thing. I think it’s actually the larger problem is much more, hey, the category has kind of hit a bit of a plateau of people haven’t figured out how to bring, you know, tons more users on board.
    Yeah, yeah. Or increase the frequency of those users. And so it seems more of a category wide problem than it is, you know, a massive market share of change. I was gonna draw the comparison to, to the coding space where Claude Co is the first product, obviously, to introduce people to this magical experience.
    You know, by all accounts, codex is, is pretty damn close to as good, if not better. Um, but like still that first product, you, you would’ve thought that would not be a super sticky, uh, you know, product surface area. And it actually has, it turns out, I, it feels like the first lab to introduce you and experience really does, uh, keep a lot of, uh, a lot of the focus.
    [00:30:12] swyx: I, I think. M maybe it’s like still, still early days. You know, Chad, BT is like three plus years old and Yeah. Cloud code is only one. Just turned a year. Yeah. So give it time, you know? Yeah. Like, yeah. I mean, definitely sometimes a lot of people have switched from to Codex. Maybe that will keep going. I, it’s like really hard to tell.
    Uh, yeah. I, I, I do, I do think that. Because we are in this like, high volatility, high temperature phase. Um, the loyalty and stickiness to first movers and category creators, I don’t think is as high as it might be in some other, uh, areas in our careers that we’ve looked at.
    [00:30:47] Jacob Effron: Yeah. Though, I mean, I’ve been surprised by the cloud code thing.
    I, I would’ve thought that, like, in many ways I always worried about the
    [00:30:52] swyx: enterprise. You think you would’ve been gone by now?
    [00:30:53] Jacob Effron: Not gone. But I would’ve, I I always worried that the, that the consumer business of these companies would be quite sticky. And then the enterprise API business. Uh, was actually like, you know, in some ways like your least loyal buyers, like they would, they would move to,
    [00:31:05] swyx: right, right.
    But, but they worked out that it wasn’t the enterprise API it was enterprise product.
    [00:31:09] Jacob Effron: Totally. And maybe that was the, that was the secret that like, but the amount of lock-in or just default behavior that has happened in that space, uh, is, is more than I might’ve imagined with two products that by all accounts are pretty damn similar.
    Yeah.
    [00:31:22] swyx: No fight there. Uh, I will say I do think that Codex is still in like a catch up. Like in terms of personal experience. Um, the only thing I like out of, out of Codex is the, is like Spark and like yeah. Uh, the, I, I feel like the skills integration is a little bit better. I feel like, uh, the, the speed is a bit better.
    Maybe ‘cause it’s in, is written in rust or whatever. Um, very minor things that you like. Almost like telling yourself rather than like objectively assessing between two, two of them. I, I, I do think, like vibes wise, I think that’s going on. Um, the, the, you know, I, I feel like the, the missing questions, uh, in, in this whole debate is like, why is this so concentrated in only two names, right?
    Yeah. Like, um, how, where, like, where is the Gemini? You know, presence, where’s the Xai presence? Um, and like they are trying, it’s just they haven’t made that much progress yet.
    [00:32:12] Jacob Effron: But what the, what the Claude Co moment does show, and it actually in some ways makes you a little more bullish on the potential for someone else to catch up because it does feel like if you’re the first person to introduce some magical net new product experience, that that actually might be stickier than one might have imagined.
    [00:32:27] swyx: Right, right, right. Okay. Yeah.
    [00:32:28] Jacob Effron: And so it’s, everyone can believe they have shot
    [00:32:29] swyx: that. What do you think that new product experience might be like? I, I, it’s, it’s like, and this is a failure of imagination on my part. Like, I always wonder, like, people always say this like, well, the, the thing that will save us is like being first to the next new thing.
    Like what is it?
    [00:32:41] Jacob Effron: Yeah.
    [00:32:42] swyx: It’s like,
    [00:32:45] Jacob Effron: I dunno, something around like, uh, consumer agent, computer use, like hybrid. I think, obviously, I think we’re like scratching the surface on the consumer side.
    [00:32:53] swyx: So my, my current theory is like the. Open claw is like a vision of things to come.
    [00:32:58] Jacob Effron: Totally.
    [00:32:58] swyx: Um, and uh, it’s good that O open I has like the association with open claw, but by no means do they have the rights to win it.
    The general thesis that I have been pursuing now is that the year the same way that 2025 was the year of coding agents, 2026 is coding agents breaking containment to do everything else. Um, and so coding agents continue to still win, but because they generate software and software eats the world, so like, it’s kind of like the trans.
    Associated property of like software, eat the world, coding agents, eat software, therefore coding agents eat the world. Um, which is like an interesting,
    [00:33:30] Jacob Effron: yeah, and breaking containment always an easier phase phrase in the consumer context than the enterprise one. You’ve seen people run these really cool, uh, experiments in their own personal lives.
    I think like,
    [00:33:37] swyx: yes.
    [00:33:38] Jacob Effron: Figuring out, you know, how you, obviously everyone’s focused, you know, on the enterprise side now around how you create these experiences. I feel like the vibes, you know, people love to have these narratives of like, everything is completely shifted. It’s like I actually, you know, open AI.
    Organizationally, uh, you know, volatility aside is, you know, great products, great team, great models like everyone else in the world is incentivized for there to be. Two, three more. Everyone would love more like great model companies. And so I feel like the, the natural forces of the world revolt when any one company, you know, is too much the star of the show, right?
    There’s so many people in the ecosystem that are incentivized for that not to happen. And so I think I’d be shocked if we don’t have. Uh, uh, reversion of vibes, not maybe completely the other way, but at least a little bit more equal at some point over the next six, 12 months.
    [00:34:24] swyx: I, I think there’s just a kind of different stages when, when you talk about the world, one wanting more model companies, I talked think about like the neo labs.
    [00:34:30] Jacob Effron: Yeah.
    [00:34:31] swyx: And I mean, I don’t know, is it fair to say none of them have really broken through in the past year?
    [00:34:35] Jacob Effron: I think that’s totally fair,
    [00:34:37] swyx: which is rough. Um, and well, how are we gonna, how are we gonna grow that diversity in, in, in choice, like. Um, that’s, this is it.
    [00:34:46] Jacob Effron: Yeah. It’ll be really interesting to see what, what, what ends up happening with that.
    And you’ve seen, you know, folks like Nvidia, you know, very incentivized to make sure there’s, there’s a broader platform of, of other model providers.
    [00:34:57] swyx: I think, uh, I don’t know people say this, but I, I, I don’t think they try it hard. Nvidia tries harder to build neo clouds
    [00:35:05] Jacob Effron: Yeah.
    [00:35:06] swyx: Than neo labs.
    [00:35:07] Jacob Effron: Well, they try pretty damn hard to build neo Cloud, so
    [00:35:09] swyx: that’s,
    [00:35:09] Jacob Effron: yeah.
    [00:35:10] swyx: But like, you know, let’s call it like the, the core weaves of the world, much happier place in the, you know, than any neo lab built on top of them.
    [00:35:18] Jacob Effron: Yeah. That one might argue it’s, it’s easier to, to enable a neo cloud to be successful than it is. Uh, you can’t will a neo lab into existence the same way you, so
    Nvidia
    [00:35:25] swyx: has more direct control over it.
    Uh, for sure.
    [00:35:27] Jacob Effron: What else is kind of catching your eye today on the startup side? I mean, you worry, there’s obviously this whole narrative of like, you know, the foundation models, you know, they announced a product and every stock goes down 15%. Like
    [00:35:36] swyx: Yeah.
    [00:35:37] Jacob Effron: Do you, do you worry about the foundation models just kind of eating into to a bunch of these startup categories?
    [00:35:43] swyx: Not really. I, I think actually like. As, uh, there’s, there’s, okay, there’s, there’s, there’s the, there’s the point of view of like being an investor in startups, and there’s a point of view of like, do you wanna start something? And I think honestly, like the, the downside for all these is so. Minimal in, in a sense of like, the worst you do is you just get hired into one of these labs anyway.
    So I, I think the, the market for people who just do things and try things and try to execute in like a competent way, even if like it doesn’t work out commercially, even if it just wasn’t that great anyway. Like, but like that’s your job interview to go into, into one of these things anyway, so, um, I don’t feel that.
    From a, from a very, very small startup perspective, mid-size startups. Yes. Uh, I will say there’s been a lot of dead, um, LM Infra, a lot of LM infra consolidation like the, the, uh, lang fuses of the world getting absorbed into, into click house. And I, I think. Like people have maybe worked out the domain specific playbook, uh, and like, I think that’s okay.
    Um, and, and yeah, I’m not that, not that worried about, uh, okay. So, um, I, I would say I’d be more worried about traditional SaaS, like low NPSS. This is the whole AI versus SaaS debate that has, that’s been going on. Uh, and, and like literally I’m going through that exact thing in my company where, so I like kind of.
    Thinking through this on a very visceral, visceral level, right? On one hand you have the people who say you vibe coders don’t appreciate the amount of work that goes into A-A-C-R-M and like, yeah, you think you can rip out Salesforce? So did the 30 entrepreneurs before you, right? Like, like, you know, you classically underestimate the things that you don’t.
    Deeply, no. And, and, and target audience is not you. Uh, at the same time, like we have never been able to build software so easily and customize software so easily and like Yeah, you’re not gonna use 90% of the things in Salesforce. So like, yeah. What’s the typical, so what have you, what
    [00:37:33] Jacob Effron: have you done internally?
    [00:37:34] swyx: So we have there the main SaaS that we do for event management and sponsor management. That’s, and we paid 200 KA year for that. Not, not huge, but like chunky for, for, for my, my scale. Um, and like, yeah, I could probably spend 2000 and, and build like a custom version of that. Um, the, the, the trick has been dealing with my, the rest of my team and getting them on board.
    Yeah. ‘cause I’m the most ethical person on my team, but like, I can’t make that decision myself. And I think in the same way I’ve been telling with other CEOs team leaders as well, it’s like, well you can be super cloud pilled. You can be super LM psychosis and that you think that’s okay, but you like you have to bring your team with you.
    And I think like there, the sort of widening disparity in LM psychosis in companies is causing real s real riffs because. And on one hand, on one hand, the people who are less AI native are not getting with the picture. They’re not, they’re actually like behind, they’re actually not waking up to the fact that like you, everything you think is necessary is not actually that necessary.
    And in fact, exactly would be better of you if you just like held your nose and went in and when came out the other side. Yeah, only talking to agents in natural language and like your life would actually be better and you just, you’re just like close-minded. There’s that perspective. The other perspective is, oh, you vibe coder.
    You, you did this in a weekend and you got the 80% solution and now the rest of your employees. Have to pick up the rest of your s**t, right, that you, that you thought you were, you were such hot, amazing, uh, uh, at, but like, actually you didn’t figure it out. And like, actually LMS are still useless at this and blah, blah, blah.
    So like, I think there’s this huge debate going on in every company right now. Um, and like, um, you know, I have a small microcosm of it, but like, yeah, it, it’s making me hesitate to, to pull the trigger. But like I will at some point, it’s like maybe I’ve put it off for one year, but not like five. Yeah, but like, so, so like SaaS is definitely getting squeezed.
    Um, it does make me wonder, like, I, I do think that there’s an opportunity for a more AI native, um, system of record thing that is not just Postgres. Um, or not just MongoDB, although both are very good. Maybe it’s like a convex or like people Yeah. Bring up convex a lot. I don’t know, like, like, I, I just feel like the sort of quote unquote firebase of, of AI apps isn’t really a thing yet.
    Um, beyond what we have. Uh, which, which is fine. It’s, it’s, it’s just. We could probably start in a more sort of rapid iteration cycle first before scaling up to like a Postgres or MongoDB, which are more sort of old tech. I was at a dinner with, uh, Mike Krieger, the CPO of en philanthropic, and, and he, we were just kind of going around the room going like, what are people most worried about?
    Yeah. And, uh, for me, uh, I, instead of security, I brought up biosafety. Yeah,
    [00:40:21] Jacob Effron: classic.
    [00:40:22] swyx: Um, actually, like I said, it was. Cliche and classic, and the rest of the table were, were like, what do you mean? Someone sitting at home can manufacture a virus that wipes out half of humanity,
    [00:40:32] Jacob Effron: almost like the OG Jeffrey Hinton.
    Like, this is why you should be scared.
    [00:40:35] swyx: I’m like, yeah, like the read the, you know, risk reports. Like this is like the thing. Um, I think, and Mike was just sitting there knowing he was sitting on Mythos and going like, actually it’s security. Um, and I think like, um, I think the, there’s, there’s, part of it is.
    A very good marketing. Like too good. Yeah, like I would actually advise and topic to tune down the marketing because also it’s, it is just a very good model and you don’t have to make so many marketing claims around it. At the same time, it is not really a private model. If you give it to 40 companies.
    Each of whom have like 10,000 employees or whatever. Right. It’s not, it’s not private, it’s, it’s like there’s bad actors in there.
    [00:41:18] Jacob Effron: Yeah. Hopefully, hopefully not as, uh, as bad as releasing it widely, but, uh, no, I mean, it’s an interesting. You know, it’s an interesting case study for how all, I mean, many model releases might, I mean, you know, this might be the first model release that looks like the rest of ‘em from from now on, right?
    [00:41:31] swyx: It, it, so it’s, it’s the, there’s an overall product strategy, uh, for anthropic of like bundle, uh, you know, restrict access bundle, uh, product with model maybe.
    Whereas, uh, OpenAI has definitely been a lot more sort of. Philosophically aligned on like, we will just enable access everywhere and we don’t know what you, what will come out of it. Right.
    [00:41:51] Jacob Effron: Right. Though, I mean, this current moment, uh, obviously the cynical take is also just ties to the amount of compute that both companies
    [00:41:56] swyx: Yeah.
    Right, right, right. Yeah, I think, I think that’s true. I I do think like the, the, this is the, the, the scale, the dawn of like larger than 10 trillion parameter models is very interesting. I don’t think it, I think it’s a temporary phenomenon because we have much larger compute clusters coming online for everyone over the next like three, five years.
    It’s, and this is like already written in, in the cards.
    [00:42:18] Jacob Effron: Yeah.
    [00:42:19] swyx: So to the extent that like, you know, will we have rationing of models, uh, above 10 trillion, uh, in like two years? I don’t think so. I think everyone will have no, we’ll just
    [00:42:29] Jacob Effron: have rationing of the next phase.
    [00:42:30] swyx: Right. Right. But like, that’s as it should be almost like, um.
    My, my classic example, which I, this is just me theorizing, not anything confirmed by Google. When Google announced Gemini, they actually announced three sizes, which was Flash Pro Ultra. They never released Ultra. They only have Pro and Flash. Um, so my theory is they have ultra sitting in a basement and they just could distilling from it for, for flashing pro.
    Um, which like, yeah, I mean, I, I actually think that’s. As it should be for any lab that they, that they do that.
    [00:43:02] Jacob Effron: Yeah. Just because those are the models that people actually wanna end up using. And it’s just like cost prohibit.
    [00:43:06] swyx: It is more, yeah, it’s cost. Yeah. It’s, it’s not the want, it’s just, just, just the cost.
    Um, I do think, like, uh, it is interesting that, uh, for a while I was, I was considering the theory that models capped out at two, 2 trillion, and I think that’s proving to be wrong. And well then if I’m wrong, how wrong? How wrong am I? Do we do 200 trillion? Do we do two quarter trillion, whatever? Um, and I don’t think we have the straight answer to that, but like, uh, it’s interesting that we are continuing to scale number of pers when everyone kind of assu like can see that we’re not going to get like the next thousand or 1 million x from this paradigm.
    So like the others, like the alias of the world are working on other. Um, model architecture improvements. We need a different scaling law, I guess, because like, we’re, I, I feel like people already already feel like we’re tapped out on this. Like the, the end, the end state of this is we turn most of the world into data centers and like, I don’t know.
    I don’t know if we want that.
    [00:44:08] Jacob Effron: Yeah, I mean, uh, if the, if, if, if the return of intelligence are there, maybe, uh, maybe not so bad.
    [00:44:13] swyx: I, I, I think there, there’s just a sheer amount of like, like un scalability that like is wrangling people’s sensibilities right now. Um, especially in terms of like context lengths.
    Um, my classic quote is that context length is like the slowest scaling factor in, in lms.
    [00:44:30] Jacob Effron: Yeah.
    [00:44:30] swyx: Um, we, like, we took maybe. Three years to go from like 4,000 context length to a million and that’s about it. Yeah. Like Gemini has had a million token context length for two years now. Um, and no one’s using it.
    Like, so like yeah, it’s memory. Memory is probably gonna be the, the biggest limiting constraint on all these things.
    [00:44:50] Jacob Effron: Yeah. Certainly seems that way. I guess I’m curious over the last year since you recorded last, like what’s one thing you’ve changed your mind on?
    [00:44:57] swyx: I feel like I was kind of bearish on open models like last year.
    Um, in a sense of, like, I, I had just done the podcast with an Al
    [00:45:07] Jacob Effron: Yeah.
    [00:45:08] swyx: Of Braintrust where he, and he, I mean, you know, he has a good cross section of all the top AI companies and he says market share of open source is 5% and going down. Um, I think that’s changed. I think it’s going up. Um, and even if,
    [00:45:22] Jacob Effron: even though the capability gap does seem to be increasing.
    Spending on the
    [00:45:26] swyx: time. It’s hard to tell. Yeah, it’s, it’s really hard to tell. ‘cause like, okay, for, for listeners, capability gap increasing is like on public benchmarks. And let’s say you’re comparing mythos versus like, I don’t know, G-T-O-S-S or like GLM 5.1. And, um, it’s, it is really hard to tell. ‘cause even if they were closing, you will also not believe that they were closing that much because it’s very easy to gain the benchmarks.
    Yeah. So you just don’t really, really know. Um, all you know is like. Uh, there’s somewhat objective open router stats on like what people choose in a free market. And people do choose some of these open models in significant volume, except that a lot of them are heavily discounted. So you need to kind of like price adjust, uh, these things.
    So even if, even if that were true, which I, I’m not sure, like I, I, I feel like the numbers just up now instead of down. Uh, I think the. Separation between what the top tier agent labs are doing versus the average startup in ai or the average GPT wrapper is significant enough that you should not worry about the, the, the sort of mean industry number.
    And you should, you should cohort things into like, here’s the median here, here’s like the bottom 80% and here’s the top 20%. And top 20% acts very differently than the pome percent. And so top 20% is, which is what I all I care about, um, is. Definitely going towards more open models. Um, the fireworks and the togethers are crushing.
    Um, and, uh, and so will all the fine tuners, right? So like, um, I think maybe last time we even said things like, fine tuning is a service doesn’t work. Well, now it’s gonna work. It’s, it’s a derivative of the open market, uh, open models market.
    [00:47:01] Jacob Effron: Well, and also in the workload scaling to the point where people care about cost and speed, you know, more and more.
    [00:47:06] swyx: Yeah.
    [00:47:06] Jacob Effron: And that like the, you know, moving from just pure use case discovery of like, what can these models do to, okay, we know what they’re gonna do at scale now let’s do ‘em cheaper and faster.
    [00:47:14] swyx: Yeah. Yeah. Um, so, so like, uh, that change I, I think, is probably the most significant in, in my mind. And like, I, I always like to do the mental math of like, uh, this is what.
    Think about, uh, scheduling a learning rate, like when you’ve been wrong once. Yeah. What else were you wrong on? Um, and I, I’m kind of working through it. I, I, to me, the, the, the other thing was the coding one, um, which obviously I, I have now come full 360 on, but I think like. People are not appreciating dark factories enough, which I don’t know if you’ve discussed in the pod yet.
    [00:47:44] Jacob Effron: No.
    [00:47:45] swyx: Um, uh, and so this is a kind of a strong DM slash Simon Willis term. Uh, the, the general idea is, okay, there’s different levels of AI coding psychosis. You can have, um, the, the very first level, which I, I, by the way I encountered first in cognition five months ago was zero. Uh, human written code. Yeah.
    Right. Which like, seems like a reasonable thing now was less reasonable five months ago. The next frontier that sounds as crazy today as it as, as zero coding was in in the past is zero Human review.
    [00:48:17] Jacob Effron: Yeah.
    [00:48:18] swyx: Like, just, just check it in without even. Reviewing it, and very few people are doing that, but opening Eyes is, is exploring this and I feel like it’s, it’s definitely the only scalable way to do this.
    Uh, which it just means like you have to just kind of like flip the S-S-D-L-C or change large amounts of what, what you normally do. Um. Which is probably things you should have done anyway. More testing, more, you know, more automated verification or whatever. But like that is a frontier at which, like when you have unlocked that in your companies, um, you are just gonna produce much more quantity of software than than you’ve ever had.
    Uh, and it’s gonna be like so much, so disposable, so cheap that you can probably innovate in quality a lot as well. Like that that quantity helps you get to quality.
    [00:49:00] Jacob Effron: Yeah.
    [00:49:01] swyx: Which I think people are very uncomfortable with. ‘cause like people associate more quantity with slop.
    [00:49:07] Jacob Effron: Right. No, it’s back to exactly the discussion we’re having on like the reaction to these token maxing scoreboards and the, and the idea that like, today, maybe that’s not the most, uh, the, the, the, the best sign of, of, of productivity in efficiency, but going forward
    [00:49:18] swyx: yeah, you, but you still get rewarded for it.
    So they’re like, f**k it, whatever. But like, uh, I, I, I think like the, the, the people who are, who are doing well, who do well, who do most well in 2026, are not the cynics who go like, oh, that’s just slop. I’m not gonna participate in that. They’re like, okay, like this is happening with, with or without me. Bend this the right way.
    [00:49:36] Jacob Effron: Yeah, no, I love that. Um, I mean, I think for, for me, like any kind of related thing on, on the open source model side is for so long, I really didn’t think it made any sense to do any sort of RL post-training, pre-training, anything you could do to like improve kind of overall quality. Certainly for like latency and cost, it always made sense to me.
    But for overall quality, like God, you just get that for free in the models like three, six months later. I, I think what I’m starting to change my tune on a little bit is. You know, hearing all these app companies talk about, like, you know, we build stuff and then we throw it out three months later, as, as like the models improve.
    You’re like, okay, well then what you’re doing for capability improvement is just another version of that, right? Like, I still don’t think that like your RL or like post train is gonna make you have a better model for like. Years and years to come. But maybe I, I think you still have to be pretty rigorous on like, is that the single best thing you can do to solve a customer problem?
    And like, you know, oftentimes, like, it’s literally just like now, like add more data and like feed more data even via connectors to these models or like, I don’t know, do some clever engineering on the back end or whatever it is. But at the single best thing you can do for that three month time period to improve your customer’s outcomes is, you know, post-training in some way that like really improves the output of model even if you throw it out three months later because the general models get up there.
    It still might have been worth doing. And so I think I’m like more open to
    [00:50:45] swyx: you, you throw out the results, but you don’t throw out the raw data.
    [00:50:47] Jacob Effron: Totally.
    [00:50:48] swyx: And like, so like
    [00:50:48] Jacob Effron: Right. Then you just run it again. And so basically there’s some, obviously at the level of cost of like $10 million, maybe that’s too much, but there’s some level of cost where
    [00:50:55] swyx: No,
    [00:50:55] Jacob Effron: it’s the, it’s
    [00:50:56] swyx: not even 10 million,
    [00:50:56] Jacob Effron: right?
    No, of course it’s not. Uh, you know,
    [00:50:58] swyx: yeah.
    [00:50:58] Jacob Effron: There’s obviously some level of investment, uh, at which it’s the equivalent of just like staffing four engineers to go build something for three months.
    [00:51:04] swyx: Yeah. Uh, so the other thing I really, uh, for, for listeners, I’m just gonna leave some, some droplets of info. Uh, look into like the, the long trajectory, the synthetic rubrics work that people are doing is very important, uh, including, uh, something that’s called Doctor GRPO.
    I’ll just, I’ll just leave those key search terms in there. Um, I, I think it, what it means is that RL is going much more multi turn than. People think, and that means that you can customize the models in way more specific dimensions than traditional, let’s call it SFT, or uh, uh, you know, like a, a sort of shallow rl, um, that was done in a year ago.
    Um, so like hundreds of turns.
    [00:51:44] Jacob Effron: Yeah.
    [00:51:45] swyx: Uh, and, and, and I think that that leads you down a path of like complete domain specificity.
    [00:51:50] Jacob Effron: What else? Like are you, you know, uh, of these like unanswered questions in AI today? Are you like looking for, you know, in the next year? Are you, you, uh, you know, paying close attention to,
    [00:51:58] swyx: I, I have a few thesis for like, what?
    Is the sort of next frontier. Uh, one is memory, which memory and personalization we talked about. The other is really, uh, world models, which we’ve done a small little series on from Fefe Lee. Yeah, of course. To, uh, even Moon Lake. Um, and, uh, general intuition and there’s a lot of debate as to like. The relative importance of this.
    I think a lot of it, it manifests as like 3D static walls that you kind of inhabit for a little bit and you walk around and they’re like, cool, but like, how does this help me with my B2B SaaS? Right. And
    [00:52:29] Jacob Effron: it’s like all the hype now is robotics, right?
    [00:52:31] swyx: Yeah. Um, and there’s a, obviously a correlation between, uh, role models and embodied.
    Uh, vision and experiences, which leads to robotics. Uh, but I think role models is very interesting in just in improving intelligence itself. Um, from the next, from the next token prediction paradigm. Um, and so I think people are kind of testing their edges around that. One of our top articles this year so far has been on adversarial award models.
    Um. I, I do think, like, uh, if you don’t do anything else, just read FE’S essay on spatial intelligence on why, um, LMS don’t need, don’t have it. And she is, she may, she may not have the solution yet, but she has the right problems statement. Yeah. And so everyone else is trying to solve that problem statement in their own way.
    Um. And let’s see who wins. But like, I, I don’t think it does you any favor to equate role models to robotics or role models to gaming or some kind of like, uh, or like the current manifestations because what is at stake is a much more important. Conception of intelligence than just answering questions.
    It is, does, does, does, does the AI understand what a table is? Like, what, what matter is, what physics is? It is almost like for, for those who are movie fans, it’s like Google Hunting where, um, Matt Damon like knows everything because he read it in a book, but he’s never lived. Great,
    [00:53:54] Jacob Effron: great scene with
    [00:53:55] swyx: Robin Williams.
    With Robin Williams and I, I look at that scene and I go like, that’s exactly the, the, the difference between like a very intelligent LLM who knows everything but hasn’t experienced anything.
    [00:54:04] Jacob Effron: Wow. That’s an awesome note to end on. Uh, that’s a, have you used that before? That’s great.
    [00:54:08] swyx: Yeah. So, so one thing I’ve done with Lean Space is I moved to like, uh, adding daily writeups.
    Yeah. And so one, one of the times I was doing this daily writeup, I wrote that.
    [00:54:16] Jacob Effron: That’s a great
    [00:54:17] swyx: one. I love
    [00:54:17] Jacob Effron: that. Um, well, so it’s been a ton of fun. Thanks so much
    [00:54:19] swyx: for, for Coming Man.
    [00:54:21] Jacob Effron: I’m Jacob Effron and this has been Unsupervised Learning. A podcast where I get to talk to the smartest people in AI and ask them tons of questions about what’s happening with models and what it means for businesses in the world.
    As I hope is clear, I have a ton of fun doing this. It’s a nights and weekends project in addition to my day job as an investor at RedPoint, but our ability to get these incredible guests on really comes from folks like you subscribing to the podcast, sharing it with friends. It’s really what ultimately makes this whole thing work.
    And so please consider doing that. And thank you so much for your support and listening. We’ll see you next episode.


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
Mais podcasts de Ciência
Sobre Latent Space: The AI Engineer Podcast
The podcast by and for AI Engineers! In 2025, over 10 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0. We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al. Full show notes always on https://latent.space www.latent.space
Sítio Web de podcast

Ouve Latent Space: The AI Engineer Podcast, Mais lento do que a luz e muitos outros podcasts de todo o mundo com a aplicação radio.pt

Obtenha a aplicação gratuita radio.pt

  • Guardar rádios e podcasts favoritos
  • Transmissão via Wi-Fi ou Bluetooth
  • Carplay & Android Audo compatìvel
  • E ainda mais funções