Transcription of the podcast as requested.
AWS_Podcast_Episode_747.mp3—This is episode 747 of the AWS podcast released on November 24, 2025.Hello, everyone. Welcome back to AWS podcast.So I’m Lishy with a great heavyback.I’m joined by a very special guest.I’m joined by bar and cook who is a vice president and distinguished scientist here at AWS.Can I bar and how are you going?Hey, how’s it going?
Good thing.So you are, you are back.We’ve had you on the podcast a couple of times.Years ago now.And yes,it was, it was long overdue to have you back becauseas is the case with many scientific advancements,the uses are not always evidence to the general populace.But as technology changes stuff to happens.So obviously we’re going to talk about GNII because you can’t not.But first, but first, let’s talk about your specific domain, which is automated reasoning,because it is a fascinating domain, not many people know about it when I speak to customers about the concept.So let’s maybe just briefly unpack that and then we’ll get into the utility.So so tell us about automated reasoning and why you’re so interested in it.Yeah.So I mean, in the past, we would have talked about the story would have started in the seven,these are eighties about people attempting to prove programs.And you know, we talked about practical applications.Microprosy, I got my started and my processor verification worked on.Reason about the genetic regulatory pathways, reasoning about real,always switching systems, reasoning about operating systems and then Amazon reason about policies andvirtualization and networking andcryptography and storage and all that kind of stuff.But actually the story goes way back.So to turn actually wrote one of the first papers in the space, it’s called how to prove,how to prove a large routine or how to check out a check out large routine,but he shows basically how to take a program essentially method,map it into the program semantics, the world of mathematics,like, like, mathematical logic.And then to reason about the mathematical artifact that the program is actually really representing.And we basically use the same tricks today.But then, but then a really interesting thing happened is that with the invention ofthe digital computer, people right away began to think, well, like, you know, what about artificial brains.And so the term artificial intelligence was coined this workshop at Dartmouth.It was John McCarthy, I think, was the hell hosted it.And they invited folks from kind of different disciplines from cognitive sciences, logic,you name it, statistics, information theory, cloud channel, it was there.And they termed, they coined that term artificial intelligence.And so I would, so if you had talked to people in the 60s and asked them what is artificial intelligence,it’s actually highly likely you would have heard about my area automated reasoning.- Interesting.- So it’s the logical, it’s the manipulation of symbols in a logic system to try and reduce,to try and, and so this, you know, so the, the sort of runningcompetition battle or collaboration, if you will, has actually been in these different areaswhere you have a sort of statistical, like a, a views based on cognitive science,on your neurons and so on.And then views like based on based theorem, for example,and then views based on mathematical logic.And it varies along the way the different approaches have been ahead or behind.And then like it’s like, say, all the different things, too, like they’ve got different.- Yeah, exactly.- So, yeah, exactly.And so one of the things that happened is there was this notion of AI winters.So basically for funding, there was a big pendulum swinging back and forth,like there was tons of funding, there was no funding, there was tons of funding,there was no funding.And so what happened in time is that these differentresearchers got sort of more and more specific about the exact problem they were taking onwith their tools. And so the area is kind of diverged. And so, so, you know, machine learning,so later we would call this tools, like machine learning, automated reasoning.And so automated reasoning,became the kind of domain of like, how do we reason about programs,as opposed to how do we like think about, you know, the reproducing, you know,artificial intelligence, general, and technology.- Yeah, yeah, sort of an analogy of human thought.And it’s interesting because if you think about why this is relevant,and why you’ve been working on this within Amazon for so long as well is, is,we do things that like huge scale, like, like it’s, it’s, it’s difficult for a lot of folks tounderstand the scale that we get to operate at. And it’s a real privilege to go to do that.But when you get to that sort of scale, it’s kind of beyond anyone’spurview to figure out, well, how do you know this will work?Like, how do you know unexpected things won’t happen? And this is where,this discipline, I think, really comes into it. This is not aboutguessing or assuming or sometimes okay.Like, this is about, will this happen or will this not happen, type?Do any of that?- Yeah, there’s like, um, at the top of my mind, there’s sort of three trend lines that drovethe adoption of automated reasoning in Amazon. And then, you know, I’m sure one,I’m talking about it, but then it, how it becomes neuro symbolic AI and how it connects tothe GNI, AI, the Gintakei. But three big drivers, a pretty pre, the general public gettingsuper excited by gener of the big, the big drivers were one. There’s actually a very radicalincrease in the capabilities of the most, one of the most foundationalautomated reasoning technologies is called propositional satisfiability solving.So there’s an international competition of the tools and the tools are getting better,better and better, like remarkably so. And there’s some real breakthroughs going out there.There’s a real breakthrough like, you’re like around 1999, but then there’s somehuge breakthroughs like 10 years ago. And so those tools are just getting better and better.So if you encode your problems into the formats, these tools can handle their way better,able to solve them now than they were before. So that’s one trend. The other trend,another trend is that, um, this move to a services model. And the view that software is notshrink-wrapped, it’s operated. And so that drives a different, uh, economies of scale situation whereyou basically are constantly, you’re encouraged to constantly rewrite the code, right? So we,we write some software, we deploy it, we take technical debt, we know that the software probablyisn’t going to be able to scale to where we think it might go, but we wait that we seeor how customers are using it and then we’re constantly rewriting that. And so that, that opens up the,the desire for the teams to then try to want to replace code or to, they’re constantlyrebuilding the code in that. So that’s an entry point for the formal as opposed to other placesI’ve worked where it’s like, they wrote it perfectly the first time there’s this airplane in. And so youkind of, yeah, it’s real hard to go like, get them. Yeah, exactly. And then the, um,what’s the, the, the, the, the third thing is is that with them move to the cloud,um, a bunch of conservative security obsessed customers were kind of, kind of,kind of looked at, they looked, they looked at the economics and they liked it a lot, but they werenervous about security. And then this work became the reason they would come that they were actuallyexcited to get off their new Jersey data centers, say, right? So they, so they, they want, they wantedto move to AWS, A, because of the cost, but they’re like, Oh, wow, if you couldn’t, so we,and we had a customer say, you know, we’ll move orders of magnitude, work load over more to AWS.If you can prove the correctness of the cryptography, that’s super meaningful to us. That’s a threatmodel that we’re really worried about. And, uh, and so, so, so basically it drove a bunch of businessinto AWS. Um, yeah. And so I’d say those, those three things were the perfect storm. And I was justkind of like really lucky, great place, great time when I joined Amazon. Well, it’s, it’s interesting toothat, um, you know, often, often highly complex, detailed areas of research like this that areshifting all the time, often the jump into the practical side is quite difficult. And what’s beeninteresting to me is to see automated reasoning sort of dip its toe in the water a little bit withinAWS, but suddenly it’s like it’s, it’s available in the hands of every AWS user. So if you’re, you know,talking about provable security, you know, if you’re using, um, you know, different capabilities like,you know, trying to understand, you know, what connects is my VPC, et cetera, like you’ve managedto surface this capability in a, in a consumable form, really, like, like our customers get to usethis every day, sort of unconsciously. Yeah, I mean, my frustration is that we’re not really evengiving them the half of it, right? Like the, the, the problem we have is that automatedreasoning is a little bit of a misnomer because what there’s, there’s actually a spectrum. So sometools are actually pushed but in fully automated, but traditionally, the scope of the problemsthat can solve were rather limited. And so that’s, that’s what you have with IMAX,S-Sanalyzer and VPC reachability and L-AZR for example, they’re based on propositionalsatisfiability and it’s, and it’s cousin called satisfiability modular theory, so it’s a slidingextension to it. Um, and those, they have, they have pretty significant limitations about what youcan do. If you’re willing to traditionally, if you’re willing to put a PhD, or a person with a PhD,I think, at award to run these tools, then you could do tremendously more because, because now theycould guide the tool to do, do really remarkable things. And the challenge has been that therearen’t that many people able to drive that, don’t have the background, they’re hard to manage,you know, just a whole bunch of reasons that it’s, that it’s, that’s, that’s, reasons. Exactly.And so that’s where, um, you know, long, long, we’ve been dreaming for a long time about these sort of,I mean, in today’s lingo, I would say it this way, that language models that are trained overproofs, right, such that now the language models can run automated reasing tools, or I might call themtheorem purpose, mechanical theorem, that, that was a dream we had. And we were thinking about how todo that, like, you know, training over all of our papers and archive and that kind of stuff. But the,the, the, what magically happened was that the mathematics community got really into the leantheorem, for lean is lean is written was founded in the sort of the main architect is Leo Demora,who works in Amazon. So the, the world of mathematics embraced lean when they’re communicating witheach other about their proofs, but also it allows people not at the top universities to do,learn the foundations of mathematics. And that lean caused people that, that scale to write downthe foundations of mathematics. And that became really good training data for language models.And the language models began to get really good at running lean. And so it’s actually,like, like, deep mind, deep seek, like, you know, just a whole bunch of model providers areare actually using lean, like with reinforcement learning, for example, to, to build models.And so for that reason, the models are usually pretty good at running these theorem purpose. And sothat gives us now that like, like, ability that you didn’t have before. Yeah. Yeah. So that,that’s a really exciting sort of piece, piece of the puzzle. So that’s kind of changing how we’rethinking about a lot of things right now. That’s fascinating. And it’s an interesting,like I said, it’s an interesting side effect of the general development of this sort of domain,that it sort of picked up this capability. Yeah. And I guess in the, in the,in the idea of like, there’s, there’s idea that like the world model, right? Like the,if you, if you think about it, like, what that’s giving you is real, more real data as opposed tohuman author data, right? And the view is that the, like, the volume of human author data is actuallyfairly limited and also flawed. The thing about a theorem, proverb is the data is not flawed. And itactually gives you an infinite amount of data. There are an infinite number of statements you can makeand prove or disprove with a theorem, proverb. And so that becomes incredible training data. And what,what we’re seeing and what a lot of model providers are, are finding is that if you use the,these kinds of tools that you get transferred, like they, they are more logical, they’re more able todo good recipe design. They’re, they’re more able to do the sorts of tasks that use the same part ofyour brain that you would use when you’re trying to drive a mathematical proof through. So it’s,it’s leaning on that, that mental model, if you like, or approach to problem solving for other things.Meaning it’s better at the other things. What we found is that basically the leantheorem, but there are others as Isabella was hall light. Actually, all of the, all of the foundersof the tools all work at Amazon these days. But, um, the, um, the, uh, in the, in the, in the worldof neuro symbolic, yeah, I, or the world, the world of where you, like, train models or, you know,reason for learning, or there’s various other ways of integrating these things that lean lean is kindof a lingua, a frank of, of, of that world. Um, and so that’s, that’s pretty exciting. That’s reallycool. That’s really cool. And so let’s, let’s, you know, we’ve been skating around a little bit,that let’s, let’s dump right in. So obviously, generative AI happened. Um, it brought great utilityand great capability. And then folks started noticing that it would give you confidently incorrectanswers. Okay, hallucinations. And so then they became the challenge of, well, I’m getting answers,sometimes they’re right, sometimes they’re wrong. How do I know? How do I prove or verify? And so,a lot of works gone on in terms of applying automated reasoning checks through things likethe drug guardrails, etc. Talk to us about how, firstly, how, with the lens of automatedreasoning, we can think about the data or the answers produced by large language models and how weseek to actually verify truth and, and, and, and correctness in hallucinations. Sure. But it’s,it’s worth, I think it’s worth pointing out that, that the two times that, I mean, we’re bothAmazon employees. So like, right, you’ll appreciate this, this, uh, hopefully the listener can kind ofunderstand what we’re saying, that Amazon, unlike other companies that worked at has like all thesemechanisms to drive connection to the customer. And to me, before I joined Amazon, that sounded kind ofcorny. Uh, but what I’ve found is that that actually is a big driver for why automated reasoning issuccessful, right? You, you, you, in 2014, 15 time frame, it was the financial services institutions inNew York. I happen to be in New York and London. So, uh, so customers began like, they had mathematicalbackgrounds. They had the need and they began driving on it and we’re driving out at heart and some,some very influential financial services institutions began to approach other financial servicesinstitutions and say to them, we need to support this and we need to help Amazon understandhow valuable this work is that’s brewing inside the company. And that same thing happened actuallywith hallucination engineer. Interesting. So, um, when Ginny, I happened, uh, it, you know, it was kind of,you know, the, the board members of most enterprise organizations, granddaughters showed them,and then began playing with it over the holiday break and then by, um, February, each C-suite leader ofevery enterprise organization needed to know what their gen A.I. plan was. And so then they all, youknow, uh, diverted resources and then form teams and went and did a bunch of stuff. And, and I beganto get calls and again, I’m in New York City in London. So like the financial service is in a bunchof those sort of industries in, in, in, in these areas up and down six avenue, we’re calling me. And so,then I started, I started ended up, I was, my, my calendar was filling with meetings with customers.And I was kind of hearing the same thing around, but we won’t be able to, we, we can’t trust this. Wecan’t use it like the, the prototypes are cool, but like, uh, there were, there were issues aroundprivacy, issues around sovereignty and then issues around incorrectness, uh, doodle-whose nation. Andso those kind of became the, uh, a bunch of stuff that, uh, that scientists, including myself,again, uh, began tackling. So the, um, so to, to, to kind of answer your question directly, how I thinkabout it, um, with, with, so we’re sort of delving in the area of neuro-symbolic, so it’s sort of theintersection. So then with, with much help from my friends, you know, who are more on the, um, statisticalmachine learning side, uh, jointly with them, I’d begun to appreciate that there’s likemultiple ways you can combine these tools. So one is you can use automated reasoning to generatedata that you might train over and that would include reinforcement learning. Um, another way is tojust allow the gene added to whatever it wants and then when it’s done doing whatever it wants, butwhatever magic it’s using and take that information and then prove it is proof its correctness.Um, and then another way is to integrate them more deeply. So like in the inference step to providehooks for, for log logical reasoning and then sort of a cousin to that islam with, you know,agentech tools where you provide, um, agentech access to the formal reasoning tools and then the,it’s as a multi agent system to collaborate and then, you know, maybe the theorem for ver provides,uh, a verification result together with a certificate, uh, that, uh, audit to make sure that thetheorem ver was really called and that isn’t that your other agents are just hallucinating that answer.Um, and so those are those are sort of different ways and you’re kind of seeing, um, and then,and then when it comes to agentech AI, then you have questions around, well, what are the agents doing?How do we know they’re doing the right thing? And then how do we,pose them? How do we know they’re doing the right thing? And then can we plan over thecompositions and so there’s a hole, there’s a whole kind of, uh, rich area and, and what we’re seeingwithin Amazon and, you know, I can’t really speak to all the, the products that’ll, that’ll come out,but I can speak to automated using checks and bedrock guard rails is that that the, there’s differentsort of points in that space and there’s different solutions we can put together and we’re basicallysort of using all of them. Yeah. Yeah. It’s, that’s the thing is not one solution. It’s lots of,lots of insertion points. Let’s maybe unpack one one, one, accessible example if I could put thatway for folks. So one example we often talk about is like a, a, a generative AI solution that’sdoing mortgage approval type works. That’s assessing different stuff. The question is always,well, how do I know it’s right? Now mortgage approvals is good because there’s lots and lots of rules,but the challenge is there’s lots and lots of rules. So how does automated reasoning go from,there’s a whole bunch of financial institution policies around the lending to, is my answer correct?Why help us bridge that gap? Yeah. So the, the statement, I could make a statement that anyone who’sworked in a formal reasoning for any length of time would agree with violently, but people who haven’twould be super puzzled by. And that statement is that over 80% of the work is figuring out what it isyou want to prove. Sorry, I say, for instance, it leaves across the world and everything.We, I’ll give you some examples, you know, pre-GNAI, right? So, so we launched IMX toanalyze them, right? The tool automatic could take all your policies throughout the minute,in a big soup is able to reason out the semantic level about them is able to prove, is able to proveproperties in the IMX to analyze her walks you through an exercise to basically trick you intowriting the specification, right? It is, it is asking you a set of questions and basically buildingsomething that we’re going to now prove of your system and ever, ever your system diverts fromthat spec, then we’re going to flag it to you and then we’re going to ask you, okay, and then thespec evolves. But also, what are the semantics of the IM policy language? Where are they precisely?And that turned out to be a multi-year effort. Wow. How could it be? It turns out thatno one really knew. And it turns out these days, if you ask questions hard enough,of AWS, what’s going to happen is they’re finally going to pull out someone from the IMXto analyze your team who has a background in mathematical logic, because then it’s going to beanswered because who owns it these days are those, is those folks. And so we see that whereincreasingly people build on that semantics. We’re building like within AWS, we use that semanticsfor a lot of other purposes beyond IMX to analyze her. And sometimes when teams break the semantics,they’re required to roll back the changes. So it actually becomes a very important,again, another God round of course, customers is this is this is the same thing with the EC2networking, same story, years, years, arguing many meetings, many people in the meeting many,and so and I’ve worked on device drivers, I’ve worked, worked on biological systems that workedon railways and so on. It’s always the same that you get three people, three biologistswho are working on models of leukemia or cancer, you can’t get agreement between the three of them.Right, so so it actually turns out just the what are the ground rules, what are the assumptionsthat we’re making when we do this proof? And then what are the important properties? Like,is it all data it rests? Oh, imagine like all data rest must be encrypted sounds sounds reasonable.Okay, well, what do you mean by encryption? Do you mean like a Caesar cipher? Oh, no, no, no,Caesar ciphers, that’s totally insecure. Okay, but you just said all data rest is encryption.Yeah, it’s your encryption. So you didn’t really mean when you said encryption, you actually mean somethingelse, right? And so that that refinement loop takes a really long time. And so that’s now whatyou know, your customers who are going to try and do mortgage approvals or family medical leave actor you know, questions about ticket airline ticket returns policies orHR rules around leave of absence and stuff like that. And how does your stock interact with your leave ofabsence and so on? Turns out that it’s really hard to work that out. It’s really hard to figure all thecorner cases. It’s hard to get agreement. And often achieving your agreement often becomes verycontroversial because when you’re setting policy staff ahead of time, you’re actually making aninfinite set of decisions all at the same time. Yes, and people begin to find all the corner casesthat they don’t like the answer suit. And so it’s very hard to reach consensus. So we’re basicallytaking work, we’re giving customers a big stick and they’re and helping them poke thewafts of Smith of what is truth. And it’s and it’s actually a very difficult challenge. Add to thatthe fact that we because they don’t understand the average customer will understand mathematicallogic. And we don’t want to require that they do. We’re basically having to maintain anot a digital twin like a logical twin. Basically, we we want to maintain a natural languagerepresentation of your rules together with the actual actual symbolic logical view. And it might bethat we want to take the logical view and communicate it to you in natural language, but it might bethat we want to take your natural language and convert it into logic. But I appreciate that youdon’t understand logic and then help you understand what’s hidden in that logic. And what are all theweird corner cases? And so there’s I mean, we can kind of zoom into all the techniques that wedeveloped for our admitted reasoning, Chaxbro, but they’re super interesting and then what we’re findingall around the company is that we’re realizing that those are actually basic building blocksand we’re reusing those a lot of those capabilities in different parts of the company. And so it’sa pretty it’s a very, very exciting time right now. Let me say that one. It is. It is. So if I’ma customer, let’s keep let’s keep picking away this mortgage approval situation. So if I’m a customer,I’m using, you know, I’m using automated reasoning checks in guardrails, for example. So I can upload,I’ll have documents, I’ll have policies, etc. Is automated reasoning basically interpreting thatand converting that into that mathematical model for me?