Data Skeptic

Graphs and ML for Robotics

Duration:: 41m
Broadcast on:: 04 Nov 2024
Audio Format:: other

We are joined by Abhishek Paudel, a PhD Student at George Mason University with a research focus on robotics, machine learning, and planning under uncertainty, using graph-based methods to enhance robot behavior. He explains how graph-based approaches can model environments, capture spatial relationships, and provide a framework for integrating multiple levels of planning and decision-making.

(upbeat music) - You're listening to Data Skeptic Graphs and Networks. The podcast exploring how the graph data structure has an impact in science, industry, and elsewhere. - Well, welcome to another installment of Data Skeptic Graphs and Networks. Today we're talking about the interesting problem of room classification. So given a floor plan, can you determine what's a bathroom, what's a bedroom, and so on, so forth, specifically doing that with graph neural networks? Asaf, what are your thoughts on this implementation? - What I like about today's guest is that he shows us how to use graphs to add more relevant features to machine learning. The pretty thing about graphs is that without adding more data, just by utilizing the data we have, we can add more dimensions to our data set. By utilizing, I need model your data as a graph. And so, you can use graph properties as features. For example, as we'll soon hear in this episode, if we take rooms and model them as nodes, and then model, let's say, the doors as edges, patterns in the graphs we've made could help us to better classify the different rooms. But even cooler is because it's a graph, we can look for patterns, not only in the immediate surrounding of the node, or the room in our case, meaning its neighbors, but also in nodes that are two or three hops away from it. Avishak refers to it as the topology adaptive graph convolution. Another thing that Avishak did was to create a ways-like app, the navigating app for robots to navigate in apartments. So like ways, is robots use algorithms, like Tagasta and A* to find the most efficient route to get to where they're supposed to go. But since it's a graph, you can also add tiers to the data, making the rooms that say the first tier, and then the furniture of the second, and the items on the furniture of the third. So you can map everything in the house to the level of toothpaste sitting beside the sink in the bathroom. Each of them can be a node by itself. Personally, the part that I really liked the most in the conversation was about how our guests makes robots reflect on their mistakes. Yes, I can shake off the image in my mind of a distressed robot called trying to itself, what did I do wrong? What did I do wrong? So to do this, let's say, soul searching in robots sound really interesting. - Yeah, very neat aspect of machine learning that is somewhat new. I don't know if this is the first use case or whatever, but classic reinforcement learning is very Pavlovian. You got minus one for doing the wrong thing, and you got plus one for doing the right thing. But here it's more to say, you made an error during this epic of time, go reflect on your error, and even the potential to simulate other circumstances on one's own infrastructure, and then figure out how to build a better algorithm or policy for the next one. Really interesting learning style. - We evolved from a behaviorism, even in robots, right? We see even in robots, it doesn't work as much as we wanted. We need something more elaborate, right? It's not so simple. So it's cool to see that with the phenomenon we see in humans, we also see in robots. - And to your point about how they extract topological features from this graph, that is the most common use case in industry I observe, where people are involved in graphs. It's that they extract some features from graphs, not just nearest neighbors, but second, third level neighbors, or structures of the graph, like how many triangles or bow ties, structures you have, and then those just become new features in traditional machine learning. Although different from what Abhishek does today, where they're actually using a graph neural network that sort of inherently takes the structure into account, which is a nice feature. But we'll have to get more into the innards of graph neural networks in a future intro. Definitely wanna pick your brain on your thoughts there. - Let's do it. - Let's jump right into the interview then. (upbeat music) - So I'm a computer science PhD student at George Mason University in Virginia. My research interest are in the areas of robotics, machine learning, and planning uncertainty. And my research, I mean, I'm more practically interested in like how to improve robot behavior and abilities to adapt in multiple environments when the robot is like using some kind of machine learning and things. So yeah, mostly on that area for improving robot learning. So I did take one robotics course in graduate school. And this was some time ago. And I was kind of surprised at how, let's say simulation based it was. We didn't talk about physical robots. We were talking about fast slam and things like that, all in software. And I was surprised to learn it felt very disconnected from the physical side. You have the same impression or perhaps as that changed over time. - That has changed over time. I developed algorithms that are first demonstrated to work in simulation. And then we are sure that, okay, this works in simulation. And then we try to like port that into a real hardware robot or something. Until at this point, like I have a lot of algorithms that I have demonstrated to work in simulation. And then I'm working on like transporting like those things into the real robot so that we can demonstrate that this also works in real robot. So I mean, there's definitely like a balance to be had, right? You can like do a lot of things in simulation, but until you face the real world, you cannot really say that, okay, this is gonna work, right? So a lot of the folks that do like low level motion controls, I think they are more focused, heavily focused on demonstrating the cap of it is in real hardware, right? Because that's the meat of their work unless you can like demonstrate that robot can handle low level controls in the real robot itself than just doing it in simulation might not be as effective way to demonstrate their work. Other folks that would who more like do high level planning, like, for example, where the robots would go, whether they should enter a bathroom or enter a kitchen. If it's trying to find some object, those kinds of like high level planning can be demonstrated even without the need for an real robot. I couldn't demonstrate that. Okay, robot enters bathroom to find, let's say a toothpaste instead of going to the kitchen, right? You can demonstrate that even without the real robot moving from maybe like a living room to bathroom. So there's definitely like of the focus of what kind of work you are doing that prior, I just whether you'd actually demonstrate in real robot or not, so yeah. - And how has the topic of deep learning and deep neural networks impacted the field? - Oh yeah, I mean, everyone is doing that, right? So I think robotics is not away from the boom of AI and deep learning, right? It has actually like brought a lot of progress in robotics when deep learning has advanced especially with the computer vision boom starting from around like 2012 with the ImageNet. So a lot of the vision based understanding has improved that has allowed like the perception system of the robots to improve and improvement perception systems of the robot has actually resulted in a lot of improvement in robots behavior, right? It can identify objects better. So it can, once it identifies objects better, it can go and find that object more efficiently. So things like that, right? And the recent boom in NLP, right? People are trying a lot of things with LLM and robots. There's a lot of interest recently, but still I think there's a lot of work to be done in that area with the new deep learning tools like LLMs right now, yeah. And it's really exciting, exciting time to be in robotics, you know? Like a lot of the research with LLMs are being done and then you have the opportunity to like get those kind of new exciting ideas into robotics to improve how the robot behaves in the real world. - Well, I recently read your paper, room classification on floor plan graphs using graph neural networks for listeners who haven't taken a look yet. Could you give a quick summary of the effort? - It seems like a little deviation from what we're talking about robotics and all, but I think this was like an old paper of mine, I think it was around like, I don't know, 2021. So that was when I was just beginning to work on robotics. I was working on like, I was basically first time getting started doing robotics and then there was a lot of ideas about how to map an environment, right? For the, how the robot maps the environment with the LiDAR sensors and also there's a lot of studying that I was learning about robot mapping and when the robot maps the environment, right? It's maybe like building a 2D grid, right? Where the obstacles are and where the free space are and a lot of this idea got me interested in about what, how about like a robot whenever it's like building a map in some building when it's like maybe a household or an apartment building, right? And then I asked it to like bring me a toothpaste like from an earlier example. You'd have to know where it is first, where it is situated right now is a living room or a kitchen or a bathroom. And once it identifies that it would have to kind of reason about where the toothpaste might be, right? Maybe it might be in the bathroom, but so it has to look for where the bathroom is and wherever it goes it has to identify where the, what room it is currently in, right? So that's kind of like the motivation that got me into this idea of room classification with floor plans, right? The robot is given a map and it has to identify what rooms are in the map first to before like going into finding like whether toothpaste might be, right? So that's kind of like the seed idea that got me interested into these research of classifying rooms in a map based on like whatever data that you have about them, about the environment. That's basically like the how the idea got started, yeah. - Well, what is an underlying data set you can have to ask questions like this? (upbeat music) - This episode is brought to you by WorkOS. If you're building a B2B SaaS app, at some point your customers will start asking for enterprise features like single sign on, skim provisioning, role based access control, and audit trails. That's where WorkOS comes in with easy to use and flexible API's that help you ship enterprise features on day one without slowing down for your core product development. Today some of the hottest startups in the world are already powered by WorkOS, including ones you probably know like perplexity, Vercel, Jasper, and Webflow. WorkOS also provides a generous free tier of up to one million monthly active users for its user management solution, making it the perfect authentication and authorization solution for growing companies. It comes standard with rich features like social logins, bot protection, MFA, roles and permissions and more. If you're currently looking to build SSO for your first enterprise customer, you should consider using WorkOS. Integrate minutes and start shipping enterprise plans today. Check it out at WorkOS.com. That's WorkOS.com. (upbeat music) - Well, what is an underlying data set you can tap to ask questions like this? - We look into the data sets for the floor brands, specifically because with floor brands, what you have is basically the structure of the map, like the top view of the map. We look into the data set about like what we could find on the internet and we came across this data set from houses in Japan. So houses and apartments in Japan. So I think it's called, I have to check, I think it's called Leful House Data Set or something. So this basically contains floor plans from I think around like 143,000 houses and apartments in Japan. And then there were like labels about what the rooms were, what the rooms coordinates were, right? That's where we found the data for like doing this experiment that we thought about. - Well, I'm curious if you could expand on the data set. I'm imagining it could be something like ImageNet except pictures of floor plans, although that sounds especially unmanageable. I'd rather have like a CAD file or something like that. What is the feature of the data? - So the Leful Data Set, house data set, I think is the original data set that we came across. And we also came across, I think this was like a very raw floor plan that I said, I think it was like a raster images even. We did not directly work with the Leful Data Set because we found another work that someone has already, someone had already done that had converted this raster floor plan into like a vectorized format and then extracted these coordinates of the rooms from this floor plan. So a lot of the work that was made easier for us by I think the paper was house scan. So what they did was completely different. So they were doing something with this idea of generating floor plans. So basically generated adversarial networks, but for houses and the floor plan. So what they did was basically they took this Leful Data Set and then extracted vectors from that and then have the room coordinates and everything extracted from this raster images, I think. And then they used that data set to learn a generative adversarial neural network to come up with new floor plans. We kind of like borrowed a lot of the work from that house scan paper that converted this raster images to vectorize format with coordinates and also a lot of the work. We didn't have to do someone else and did it for us, right? And that's what a lot of the research is about. You don't have to start everything from scratch. You can build on top of other people's work. And that's what we did like with the data processing part of this work for room classification. So basically this data set had like room bounding boxes like and then room categories, the coordinates of the walls for each of these rooms, which wall corresponded to which rooms and then whether there was doers on the walls and so on and so on, right? And there was also like RGB images, the actual images that was used to generate the floor plans but we didn't really use that because we just needed the vectorized form of the data. So yeah. - Well, having that house scan data set sounds like a great starting point. - Yes. - But it's still not clear how you go from coordinates and polygons to saying a graph is the right tool here. - Right. - But what brought that inspiration around? - So in robotics, there's this idea of, so we talked about occupancy grid map representation which adds a 2-degree where each grid cell could either be free or an obstacle, right? And when you have a large grid and each cell is either free or obstacle, then you can like represent a big floor as an occupancy grid map. And then there's this other representation of map, what we call like the topological map. I think, yeah, topological map. So the representation would basically have prominent landmarks. For example, a landmark could be, let's say a point in a living room, right? And then you have a path, you have like a straight line connecting that point in living room with maybe the door and then the robot can move from that point in living room to the door. And then that door would represent as a node and the living room, a point in a living room would also be represented as a node. And if the robot can move from like, let's say, the front door to a living room, there's an S that connects these two nodes. So these topological maps are used in like robot planning, right? So if the robot is at the door and it wants to go to bedroom and that there's like a node from door to living room and then living room to the bedroom, then you can plan a path from door to the bedroom, right? So through the living room. So this idea of like topological map is what kind of like inspired us to represent this room and the whole floor plans as a graph. - So should I expect one node per room or would there be many nodes that represent various positions inside one room? - In this research, actually we represented one room as like one node in the graph, something like this, right? If there are like five rooms, let's say, right? In this, in one floor plan would have five nodes. I think the connection between the ages, I have to refer to the paper again because it's from 2021. So I don't want to be wrong about this, right? So I think we connected the two nodes in this room nodes graph if they are I think within like some distance threshold 3%. So if like the rooms, if the two nodes are within like the 3% threshold compared to like the whole floor plan length, then we connect these two rooms with an A's. And that's how we construct the graph from these floor plans. - And then is the edge just the representation that I can move from one room to another? - Not exactly. For example, sometimes a bathroom or like, let's say, a living room can be separated by one wall with let's say a dining room, but there might not be paths from the living room to a dining room. Maybe you have to go to kitchen first or something like that, right? But if they have like a lot of everything wall or something in our method, we still connect that living room with the dining room. The connection, the A's connection are based only on how far these rooms are, not whether you can directly go from that room to another room. As I said, I mean, in robotics, like in the top logical maps, we talk about that was the inspiration, but we did not exactly represent this floor plans as a topological map where you can navigate through the nodes, through the A's from node to node, right? So there's some subtle differences here. That was the inspiration, but we thought this was like a better representative than because we are not really like interested much. Like, I mean, that could be a different focus where we are like leveraging the topological map itself in this research, but we wanted to like make it more viable for what we can learn through draft neural networks. - Well, just out of curiosity though, I have to say I can't remember a time when I ever was in a home where you could go from the kitchen directly into the bathroom. - Right. - That's a structural insight that maybe you could take advantage of. Do you have any thoughts on, you know, maybe not having that in your model? - I mean, after we did this work and then like a few months after, we could see a lot of improvement that we could do. And the thing that you mentioned like, actually like representing A's to be like, whether you've been traversed between nodes, it's a really, really good insight. But we didn't really work on that because I think we had like some kind of time limit. I was doing this also as a part of a course project that I had to submit one time. So there was a lot of limitations about what we could explore and what we could not. And we kind of like did not export that side of the things because we had like some time constraint before we could finish this research. But definitely like, if we had more time, even after like we submitted this before, we kind of like thought about that. But I got more interested into robot planning and then like planning on uncertainty and that kind of like this work was left behind in that sense. So, but I agree with you that there are a lot of improvements that we can do on the current approach that we have. - I'm curious if you could frame a little bit more on the machine learning problem. If you think of textbook machine learning, it should get some tabular data where one column is the objective and the other columns are your features. How do you get the final representation going here? - Once we have this graph, we need to like formulate this as a classification problem, nor classification problem as we call it in the graph theory. With these nodes we have that represent rooms. What we are trying to output by inputting this graph structure into some machine learning algorithm is a class for each of the nodes that we have in the graph. One node could be classified as bedroom, the other node could be classified as a kitchen. We need to like have some features associated with each of the nodes that we have in the graph, right? And those features are what are passed into a machine learning algorithm like neural network. For each room, we had information about the rooms coordinates, right? What's the room coordinates? I think we have six features for each of the room nodes. The first one is area of the room. So once you have the coordinates of the rooms extracted from the data, we can compute area of the room. Maybe generally living rooms might have generally larger area than, let's say, a bathroom, right? So that's kind of like an important feature to distinguish between rooms. So we thought like area could be like a good feature. So we computed the area based on the coordinates and then put that as the first feature or lecture, right? The other feature could be length and width, like, trivially, right? So the area, the length of the room, the width of the room, how many doors a room has? Maybe a living room has multiple doors, right? Because it's maybe one entrance door and one that might lead to a kitchen and the other one that might lead to our bedroom, right? The number of doors in a room could also be a distinguishing feature between rooms. So the other two interesting features we have is whether a room is parent room or a child room. There were rooms labeled as closet. Many, many light bedrooms have like a walk-in closet that are kind of like big. One of the distinguishing features of closet is it's generally inside a bedroom. We came up with this new feature that defines whether it's a child room or not. And that closet that's inside a bedroom can have a feature that says it's a child room. And then bedroom can be a parent room like conversely, right? So those are the features that we use for running the machine learning algorithms that we have. And can you expand on what those algorithms were? So it's a basic neural network algorithms. So we only experimented with neural networks, but we experimented with specifically, more specifically with graph neural networks, right? Because the nodes represent rooms, the edges represent whether they are closed or not. With graph neural network is designed to capture the relationship between nodes, have better results than regular neural networks, like multi-layer perceptron. We kind of like have this multi-layer perceptron as the baseline that we were comparing with our approach that leverages graph neural networks. So multi-layer perceptron basically, it does not take into account this graph structure that we built. It's gonna take the node features, but there's no interaction built into multi-layer perceptron. And other three we experimented with were like different variants of the graph neural networks. The first one is the graph convolution network. I think it was one of the first, I mean, this was also inspired from this idea, idea of convolutional neural networks. And that's why it's called graph convolution, right? It's like a generalizing form of convolutional neural network. So what you have in convolution CNNs is like, there's one pixel is kind of getting information from the surrounding pixels during this convolution operation, like the neighboring pixels, and then aggregating that into like a new feature. The idea is kind of like similar with graph convolution, right? Here in convolution, you can maybe like consider that a pixel has neighboring pixels. Taking that idea and it's like generalizing it over graphs, you have a node and then it's surrounded by its edges and nodes and that you are doing convolution, similar convolution like operation over that set of nodes. So it's taking information from its surrounding nodes to compute the feature embeddings and that's how it's able to like capture the information from its neighbors to come up with the feature and then we use that feature for like further classification that's like a very high level view of graph convolution, right? And graph attention network is another variant of the graph neural network that we use. It kind of like builds on top of this idea of graph convolution. Sometimes what you have is maybe like you want to prioritize one neighbor more than the other neighbor, right? The attention mechanism kind of does that for you. The other one is graph says, so it's kind of like also like similar to graph convolution, but it's doing what we call like sampling and aggregation, right? That's what it stands for. And that's what says it stands for in graph says, right? Aggregating from all the neighboring nodes to get a node features, but with graph says you might not be like aggregating from all of the nodes, but like sampling a few node, few neighboring nodes and then aggregating over that. So aggregation could be like mean operation, like called the max operation. Like in graph convolution, you were always probably like doing mean, but graph says like you could like have multiple variants of aggregation operation, main, median, like whatever you have. We sample which node we take into account for aggregation. We also like use another variant called topology adaptive graph convolution network. So with all the three graph convolution that we talked about to compute an embedding, what we had to do is like we take into account what the node embeddings for the neighboring nodes are, right? With one convolution operation, we only take into account the immediate neighbors, right? One half neighbor, right? So what topology adaptive graph convolution did was like, even with one step, you could like get information, like to compute the node embeddings for that in one step get information not just from its immediate neighbors, but like maybe two half or three half neighbors. So that kind of like in one step, you would get a lot of information from far away nodes. So that's kind of like what topology adaptive graph convolution would do. As you can see, like there's a variety here, right? So we wanted to experiment with not just like one kind of variety of graph numerators, we wanted like export multiple varieties of graph numerators so that we could see like what kind of algorithm or neural network graph and neural network algorithms would perform better in this room classics and tasks that we have. - Quick question for you about features before we jump back into algorithms. Obviously, this is a room classification problem. So the objective is what type of room is this? You don't give it the feature, but do you tell it the neighbors? Like, you know, it's connected to a bathroom or it's connected to a hallway. - In the feature for a node, there's no hint of what the labels for the nodes, other neighboring nodes were, right? That is what the meat of graph neural network is. You don't have to like interpret what neighbors are there. In the features itself, right? That information is carried by the edges of the graph. And graph neural network is taking into account that not just the node features, but also edges to compute and embedding, right? So that feature, even if you don't explicitly have, like what neighbors are there for a node as a node feature, graph neural network is able to leverage that implicitly through this message passing mechanism that takes into account what other nodes this node is connected to via edges, right? So that's like the billions of graph neural network is what I would say, that you don't need to like take into account into the feature space what the neighbors are. So let's maybe start by talking about your baseline. You trained a multi-layer perceptron which lacks this graphical structure information. So hopefully your other approaches did a better job, but let's start at this baseline. How did it perform? - It performed, okay, I will say, right? Not really great. I think we are like the test accuracy of, for multi-layer perceptron, we are test accuracy of 65% compared to, let's say, CCN. So surprising, here's a surprising feature, right? So CCN was not able to perform as well as the multi-layer perceptron. This was like the first graph convolution neural network that was proposed. One thing I should note is that these were like the best performances that we picked from all the experiments that we had, right? So multi-layer perceptron's best performance over like a many hyper parameter tuning that we did was 65% and it could not beat the graph convolution neural network which had like an accuracy on test set of, I think, 54, 55, something like that, yeah, 54. But then when you look at like other neural networks, that kind of like built on this very vanilla graph neural network, they kind of like have better performance. For example, a graph says has 80% and then the topology aware has 81% which is like higher than the MLP and other regular graph neural networks. So those are like, so the baseline MLP is like outperformed by graph Cs and the topology aware graph convolution in our results. - Yeah, so there's a big margin there. Give or take 15% I guess. You get just the graph components. - You're right, yeah. - Do you have a sense of what about the graph algorithms makes one more successful than another? - So the THD scene, I think it outperforms everyone most likely because of its inherent ability to capture information from not just from neighbors, like immediate neighbors, but from like far away neighbors, like two or three of neighbors. I think that would have to be the reason that the topology aware graph convolution network kind of like outperforms all, yeah. - And do you have a thought about whether or not this is let's say world class performance or if you spent a lot more effort on algorithms, would you find some other solution or would you invest in feature engineering? I guess how would you move it forward if you were going to pursue this more? - Yeah, definitely. I think this was like a very time constraint experiments and research I would say, right? So there's still a lot of opportunities for improvement and like making the results better. I think a few papers that have cited this research kind of like have better results than what we have right now, right? So they built on top of what we did and then modified a few things and then maybe like use a new algorithm that have come up after like what we, after we did our paper. So new algorithms for graph, new relatives are coming up. So there are papers that cite us that have better results than ours. So definitely there's room for improvement. And there's also like more into like this idea of your feature engineering, right? We have like how many features? Six features that are very simple features and we don't even into taking to account like the images of the floor plan. - So graphs are very useful in robotics and your paper is one of many that establishes that. But robotics is also a pretty big field with a lot of specializations. Do you have any advice for an aspiring roboticist? How important is it for graphs and graph theory to show up on their syllabus? - Even in my lab, we do a lot of graph neural networks for robots. For example, I can talk about one of my colleagues' research like what he does is represent the whole house as a graph and not just like a regular graph that we did but like a hierarchical graph, right? So the hierarchical graph would look something like there's a root node and from the root nodes you have like the first hierarchy of rooms. The root would represent maybe an apartment and then inside the apartment there are rooms, right? So from the root node apartment, you have like a first hierarchy of rooms and then inside the rooms, you might have like table and then bed, right? And then that would form like another label of hierarchy. On the table, you might have objects, right? A pen, a mouse, a keyboard, a laptop. And so he leverages the hierarchical graphs to plan the specific reason for tax and motion planning in robotics, like so in robotics, tax and motion planning would constitute of like, if you are given a task, let's say, and then he, what he does is like leverage these hierarchical graphs to run a plan that moves objects from one room to another in the most efficient way, right? That's one example of like leveraging graphy structures in robotics. And the other one we talked about earlier, right? So this idea of past planning is all about graph. Even in like all kinds of grid to plan a path from where the robot is to let's say like a goal location, right? A lot of the path planning algorithm constitutes of graph algorithms. For example, we leverage Dijkstra's algorithm, which is a graph algorithm, right? You plan a shortest path from where the robot is to where it wants to go to, right? Maybe you want to go from a door to a microwave inside the room, right? So you represent the whole whole thing as an occupancy grid where each grid cell would be a node. And then if you can traverse from one grid set to another, you would have it is, and you can leverage Dijkstra algorithm or A-star algorithm from graph theory, specifically to plan the most efficient path from where the robot is to where it wants to go to. So there's a lot of past planning that leverages graph algorithms to just to talk about few. But yeah, and graphs and graph theory and graph algorithms, they are like really important robotics. So definitely, if you are looking into starting or like learning robotics, starting with like having like this idea and the concept of graphs and graph theory would definitely be a plus. And then it's a must I would say. - To what degree do you think of robots as agents, maybe even having their own utility functions? - I mean, a lot of the problems, I would say, I mean, even like all of the problems, even like maybe almost all of the problems can be formulated as some kind of optimization problem, right? Where like you have some optimization function that you are trying to either minimize or maximize, Dijkstra algorithm is an optimization problem, right? So you're trying to obtain a plan that minimizes the distance that robot travels from start to goal. And that's the shortest path. A lot of the problems can definitely be formulated in terms of like having a utility function. And then the result that you're trying to obtain is either trying to minimize that utility function or maximize that utility function, right? For example, in robotics, a lot of reinforcement learning paradigm is doing this. You have a reward that the robot might get if it reaches goal and then there might be a negative reward if it like starts wandering around more than it needs to or something like that, right? Once it finds the goal, there might be a large reward. So a lot of the reinforcement learning algorithms are trying to optimize how to get the best or like how to get maximum rewards and then trying to direct the robot's behavior towards getting this maximum expected reward. - Well, I know this is papers from a few years ago and it's just one facet of your research. What are you most excited about today? - I'm excited about robotics, right? So I'm doing my research in robotics. So a lot of the things that I think about right now is planning on uncertainty, right? So the robot might not have all the information that it needs about the environment, right? A lot of the part of the environment might be unknown. It's sensor readings might mislead us because they are noisy, right? You have to take into account a lot of those things while you are trying to make the robot behave the way you want. Let's say humans would do. To that end, I mean, I'm excited about how to imbue this idea of introspection in robots. The robot does something, right? And it might make mistakes or it might perform badly. And I'm interested in this idea that the robot has to realize when it's making mistakes or when it is behaving poorly. I asked the robot that's in the living room to go face me a knife or a fork. And then it goes into a bathroom. It's to realize that, okay, I went to the bathroom. I did not find the knife. Maybe that was not the best thing to do, right? This idea of when it does something wrong, when it does something poorly, is to realize that it made a mistake or performed poorly and then use that information to correct its behavior. This is like this idea that I have been interested for a long time and a lot of my research is scared towards how to develop techniques that are theoretically sound and practically feasible. That leverages idea of introspection to improve robots' behavior when it is deployed in unknown environments or robots or environments it has never seen before. Or if it has seen before, it might not have full access to, right? And those kind of things, yeah. - Well, I presume introspection requires some form of feedback. How else would you know that you need to find a mistake and where the mistake is? - Right. - What form does that feedback take? If I ask the robot to, again, like it's the same example, like bring me a fork and the robot is in living room. Let's say it goes to first goes to bathroom and then it explores the bathroom and then realizes that there's no fork, right? Now it has to go to go somewhere else, right? Maybe it goes to bedroom and then realizes that, okay, bedroom, there's no fork. And until like it finally goes to the kitchen and then finds the fork. Now after it finds the fork, it can now like leverage the information it gathered from while traveling from the living room to the bathroom to the bedroom and find it to kitchen to maybe like, so now once it has this more information about the environment, now it can maybe think about what it would have done if it had all the information beforehand and then generate a new plan. They're like, with this new plan, it would have a better plan, right? Because once it has reached the kitchen and then found the fork, the better plan would be to just go from living room to the kitchen directly, right? And then compare this new plan with the old plan that it actually executed while finding the fork, right? And that kind of like is an intuition for building this introspection that I leverage in my research. - I've got the intuition of it in hand. You know, if I'm the robot, once I've got the fork, I can rethink, oh, I should have just come directly here. - Yes. - Are there popular methodologies to do that? How do you express that mathematically? - So mathematically, I mean, I mean, one thing that we do is in terms of the performance, right? So here the performance would maybe like be represented in terms of the distance the robot traveled to find the object. Once you compare the actual distance that the robot traveled to find the fork and then compare that to a new plan, new imaginary plan, compute the distance from living room to the kitchen to find the fork, like some kind of approximate distance the robot would have needed to travel. There's this quantitative measure of how much better it would have done. And then we can build on top of that. We can compare different plans. It would have generated from different learned models, right? So maybe like you use one model, some kind of model, like any kind of model that the robot can use based on his prior learning experience. And then the robot uses that model, goes from living room to the bathroom and then bedroom and then kitchen to find the fork. If it had used a different learning experience, then it would have gotten a different result. So you can kind of like reason about its different learned behaviors or in robotics, what we call policy, right? So policy is basically, we take the state of the environment and it generates an axiom, right? So that's a policy. So if the robot has multiple policies, it can introspect about what other policies could have done. And that's kind of like one of the ideas that I leverage in my research. So going to leverage that is existing knowledge and going to like try out new things, a lot of that can be built up on this idea of introspection where the robot is kind of constantly monitoring its behavior and then identifying whether it performed better or not. - All right, well, first wrap up question, then you being in the field of robotics, do you feel comfortable predicting anything in the next couple of years? Where is it going and how might it impact our everyday life? - I mean, I don't have a good answer, I guess, right? Robotics as a field is quite bold and people consider like robotics to be moving very slowly, right? Because it's hard, you know, like robotics is hard. And it's very difficult and hard to get a robot to do things intelligently, right? It's very, very difficult. So robotics as a field is sometimes, I think, underappreciated because it's not moving very fast, as fast as people think when compared to like a machine learning, like computer vision, NLP, right? You see the boom, right? I think people and researchers are still like expecting a lot of the work to be done before, like there's like a very usable and trustworthy robots out in the wild, right? And I think we still have a lot of work to do in robotics to make that happen. A lot of the researchers in this area are also like, have a vision, okay? When are we getting this trustworthy robots out in the wild? And I feel like there's a long way to go and everyone contributing, every robot is contributing in this like very small part, an island of their own. I think it might take a while to like connect those islands and then get like a real intelligent robot that you can trust and not worry about a lot. I think I'm excited about the future in that sense because there's a lot of work to be done before we can reach there. And that's what excites me, right? You know, like a lot of the work needs to be done and then I'm doing the work. A lot of people are doing the work. The future is bright, although it may seem a little far. Yeah. - And is there anywhere listeners can follow you online? - Yeah, so I am available on LinkedIn. I don't use social media, so I'm available on LinkedIn if you want to connect only on LinkedIn. - Sounds good, well, links in the show notes for this news wanna follow up. Thank you so much for taking the time to come on and share your work. - Thank you, Kyle. This was a very, really interesting conversation and thank you for the opportunity. (upbeat music) (upbeat music) (laughing)