May 14, 2020|AI, AI on the Edge, robotics

(Author: Jim Gunderson, PhD – GunderFish – jpgunderson @ gunderfish.com)

We know the vision –

robots working tirelessly along side people. Robots handling the dull, dirty, and dangerous tasks. Robots designed to free people up to focus on the hard stuff, the stuff that takes judgment, discernment, and other uniquely human capabilities. But where are those robots? Instead we see many applications of robots as sensor platforms, steered by operators, monitored minute by minute, requiring people to do the routine work that the robot should be taking care of.

What is the Problem?

Our vision of robots often includes the system doing its job on its own, and only calling for help when there is a problem. And for many robotic systems this is the case. Automated manufacturing can run 24/7 without human oversight, your dishwasher and laundry runs without a thought of “I have to watch it.” But, these systems function in very tightly controlled domains. When the domain is more complex it becomes more challenging to design a system that can routinely run on its own.

In a complex environment things change. A person walks by and puts down a clipboard, someone moves a bench or pushes a cart out of the way. The weather changes and clouds block the sun changing the way things look, the wind piles snow up against a door to the courtyard. Harry decides that he is going to grow a beard, while Chris gets a radical new haircut and purple dye job.

These are all things that a person can easily adapt to, but can be challenging for a machine. The way this is currently handled is by providing human oversight in proportion to the complexity of the environment. We have a 3D printer that is churning out masks during the COVID-19 outbreak. It operates in a very non-complex environment, so minimal oversight is required. I turn it on, select a mask design, press print. I wait a few minutes to make sure the first layer is going down correctly, then walk away. I come back 6 hours later to remove the parts, check or change the feed stock, and start the process over again. A very stable environment equals very little need for human interaction and very little need for intelligence in the robot.

But let’s look at another extreme – a mobile security robot. The idea is simple – a mobile robot to patrol warehouses and other facilities and report on problems: break-ins, fire, flooding, unsecured doors and windows, unauthorized people, etc. This might seem simple, but it is quite complex – because the environment is very complex.

What we want is a robot that behaves appropriately in most situations. In this context that means that it should take the ‘correct’ actions for most of the situations in encounters. If it sees an employee, it should (perhaps) greet the employee by name (“Morning Mr Anderson”) and continue with whatever task it is on. If, on the other hand, it encounters an intruder late at night – the appropriate behavior is very different – sound an alarm, notify its human supervisor, start recording high definition video and audio for possible use in prosecution. Very different responses to very similar sensor data.

To accomplish this, the robot needs to do more that just analyze sensor data to detect a human in a space – it needs to map that sensor data onto a semantic model of the situation and be able to ‘reason’ about the objects in its model. Further, it needs to be able to consider a number of possible responses to those objects, and select the appropriate course of action. It needs to use AI to recognize, to reason, and to plan. It also needs to know when it needs help, when it needs to call its supervisor; because almost any environment with people in it will exceed the capabilities of the on-board intelligence at some point. As a result, each robot requires a support network that allows it to ‘call home’ and get instructions from a human operator. But how do we give the robot the ability to behave appropriately when it is at work?

What is the Solution?

To be an effective force multiplier, the robot needs to be able to handle routine interactions on its own, and elevate non-routine situations to a human. As discussed above there are three key capabilities that the robot needs: the ability to map the objects it encounters into semantic representations, the capability to reason about these objects and their behaviors, and the capability to choose its own behaviors to meet the system’s goals. With these three capabilities on board, the robot can both handle the routine and also recognize when things are likely to be outside its capabilities and require help from the people.

The basis for all of this is the difference between seeing (sensing) something and understanding what those inputs mean. The sensor readings indicate that there is some thing in the environment. AI on the Edge allows the system to represent the object that the sensors indicate. Once the system ‘knows’ what the object is, then it can reason about that object to assess the appropriate course of action. We call this process of mapping sensor data into a conceptual model reification.

Reification

There are two aspects of AI on the Edge that are needed to effectively operate in a dynamic and uncertain environments:

However, for more complex systems, there is a mapping between the perception of sensory information and some abstract representation that is manipulated to enable task completion. This is the process of recognition. Conversely, the system must be able to know what features of the environment can be used to complete a task, to know what to expect the environment to look like after an action is taken. This is preafference. The combination of these two processes is called reification. (paraphrased from “Robots, Reasoning, and Reification”)

In order to function effectively on the edge, a system needs to be able to recognize what things are in their environment, reason about those representations with respect to the systems goals, and predict what the effect of possible actions will be. These capabilities will enable the system to handle low-level, routine tasks on its own, and ‘know’ when something requires outside help from a human.

Reification of objects

A robot patrolling an empty warehouse has an easy job. There is nothing to get in the way, there is nothing that changes, there is, well, nothing. So, if one night there is suddenly something (anything) that is a cause for an alarm. Any change indicates a problem. In the real world, warehouses and office building are not like that. There are things all over – in a warehouse there are forklifts and shelves and pallets of materials, in office buildings there are desks and chairs, and potted plants. In both types of facility there are people moving about during the day and sometimes at night.

Now the robot is confronted with a challenge – suppose one night there is an object in the middle of an aisle or hallway. What is it? The ability to recognize and understand the properties of an object is key to AI on the Edge. That object could be a box left on the floor, or it might be a person. If it is a person, it could be Chris Anderson from accounting who forgot her purse, or it could be an intruder. With raw sensor data the robot can determine the relative location, the size, perhaps the shape, and maybe the temperature – but none of these is sufficient to determine what the object is. So the robot has no option except to contact a person and ask for help. The human operator takes a look using the onboard video system and concludes that it is a large box and tells the robot to ignore it. No big deal, problem solved. Except that the operator had to take a hand in this trivial task.

An hour later the robot goes down the same hallway, and sees the same object, calls it in again, and is told to carry on. And after a few more hours, the human gets tired of the constant alarms, and re-routes the robot, unfortunately opening a potential security hole in the facility.

Let’s look at this situation with AI on the Edge – the robot comes down the hallway and the sensors detect the object. The general shape is extracted from the data fusion of the 3-D sensor and the video image. It matches a pattern in the system for a 2’ moving box (recognition). The color matches the expected color of a cardboard box, and the temperature matches the ambient temp. This is the expected sensor data for the current classification (preafference). The robot sends in a report (just as a mobile security officer would call it in) and continues on its patrol. On the next pass the robot expects to see the box in the same place and it confirms that the location and orientation are the same (again matching the expected behavior using preafference). Since the box was called in previously, and everything is as expected, no additional report is made. Reification enables the robot to confirm that the object is exactly what it appears to be, and the robot can plan and act appropriately.

Reification of speech

On a midnight patrol of the office, the robot picks up the sounds of speech. Speech is a lot more than language – in addition to words and phrases there is tone and volume, pauses and pacing. While many systems (either on the edge or in the cloud) provide real-time speech to text conversion, almost all of the non-verbal aspects of speech are removed. With better AI on the Edge, speech can be analyzed for volume – was it shouted or said at a conversational level? Was it a stressed voice or a calm voice, was it a whispered “I don’t think anyone is coming” or was it part of a discussion about a meeting “I don’t think anyone is coming”. These are the types of nuances that currently require a person to disambiguate, but could be processed on the edge with the right AI. Of course, if there is any question – the human can be brought into the picture to take over, just as a security officer might call on their supervisor.

Reification of space

For a mobile robot space is key. Many systems currently use very advanced position maintenance algorithms and mapping software enable the robot accurately track its location and navigate in crowded environments. What is often missing is the ability to adapt to changes in the environment without relying on external support. In dynamic situations the robot may need to create new routes to avoid obstacles that block hallways or aisles, or determine the optimal route to a trouble spot on the fly to respond to an incident. These require more than just a static map with pre-planned patrols – it requires that the robot be able to understand its environment well enough to plan new routes and then navigate the route to get to its destination. This means having levels of AI on the robot at the edge.

Reification of people

One of the critical tasks for a mobile security robot is making sure that unauthorized people are detected and reported effectively. In some cases this can be straight forward. In a locked down office or warehouse – any person in the space should be reported. So, as long as the robot can reliably detect a person, it can do its job. There can be challenges to identifying a person – but typically there are a number of clues (shape, motion, temperature, and others) that can be used with data fusion and pattern recognition to identify a person – even if they are trying to obscure their signatures. So sounding the alarm when there is someone in a place that no-one should be is relatively easy.

But it becomes more challenging when some people are authorized and other people are not. Now it is no longer sufficient to simply detect a person – it is necessary to detect that the right person is in the right location at the right time. This requires access to some form of database to determine who is authorized. The second aspect is being able to detect and identify a specific person. One step may be using a machine readable identification token (an ID card of some variety) but then it is also necessary to make sure the right person is holding the token. A human security officer does this easily – look at the photo, look at the face. For the robot it requires some form of facial identification, almost certainly using AI at the Edge. And even then, a change in glasses, a new beard, even bad lighting can create problems – so the human must be on tap to help out their robotics junior partner.

But calling for backup is more than just sending an alert. The person needs to quickly get up to speed on what is happening, and build their situational awareness.

Situational Awareness and Reification

When a robot calls for help there is a moment when the human needs to assess the situation and build their own model of what is going on. Why did the robot call? What is going on? What is the problem? The need to quickly build up situational awareness is a critical step whether a person is taking over for a robot or a human officer on the scene.

One of the powerful benefits of a security robot with AI on the Edge is that when control is passed to the human the robot can use the semantic information to communicate what and why in ‘human’ terms. The robot can report

“I have encountered a person who has no ID, and refused to stop. I am following them in the cafeteria.”

“There is a package that has been left in the lobby. I have monitored it for 15 minutes and no person is near it and no person has come by to claim it.”

Reports like this can quickly give the supervisor an accurate picture of what has happened and why the robot requested backup. Without the semantic information, the supervisor might need to spend critical time replaying a video feed, or reading a log file to get situational awareness. And in the security business speed and accuracy are critical.

Modern control loop – Smart sensors, complex spaces, human agents all interacting.

Wrapping it up

A robot can be a tireless worker doing repetitive tasks over and over and freeing up people to do ‘people stuff.’ A robot can be a remote extension of a person – allowing them to see through its eyes, hear through its ears, and act in the place of a person. The first frees up a person from doing dull tasks, the second allows a person to ‘be’ somewhere they are not. Neither of these applications really require much intelligence in the robot.

But, to be a true force multiplier robots must carry their own intelligence along with them. They need to be smart enough to handle the complexities of the real world, and share some of the burden with the humans. They need to be smart enough to respond appropriately to the routine matters of their jobs, and smart enough to know when to call in their boss – the person. Finally they need to be smart enough to deliver a ‘human readable’ situation report to enable their supervisor to quickly get situational awareness and be able to respond appropriately themselves. With mobile security robotics it becomes possible for the team of robot and human to do more that either can do individually, and provide significant value by incorporating AI on the Edge .

About the Author

Jim Gunderson is part of the team at GunderFish. He has been working on Artificial Intelligence and robotics for decades. With a PhD in Computer Science and a focus on AI, Dr. Gunderson has a ideal perspective on the technology and the impacts of the the AI revolution. He is the author of numerous technical papers and co-authored the book “Robots, Reasoning, and Reification.” He is an invited speaker for plenary talks and international presentations on topics ranging from the social effects of AI to a joint TEDx presentation with Dr. Louise Gunderson featuring an intelligent mobile robot as a co-presenter addressing consciousness and robotics.

AI on the Edge: Robotics