Why Would Google Want To Be Your Babysitter?

Google wants to be your babysitter?  That’s a pretty interesting tag line, and one that Protocol has offered us.  According to their piece, parent Alphabet has filed a patent in the space.  “Not that many of us are leaving home much, but in that distant future when the world returns to normal, Google wants to be in charge of looking after your kids.”  While I doubt that Google engineers are swarming over babysitting robot prototypes, I do think that they’re exploring what could be done with what I’ve been calling “contextual services”.  I’ve talked about them mostly in the context of the wide-area network, as part of carrier cloud, but they could have applications within a workplace or home.

The basic theory of contextual services is that in order for a form of artificial intelligence to work appropriately in a real-world scenario, it needs to know what the real world is.  Actions are judged in context, meaning what’s happening around you in that real world.  Turning on a light if it’s dark is smart, but it wastes money if it’s already bright daylight.  A robot cleaner should stop short of hitting a wall, but it has to know where the wall is, where it is, and what “short” means, in order to do that.

We already use machines to monitor us.  In most hospitals, patients are hooked up to monitors that constantly check critical health parameters and signal if something is wrong.  There are some situations where a response could be automatic; a pacemaker implant responds to EKG conditions, it doesn’t signal a nurse to run out and find the patient.  So, it’s not outlandish to think that we could employ some sort of contextual toolkit to watch our families, pets, etc.

Not outlandish, but maybe outlandishly complicated.  I’ve fiddled with notions of contextual services for about a decade now, and with some robotic concepts for over 30 years.  In the process, I’ve uncovered some issues, and some principles too.  They suggest Google might be on to something.

My instinctive approach to something like an automated home was a central control intelligence.  A supergadget sits somewhere, wires and wireless connecting it to all manner of sensors and controlled devices/systems (vacuums, heaters, etc.).  This works pretty well for what I’ll call “condition-reactive” systems like heaters, because you can set simple policies and apply them easily because the conditions are simple and fairly regular.  It’s cold-turn up the heat.  It doesn’t work as well for what I’ll call “interactive” systems, which means things that have to relate to humans, pets, or other things that are not really under central control at all.

In order for an automated home to handle interactive systems, it has to be able to understand where all those uncontrolled things are, and from that and a timeline, infer things like their motion.  It also has to have a set of policies that define abnormal relationships between the uncontrolled elements and either other elements or controlled/controllable things.  No, mister robot vacuum, don’t sweep up Junior, and Junior, don’t keep following the vacuum or jumping in front of it.

The more complicated the controllable things are, the more useful an automated thing can be.  If all you can do is turn lights on and off, or set the heat up and down, you’re not getting much automation.  If you can unlock doors, change room lighting based on occupancy, warn people about risky or unauthorized movement, and even perhaps step in by closing a gate, then you’ve got something a lot more valuable.  The problem is that you also have a lot of complex stuff for your central supergadget to handle.

My view has evolved to the point where I tend to think of automated facilities in terms of hierarchies of autonomous systems.  A home robot, for example, might have a controller responsible for moving its wheels and controlling their speed, and another to sense surroundings via optical interpretation, motion/proximity, etc.  A third system might then apply policies to what the sense-surroundings system reported, and send commands to the motion-control system.  This approach leads to easier coding of the individual elements, and guarantees that no task starves the others for execution time.  If one system can’t respond to a condition, it reports it up the line.  I hit something and there shouldn’t be anything there to hit!  What do I do now?

We can apply this to automating a home.  A robot vacuum has an internal control system that’s responsible for having it cover a space without getting trapped, hitting things, or missing spots.  That system does its thing, blind to the rest of the home, until it encounters something that it has no policies to handle.  Say, Junior keeps jumping in front of it.  Our robot vacuum might see this as an obstacle that doesn’t stay in place, and so can’t be avoided through normal means.  When this happens, it kicks the problem upstairs.

There are also situations where “upstairs” kicks a condition downward.  Suppose there’s a power failure.  Yes, the process of going into power-fail mode would normally involve many predictable steps; emergency lighting is an example.  If we assume our supergadget has UPS, we could have it signal a power-fail condition downstream to other power-back-up control points.  If there is a human or pet in a room, set emergency power on there.  That rule could be applied by a lower-level controller.  For our robot vac, it might use battery backup to move into a corner so somebody doesn’t trip over it, and of course we’d likely alert a human agent about the new condition.

This can be visualized at a high level as a kind of finite-state problem.  There is a normal state to the home, in terms of lighting conditions, power, temperature of the rooms and refrigerators and other stuff, detecting of smoke or odors, etc.  Everything in every room has a place, which can be static (furniture) or dynamic (Junior, Spot, or the robot vacuum).  Dynamic stuff has a kind of position-space, where “green” positions are good, “yellow” positions are warnings, and “red” means intervention is necessary.  Same with the state of things; body temperature, heart and breathing rate, and even time since a last meal or drink could be mapped into the three-zone condition set.

It’s interesting to me that contextualization needed for home automation and Google’s hypothetical babysitting goal, matches so closely with the state-mapping needs of lifecycle automation of operations processes.  It shouldn’t be a surprise, because the underlying approaches are actually very similar.  What makes it interesting, therefore, is less its novelty than the opportunity that might be represented.  A good state/event-model system could be used to babysit, control warehouses and movement of goods, automate operations processes, drive vehicles, and so forth.  There is, in short, an opportunity for an underlying platform.

Could we use TOSCA or SDL to describe an arbitrary state/event system?  At least at some level, I think we could.  The questions with both are the way in which a hierarchy we represent in our model can be related to a map of that real world we have to keep in mind, and the way that events are represented and associated with processes.  I’d love to see some useful work done in these areas, and I think the reward for something insightful could be very substantial.