AAAI-15 Conference – Day 3 – Daniel Lee James

My observations from the 3rd full day of sessions at the AAAI-15 artificial intelligence conference in Austin, Texas.

Intelligent Decisions

I missed most of this particular talk, so a bit of this is second-hand. Here’s what I’ve been able to gather from it.

Talk is mostly about what has been happening with IBM research, most of which seems to be going down the track of aggregating algorithmic solutions; a sort of overloaded shotgun approach.

Have developed a synapse chip that presents a significant performance improvement that can possibly be applied to neural simulation.

Cognitive systems development. In practical terms, the design support system is made for working in partnership. Part of goal is to be able to effectively answer questions that weren’t programmed into the system. General strategy is to train multiple classification algorithms at once, favor the one with the best upper bound, and have humans interpret the result for feedback. Cycle of processing: search => read => find => record => aggregate. User will not provide all of the data, but may relate how to extract more, which can lead to better interactive modeling.

Solution to another problem involved painting a scenario wherein you allow the system to “max out” a budget in order to find a solution to a problem.

Meta-learning: multiple learners together give you better results. They, in effect, learn with algorithm works the best for a given input, without interpolation. Select winner by clustering, etc. More or less the Watson way, and the Watson Development Platform is available for use.

meinoff@us.ibm.com

“Deployed: Robust System for Identifying Procurement Fraud”

Fraud risk in enterprises. $3.5 trillion in losses. 18 months to capture fraud on average by way of auditing, which captures only a small percentage. Average loss of $1 million in 20% of the cases. Clear that opportunities exist to improve.

Taxonomy of fraud events (created by IBM). Fraud by vendors vs fraud by employees, both often involving collusion; e.g. artificially inflating prices, bribes, lower quality product or service, falsifying performance, fake vendors, fictitious orders, etc etc. Example of lady in india ordering usb sticks, but would end up getting for free, had access to and would delete relevant boss’s email. Ended up ordering thousands of usb’s for free and selling. So WIDE range of cases that need detection.

Example of conjoining events & observations which alone may not indicate anything; e.g. hear fight between husband and wife (happens all the time and not necessarily alert on its own), hear loud noise at night (ditto), see husband carrying bag out in the morning (ditto). Combined, however, is pretty suspicious.

Here, multiple analytics techniques applied for a procurement fraud analytics tool:

data capture/prep => text analytics => anomalous event detection => social network analysis = importance weighting => unsupervised learning => investigation & alert triggers (scores and confidence levels) => supervised learning

Risk event groups: profile risk (problems with vendor profile that indicate closer look needed), perception risk (problems with perception of vendor), transactional risk (issues with transaction patterns and history), and collusion risk (problems with relationships between associated parties.

Vendor scoring includes things like registration with Dunn, roundness of dollar invoice, perception index, how well P.O.’s line up with invoices, etc.

Uses sequential probabilistic learning, an “online” learning algorithm, for evaluating collusion. Input weights and confidences => determine edge probabilities => assign edge weights => infer probability of collusion => output collusion confidence.

Showed to be better than leading competition in solving more types of fraud problems. Seamlessly combines various models to effectively analyze procurement risks. Actively, since last year, monitors daily about $45 billion, 65k vendors across the world. Still a work in progress, but currently is quite useful and accomplishing things that haven’t been done for 15 years.

Then showed demo on real data. Landing web page is a sort of dashboard of statistics. Risk analysis report shows word clouds for countries and types of issues, with various filter and NLP options, summary charts, score distribution, country to country risk assessment, etc.

“Emerging: Design and Experiment of a Collaborative Planning Service for NetCentric International Brigade Command”

German, French and US army. Cooperative planning. Optimization of collaborative operations. Information flows very fast at tactical levels. Using constrained optimization techniques, conducted experiments 2008-2012.

Joint planning today is at the division level and is very slow. Division decision level 6-8 hours down to platoon or squad which needs decisions like NOW. Situations involves enemy and friendly forces, of course. Execution involves phasing and coordination; who’s doing what at what time. Highly asynchronous and time sensitive. All sorts of variable aspects; mobility, size, firepower, effectiveness of troops, etc. Example of embassy taken by terrorists. Contingencies can cause unintended cross fire between friendly forces.

Created a planning service “ORTAC” that can be accessed by US/FR/GE command and control. CLP(FD) in SICStus Prolog; involves constraint graph, branch and bound algorithm, predicates and constrained predicates, etc. Navigation constraint model uses to calculated costs (on timing, security capacity, etc) for paths; more or less a routing optimization problem. Introduce a deconfliction model to minimize conflicts in the graph overall. Then apply (global) search algorithm with “probes”: metric computation => relaxed problem solving => variable ordering => branch and bound… Then tune the algorithm to consider temporal and spatial deconfliction.

Many participants in the experiment, government and corporate; conflict and threat warning, plan computation and time/space deconfliction/repair. Can propose alternative “better” plans. Robots and humans alike considered in deployment analysis. The paper includes a lot of military acronyms; author says you can contact him if you have any questions. The experiment was successful enough to prove feasibility of a system like this, so research and development is continuing.

Questions raised about effectiveness with broken communications on the field… There’s a political barrier that complicates it; each country wants to control its own resources, etc. It is something that would involves cooperation at other levels than just the tech.

“Deployed: Activity Planning for a Lunar Orbital Mission”

NASA Ames research. Problems/contributions: LADEE activity scheduling system (LASS) for activity planning to meet deadlines, using AI to help formulate activity planning modeling and processing to manage orbit; does involve issuing various live commands to equipment on the spacecraft, and can be applied to “snap-to-orbit”.

Mission objectives: examine lunar atmosphere, sample dust. “Sunrise Terminator” is a position important for collecting data, among other positions with equally interesting names. Predictions of crossing times important and affect overall planning of spacecraft trajectory. Variable aspects that had to be taken into consideration in live planning and update to planning included things like purpose of observation, when science activities could or could not occur, multiple concurrent plans that need to be coordinated, etc. Instrument activity plans have to be coordinated with strategic, tactical, etc activity plans.

Numerous fairly involved, detailed slides of the flow and other aspects shown; will have to reference the paper/presentation.

Showed a number of JPL space projects this has be successfully used in. Dynamic Europa: automated constraint reasoning system. SPIFe: client interface. Activity Dictionary: encoded activities. Template Library: partial pre-defined plans.

Didn’t really go much into the actual algorithms or what was novel in terms of AI. It kinda came across as fairly routine software engineering, and a little bit of a stretch calling it AI.

“Asking for Help Using Inverse Semantics”

IkeaBot (yes, for building Ikea furniture 😉 )

Noticed while testing, noticed bot was flailing. Solution was to shove it twice. Not really good solution. So… made them think about how to get the robot to ask for help when it needs it. How does the robot determine what it wants the person to do? And how does it know how to communicate it?

Looking at prior NLP work… tried to find crossover solution for two main types of solutions.

Strips style symbolic planner for assembling furniture, includes pre and post conditions. Can hardcode mappings from failures to intervention methods, but that’s a fragile solution.

Introduce inverse semantics… involving forward semantics on “groundings”, probabilistic determination of the surrounding objects. Generation of semantics starting with looking at a context free grammar to describe the language used to talk about the groundings/objects. E.g. “hand me the (most likely the) white leg (that I need)”. Suggests that the formula presented (a product over a sum*product) for inverse language understanding *is* language understanding (seems a little stretchy).

Tested, of course, by introducing complications for the IkeaBot. Initial success rate was relatively low (20%-50%). Human written got high rate (as a control?). Inverse semantics approach reached about 64%.

Collaborative planner infers human actions needed to help robot, and generates natural language, and mathematical framework unifies language understanding and generation.

“Learning Articulated Motions from Visual Demonstration”

Motivation: want robot to understand the underlying kinematics of objects in a household environment. Doors, cabinets, trash cans, faucets, knobs, etc. How does the bot learn these things? An option is to mark objects in the environment, but this obviously doesn’t generalize beyond specialized environments. Would rather them learn in an unstructured environment with humans roaming about.

RGB-D as input only. Trajectory construction: extract features and motion over time. Trajectory clustering: collect into object parts . Pose estimation: can be noisy, but observe and predict possibly movement for the parts. Articulation learning: learns movement. Object model persistence: remember what has been learned in a way that can be used in other environments. Predicting object motion.

Qualitative results: train in one room with a set of objects. Then take into another environment with similar objects. Compared to state-of-the-art, this solution provides more robustness, better fidelity and accuracy of motion, show that over 43 different test environments, successful over 2/3 of the time.

Recent work: learning done with a human co-operator, which seems to be pretty effective in increasing smoothness and accuracy by way of observation of aided movement.

“Tell Me Dave: Context-Sensitive Grounding of Natural Language to Manipulation Instructions”

Common scenario is to give a grounding command to a robot, and it has to interpret it correctly and do something intelligent. Problem with grounding is converting a command into an action plan/sequence of actions.

Example. Making sweet tea => “Heat up a cup of water, add tea bag. Mix well.”. So in the environment there’s a table, a cup, a microwave oven, a stove, a sink, etc. Many instructions presumed and missing from the action description. Command also may not be in sync with the environment (might not be a cup, but *might* be a glass or something “close enough”). May be a multitude of ways to perform the task (e.g. multiple vessels, a microwave or a stove?). So the grounding is subject to all of these circumstances.

Common approach is to define action templates. However, fragile and typically doesn’t handle ambiguity or many to many situations well. E.g. “heat cup of water” vs “heat cup of milk”.

Another approach is to create a search path (a graph/tree). However, search space can become exponential.

Proposed here is a learning model (CRT with latent nodes). Clauses. E.g. “get me a cup” can be scoped in the tree with feedback about what’s available in the environment. Model solved using VEIL-template, which appears to be a function of the clause + environment + instruction sequence template + original object mapping.

Created an online gaming environment (crowdsourced testing). Able to collect 500 real templates by recording movements made by online users (I think). Those templates went into training the bot. Results show that robot is able to “fill in the gaps”, and shows 3x improvement over other solutions. Video showed a bot cooking and serving a meal, which was kinda cool.

“Learning to Locate from Demonstrated Searches”

Elderly care scenario. Want robot to find grandma’s glasses. Want this to happen for any grandma.

To make this work, would like this to be applicable in an optimal way in a novel environment. Introduce notion of “priors”, which are, e.g., beliefs about the likelihood of an object being in a particular place.

Given a location, then target probability distribution, then figure out an optimal search plan. For each location we have some features and determine a log probability. All we get to observe are “pasts” prior to our current situation. Scores end up being time-optimal search stories. Learning algorithm is iterative, starting with some weights. Make adjustments with feedback. Expected time to locate the object is part of what’s fed into the inference engine. “bag of words” technique used to capture/represent features.

Naive approach equations shows in contrast to better approach that involved “transposing sums”. As well… “admissible” heuristics are derived from “relaxations”. Introducing these sorts of heuristics shows good results as the complexity of the scenario scales up, by a factor of 10. Basically these changes shifts the search path to try the “most probable” locations first. While not purely optimal, results showed what a human would perceive to be near optimal (and I guess not totally dumb; e.g. robot: “glasses? I’ll check the fridge!”).

“Fully Decentralized Task Swaps with Optimized Local Searching”

Multi-robot task allocation. Which bot should do which task? Applicable to many scenarios; e.g. warehousing, robocup soccer, etc. Generate a task cost matrix. Translate into assignment matrix.

Background for task swapping… minimize total travel distance. Start with initial assignment. Update by swapping if not optimal. Use duality to optimize the solution. Use only local and single-hop communication (instead of global). Idea is to decompose the global solution into localized bits.

“Toward Mobile Robots Reasoning Like Humans”

Robot teammates with humans. Semantic navigation, search, observe, manipulate with autonomy, natural communication and explanations of “why”. A bit of related work cited that applies to just about everything described later. Work is centered around perception. Example: “stay to the left of the building and navigate to a barrel behind the building”. As humans, we can immediately upon seeing the building imagine a barrel behind it.

This approach mimics that behavior. Semantic classifiers applied to 2D visual image, then use that to generate 3D label points, given the resulting “plane cloud”, we predict a “building”. Based on the predicted building, the bot “hypothesizes” a barrel behind it. Robot can then generate path with pref to left side of building, per the command. Adjusts and corrects prediction as it moves. Introduced architecture (lot on the slide), includes world modeling, navigation, path planning, search actions, NLP, etc.

Semantic classification tries to label regions with a pre-defined set of labels. From there, cluster 3D objects, applying labels tempered by bayesian probabilities and field of vision limitations.

Imitation learning is used to teach spatial relationships and models of environment. Show the robot what it means to navigate in different environments. Robot extrapolates and weighs features for future use in live environment. Hypothesizing the unseen environment involves summations; equations shown, along with live environment examples, along with a crapload of small font sample commands, etc. During tests… 35 of 46 tests were successful. Some tests with bad commands excluded. Videos shown of real robot and interpretation of what the bot sees… going around the building, ignoring a fire hydrant, ignoring a barrel that’s not behind the building, and navigating to the barrel behind the building. All from a real robot over gravel and bumpy terrain.

“Learning to Manipulate Unknown Objects in Clutter by Reinforcement”

Autonomous robot for rubble removal. Major challenge for search and rescue. Previous work achieved 94% success on regular objects, 74% with irregular objects. Added strategy of pushing objects. Wanted to be able to get the robot to learn without any human intervention. Involves a lot of random interactions; a lot of trial an error. All by trying and observing success. Video shown of it in action, touch, touch, move box, move cylinder, push cylinder, pick and move box, pick up and move cylinder.

Unreadable slide of overview of the integrated system (thin, small yellow font on blue rectangles), and MC Botman in the house overloading the microphone… will have to see the paper. Something about clustering and segmentation of surfaces into objects, then breakdown into “facets”. Looks like this led to the bot translating visual image into conceptual mapping of disparate non-uniform objects. Touch and push appears to play into extracting features of the object, though it doesn’t look like tactile sensor based; rather success or failure of grasping. Functions shown for action evaluation, and yes it does appear to be a matter of iterations evaluating push-grasp feedback. Bandwidth selection equation shown that makes use of a “Bellman error” to adjust predicted. Plus makes use of reinforcement learning.

“Learning and Grounding Haptic Affordances Using Demonstration and Human-Guided Exploration”

Humanoid bot, Curi. Learning from demonstration. Human guided exploration. Involves: action demo => human guided exploration => affordance model.

Video shown. First assist Curi by physically hand-holding the action. Then bot tried on own a few times, but with a little bit of corrective assistance. Then create affordance model; 2 markov models… successful trajectories vs failures.

10 successful, 10 failed cases. Feed into offline classification. Precision is generally higher than recall. Skill with best performance have distinct and continuous haptic signals. Online testing involved variations, and successful 6 out of 7 times. Curi detects shake top affordance, pour top affordance…

Note that no visual information is actually recorded, just trajectory information… (hrm…)

(note to self: Curi’s approaching the uncanny valley (observation from video). And ignoring visual info (going blind basically) seems pretty limiting and difficult to generalize from).

“Apprenticeship Scheduling for HumanRobot Teams in Manufacturing”

On verge of a revolution in manufacturing in that we will see more robots and people working together. Early or lateness, broken parts or tools introduce complications. How do we do this efficiently? How do we allocate authority over workflow? How do we handle implications of adoption?

Task allocation, ordering of tasks, timing of tasks (duration), balance, deadlines, agent capabilities, etc etc… Tercio introduced. Uses heuristic techniques among other.

Looking at human acceptance of this, tested on two humans, two robots. Fetching tasks. Assembly tasks. Only the humans can build (legos). Team efficiency was better if the bot had more control over the flow. Plus the humans liked handing over some of that control to the bots. Further examination of that shows that people preferred robots over human control. So further questions include how to have robots coordinate teamwork. Regardless, figuring out the psychology of this is something worth more study.

(Note to self: implications of where this heads is scary and seems wrong. Egads. Regardless, figuring out the psychology of this is something worth more study.)

“Following a Target Whose Behavior Is Predictable”

What makes a good robot videographer? E.g. motorcycle jump video, sports, political events, etc. Line of sight, viewpoint. Good camera settings and beauty too, but focus on the former.

Video shown of underwater bot following a yellow target.

Robot needs to retain a belief about target, anticipate actions, search for target, consider dynamics of environment. Range from fully cooperative to fully adversarial targets.

Modeling the target… compute cost to go, giving pref to single path. Incorporate speed of convergence to the goal, somewhat relative on the rational behavior of the target, and include perceived rationality of the target into the equation. Particle filter applied. Cast as a finite-horizon POMDP problem. Robot has limited time to compute, so use monte carlo tree search. Each node maintains times visited and expected reward.

Animation shown of agent pursuing target in a maze. Seems to still be able to follow even when the target is out of the line of sight / field of vision. Still work to be done, because apparently there are some performance issues, etc. Not really scalable.

“Multi-Agent Rendezvous”

E.g. automated taxis meeting to load balance passengers, or an underwater vehicle near a surface vehicle. What’s the best strategy for solving the task? Goal: minimize time and resources for rendezvous. How much prior knowledge and communication is there? What if it’s an unknown environment? Compare to how humans do it.

Bots discover their environment as they go, eventually encounter each other. No prior knowledge or communication. Introduce cost-reward model and “distinctiveness”, which represents expensive choices.