Abstract
Humanoid robots are increasingly being integrated into human spaces to facilitate independent functional living. Teaching robots skills to successfully navigate the human spaces which are characteristically non-stationary is extremely challenging. Personal robot assistants are personal such that the robot is expected to intuitively learn from the human partner as the more knowledgeable other. While Learning from Demonstration, LfD, has been successfully demonstrated as a viable concept to intuitively teach robots from human action skills, current works have been focused on specific areas of concern under controlled conditions that are rarely applicable or extensible in the real world. The following problems are not fully addressed. The low-level definitions of the action skills generally are learned with little regard to context in which such motion is to be applied. Where such context exists, the distribution of the context does not allow reaction in real time, the contract grows stale and irrelevant over time; the training takes too long and require expert skills which are not readily available in the end-user environment. The complexity of the defining and executing the task demands an ensemble of techniques to tackle the multi-objective problems of learning the human spaces. In this work, the LfD is extended to employ scaffolding and co-evolution to achieve life-long learning. A general architecture and context-aware middleware is provided to address the ontological relationship between the tasks, sub-tasks, objects, and corresponding affordances in a consistent manner. Further the middleware, addresses the communication challenges in the distribution knowledge and inference by leveraging low-latency communication protocols.In this thesis, a brief overview of the state-of-the-art methods for tackling the challenges in the implementing a scaffolded framework for teaching humanoid robots from human action skills. This captures the breadth and scope of the problem of decomposing human demonstrations and encoding them for the robot to learn collision-free trajectories. The overall aim is to intuitively instil situational adaption and life-long learning in robots operating in stochastic human-centric spaces. The learning process is purely statistical and involves finding patterns from the observations and stitching together a sequence representing the observed behavioural cues. The robot is expected to learn object affordances and solve the correspondence issues arising from differences in form factor and embodiment. This will provide the basis and direction rest of the thesis. Using the framework developed, both low-level and high-level context information is captured as a series of related knowledge graphs. The domain over which multi-stage tasks are defined and their context represent a distribution of priors encoded as knowledge hypergraphs in a symbiotic ecosystem exhibiting deliberately modelled on protocooperative mutualism. Using a form of digital commons, the robots uses the state space knowledge and constraints to limit unguided exploration and reducing the dimensionality of the search space by exploiting the systems long-term memory. Given a robot which knows the basic set of ‘truths’ about itself and the immediate environments, it can be immediately useful with little or no prior training. Using the model of cooperative mutualism in informationally structured environments, members of this symbiotic ecosystem contribute to and benefit from this shared domain knowledge and continually coevolve to perform more complex tasks. Several methods of reasonably disambiguating the multi-sourced knowledge and inferring quasi-optimal decision knowledge are leveraged.
Overall, this thesis presents the architecture and the ontological structures to encourage scaffolded life-long pseudo-inductive learning in humanoid robot learning from human action skills. The concepts, system architecture and justification of choices are presented. Several sources of uncertainties in teaching humanoid robots from action skills exist using noisy data from sensors used to capture observations, variations in the trajectories used for the same demonstration, decisions on the size of the observation window for continuous or expressive actions, misclassification of models and more. These may also include situations where the human expert can provide the confirmation of a learned task but it incapable of fully demonstrating the task due to physical limitations such as disability. The robots are modelled as facultative mutualists involved in protocooperative relationship projected onto a hypergraph whose hyperedges are continuously refined and evolved over the life time of the learners. The ranking of the hyperedges may be based on either spatial separation or provided by an authoritative reference such the human expert to allow disambiguation. Each member of the ecosystem is able to capture a degree of truth about its domain and contributes to a shared digital commons through a producer-consumer middleware that adds very little overheard to the task execution. Context is restricted by topics which define a domain made up of dense connection of related nodes allowing complex learning to be decomposed into tractable overlapping sub-problems.
Date of Award | May 2019 |
---|---|
Original language | English |
Awarding Institution |
|
Supervisor | Zhaojie Ju (Supervisor) & Honghai Liu (Supervisor) |