Interactive Assistive Control Via Value Function Estimator Optimization

Class Instructor Date Language Ta'ed Code
PhD Research Project Karen Liu Fall 2017-Winter 2019 (Continued as Personal Project) C++ No Code N/A

Assistive Getup - Using VF Approximation to Derive Optimal Assistance.

(Started Fall 2017) This project had been my primary project as a PhD. The underlying hypothesis was that, given an environment consisting of an Agent Needing Assistance(ANA), and an Assisting Bot (BOT), the trained baseline function (used to stabilize training and reduce variance between epochs/policy updates) will act like a Value function approximator, and if the BOT's assistance (i.e. a force vector applied to ANA) is included as part of ANA's state then this VF approximator could be used via optimization to propose the best assistance for any given state.

Below is a video of the system in action (with links if the videos don't play automatically - the link will play full size) :

Assisted getup using baseline network optimization to derive assistive force.

In the video, The ANA figure shown on the left is responding to a random trajectory providing assistance in the form of a 3D force vector, using a policy to derive control torques given pose q, angular joint velocity qdot and assistive force vector (in 3 D). This ANA is forward simulated a single step to derive contact information. The coupled ANA-BOT pair on the right are consuming the same policy, but the BOT is providing the assistance force via derived Optimal Control using the assistance proposal provided by the Value Function Optimization.

The problem with this project was that the results proved inconclusive. While in the example above, the improved behavior is evident, many configurations did not exhibit such definitive results, with some performing no better despite optimal assistance proposals.

Initially we worked in 2D, with a partial humanoid (waist-down) with a much smaller state (16 dofs vs 33) and in this case the VF approximator provided a good optimal force proposal, but as the state grew in size, the VF approx seemed to suffer from the increased dimension and yielded a much less definitive force proposal.