bnew

Veteran
Joined
Nov 1, 2015
Messages
45,543
Reputation
7,433
Daps
136,942




1/1
Extremely thought-provoking work that essentially says the quiet part out loud: general foundation models for robotic reasoning may already exist *today*.

LLMs aren’t just about language-specific capabilities, but rather about vast and general world understanding.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196





1/4
Very excited to announce: Keypoint Action Tokens!

We found that LLMs can be repurposed as "imitation learning engines" for robots, by representing both observations & actions as 3D keypoints, and feeding into an LLM for in-context learning.

See: Keypoint Action Tokens

More

2/4
This is a very different "LLMs + Robotics" idea to usual:

Rather than using LLMs for high-level reasoning with natural language, we use LLMs for low-level reasoning with numerical keypoints.

In other words: we created a low-level "language" for LLMs to understand robotics data!

3/4
This works really well across a range of everyday tasks with complex and arbitrary trajectories, whilst also outperforming Diffusion Policies.

Also, we don't need any training time: the robot can perform tasks immediately after the demonstrations, with rapid in-context learning.

4/4
Keypoint Action Tokens was led by the excellent
@normandipalo in his latest line of work on efficient imitation learning, following on from DINOBot (http://robot-learning.uk/dinobot)[/URL] which we will be presenting soon at ICRA 2024!

GK0Po-4X0AY61lL.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
45,543
Reputation
7,433
Daps
136,942

bnew

Veteran
Joined
Nov 1, 2015
Messages
45,543
Reputation
7,433
Daps
136,942

1/1
Excited to announce Tau Robotics (
@taurobots ). We are building a general AI for robots. We start by building millions of robot arms that learn in the real world.

In the video, two robot arms are fully autonomous and controlled by a single neural network conditioned on different language instructions (four axes and five axes robot arms). The other two arms are teleoperated. The entire hardware cost in the video is about $1400. The video is at 1.5x speed.


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GMCPNRbasAIcZQ9.jpg

GMCMHhdW8AA-pd1.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
45,543
Reputation
7,433
Daps
136,942







1/7
Introduce HumanPlus - Shadowing part

Humanoids are born for using human data. We build a real-time shadowing system using a single RGB camera and a whole-body policy for cloning human motion. Examples:
- boxing
- playing the piano/ping pong
- tossing
- typing

Open-sourced!

2/7
Which hardware platform should HumanPlus be embodied on?

We build our own 33-DoF humanoid with two dexterous hands using components:
- Inspire-Robots RH56DFX hands
-
@UnitreeRobotics H1 robot
-
@ROBOTIS
Dynamixel motors
-
@Razer
webcams

We open-source our hardware design.

3/7
Naively copying joints from humans to humanoids does not work due to gravity and different actuations.

We train a transformer-based whole-body RL policy in IsaacGym simulation with realistic physics using AMASS dataset containing 40 hours of human motion: AMASS

4/7
To retarget from humans to humanoids, we copy the corresponding Euler angles from SMPL-X to our humanoid model.

We use open-sourced SOTA human pose and hand estimation methods (thanks!)
- WHAM for body: WHAM
- HaMeR for hands: HaMeR

5/7
Compared with other teleoperation methods, shadowing
- is affordable
- requires only 1 human operator
- avoids singularities
- natively supports whole-body control

6/7
Shadowing is an efficient data collection pipeline.

We then perform supervised behavior cloning to train skill policies using egocentric vision, allowing humanoids to complete different tasks autonomously by imitating human skills.

7/7
This project is not possible without our team of experts, covering from computer graphics to robot learning to robot hardware:
- co-leads: @qingqing_zhao_ @Qi_Wu577- advisors: @chelseabfinn @GordonWetzstein

project website: HumanPlus: Humanoid Shadowing and Imitation from Humans
hardware:


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GP8poQOaoAAaT8w.jpg

GP8rzNmaMAABuj_.jpg

GP8t0N8bcAAuB0d.jpg

GP81fPRaIAAgNlP.png

GP9yUj4bsAEPyaU.jpg








1/6
Introduce HumanPlus - Autonomous Skills part

Humanoids are born for using human data. Imitating humans, our humanoid learns:
- fold sweatshirts
- unload objects from warehouse racks
- diverse locomotion skills (squatting, jumping, standing)
- greet another robot

Open-sourced!

2/6
We build our customized 33-DoF humanoid, and a data collection pipeline through real-time shadowing in the real world.

3/6
Using the data collected through shadowing, we then perform supervised behavior cloning to train skill policies using egocentric vision.

We introduce Humanoid Imitation Transformer. Based on ACT, HIT adds forward dynamics prediction on image feature space as a regularization.

4/6
Compared to baselines, HIT uses
- binocular vision, thus having implicit stereos for depth information
- visual feedback better, avoiding overfitting to proprioception given small-sized demos

5/6
Besides vision-based whole-body manipulation skills, our humanoid has strong locomotion skills:
- outperforming H1 default standing controller under strong perturbation forces
- enabling more whole-body skills like squatting and jumping

6/6
This project is not possible without our team of experts, covering from computer graphics to robot learning to robot hardware:
- co-leads: @qingqing_zhao_ @Qi_Wu577- advisors: @chelseabfinn@GordonWetzstein


project website: HumanPlus: Humanoid Shadowing and Imitation from Humans
hardware:


To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
GP9yyYia4AAR0-L.jpg

GP9y4p5bAAAWqCR.png

GP9y8hVbsAA67zY.png

GP9zBNnaoAA6r_-.jpg
 
Top