Regarding direct training deployment on real robot #1

yolo01826 · 2024-05-18T18:30:38Z

Dear Author,

I really like your work and have a few questions I'd like to ask you.

Firstly, if the keypoint tracking method is trained on a single task with 100 human demonstrations and 10 robot teleoperations, can it yield a well-performing tracking model?
Does the keypoint information passed into the policy consist of its pixel coordinates, or does it include some form of rotation?
The paper mentions using 4 A100 GPUs for training. Is it possible to train a good single-task policy using a 4090 GPU instead?

Thank you.

AlvinWen428 · 2024-05-29T02:17:58Z

Hi, thanks for your interest in our work.

Yes. A single task makes it easier to learn a good track transformer.
Only 2D coordinates are fed into the policy. Because there are multiple point tracks, the policy will infer the rotation by itself.
Yes, but you need to tune the hyperparameters.

I hope my answers will be helpful.

dblanm · 2024-05-30T10:50:11Z

Hey! I want to follow on questions about training and deployment in real.

Is there any example of the data required for providing real-world demonstrations with a robot?
Will you provide code for the real-world experiments with the UR5?
How are you sending the commands with the UR5, is this ROS, RTDE with movep and movej, or something different?
Related to 3, Could you explain the pipeline: ATM tracker & policy + send commands to robots
Could you also specify more in detail what is needed for doing this with real hardware: what inputs for doing this with real hardware, requirements in camera viewpoints, any required transformations of observations to the policy, calibrations, limits of robot/human demonstration velocities, GPUs required for doing inference, etc.

Thanks for the help!

Provide feedback