This library provides implementations for single and multi-object instance localization from RGB-D sensor (MS Kinect, ASUS Xtion etc.) data. These are based on the PERCH (Perception via Search) and D2P (Discriminatively-guided Deliberative Perception) algorithms.
- Detect single objects in 3D space (in a tabletop setting) in under 10s
- No pretraining required
- Works with depth data from typical RGBD cameras
- Get high detection accuracies required for tasks such as robotic manipulation
- Ubuntu 16.04+
- ROS Kinetic (active development only on Kinetic)
- Create a catkin_ws and clone the following (clone realsense package to work with real camera) into the src folder:
https://github.com/SBPL-Cruz/improved-mha-planner -b renamed
https://github.com/SBPL-Cruz/sbpl_utils.git -b renamed
https://github.com/IntelRealSense/realsense-ros
https://github.com/SBPL-Cruz/ros-keyboard
- Clone
roman_devel
branch of this repo in the src folder of your catkin_ws:
git clone https://github.com/SBPL-Cruz/perception -b roman_devel
- Install Open CV 2.4 if not already installed. You can follow steps on the Open CV website
- Install gsl, vtk library etc.:
sudo apt-get install libgsl-dev libvtk6-dev libglew-dev libsdl2-dev
- Check parameters (frame names etc.) in the launch file :
object_recognition_node/launch/roman_object_recognition_robot.launch
- Check camera parameters in (currently configured to use Realsense):
sbpl_perception/config/roman_camera_config.yaml
-
Build the packages, build has been tested with catkin tools
catkin build
command. If you get compilation errors inoctree_pointcloud_changedetector.h
, follow steps here to fix -
To test with real data you can download sample bag file from these links :
- Bag 1
- Bag 2
- Or if using a robot, run Realsense using :
roslaunch realsense2_camera rs_rgbd.launch camera:=/head_camera publish_tf:=false
- Launch the code and RVIZ visualization using (the transforms between camera and base of robot should be being published by another node or bag file). The launch file is configured to use 4 cores for parallelization. To change this, change the number in this line -
mpirun -n 4
:
roslaunch object_recognition_node roman_object_recognition_robot.launch urdf:=false
-
The command
rostopic pub /requested_object std_msgs/String "data: 'crate'"
needs to be run to launch the code. This will start the algorithm once input point cloud and transform between camera and robot base has been received. The input point cloud, successors and output pose of crate (the crate model is published as a marker with the detected pose) can be seen in RVIZ. The config file for rviz that needs to be loaded is stored inobject_recognition_node/rviz/realsense_camera_robot.rviz
. -
With the sample bag file and 4 cores, runtime should be ~12s for Bag 1 and ~9s for Bag 2.
-
Sample RVIZ output when this config is used :
- Tweek table_height such that no points of the table or floor are visible in /perch/input_point_cloud
- Tweak xmin, xmax, ymin, ymax such that no visible points of required objects get excluded from the point cloud
- Tweak downsampling leaf size to get desired speed and accuracy