FastPoseCNN: Real-time Monocular Category-Level 6D Pose and Size Estimation Framework

Created by Eduardo Davalos Anaya and Mehran Aminian from St. Mary's University.

Our method uses multiple representations to reconstruct an object's pose and size physical parameters. By decoupling these parameters, the framework achieves better performance and excellent inference speed.

Introduction

This PyTorch project is the implementation of my thesis, FastPoseCNN: Real-time Monocular Category-Level 6D Pose and Size Estimation Framework. Note: that this thesis is just a proof of concept and requires more development to fully become a stable and commerically liable solution. That being said, FastPoseCNN provides an excellent tradeoff between speed, accuracy, and universaility.

Information about the project directory and file structure.

FastPoseCNN
|   README.md
|
|---datasets                                # location of all datasets
|   |
|   |---NOCS                                # dataset used for most experiments
|        
|---source_code
    |   environment_linux.yaml              # dependency files (strict for linux)
    |   environment.yaml                    # relaxed dependency files
    |
    |---FastPoseCNN
        |   .env                            # environmental variables file
        |   config.py                       # Contains hyperparameter container 
        |   setup_env.py                    # Script to setup environment vars.
        |   train.py                        # Script for all training routines 
        |   evaluate.py                     # Script for all evaluation routines
        |   inference.py                    # Script for inference tests
        |   ...
        |   
        |---lib                             # Directory with all PyTorch GPU code
        |   aggregation_layer.py
        |   gpu_tensor_funcs.py
        |   ...
        |   |
        |   |---ransac_voting_gpu_layer     # PVNet's hough voting implementation
        |
        |---tools                           # Numpy+PyTorch generic tools
            create_meta+.py
            visualization.py
            ...

Requirements

The specific libraries and their versions can be found in the environment.yaml (less strict) and environment_linux.yaml (more strict linux requirements). Overall, the most important dependecy requirements are the following:

python==3.8.5
pytorch==1.8.0
torchvision==0.8.2
cudatoolkit==10.2
numpy==1.19.2

Also, this project used the Hough Voting scheme and implementation from PVNet. The authors perform fantasic research, and without their released code, this project wouldn't be possible. Below, we have provided a brief citation to their GitHub repository and project page.

PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation
Sida Peng, Yuan Liu, Qixing Huang, Xiaowei Xhou, Hujun Bao
CVPR 2019 oral
Project Page

We placed their Hough Voting scheme within the lib directory. Intructions to compile the cuda source code is provided within PVNet GitHub's installation section.

Datasets

For this research, we used NOCS CAMERA and TEST datasets. Beware: the CAMERA dataset is very large (~140 GB). These datasets can be downloaded here:

We would like to personal thank the NOCS authors for providing these datasets. Below is an another brief GitHub link:

Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation
Created by He Wang, Srinath Sridhar, Jingwei Huang, Julien Valentin, Shuran Song, Leonidas J. Guibas from Stanford University, Goodle Inc., Princeton University, Facebook AI Research.
CVPR 2019 oral
Project Page

Training

Before running the train.py script, we recommend that you modify the HPARAM variable that defines the overall hyperparameters used during training. More information about these hyperparameters can be found in config.py file.

For training, we used the config.MASK_TRAINING and config.HEAD_TRAINING preset HPARAMs to train the model in a two stage system. Once you have modified your hyperparameters, you can run the training script with the following command:

python train.py

Any hyperparameter can be changed in by adding --<HPARAM NAME>=<HPARAM VALUE>.

Evaluation

Before evaluating, download the NOCS dataset and the weights provided in the releases page. Additionally, modify the NOCS dataset by rename four folders to match the structure shown below. This is to simply the loading of the datasets' samples.

NOCS
|
|---camera
|   |
|   |---train
|   |   ...
|   |   
|   |---val
|       ...
|
|---real
    |
    |---train
    |   ...
    |
    |---test
        ...

After making these modifications, please execute the following commands:

python create_meta+.py --DATASET_NAME=camera --SUBSET_DATASET_NAME=train
python create_meta+.py --DATASET_NAME=cemera --SUBSET_DATASET_NAME=val
python create_meta+.py --DATASET_NAME=real   --SUBSET_DATASET_NAME=train
python create_meta+.py --DATASET_NAME=real   --SUBSET_DATASET_NAME=test

After all this steps, you should be able to execute the evalute.py routine. Just remember to modify the CHECKPOINT hyperparameter to reflect the location of the downloaded weights.

python evaluate.py --CHECKPOINT=<weights path>

Here is an example output of the FastPoseCNN framework using the provided in this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
datasets		datasets
source_code		source_code
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets

datasets

source_code

source_code

LICENSE

LICENSE

README.md

README.md

Repository files navigation

FastPoseCNN: Real-time Monocular Category-Level 6D Pose and Size Estimation Framework

Introduction

Requirements

Datasets

Training

Evaluation

About

Releases 1

Packages

Languages

License

edavalosanaya/FastPoseCNN

Folders and files

Latest commit

History

Repository files navigation

FastPoseCNN: Real-time Monocular Category-Level 6D Pose and Size Estimation Framework

Introduction

Requirements

Datasets

Training

Evaluation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages