Monodepth2-TF2

It's a monodepth2 model implemented in TF2.x, original paper《Digging into Self-Supervised Monocular Depth Prediction》.

Here is the result after training for 13 epochs (without fine-tuning).

Dependencies

tensorflow==2.3.1
(for gpu) cudatoolkit=10.1,  cudnn=7.6.5

Performance

I'm using a normal GTX1060-MaxQ GPU. The FPS for single-image depth estimation:

using tf.saved_model.load (I think it's serving mode)
- encoder: ~2ms (500 FPS)
- decoder: ~2ms
- overall: >200 FPS (but when I use with YOLOv4, it drops to ~120 FPS)
using tf.keras.models.load_model with model.predict():
- overall : ~100 FPS (details forgot...)

example usage:

Use @tf.function decorator for Train.grad() and DataPprocessor.prepare_batch() will allow much larger batch_size

# Start training (from scratch):
python Start.py --run_mode train --from_scratch True --data_path <path_to_kiiti>

# Continue training:
python Start.py --run_mode train --weights_dir <weights_folder_path> --data_path <path_to_kiiti>

Check intermediate result:

method 1: run training code

# remember to comment "tf.function" decorator for Train.grad() and DataPprocessor.prepare_batch()
python train.py --weights_dir <folder_path> --debug_mode True --data_path <path_to_kiiti>

method 2: run simple_run.py (recommanded, details see Notes below)

python simple_run.py --weights_dir --data_path --save_result_to --save_concat_image

The models (transferred from official Pytorch model, trained on KITTI-Odometry dataset, size (640x192).):

weights_all_4_models

Note：It's trained on Odometry split, so if applied on Raw data, the results won't be perfect.

TODO

Depth and Pose encoder-decoder
TF2 data loader for KITTI dataset (KITTI_Raw and KITTI_Odom)
training code
evluation code for depth and pose
see where to improve: pose net can use pose = mean(inv(pose), pose) to put constraint on predicted poses on two frames
Add new stuff from Unsupervised Monocular Depth Learning in Dynamic Scenes and Depth from Videos in the Wild
- implement Motion-Field model, to replace (rot, trans) with (rot, trans, trans_residual, intrinsics) for more information
- construct corresponding losses: rgbd and motion consistency losses

Note up-to-date

First, Evaluation code, i.e. eval_depth.py and eval_pose.py is finished.

Then simple_run.py also finished, try it with:

python simple_run.py --weights_dir --data_path --save_result_to --save_concat_image

data_path: path to a video or image file
weights_dir: path to a folder with necessary weights (.h5)
save_result_to: optional, a folder path where the result will be saved
save_concat_image: optional, show concatenated images with original image for comparison

Next move:

try new stuff in similar papers, e.g. struct2dpeth

History note

April

Now you can train your own model using the train.py and new_trainer.py. For now I just trained for 1 epoch, and the results seem to head to the correct way. See the reconstructed image and the disp image under assets/first_epoch_res.jpg.

Next step will be:

completing the evaluatiion code.
try new stuff in similar papers, e.g. struct2dpeth to improve the model

March

Just for personal use, but please feel free to contact me if you need anything.

I haven't used argument-parsing, so you can't run with one command. You need to change some path settings when you run the demo. However, no worries, it's just a simple code merely for singlet depth estimation (for now). The Pose networks and training pipeline is still in progress, needs double check. Though bug-free, I can't give any guarantees for total correctness. Anyways, Take one minute you will know what's going on in there.

simple_run.py: as the name suggests, it's a simple run, important functions are all there, no encapsulement.

depth_esitmator_demo.py: a short demo, where the model is encapsuled in Classes. But I recommand to see simple_run.py first.

The depth_decoder_creater.py and encoder_creator.py is used to

Useful part: build the Model in TF2.x the same way as the official monodepth2 implemented in Pytroch.
Neglectable part: weights loading. Weights were extracted from the official torch model in numpy.ndarray form, then directly loaded to the TF model layer-/parameter-wise. It's trivial. But I will upload the converted SavedModel directly, so no worries.

Credits

the Official repo: https://github.com/nianticlabs/monodepth2
a TF1.x version implementation (complete with training and evaluation): https://github.com/FangGet/tf-monodepth2

Name		Name	Last commit message	Last commit date
Latest commit History 243 Commits
archive		archive
asset		asset
datasets		datasets
models		models
splits		splits
src		src
README.md		README.md
Start.py		Start.py
depth_estimation_demo.py		depth_estimation_demo.py
new_trainer.py		new_trainer.py
options.py		options.py
options_velomlp.py		options_velomlp.py
simple_run.py		simple_run.py
trainer.py		trainer.py
trainer_velocity_mlp.py		trainer_velocity_mlp.py
utils.py		utils.py

dexter2406/Monodepth2-TF2

Folders and files

Latest commit

History

Repository files navigation

Monodepth2-TF2

Dependencies

Performance

example usage:

TODO

Note up-to-date

History note

April

March

Credits

About

Resources

Stars

Watchers

Forks

Languages