Skip to content

dexter2406/Monodepth2-TF2

Repository files navigation

Monodepth2-TF2

It's a monodepth2 model implemented in TF2.x, original paper《Digging into Self-Supervised Monocular Depth Prediction》.

  • Here is the result after training for 13 epochs (without fine-tuning).

image

Dependencies

tensorflow==2.3.1
(for gpu) cudatoolkit=10.1,  cudnn=7.6.5

Performance

I'm using a normal GTX1060-MaxQ GPU. The FPS for single-image depth estimation:

  • using tf.saved_model.load (I think it's serving mode)
    • encoder: ~2ms (500 FPS)
    • decoder: ~2ms
    • overall: >200 FPS (but when I use with YOLOv4, it drops to ~120 FPS)
  • using tf.keras.models.load_model with model.predict():
    • overall : ~100 FPS (details forgot...)

example usage:

Use @tf.function decorator for Train.grad() and DataPprocessor.prepare_batch() will allow much larger batch_size

# Start training (from scratch):
python Start.py --run_mode train --from_scratch True --data_path <path_to_kiiti>

# Continue training:
python Start.py --run_mode train --weights_dir <weights_folder_path> --data_path <path_to_kiiti>

Check intermediate result:

  • method 1: run training code
# remember to comment "tf.function" decorator for Train.grad() and DataPprocessor.prepare_batch()
python train.py --weights_dir <folder_path> --debug_mode True --data_path <path_to_kiiti>
  • method 2: run simple_run.py (recommanded, details see Notes below)
python simple_run.py --weights_dir --data_path --save_result_to --save_concat_image

The models (transferred from official Pytorch model, trained on KITTI-Odometry dataset, size (640x192).):

Note:It's trained on Odometry split, so if applied on Raw data, the results won't be perfect.

TODO

  • Depth and Pose encoder-decoder
  • TF2 data loader for KITTI dataset (KITTI_Raw and KITTI_Odom)
  • training code
  • evluation code for depth and pose
  • see where to improve: pose net can use pose = mean(inv(pose), pose) to put constraint on predicted poses on two frames
  • Add new stuff from Unsupervised Monocular Depth Learning in Dynamic Scenes and Depth from Videos in the Wild
    • implement Motion-Field model, to replace (rot, trans) with (rot, trans, trans_residual, intrinsics) for more information
    • construct corresponding losses: rgbd and motion consistency losses

Note up-to-date

First, Evaluation code, i.e. eval_depth.py and eval_pose.py is finished.

Then simple_run.py also finished, try it with:

python simple_run.py --weights_dir --data_path --save_result_to --save_concat_image
  • data_path: path to a video or image file
  • weights_dir: path to a folder with necessary weights (.h5)
  • save_result_to: optional, a folder path where the result will be saved
  • save_concat_image: optional, show concatenated images with original image for comparison

Next move:

  • try new stuff in similar papers, e.g. struct2dpeth

History note

April

Now you can train your own model using the train.py and new_trainer.py. For now I just trained for 1 epoch, and the results seem to head to the correct way. See the reconstructed image and the disp image under assets/first_epoch_res.jpg.

Next step will be:

  • completing the evaluatiion code.
  • try new stuff in similar papers, e.g. struct2dpeth to improve the model

March

Just for personal use, but please feel free to contact me if you need anything.

I haven't used argument-parsing, so you can't run with one command. You need to change some path settings when you run the demo. However, no worries, it's just a simple code merely for singlet depth estimation (for now). The Pose networks and training pipeline is still in progress, needs double check. Though bug-free, I can't give any guarantees for total correctness. Anyways, Take one minute you will know what's going on in there.

simple_run.py: as the name suggests, it's a simple run, important functions are all there, no encapsulement.

depth_esitmator_demo.py: a short demo, where the model is encapsuled in Classes. But I recommand to see simple_run.py first.

The depth_decoder_creater.py and encoder_creator.py is used to

  • Useful part: build the Model in TF2.x the same way as the official monodepth2 implemented in Pytroch.
  • Neglectable part: weights loading. Weights were extracted from the official torch model in numpy.ndarray form, then directly loaded to the TF model layer-/parameter-wise. It's trivial. But I will upload the converted SavedModel directly, so no worries.

Credits

About

A monodepth2 model implemented in TF2.x, original paper《Digging into Self-Supervised Monocular Depth Prediction》

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages