Skip to content

Devoney/geometric-perception

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

geometric-perception

Classifies objects based on their geometry using a 3D camera and neural networks from Tensorflow + Keras and image processing from OpenCV.

See a demonstration and explanation on how this all works on YouTube.

IMAGE ALT TEXT HERE

But I do not have a depth cam!

No worries. You can use a virtual depth cam using Blender. Load VirtualDepthCam.blend into Blender, and run the embedded Python script. (Blender 2.8). Instead of placing real objects in front of a real depth cam, you can place 3D models in front of a virtual depth cam in Blender, and still have it classify objects in real time. (You do need a powerful CPU to render on a decent frame rate though).

How to start

  • Clone this repository
  • Find yourself a 3D camera. I've used a Occipital Structure Core. Or use a virtual depth camera from Blender.
    • When using the Structure Core depth camera you should download their SDK.
    • You can use SimpleStreamer.cpp to get camera data from the camera to python.
      • Adjust the pipe for your operation system, as Windows pipes are used here.
    • Or alternatively, on Windows, you can use SimpleStreamer.exe.
  • After having setup the named pipe to transfer the camera data it is time to start the python code to capture the camera data. Run CaptureData.py. This will run the SimpleStreamer.exe by default to setup the pipe for streaming the camera data. Which will show you this GUI:

Data Capture Control

  • By now the named pipe should be connected and images should be streaming to the GUI.

Capturing images

  • First you will want to Record background for a few seconds. Then turn it off.
  • Then put the object into the scene.
  • Apply median filter to reduce noise.
  • You can play with the number of frames to Average or the Background margin to reduce noise further.
  • Next tick Shift to center which will shift and normalize the data. This shows you what data will go to the neural network.
  • Using the Data Capture Control you can capture images using the Capture button.
  • It will store data in the data folder.
  • When you are pleased with the captures data you need to move the data manually into a sub folder of the data folder. Give this the label you want to use when training the network. Like geometric-perception\data\cube\ for instance.

Training the neural network

  • You should have atleast captured 2 different objects and put the data in the geometric-perception\data\ as mentioned above. I recommend to take 32 images per object when rotated on a single axis. So 360 / 32 = ~11º rotation per image.
  • OPTIONAL: Next you will likely want to use a RAM drive for the images that will be generated by the ImageDataGenerator to not overly use your HD or SSD and it will probably speed things up a little as well.
  • Set the location for the storage of the images being generated in img_gen_dir in TrainNeuralNetwork.py.
  • Now run TrainNeuralNetwork.py. It should find the images from the data directory, associated the sub folder name to the images as their label and start training.
  • Play around with the lr (learning rate) and number of epochs and batch_size if you want.
  • If you want to tweak the image generation, then have a look at the parameters given to ImageDataGenerator in Training.py

Classifying objects

So after having created images, and having trained the network you can have it classify objects. This will use the Data Capture Control GUI, that was used for capturing images.

  • Run CaptureData.py and you should get the UI shown below.
  • Data Capture control
  • Record the background, without any objects in the scene by turning on and off the Record background checkbox. You can play with the recording time. For me ~4 secs work just fine (without frame averaging).
  • Next check Apply median filter
  • Check Shift to center
  • Check Classify objects
  • Put an object in the scene now that you have used for training.
  • Play with these settings to improve accuracy and reduce noise:
    • Backgroudnd margin (mm) Which adds a margin to the recorded background
    • The number below Apply median filter which is the size for the kernel used for the filter. Use an odd number like 1, 3, 5, 7, 9, etc...
    • Area size cut-off This controls the removal of noise by ignoring objects which are small. So when finding contours, small contours are skipped, based on this setting. If the slider is all the way to the left, then no contour is every skipped.

A few notes

The code is not intended for production use. I spend some time to put structure to the project, but it is certainly not perfect. The intent of this software is to have a decent control to play around and get a feel for how a neural network behaves and performs. So the intent was not to create fully dummy proof code. For instance, when turning off Crop while classifying objects, things will break. As matrices used in the code, like frame averaging suddenly change in shape.

About

Classifies object based on their geometry using a 3D camera and Tensorflow + Keras

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published