Luxonis
LuxonisApplications
Software
Support
About UsBlog
Store
Applications
DocumentationGuides, specifications and datasheets
GitHubSource code of our libraries
About UsBlogStore
Jun 23, 2022

Custom Model Training

DepthAI
Machine Learning
What’s one of the things that makes our camera systems so powerful? Deep learning. What does that mean? Instead of having to program computers or devices to do things, you can train them. And that's huge. Just like you can train a dog, you can train any of our powerful computer vision devices. Don’t worry we’ll show you how.

No Training, No Problem

First off, we want to note that you don't have to train anything to get up/running with an OAK camera. There are all sorts of pre-trained models you can use. There are tons of things you can run right away. To name just a few:
  • Intel/OpenVINO (here)
  • OpenCV.ai's modelplace.ai (here)
  • PINTO0309's (here)
  • Roboflow Universe (here)
And here’s a glimpse at models you can run right away from Roboflow (a fantastic place for dataset development, labeling, and management):
There are lots more too, including reference applications you can run just by downloading and hitting run (e.g. here). But you'll want to train things, trust us.

Some History

Geoffrey Hinton (the godfather of AI) single-handedly invented and killed Deep Learning back in 1986. Before it was even called that. How? Well, he proved that it was theoretically possible to train electronics (the invention) through a system called “back propagation,” while also making clear how much computation this would take to be useful (what killed it). In 1986, when his paper was released, it would have taken more than all of the world's computers the rest of humanity's existence to train a single neural network. And thus plunged Deep Learning and really the whole Artificial Intelligence space into an "AI Winter". Research effectively stopped.   This all changed around 2009-2012, when folks realized that, well, all of the world's computers from 1986 were WAY less capable than a single modern GPU.   In 2009  ImageNet was released, which provided the first usable dataset (more on the importance of datasets later), and by 2012 Andrew Ng trained a network to detect cats on Youtube. And Boom! The Deep Learning Boom was on: Siri, Alexa, Cortana, Google Photos. A whole slew of AI-based things you've heard of, and even more you haven't, all started taking over markets. Overnight at Google, hundreds of man years of work, what were the best algorithms in the world, were outperformed and replaced by machine learning models. This is the power of being able to train.  The world's best algorithms can be outperformed. But more importantly, all sorts of previously-intractable problems are now readily-solvable.

OK, I'm Convinced. Training is Cool. But How Do I Do It?

Fortunately, we have prepared open source training scrips which allow you to get to work training right away. And you can even train for free. Below is our Yolov5 tutorial trained on some grocery-store items. And you can follow along the tutorial that trained this yourself, here. And there's a bunch more training tutorials here covering YOLOv3, TOLOv4, MobileNet SSD, and even Deeplabv3+ for semantic segmentation. 
Can't get to the store to test? No problem. When training a model, you can test out how it performs on our cameras by feeding in images or videos from your computer over USB or ethernet. Don't have the thing you're trying to detect near you? We have you covered there too:
  • Feed Video (or stills) into a Luxonis camera from your computer, here 
  • You can even stream Youtube directly into a Luxonis camera, here with `--video https://youtu.be/9rlI3Xg9g_A` (to test your custom Johnny 5 detector.)
That covers how you train! It's actually not that hard these days. Machine learning sure has come a long way.
So get to it! Have questions? Need help developing your dataset? We’re here to help! Join our Discord here.

Erik Kokalj
Erik KokaljDirector of Applications Engineering