Skip to content

ksaye/vision-ai-developer-kit-audio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Using the Vision AI Developer Kit for Audio

Overview

This repo demonstrates how to use the Vision AI Developer Kit (VAI DevKit) to develop a Neural Network model to process audio sounds. For information on using Vision on the VAI DevKit, refer to Vision AI DevKit main page.

Solution Videos

Background

Processing video or images through a Neural Network involves converting images, most commonly JPEG, into a NumPy array where features can be extracted and calculated. At the hightest level, most Vision AI projects include this with added capabilities.

A few of the challenges with Vision include requiring a camera and the camera only has a limited field of view. To detect images in a complete circle, you often need 4+ cameras, which has a higher cost and require specialized hardware to process so much data and networks.

Audio, using just a microphone, is a much more cost effective approach for lots of use cases. The advantages of Audio include:

  • Lower Price
  • Full 360° coverage
  • No dependency on light

While audio is not the answer to all use cases, it can be used in many. With your eyes closed, listen to all the sounds around you and think about how you were "trained" to recognize the sound.

Resources

Get a kit

You can purchase the DevKit from Arrow Electronics.