Hysia Video to Online Platform [V1.0]

_{* This project is supported by
Cloud Application and Platform Lab
led by Prof. Yonggang Wen}

An intelligent multimodal-learning based system for video, product and ads analysis. You can build various downstream applications with the system, such as product recommendation, video retrieval. Several examples are provided.

V2 is under active development currently. You are welcome to create a issue, pull request here. We will credit them into V2.

Updates

V1.3:

To be updated.

V1.2:

A light model database
model exporter

V1.1:

Docker support
Frontend separation
More models
Improve documents

Highlights

Multimodal learning-based video analysis:
- Scene / Object / Face detection and recognition
- Multimodality data preprocessing
- Results align and store
Downstream applications:
- Intelligent ads insertion
- Content-product match
Visualized testbed
- Visualize multimodality results
- Can be installed seperatelly

Showcase

1. Upload video and process it by selecting different models

2. Display video processing result

3. Search scene by image and text

4. Insert product advertisement and display insertion result

Download Data

Here is a summary of required data / packed libraries.

File name	Description	File ID	Unzipped directory
hysia-decoder-lib-linux-x86-64.tar.gz	Hysia Decoder dependent lib	1fi-MSLLsJ4ALeoIP4ZjUQv9DODc1Ha6O	`hysia/core/HysiaDecode`
weights.tar.gz	Pretrained model weights	1O1-QT8HJRL1hHfkRqprIw24ahiEMkfrX	`.`
object-detection-data.tar.gz	Object detection data	1an7KGVer6WC3Xt2yUTATCznVyoSZSlJG	`third/object_detection`

For users without Google Drive access, you can download from Baidu Wangpan and unzip files correspondingly. (See Option 2)

Option 1: Auto-download

# Make sure this script is run from project root
bash scripts/download-data.sh
cd ..

Option 2: Step-by-step download

Note: curl can be used to download from Google Drive directly according to amit-chahar's Gist. File names and file IDs are available from the above table:

fileid=<file id>
filename=<file name>
curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=${fileid}" > /dev/null
curl -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=${fileid}" -o ${filename}
rm cookie

Please cd to the specific folder (from the above table, column Unzipped directory) before execute curl.

1. Download Hysia Decoder dependent libraries and unzip it:

deocder_path=hysia/core/HysiaDecode
mv hysia-decoder-lib-linux-x86-64.tar.gz "${deocder_path}"
cd "${deocder_path}"
tar xvzf hysia-decoder-lib-linux-x86-64.tar.gz
rm -f hysia-decoder-lib-linux-x86-64.tar.gz
cd -

2. Download pretrained model weights and unzip it:

tar xvzf weights.tar.gz
# and remove the weights zip
rm -f weights.tar.gz

3. Download object detection data in third-party library and unzip it:

mv object-detection-data.tar.gz third/object_detection
cd third/object_detection
tar xvzf object-detction-data.tar.gz
rm object-detection-data.tar.gz
cd -

Installation

Requirements:

Conda
Nvidia driver
CUDA = 9*
CUDNN
g++
zlib1g-dev

We recommend to install this V2O platform in a UNIX like system. These scripts are tested on Ubuntu 16.04 x86-64 with CUDA9.0 and CUDNN7.

Option 1: Auto-installation

Run the following script:

# Execute this script at project root
bash ./scripts/install-build.sh
cd ..

Option 2: Docker

See Run with Docker to build and install.

Option 3. Step-by-step installation

# Firstly, make sure that your Conda is setup correctly and have CUDA,
# CUDNN installed on your system.

# Install Conda virtual environment
conda env create -f environment.yml

conda activate V2O

export BASE_DIR=${PWD}

# Compile HysiaDecoder
cd "${BASE_DIR}"/hysia/core/HysiaDecode
make clean
# If nvidia driver is higher than 396, set NV_VERSION=<your nvidia major version>
make NV_VERSION=<your nvidia driver major version>

# Build mmdetect
# ROI align op
cd "${BASE_DIR}"/third/
cd mmdet/ops/roi_align
rm -rf build
python setup.py build_ext --inplace

# ROI pool op
cd ../roi_pool
rm -rf build
python setup.py build_ext --inplace

# NMS op
cd ../nms
make clean
make PYTHON=python

# Initialize Django
# This will prompt some input from you
cd "${BASE_DIR}"/server
python -m grpc_tools.protoc -I . --python_out=. --grpc_python_out=. protos/api2msl.proto

python manage.py makemigrations restapi
python manage.py migrate
python manage.py loaddata dlmodels.json
python manage.py createsuperuser

unset BASE_DIR

* Optional: Rebuild the frontend

You can omit this part as we have provided a pre-built frontend. If the frontend is updated, please run the following:

Option 1: auto-rebuild

cd server/react-build
bash ./build.sh

Option 2: Step-by-step rebuild

cd server/react-front

# Install dependencies
npm i
npm audit fix

# Build static files
npm run-script build

# fix js path
python fix_js_path.py build

# create a copy of build static files
mkdir -p tmp
cp -r build/* tmp/

# move static folder to static common
mv tmp/*html ../templates/
mv tmp/* ../static/
cp -rfl ../static/static/* ../static/
rm -r ../static/static/

# clear temp
rm -r tmp

Configuration

Decode hardware:
Change the configuration here at last line:
```
DECODING_HARDWARE = 'CPU'
```
Value can be CPU or GPU:<number> (e.g. GPU:0)

ML model running hardware: Change the configuration of model servers under this directory:

# Custom request servicer
class Api2MslServicer(api2msl_pb2_grpc.Api2MslServicer):
    def __init__(self):
        ...
        os.environ['CUDA_VISIBLE_DEVICES'] = '0'

A possible value can be your device ID 0, 0,1, ...

Demo

cd server

# Start model server
python start_model_servers.py

# Run Django
python manage.py runserver 0.0.0.0:8000

Then you can go to http://localhost:8000. Use username: admin and password: admin to login.

Some Useful Tools

Large dataset preprocessing
Video/audio decoding
Model profiling
Multimodality data testbed

Credits

Here is a list of models that we used in Hysia-V2O.

Models	GitHub Repo	License
MMDetection
Google Object detection
Scene Recognition
Audio Recognition
Image Retrieval
Face Detection
Face Recognition
Text Detection
Text Recognition

Contribute to Hysia-V2O

You are welcome to pull request. We will credit it in our version 2.0.

Paper Citation

Coming soon!

About Us

Maintainers

Huaizheng Zhang [GitHub]
Yuanming Li yli056@e.ntu.edu.sg [GitHub]
Qiming Ai [GitHub]
Shengsheng Zhou [GitHub]

Previous Contributors

Wenbo Jiang (Now, Shopee) [GitHub]
Ziyuan Liu (Now, Tencent) [GitHub]
Yongjie Wang (Now, NTU PhD) [GitHub]

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
cache		cache
docker		docker
docs		docs
hysia		hysia
scripts		scripts
server		server
tests		tests
third		third
.gitattributes		.gitattributes
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
pip_environment.yml		pip_environment.yml

License

zhhezhhe/Video-to-Online-Platform

Folders and files

Latest commit

History

Repository files navigation