GitHub - mosnicholas/audio-video-alignment: Computer vision project

#Audio / Video Synchronization Brian Pugh & Nicholas Moschopoulos
Berkeley Computer Science 280

###Abstract

This project explores techniques for recombining desynchronized audio and video tracks.

###FaceSync

We attempt to reconstruct the algorithm from the Slaney, Covell paper (2001). It implements an optimal linear algorithm which calculates the correlation between Mel Frequency Cepstrum Coefficients and image face pixels over time.

###Left/Right Video Alignment

We split a video segment into the left and right half of the image. One half is then offset in time by anywhere from 0 to 10 frames. A Siamese Convolutional Neural Network is trained to predict the alignment of the video. The truths are 1 for aligned and 0 for unaligned.

Dataset

The Left/Right Video Alignment dataset contains X sequences of ten frames each. If the label for a sequence is 1, then the left and right sides of the video are temporally offset. We do this by adding some temporal offset less than ten to find the second half. For example, the left half could be frames 8-17 while the right half is 10-19. If the label is 1, there is some non-zero offset. If the label is 0, there is no offset.

We remove a ten pixel bar in the middle of the video frame in order to prevent the video from simply learning to match the brightnesses or gradients of neighboring pixels.

In creation, we create one aligned and one unaligned pair for each twenty frame source sequence.

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
data		data
left_right_align		left_right_align
video_align		video_align
.gitignore		.gitignore
README.md		README.md
create_datasets.py		create_datasets.py
facesync.py		facesync.py
moviepy_benchmark.py		moviepy_benchmark.py
test_data_gen.py		test_data_gen.py
trainer.py		trainer.py
video_audio_analysis.py		video_audio_analysis.py
video_face_detection.py		video_face_detection.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

left_right_align

left_right_align

video_align

video_align

.gitignore

.gitignore

README.md

README.md

create_datasets.py

create_datasets.py

facesync.py

facesync.py

moviepy_benchmark.py

moviepy_benchmark.py

test_data_gen.py

test_data_gen.py

trainer.py

trainer.py

video_audio_analysis.py

video_audio_analysis.py

video_face_detection.py

video_face_detection.py

Repository files navigation

Dataset

About

Releases

Packages

Languages

mosnicholas/audio-video-alignment

Folders and files

Latest commit

History

Repository files navigation

Dataset

About

Resources

Stars

Watchers

Forks

Languages