Automatic-Videography-of-Audio-Tracks-of-Songs

This following repository contains a prototype automatic videography generation system. More specifically, given any YouTube video of a song, the system automatically retrieves a set of images that are related to each line of the song, and inserts these images in an automatically created video track seeking to align these images with the background audio.

Prerequisite

This system was created and tested using a Windows Operating system, and although the system should permit the use of other operating systems

Python v3.8.12
- Use python version equivalent to, or greater than, verison 3.8.12.
- Python version can be found using the python --version command.
Google Tesseract OCR
- Windows Installer
  - Install either the 32 bit or 64 bit installer depending on your system specifications.
  - Allow installer to run using the deafult values, and add the install location to your path, for example: C:\Program Files\Tesseract-OCR
  - Create an evironment variable called "TESSDATA_PREFIX" which contains the path to the "tessdata" folder in the "Tesseract-OCR" program folder, for example: C:\Program Files\Tesseract-OCR\tessdata
- Generic Install Page
  - Follow install instructions in the link above for your operating system.
  - Add the install location to your path.
  - Create an evironment variable called "TESSDATA_PREFIX" which contains the path to the "tessdata" folder in the "Tesseract-OCR" program folder.
Chrome
- Must be the most up to date version avalable.
  1. To check, open chrome and open "More", which appears as three vertical dots on the top right of the window.
  2. Then go to "Help", open "About Google Chrome".
  3. Under "About Chrome", check if there is an update available and download it if so.

Installation and Running

Open bash terminal and navigate to workspace folder.

cd ~/<path>

git clone https://github.com/AndrewParker770/Automatic-Videography-of-Audio-Tracks-of-Songs.git

cd Automatic-Videography-of-Audio-Tracks-of-Songs

Create python virtual environment, activate it, and download the modules used by the project:

python -m venv venv

. venv/Scripts/activate

python -m pip install --upgrade pip

pip install Pillow

pip install -r requirements.txt

python -m pip install --upgrade pytube

pip install git+https://github.com/ssuwani/pytube

python -m pip install --upgrade pytube

cd django_videography_project

python manage.py runserver

Terminal output should prompt to open browser and access local host.

Collections

The system may store videos internally during the video generation process, however these videos (as well as any file made during its creation) are deleted once another video is generated. This is in part to prevent excessive files being stored, however this is mainly due to the issue of storing vidoes which are based on copyrighted content such as much music is.

However, a collection folder for demonstration purposes have been provided in the system and if required you can manually enable saving videos to the collection by creating an evironment variable before starting the django server as follows: bash export COLLECT_JSON=True

Example of video generation

By supplying the forced alignment method in the prototypes menu and inputting the

youtube url: https://www.youtube.com/watch?v=1iiDp6ga_qQ
artist name: America
song name: A Horse With No Name

We can take the video (found at https://www.youtube.com/watch?v=1iiDp6ga_qQ) which appears as a lyrics video as follows:

This video is then converted into the following:

ocLCLMZO6dc.mp4

The figure below shows a section of the entries in timing dictionary which created each image, in the video above, shown in a timeline:

Citations

C. Gupta, E. Yılmaz and H. Li, "Automatic Lyrics Alignment and Transcription in Polyphonic Music: Does Background Music Help?," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 496-500, doi: 10.1109/ICASSP40776.2020.9054567.

Project Information

This prototype was created as part of a Level 4 project at the University of Glasgow.

Created by: Andrew Parker & Debasis Ganguly

Name		Name	Last commit message	Last commit date
Latest commit History 133 Commits
django_videography_project		django_videography_project
.gitignore		.gitignore
2389622p.rmd		2389622p.rmd
Dissertation_Automatic_Videography_of_Audio_Tracks_of_Songs.pdf		Dissertation_Automatic_Videography_of_Audio_Tracks_of_Songs.pdf
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

django_videography_project

django_videography_project

.gitignore

.gitignore

2389622p.rmd

2389622p.rmd

Dissertation_Automatic_Videography_of_Audio_Tracks_of_Songs.pdf

Dissertation_Automatic_Videography_of_Audio_Tracks_of_Songs.pdf

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Automatic-Videography-of-Audio-Tracks-of-Songs

Prerequisite

Installation and Running

Collections

Example of video generation

Citations

Project Information

About

Releases

Packages

Languages

AndrewParker770/Automatic-Videography-of-Audio-Tracks-of-Songs

Folders and files

Latest commit

History

Repository files navigation

Automatic-Videography-of-Audio-Tracks-of-Songs

Prerequisite

Installation and Running

Collections

Example of video generation

Citations

Project Information

About

Resources

Stars

Watchers

Forks

Languages