Skip to content

AndrewParker770/Automatic-Videography-of-Audio-Tracks-of-Songs

Repository files navigation

Automatic-Videography-of-Audio-Tracks-of-Songs

This following repository contains a prototype automatic videography generation system. More specifically, given any YouTube video of a song, the system automatically retrieves a set of images that are related to each line of the song, and inserts these images in an automatically created video track seeking to align these images with the background audio.

Prerequisite

This system was created and tested using a Windows Operating system, and although the system should permit the use of other operating systems

  • Python v3.8.12

    • Use python version equivalent to, or greater than, verison 3.8.12.
    • Python version can be found using the python --version command.
  • Google Tesseract OCR

    • Windows Installer
      • Install either the 32 bit or 64 bit installer depending on your system specifications.
      • Allow installer to run using the deafult values, and add the install location to your path, for example: C:\Program Files\Tesseract-OCR
      • Create an evironment variable called "TESSDATA_PREFIX" which contains the path to the "tessdata" folder in the "Tesseract-OCR" program folder, for example: C:\Program Files\Tesseract-OCR\tessdata
    • Generic Install Page
      • Follow install instructions in the link above for your operating system.
      • Add the install location to your path.
      • Create an evironment variable called "TESSDATA_PREFIX" which contains the path to the "tessdata" folder in the "Tesseract-OCR" program folder.
  • Chrome

    • Must be the most up to date version avalable.
      1. To check, open chrome and open "More", which appears as three vertical dots on the top right of the window.
      2. Then go to "Help", open "About Google Chrome".
      3. Under "About Chrome", check if there is an update available and download it if so.

Installation and Running

Open bash terminal and navigate to workspace folder.

cd ~/<path>
git clone https://github.com/AndrewParker770/Automatic-Videography-of-Audio-Tracks-of-Songs.git
cd Automatic-Videography-of-Audio-Tracks-of-Songs

Create python virtual environment, activate it, and download the modules used by the project:

python -m venv venv
. venv/Scripts/activate
python -m pip install --upgrade pip
pip install Pillow
pip install -r requirements.txt
python -m pip install --upgrade pytube
pip install git+https://github.com/ssuwani/pytube
python -m pip install --upgrade pytube
cd django_videography_project
python manage.py runserver

Terminal output should prompt to open browser and access local host.

Collections

The system may store videos internally during the video generation process, however these videos (as well as any file made during its creation) are deleted once another video is generated. This is in part to prevent excessive files being stored, however this is mainly due to the issue of storing vidoes which are based on copyrighted content such as much music is.

However, a collection folder for demonstration purposes have been provided in the system and if required you can manually enable saving videos to the collection by creating an evironment variable before starting the django server as follows: bash export COLLECT_JSON=True

Example of video generation

By supplying the forced alignment method in the prototypes menu and inputting the

We can take the video (found at https://www.youtube.com/watch?v=1iiDp6ga_qQ) which appears as a lyrics video as follows:

Horse

This video is then converted into the following:

ocLCLMZO6dc.mp4

The figure below shows a section of the entries in timing dictionary which created each image, in the video above, shown in a timeline:

Timings

Citations

C. Gupta, E. Yılmaz and H. Li, "Automatic Lyrics Alignment and Transcription in Polyphonic Music: Does Background Music Help?," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 496-500, doi: 10.1109/ICASSP40776.2020.9054567.

Project Information

This prototype was created as part of a Level 4 project at the University of Glasgow.

Created by: Andrew Parker & Debasis Ganguly

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published