Skip to content

Analyses of heavy metal artists and their lyrical content

Notifications You must be signed in to change notification settings

pdqnguyen/metallyrics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

Heavy metal lyrics and reviews

Overview

This repository contains analyses of heavy metal artists and their lyrical content. The core data set combines artist information, including genre labels, and album reviews from The Metal-Archives (MA) and song lyrics from DarkLyrics (DL).

Notebooks

The analyses below provide insights on the history of heavy metal albums, and linguistic properties of metal lyrics.

For just the discussion, see the corresponding blog posts I wrote up on each topic.

Exploration of artists and album reviews

A data-driven discussion of the history and global demographics of the heavy metal music industry and its many genres. This notebook also provides statistical insights on the sentiments of MA users as expressed through online album reviews.

Neural network album review score prediction

Prediction of album review scores using a convolutional neural network and GloVe word embedding.

Lyrics data exploration

Brief overview of the lyrics data set.

Lexical diversity measures

Comparison of lexical diversity measures and what they tell us about artists and genres.

Word clouds

Concise visualizations of song lyrics from different genres.

Network Graphs

Processing data for generating network graphs with Gephi.

Machine learning genre classification

This notebook presents the multi-label problem of genre classification based on lyrics. Different approaches and preprocessing steps are discussed, and various machine learning models are compared via cross-validation to demonstrate possible solutions.

Word embedding genre classification

An attempt at using GloVe word embedding and convolutional neural network, as well as LSTM, for genre classification.

Machine learning scripts

For the genre classifier tool (see link at the bottom of page), a number of machine learning models were tuned and trained to assign genre tags to text inputs of arbitrary length. As discussed in the machine learning notebook above, these models are incorporated into pipelines that also vectorize (and oversample, when training) the data. The relevant scripts are located in lyrics/scripts/ and are configured by the corresponding .yaml files in lyrics/. The genre_classification_tuning.py script tunes the models using cross-validation to determine optimal hyperparameters. The genre_classification_train.py script is used to train the model, given those optimal hyperparameters, and genre_classification_test.py can be used to test the pipeline for functionality before deploying it to the genre classifier tool.

Interactive webpages

Source code for these webpages can be found in the pdqnguyen/metallyrics-web repository.

Interactive data dashboard

Explore the lyrics and album reviews data sets through interactive scatter plots and swarm plots.

Network graph of heavy metal bands

See how genre associations and lyrical similarity connect the disparate world of heavy metal artists.

Global and U.S. maps of heavy metal bands

Explore the world of heavy metal through choropleth maps.

Interactive genre classifier tool

Enter any text you want and see what heavy metal genres it fits in best.

Releases

No releases published

Packages

No packages published