GitHub - Salma-El-Alaoui/lyrics-predictor: A word predictor for song lyrics

#Lyrics Predictor

This is the final project for an Artificial Intelligence class, in which we implemented a word predictor for a corpus of Rock and Pop song lyrics. Our algorithm is a weighted combination of an N-gram model with discounting and back-off, and an N-gram for tags.

Our results and analysis can be found in the project presentation or the report.

Prerequisites

To run the project use python 3.4

The following dependencies are needed:

nltk 3
pandas
bokeh

Run the project

Start from zero

in corpus/raw/ and corpus/lyric_corpus/

run cl_client.py using a songlist file to crawl lyrics
run corpus_builder.py to clean the raw data
manually remove empty files, non english lyrics, etc.
run category.py to generate a category file

Start from corpus

Corpus analysis

in analysis/

run basic_statistics.py for basic meassurements on corpus
run collocation.py for bigram and trigram collocations in POP and ROCK

Word Prediction and further analysis

in analysis/

run linearCombination.py
run perplexity.py
run predictWord.py
run testSimpleNgram.py
run testSmoothing.py
run tryAlpha.py

Models used for NGrams

in nGram/ the following models and taggers can be found:

nGramModel.py
NgramTagModel.py
trainTagger.py

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
analysis		analysis
corpus		corpus
nGram		nGram
report		report
taggers		taggers
.gitignore		.gitignore
README.md		README.md
presentation.pdf		presentation.pdf
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

analysis

analysis

corpus

corpus

nGram

nGram

report

report

taggers

taggers

.gitignore

.gitignore

README.md

README.md

presentation.pdf

presentation.pdf

report.pdf

report.pdf

Repository files navigation

Prerequisites

Run the project

Start from zero

Start from corpus

Corpus analysis

Word Prediction and further analysis

Models used for NGrams

About

Releases

Packages

Contributors 2

Languages

Salma-El-Alaoui/lyrics-predictor

Folders and files

Latest commit

History

Repository files navigation

Prerequisites

Run the project

Start from zero

Start from corpus

Corpus analysis

Word Prediction and further analysis

Models used for NGrams

About

Resources

Stars

Watchers

Forks

Languages