Skip to content

c-okelly/movie_script_analytics

Repository files navigation

Introduction

This project will contain all of the code required for my data analytics project.

It will also contain any extra work that was required such as presentation materials and final report.

The data required to run this project has been left out at this point in time due to how large it is.

Setup / Dependancies

General process overview

Documentation

Special values

There are a number of special values throughout the project that will be marked with the phrase ##special_key_value## for ease of searching

Future Work

A number of scripts are using a different format to lay out their work. This normally results in a much of the speech being marked as description.

Need to be more refinement in how the sections are created to be analysed.

One future project should look at identifying the level of indentation for each line in the script. This can be used as a marker for different sections.

Secondary issue where script format as lines between char name and speech section. This results in orphaned speech sections. These are then marked as descriptions. Soluiton to this problem is to join these secitons together. This issue is also present on page breaks in the way there are presented in a document of continous text.

About

Python library to take in a standard style movie script, perform analysis on it, retrieve external information and return data for further analysis.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages