Skip to content

Project Fletcher in the Metis Data Science Bootcamp; text processing of arXiv data.

Notifications You must be signed in to change notification settings

EmmanueleSalvati/Fletcher

Repository files navigation

Project Fletcher

Particle name statistics in arXiv abstracts. Natural Language Processing exercise.

The goal of this project is to collect all hep-ex abstracts on the arXiv and provide a visualization about the history of particles over time. As an obvious example, how many papers were about the Higgs boson in the last decade?

Tools used: a little class which I have written to retrieve data from the arXiv API (which uses the OAI-PMH protocol), mongoDB to store all the abstracts on the cloud, D3 to create the visualization.

How to create the csv files

ipython
from arXiv_helper import create_particle_csv
create_particle_csv('muon')

About

Project Fletcher in the Metis Data Science Bootcamp; text processing of arXiv data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages