Skip to content

linksuccess/linksuccess

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Linksuccess

This repository contains material for the article "What Makes a Link Successful on Wikipedia?". First, we make a parsing framework for Wikipedia available. The Python framework is intended to extract different link features (e.g., network topological, visual, semantic similarity) from Wikipedia in order to study human navigation. It can be used in combination with the clickstream dataset by Ellery Wulczyn and Dario Taraborelli from Wikimedia. The corresponding Wikipedia XML dump can be found here. Click here for more recent dumps. Additionally, the repository contains sample data extracted from Wikipedia with this framework and utilized in the paper. We also make a notebook with R kernel available containing detailed methodological steps and results from the paper.

Modules description

parsingframework

This folder contains all python scripts needed for setting up the database containing all Wikipedia links and their features.

Requirements

MySQL, PyQt4, Xvfb, Graph Tool and a lot of RAM and free hard disk space.

notebooks

The folder contains a R notebook with mixed-effects hurdle models.

data

This folder contains a sample of links and their features.

License

This project is published under the MIT License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published