Skip to content

πŸ“„ Replication Package: Who Gets a Patch Accepted First? Comparing the Contributions of Employees and Volunteers

Notifications You must be signed in to change notification settings

fronchetti/CHASE-2018

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CHASE-2018

In this repository you will find all the necessary steps to replicate the method available in the paper "Who Gets a Patch Accepted First? Comparing the Contributions of Employees and Volunteers", published at CHASE 2018.

Reproducing the dataset:

If you want to create your own version of the dataset execute the file "script.py" [1] using Python 2.7. After the script execution, all the files will be saved in a folder called "Dataset", and you may need to allow this process in your system. We have already made available a ready copy of this folder in this repository [2].

Tips for dataset replication:

  • In line 237, you can add more projects to be extracted (They need to be on GitHub).
  • Between line 226 and 233 you can decide which of the dataset files you want to extract using the script. For example, if you just want the contributors of the projects use just the R.contributors() method and comment on the remaining lines. But pay attention, some files just can be extracted if others were already collected, so be careful.

Dataset Structure:

β‹…β‹…* Dataset:
β‹…β‹…β‹…β‹…β‹…β‹…* Project:
β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…* about.json (General information about the project)
β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…* contributors.json (All the contributors of the project)
β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…* externals.csv (All the externals contributors of the project)
β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…* pull_requests.json (All the pull requests of the project)
β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…* pull_requests_files.json (Files used in each merged/closed pull request)
β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…* unit_test_files.csv (Pull requests files that are probably related to unit tests)
β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…* merged_pull_requests_summary.csv (General information about each merged pull request)
β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…β‹…* closed_pull_requests_summary.csv (General information about each closed pull request)

Visualizing the charts:

With the dataset in hands, you can reproduce the charts using "charts.R" [3]. The values defined in the lines of this script were manually written, based on the values that we found generating subsets for each project in the dataset. We created these subsets using conditionals that can be seen in "script.R" [4]. To find merged pull requests created by internals that attended to the best pratice three, for example, we created the conditional "user_type == "Internals" & second_line_is_blank == "True"", using data from the merged_pull_requests_summary.csv file of each project.

Help?

Send us an e-mail: fronchetti at usp . br

About

πŸ“„ Replication Package: Who Gets a Patch Accepted First? Comparing the Contributions of Employees and Volunteers

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published