-
Notifications
You must be signed in to change notification settings - Fork 0
wdunicornpro/GithubCrossRepositoryTeams
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is the dataset and scripts of "Investigating the Cross-Repository Socially Connected Teams in Github" This dataset is uploaded anonymously for sharing with the reviewers during the double-blind peer review process. The dataset may be opened publicly after acceptance. File structure of this project: issuecomment\ -- IssueCommentEvent data from 01/01/2015 to 06/30/2018 organized in years, months, and days OSLOM\ Edges.dat -- The edge list of the developer network Edges.dat_oslo_files\ tp -- The list of modules generated by OSLOM pajek_file_0.net -- The pajek file generated by OSLOM pajek_file_0_new_without_singleton.net -- The pajek file for visualization (Top 100 largest teams) Edges.link -- The repo context of each edge in the developer network Edges_single.dat -- The edge list of the developer network(without cross-repo condition) Edges_single.dat_oslo_files\ tp -- The list of modules generated by OSLOM Edges_single.link -- The repo context of each edge in the developer network(without cross-repo condition) network.json -- The developer network network_time.dat -- The duration of each edge in the developer network repo_statistics.txt -- Repo level statistics team_tags.txt -- Team level statistics team_tags_single.txt -- Team level statistics (without cross-repo condition) teams.txt -- Team list teams_single.txt -- Team list (without cross-repo condition) README -- This file contributors.json -- Contributors of each repo contributors.py -- Script for generating contributors.json edgelist.py -- Script for generating OSLOM\Edges.dat and OSLOM\Edges.link edgelist_single.py -- Script for generating OSLOM\Edges_single.py and OSLOM\Edges_single.link network.py -- Script for generating OSLOM\network.json and OSLOM\network_time.dat pajek_repaint.py -- Script for generating OSLOM\Edges.dat_oslo_files\pajek_file_0_new_without_singleton.net repo_features.json -- Numeric features of each repo repo_features.py -- Script for generating repo_features.json repo_language.json -- Programming language of each repo repo_language.py -- Script for generating repo_language.py repo_statistics.py -- Script for generating OSLOM\repo_statistics.txt repo_topics.json -- Topics of each repo repo_topics.py -- Script for generating repo_topics.json repos.json -- List of repos repos.py -- Script for generating repos.json team_statistics.py -- Script for generating charts team_tags.py -- Script for generating OSLOM\team_tags.txt and OSLOM\team_tags_single.txt teams.py -- Script for generating OSLOM\teams.txt and OSLOM\teams_single.txt users.json -- List of users users.py -- Script for generating users.json Workflow: 1. All public IssueCommentEvents from 01/01/2015 to 06/30/2018 stored under \issuecomment\ directory 2. Extract all the active repos during the above period by running: python repos.py 3. Get all the contributors of these repos through Github API by running: python contributors.py 4. Get the user list by running: python users.py 5. Get all repo languages, topics, and numeric features through Github API by running: python repo_features.py 6. Generate the developer network by running: python network.py OSLOM\network.json OSLOM\network_time.dat 7. Generate the edge lists by running: python edgelist.py OSLOM\network.json OSLOM\Edges.dat OSLOM\Edges_single.dat 8. Run OSLOM2(www.oslom.org) on the edge lists. ./oslom_undir -f Edges.dat -uw -hr 0 -singlet -louvain 1 -t 0.99 -cp 0.01 9. Extract the team lists: python teams.py OSLOM\Edges.dat_oslo_files\tp OSLOM\network.json OSLOM\teams.txt python teams.py OSLOM\Edges_single.dat_oslo_files\tp OSLOM\network.json OSLOM\teams_single.txt 10. Compute lifetime for repos: python repo_time.py repos.txt repo_time.txt 11.Compute properties of each team: python team_tags.py OSLOM\Edges.link OSLOM\teams.txt OSLOM\Edges_single.link OSLOM\teams_single.txt repo_features.json OSLOM\network_time.dat contributors.json repo_time.txt OSLOM\team_tags.txt 12.Compute team level statistics and generate charts: python team_statistics.py OSLOM\team_tags.txt repo_features.json 13.Compute repo level statistics and generate charts: python repo_statistics.py OSLOM\repo_statistics.txt 14. python repo_feature_statistics.py OSLOM\repo_feature_statistics.txt 14.Generate the visualization of top 100 largest teams: python pajek_repaint.py OSLOM\Edges.dat_oslo_files\pajek_file_0.net 15.Open pajek_file_0_new_without_singleton.net in gephi(www.gephi.org)
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Packages 0
No packages published