voatScrape.py|gets all post comments from a voat board
voat.py|gets all ids of posts published after 2015/1/1 from a voat board
redditCommentExtractMultiThreaded.py|gets all post comments from a reddit board (using 5 parallel threads)
reddit.py|gets all post ids of a subreddits from pushshift.io
read.py|splits 4chan archives into semi-annual chuncks (Extremely inefficient, should use datetime.datetime)
politics.voat|All post ids from the politics subvoat
multiThreadreddit.py|gets all post ids of a subreddits from pushshift.io (using 5 parallel threads)
meta.py|Scraper for the 4chan archive @4archive
commentExtract.py|gets all post comments from a reddit board
barchive_dist.py|Distributed Scraper for the /b/ 4chan board: archive @barchive
barchive.py|Scraper for the /b/ 4chan board: archive @barchive
4threadScrape.py|Distributed 4chan scraper for archive @4plebs (Download archives from https://archive.org/details/4plebs-org-data-dump-2019-01 instead)
4plebWriteFirstThread.py|distributed 4chan scraper for @4plebs that writes as it reads (Download archives from https://archive.org/details/4plebs-org-data-dump-2019-01 instead)
4plebWriteFirst.py|4chan scraper for @4plebs that writes as it reads (Download archives from https://archive.org/details/4plebs-org-data-dump-2019-01 instead)
4plebScrape.py|4chan scraper for archive @4plebs (Download archives from https://archive.org/details/4plebs-org-data-dump-2019-01 instead)
4arc.py|gets all posts from a 4chan id
2itch.py| api+key for twitch
main.py|preprocesses data dumps\
\
kstats/CommunityLanguage
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published