Code to gather dataset for Stanford CS224n course project.
The source code is available under src.
To gather the dataset run:
python main.py -c -p -s
All arguments are optional and default to False:
usage: main.py [-h] [-c] [-p] [-s]
optional arguments:
-h, --help show help message and exit
-c, --crawl Whether to crawl the raw data or not. default: False
-p, --process Whether to clean and process the raw data or not. default: False
-s, --stats Whether to report statistics of the data. default: False
All of the data is also available here.
Please read the documentation for more info about the dataset.