Skip to content

gabgoh/adviceanimals

Repository files navigation

Processing Pipeline

1) First parse the raw reddit data from Jason, and scrape thumbnails into thumbs folder (scrapeThumbs.py)

    INPUT          : data\adviceAnimalsSubmissions
    OUTPUT         : Thumbnails

2) Precompute Thumbail summaries (createSummaries.py)
    This should achieve a compression of the thumbnails into a smalelr 70mb file.
    The python file also provides a means of appending metadata from the raw reddit data into the summaries,
    such as title, date created, and number of upvotes.

    INPUT          : Thumbnails, data\adviceAnimalsSubmissions
    FILE GENERATED : \ThumbsnailSummaryTitles

3) Precompute Template Summaries (processTemplates.py)
    Precompute summaries for templates.

    INPUT          : Template Pictures
    OUTPUT         : \templateData

4) Sort Thumbnails according to distance between template and Summary (sortThumbs.py).
    Precompute dist(template, thumbnail). Store the distances, sorted (from smallest to biggest),
     in a file. (This is done to make the next step smoother)

    INPUT          : \memeproject\templateData, \memeproject\ThumbsnailSummaryTitles
    OUTPUT         : thumbssort\*
    (note that the output file will have an identical name to the template file, including the extension)

5) Manually pick cutoff, and stragglers using python gui. Create JSON Files (thumbSortToJSON.py)

    INPUT         : \memeproject\thumbssort\*
    OUTPUT        : \code\html\*.js

6) Create JSON File of templates for the thumbBar (templatesToJSON.py)

    INPUT         : \code\html\*.js
    OUTPUT        : \code\html\templateData.js

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published