This is still a WIP and we should be officially open-sourcing the codebase in late June/July 2015. For now, please read the report we published for the TowCenter on our prototype.
- Install
newslynx
, prefrerably in a virtual environment.
git clone https://github.com/newslynx/newslynx.git
cd newslynx
python setup.py install
-
NOTE: If you're on a mac you should use Postgres APP
-
(re)create a
postgresql
database
dropdb newslynx
createdb newslynx
-
fill out
example_config/config.yaml
and move it to~/.newslynx/config.yaml
-
modify default recipes and tags in
example_config/defaults/recipes/
andexample_config/defaults/tags/
, respectively. These tags and recipes will be created everytime a new organization is added. -
initialize the database:
newslynx init
- populate with sample data
newslynx gen_random_data
- start the server in debug mode
newslynx runserver -d
- start a production server via
gunicorn
./run
- IGNORE THIS ERROR:
This is a result of our extensive use of gevent
. We haven't yet figured out how to properly suppress this error. See more details here.
Exception KeyError: KeyError(4332017936,) in <module 'threading' from '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.pyc'> ignored
brew reinstall postgresql --build-from-source --with-python
-
Migrate common utilites from other repos into single repo.
-
Create Database Schema / Models
-
Create Blueprint-based app workflow
-
Re-implement OAuth endpoints
-
Implement Facebook OAuth
-
Re-implement User / Login API
-
Implement Org API
-
Re-implement Settings API
-
Re-implement Events API
- Implement Postgres-based search
- Make multiple search vectors
-
Re-implement Things API (aka Articles)
- Implement Postgres-based search
- Make multiple search vectors
-
Re-implement Tags API
-
Write out SousChefs JSONSchema
-
Write out initial schemas:
- article
- twitter-list
- twitter-user
- facebook-page
-
Write out default recipes + tags:
- article
- twitter-list
- twitter-user
- facebook-page
- promotion impact tag
-
Update create org endpoint to generate default recipes + tags.
-
Implement SousChefs API
-
Implement Recipes API
-
Implement Thing Creation API
-
Implement SQL Query API
-
Implement Extraction API
-
Implement Event Creation API
-
Create thumbnails for images.
- Add thumbnail worker redis cache.
-
Implement Metrics API:
- Create metrics table which contains information on each metric (name, timeseries agg method, summary agg method, cumulative, metric category, level, etc)
- Faceted metrics only need to declare their name name, not all their potential facet values.
- Sous Chefs that create metrics must declare which metrics they create.
- When a recipe is created for a sous chef that creates metrics, these metrics should be created for the associated organization.
- Timeseries Metrics for things will only be
collected 30 days after publication. After this period an article moves into an "archived" state. - Each Organization should have the following views/apis with these respective functionalites: - [x Timeseries Aggregations - [x] Thing level - [x] By hour + day + month - [ ] Subject Tag Level (subsequent aggregations of things) - [ ] By day. - [x] Impact Tag Level (aggregations of events => non customizable.) - [ ] Org Level (This should include summaries of thing-level statistics, tag-level statistics, and event-level statistics) - [ ] By day, month - [x] optionally return cumulative sums when appropriate - [ ] Summary Stats - [ ] Impact Tag Level - [ ] Subject Tag Level - [ ] Impact Tag Level - [ ] Organization Level - [ ] These should be Archived Every day. and percent changes should be computed over time periods.
-
Implement Reports API (Are these just metrics?)
- reports are json objects
- reports can be rendered with Jinja templates
- reports can be rendered as pdfs
- see: https://pypi.python.org/pypi/pdfkit or http://stackoverflow.com/questions/23359083/how-to-convert-webpage-into-pdf-by-using-python or just force user s to "save as pdf"
- reports can be saved + archived up to X days.
- reports can o
-
Implement Redis Task Queue For Recipe Running
- Create gevent worker class to avoid reliance on os.fork
- Figure out how to rate limit requests.
-
Implement Modular SousChefs Class
-
Figure out how best to use OAuth tokens in SousChefs. Ideally these should not be exposed to users.
-
Implement API client
-
Re-implement SousChefs
- RSS Feeds => Thing
- Google Analytics => Metric
- Google Alerts => Event
- Social Shares => Metric
- Homepage Promotions => Metric
- Twitter Promotions => Metric
- Facebook Promotions => Metric
- Twitter List => Event
- Twitter User => Event
- Facebook Page => Event
- Reddit => Event
- HackerNews => Event
-
Implement New SousChefs
- IFTTT integrations
- Wordpress Publish => Thing
- TK
- Regex Thing URL => Tag
- Search Things => Tag
- Meltwater Emails => Event
- Newsletter Email Promotions => Metric
- Calculated Metric? SQL API.
- IFTTT integrations
-
Implement Recipe scheduler
-
Implement Admin Panel
-
Migrate Core Prototype Users.
-
Automate Deployment
-
App Integration
-
Document, Document, Document
- [http://stackoverflow.com/questions/346132/postgres-how-to-return-rows-with-0-count-for-missing-data]
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.