Skip to content

An restful API to control a web crawler to download news from Chronicle and Metro News

License

Notifications You must be signed in to change notification settings

YaxinCheng/NewsHub-backend

Repository files navigation

#NewsHub (backend) A restful API for downloading news from Chronicle and Metro by a python crawler
Unfinished yet, more features will be added in the future

###Feature: Multi-threaded crawler for better crawling speeds
Support both Chronicle and MetroNews, and more will be added
MongoDB used to stored parsed news and user information
Python Flask and python crawler work seamlessly
Restful API for easier accesses
APScheduler ensures regular updates and cleaning old news
User data encryption with SHA256

###API: All parameters and responses will be in JSON format

News from both sources:
Method: GET
Address: https://hubnews.herokuapp.com/api/news
Parameters:
Headers: {'page': <int: number>, 'location': <string: location>}
Page number (every 15 news in 1 page)
Response: {'headlines': [.], 'normal': [.]}

News from specific source:
Method: GET
Address: https://hubnews.herokuapp.com/api/news/<string: source>
source can be 'metro' or 'chronicle'
Parameters:
Headers: {'page': <int: number>, 'location': <string: location>}
Page number (every 15 news in 1 page)
Response: {'headlines': [.], 'normal': [.]}

Content for specific news:
Method: POST
Address: https://hubnews.herokuapp.com/api/details
Parameters: {'url': '', 'source': ''}
Headers:
Response: {'content': '', 'img': '', 'tag': '', 'title': '', 'source': '', '_id': ''}
This response is a News JSON, where '_id' is URL and 'img' is an URL for the image

Thumbnail for specific news:
Method: POST
Address: https://hubnews.herokuapp.com/api/thumbnails
Parameters: {'url': ''}
The URL should be the image url from the news
Headers:
Response: {'_id': '', 'img': ''}
The 'img' attribute contains a base64 encoded string which is the binary data of the image

Register:
Method: POST
Address: https://hubnews.herokuapp.com/register
Parameters: {'email': '', 'password': '', 'registerTime': '', 'name': ''}
Password will be encrypted on server side
Headers:
Response: {'ERROR': 'INFO'} or {'SUCCESS': 'INFO'}

Login:
Method: POST
Address: https://hubnews.herokuapp.com/login
Parameters: {'email': '', 'password': ''}
Headers:
Response: {'ERROR': 'INFO'} or {'_id': '', 'name': '', 'status': BOOL, 'activated': BOOL}
The success response is a User JSON

Change password:
Method: POST (login required)
Address: https://hubnews.herokuapp.com/uManage/password
Parameters: {'email': '', 'password': ''}
Headers:
Response: {'ERROR': 'INFO'} or {'SUCCESS': 'INFO'}
Note: User will be log out once the password is changed

Log out
Method: GET (login required)
Address: https://hubnews.herokuapp.com/logout
Parameters:
Headers:
Response: {'SUCCESS': 'INFO'}

Get available locations:
Method: GET
Address: https://hubnews.herokuapp.com/api/locations
Parameters:
Headers:
Response: {'location': []}

Like/Unlike a news:
Method: GET, PUT, POST
Address: https://hubnews.herokuapp.com/api/likes
Parameters: {'url': ''}
Headers:
Response:
  GET:
    {'SUCCESS': ['_id': '', 'img': '', 'title': '']}
    {'ERROR': 'INFO'}
  PUT:
    {'SUCCESS': 'INFO'} or {'ERROR': 'INFO'}
  POST:(Check news liked or not)
    {'SUCCESS': 'INFO'} or {'ERROR': 'INFP'}

###Packages dependency: In the file requirements.txt

There are some more coming
-- Yaxin Cheng @July 19, 2016

About

An restful API to control a web crawler to download news from Chronicle and Metro News

Resources

License

Stars

Watchers

Forks

Packages

No packages published