Skip to content

Sadhanandh/Chat-thumbnailer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

There are two different implementation->
requests-enabled -> which uses requests (urllib3) to fetch the webpage
urlib2-enabled -> which uses the same old urllib2 (shouldnt give you firewall troubles though.)

The Problem statement was to scan through the readable content of the page and according to relevance of location (The readability plugin which removes unnecessary data like comments,frames,header,footer etc) and fetch the image url if exists.

This is a simple quickfix implementation in python using flask.

Used libraries->

* Readability python lxml 
https://github.com/jcharum/lxml-readability

* lxml (I could have used regex to search img tag but anyway as readability uses lxml i did it too and moreover i always use the fast lxml over BSoup)

Run->
git clone https://github.com/Sadhanandh/Chat-thumbnailer.git
cd Chat-thumbnailer/urllib2-enabled
./flask_app.py

(If you dont have flask then install it in a virtualenv and get going.)

About

A simple Readability based web app on Flask -POC

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages