Viddle

Web video syndicator.

Changelog

v0.6

minor change of scoring algorithm
slovak docs

v0.5

extended list of supported sites and video players
multiple videos per page functionality

v0.4

results pagination
video templates

v0.3

embedded video output
tags

v0.2

slovak text search support
cherrypy web framework integration

v0.1

web crawler
data parser
basic whoosh indexing and search engine

Docs

Further documentation in slovak language:
http://vi.ikt.ui.sav.sk/User:marek.hlavac?view=home

Dependencies

To run Viddle in your local environment you will need:
- Python 3.x
- Whoosh module
- CherryPy module
- BeautifulSoup module
- PyMongo module
- MongoDB database

Usage

/conf/db.conf

Should contain one line of mongodb access data in format: mongodb://USER:PASS@SITE:PORT/DB_NAME

/conf/sites.conf

List of sites from we are going to crawl inner links with additional sites information. One line contains triplet
[URL] [INNER_LINKS_FILTER] [NAME]
where:
- URL is sites url
- INNER_LINKS_FILTER is used for filtering out cross-domain or other irrelevant inner links
- NAME is used for identifying site

/conf/regex.conf

List of regular expressions that will be used for finding out video data. One line contains triplet
[TAG] [URL_REGEX] [PLAYER]
where:
- TAG specifies tags from which we are going to crawl video data
- URL_REGEX is regular expression for finding out video identificator
- PLAYER specified type of video player
- e.g.: input http://embed.ted.com/talks/.*\.html ted.com

Web crawling can be started with miner.py script:

python crawler/miner.py

Search can be executed through web GUI or by query class from search module.

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
conf		conf
crawler		crawler
search		search
.gitignore		.gitignore
LICENSE		LICENSE
README.textile		README.textile
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conf

conf

crawler

crawler

search

search

.gitignore

.gitignore

LICENSE

LICENSE

README.textile

README.textile

init.py

init.py

Repository files navigation

Viddle

Changelog

v0.6

v0.5

v0.4

v0.3

v0.2

v0.1

Docs

Dependencies

Usage

About

Releases

Packages

Languages

License

hmark/viddle

Folders and files

Latest commit

History

Repository files navigation

Viddle

Changelog

v0.6

v0.5

v0.4

v0.3

v0.2

v0.1

Docs

Dependencies

Usage

About

Resources

License

Stars

Watchers

Forks

Languages