Skip to content

trustyou/meetups

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

meetups

Code samples and materials for tech talks and meetups hosted by TrustYou

Example combining spaCy and Keras for a simple machine learning exercise. See README.md.

In this meetup I showed how a pipeline similar to TrustYou's - crawl, analyze, serve - can be built out of popular Python libraries. And then how it can be scaled up.

$ cd pydata
$ pip install -r requirements.txt # e.g. in a virtualenv
$ ./run_example.sh

The above example will crawl meetup.com to discover all meetups (runs a few hours!), and then build a Word2Vec model based on their descriptions.

In this meetup we shared some insights into the TrustYou big data tech stack, and gave introductions to two tools we've found useful: Apache Pig and Luigi. The examples from the slides are contained in this repo.

Apache Pig

Install Apache Pig, e.g. from their website. No Hadoop necessary! Alternatively, give the Hortonworks sandbox a try if you're planning to try out other Hadoop-related technologies as well. Then, run this:

$ cd big-data/pig
$ ./run_examples.sh

Look in the *.tsv sub folders for the output - when run locally, Apache Pig mimics the folder structure of job output in the HDFS, so the data will be in part files.

Luigi

Install dependencies by running pip install -r requirements.txt from luigi folder. Then, run:

$ cd big-data/luigi
$ ./run_example.sh

We had a look behind the scenes of CPython, the reference implementation of Python, and its C API which allows you to extend the Python language in C. Finally we checked out Cython, which seems to be the sanest way of writing C extensions of Python.

In the examples we focused on benchmarking different implementations of QuickSort in Python, C and Cython. Before trying them, run pip install -r requirements.txt from the python-c folder. I propose to run the following inside a virtualenv:

$ virtualenv venv
$ . venv/bin/activate 
(venv) $ cd python-c
(venv) $ ./run_examples.sh

About

Code samples and materials for tech talks and meetups hosted by TrustYou

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published