Kumamon

硅谷第7小队项目repo

1st Project: Web Crawler via Scrapy

Pacing

[2016/02/08 - 2016/02/14]
First Stage: Create a Scrapy project to crawl the content in the Xiaomi Appstore homepage or any other Appstore homepage
[2016/02/15 - 2016/02/21]
Second Stage: Save the crawled content in MongoDB[2]. Install Python MongoDB driver and modify pipelines.py to insert crawled data into MongoDB.
[2016/02/22 - 2016/02/29]
Third Stage: Crawl more content by following next page links. So far you have likely only crawled the content of the home page. We need to use Splash[3] and ScrapyJS[4] to re-render the web page to transform the dynamic part to static content if the next page link is written in JavaScript
Bonus Round

pull results from mongo db and show it in browser via flask
multiprocessing (tbd)

What is next?

1st project - Crawler (python)
2nd project - Recommender (python / spark)
3rd project - website (Meteor/React)

Learn programing via project

Nowadays we spend a lot of time to have a good grap of the Data Structure and Algorithms by solving the problems on CC, LC and GFG. But we still probably cannot end up with a good result in our job seeking, since the CS job market is so hot that you have so many competitors...

Quality beats quantity. Instead of going through a lot of questions, if you can make best use of your knowledge to build a product, you can easily extend to similar problems after some practice(this is what they look for, Your problem solving abilities).

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
appstore		appstore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

appstore

appstore

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Kumamon

1st Project: Web Crawler via Scrapy

Pacing

What is next?

Learn programing via project

About

Releases

Packages

Languages

License

stevenmyan/Kumamon

Folders and files

Latest commit

History

Repository files navigation

Kumamon

1st Project: Web Crawler via Scrapy

Pacing

What is next?

Learn programing via project

About

Resources

License

Stars

Watchers

Forks

Languages