Skip to content

Rotaro/JobSearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#JobAdCollector

A simple program to search for job ads from several sites at once.

Currently JobAdCollector has parsers for indeed.fi, monster.fi, duunitori.fi and oikotie.fi. The program gathers job advertisements using keyword searches on the sites. The results are stored in a local sqlite database, which can be viewed as an html or csv table. Additionally, users can classify job ads as relevant or not, which allows JobAdCollector to provide recommendations for new job ads using a decision tree model.

Dependencies

  aiohttp
  rpy2 (for classification of jobs ads only)

##Usage

A command line interface is provided for operating JobAdCollector. To make batch searches easier, the command line can parse search terms from a text file where search terms are separated by new lines (UTF-8 encoding).

Example of search term file:

  dream job
  great job
  ....

To access the interface, run JobAdCollector as a script:

  python -m jobadcollector 

The -h flag provides help for each command. All possible commands are listed below.

###Command Line Options

  • search

    Searches sites for job advertisements using keywords in the file <my_search_terms> and saves them in the database <db_name>.

    python -m jobadcollector <db_name> search <my_search_terms>

  • view

    Displays ads between dates <start_date>, <end_date> (format %d-%m-%Y) in the database <db_name> as a table. The table is saved in the file <output_name> as <output_type> (html or csv).

    python -m jobadcollector <db_name> view <start_date> [-end_date] <output_name> [-output_type]

  • classify

    Starts GUI for classifying job ads in database <db_name> between dates <start_date>, <end_date> (format %d-%m-%Y).

    python -m jobadcollector <db_name> classify <start_date> [-end_date]

  • Rfunc

    Option for using functionalities which require R and rpy2. These require all search terms to be provided in a file <my_search_terms>.

    • detlang

      Attempts to determine language of job ads in database <db_name> between dates <start_date>, <end_date> (format %d-%m-%Y).

      python -m jobadcollector <db_name> Rfunc <my_search_terms> detlang <start_date> <end_date>

    • train

      Trains model on classified ads in database <db_name> between dates <start_date>, <end_date> (format %d-%m-%Y). Only uses ads of language . The model is saved in the file <output_name>.

      python -m jobadcollector <db_name> Rfunc <my_search_terms> train <start_date> <end_date> <language> <output_name>

    • recomm

      Provides recommendations for job ads in database <db_name> between dates <start_date>, <end_date> (format %d-%m-%Y) using the model <input_name> of language .

      python -m jobadcollector <db_name> Rfunc <my_search_terms> recomm <language> <input_name> <start_date> <end_date>

    • Rfuncsearch

      Searches sites for job advertisements using keywords in the file <my_search_terms> and saves them in the database <db_name>. Also automatically determines languages of new job ads and provides recommendations using designated model <input_name> of language .

      python -m jobadcollector <db_name> Rfunc <my_search_terms> Rfuncsearch <language> <input_name>

About

Small program to combine job search results from several job advertisement sites.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published