eikon_challenge

This is part of the code for the Eikon challenge. Probably only the features.py file and the design diagram is of interest here.

In this challenge Thomson Reuters, was searching for an algorithm to accurately tag incoming news items by relevance for companies or organizations mentioned within the news item. I built a system capable of recognizing alternative company names (using DBpedia data), stock ticker based identification (Bloomber Symbiology data) and country based discrimination in the text of the news. The system has the following structure:

Lookup tagger: Performs authorithy driven mention detection, i.e. extracts with high recall possible mentions of company names.
Candidate generation: For each possible company mention several candidate companies are suggested
Feature generation: For each mention-candidate company generate features.
Classifier: This component finds the correct candidates using the features. One of the greatest challenges was to find data sources to augment the information about the list of companies complying with the accepted licenses.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
test		test
README.md		README.md
__init__.py		__init__.py
candidates.py		candidates.py
features.py		features.py
knowledgebase.py		knowledgebase.py
lookuptagger.py		lookuptagger.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test

test

README.md

README.md

init.py

init.py

candidates.py

candidates.py

features.py

features.py

knowledgebase.py

knowledgebase.py

lookuptagger.py

lookuptagger.py

utils.py

utils.py

Repository files navigation

eikon_challenge

About

Releases

Packages

Languages

elyase/eikon_challenge

Folders and files

Latest commit

History

Repository files navigation

eikon_challenge

About

Resources

Stars

Watchers

Forks

Languages