Skip to content

PawelPamula/who-are-you

Repository files navigation

"Who are you" CERN Webfest 2015 project

We are creating a tool to help you learn more about people around you, and connect with those with similar interests or background.

Deliverables

We are developing:

  • a back-end tool to analyse public profile of a person
    • input:
      • person's Twitter and LinkedIn ID
      • (optionally: also Google Scholar, FB? or even person's for web search?)
      • (optionally: the category - professional, hobby, all?)
    • output: tags for that person (+ weights? likelihoods? categories?)
  • a front-end (web) interface to the back-end tool

Tasks/components

  • Data collection – sources -> text

    • Azqa, Maria, Marija, Paweł, Mufutau
    • API (limitations) and library investigation
    • What data and metadata is available?
  • Text processing – text (and metadata) -> tags

    • Sabrina, Konst., Marija, Azqa, Maria
    • how? What library to use (nltk, gate)?
    • dealing with synonyms, categories
  • Information analysis – tags (words, cats, weight, time?) -> tag cloud

    • Sabrina, Sebastian, Paweł, Mufutau, Konst.
    • Testing different thresholds, weights, timing, person vs. the contacts
    • Building tag cloud (profile) of a person
  • Front-end (web) interface

    • Harris
    • taking user id (Twitter handle, LinkedIn profile) as input
    • visualisation of results

Coding and testing nvironment

  • Linux virtual machine, login using SSH (ssh [your CERN login]@tedxapp on Linux or Putty on Windows)
  • Firewall open for specific port:
sudo firewall-cmd --zone=public --permanent --add-port=8443/tcp
sudo systemctl restart firewalld

NLTK

Some useful links