Skip to content

jaredhirsch/universal-search-recommendation

 
 

Repository files navigation

universal-search-recommendation

Universal Search recommendation server.

Build Status Coverage Status Requirements Status

Documentation: Local Development | Methodology | Deployment

Methodology

This recommendation service is intended to return a suggested search result—and additional metadata for it—for an incomplete search string. Following: a walkthrough of the methodology using the search term "the mar", an in-process search where the user is looking for information about the film The Martian.

First, we want to determine the user's intended search from the in-process search string. We do that by passing the query to a search suggestion engine. That engine might respond with the following suggestions:

[
  "the martian",
  "the martian trailer",
  "the martian movie",
  "the martian book",
  "the marion star",
  "the mary sue"
]

We'll take top of those, the martian, and perform a search on that using a full search engine API:

{
  "abstract": "Directed by Ridley Scott. With Matt Damon, Jessica Chastain, Kristen Wiig, Kate Mara. During a manned mission to Mars, Astronaut Mark Watney is presumed dead after a ...",
  "clickurl": "http://www.imdb.com/title/tt3659388/",
  "date": "",
  "dispurl": "www.imdb.com/title/tt3659388",
  "title": "<b>The Martian</b> (2015) - IMDb",
  "url": "http://www.imdb.com/title/tt3659388/"
}

Using the data from that search result, we then run it through a series of classifiers. Each of these attempt to match and enhance results with data from additional sources.

An IMDb enhancer might recognize that the above result is a movie from IMDb and add the following data:

{
  "date": "2 October 2015",
  "genre": [
     "Adventure",
     "Drama",
     "Sci-Fi"
  ],
  "rating": {
    "actual": 8.2,
    "max": 10
  },
  "credits": {
    "acting": [
      "Matt Damon",
      "Jessica Chastain",
      "Kristen Wiig"
    ],
    "directing": [
      "Ridley Scott"
    ],
    "writing": [
      "Drew Goddard",
      "Andy Weir"
    ]
  }
}

Summed up, the response from the API would look like this:

{
  "enhancements": {
    "imdb": {
      "date": "2 October 2015",
      "genre": [
         "Adventure",
         "Drama",
         "Sci-Fi"
      ],
      "rating": {
        "actual": 8.2,
        "max": 10
      },
      "credits": {
        "acting": [
          "Matt Damon",
          "Jessica Chastain",
          "Kristen Wiig"
        ],
        "directing": [
          "Ridley Scott"
        ],
        "writing": [
          "Drew Goddard",
          "Andy Weir"
        ]
      }
    }
  },
  "query": {
    "completed": "the martian",
    "original": "the mar"
  },
  "result": {
    "abstract": "Directed by Ridley Scott. With Matt Damon, Jessica Chastain, Kristen Wiig, Kate Mara. During a manned mission to Mars, Astronaut Mark Watney is presumed dead after a ...",
    "title": "The Martian (2015) - IMDb",
    "url": "http://www.imdb.com/title/tt3659388/"
  }
}

Local development

Prerequisites: memcached v1.4.25, libmemcached 1.0.18, python 3.5.1, virtualenv

To install:

git clone https://github.com/mozilla/universal-search-recommendation.git
cd universal-search-recommendation
mkvirtualenv universal-search-recommendation  # You may need to set --python to point to the python 3.5 executable.
pip install -r requirements.txt

To run the server:

source .env
python server.py

Configuration

Some environment variables are required. Each of those are contained within .env.dist, which should be renamed to .env and populated before developing:

  • YAHOO_OAUTH_KEY: an OAuth key with access to Yahoo's BOSS Search API.
  • YAHOO_OAUTH_SECRET: the secret for the key contained in YAHOO_OAUTH_KEY.

Additional confuration can be done by setting these optional environment variables:

  • RECOMMENDATION_ENV: the environment in which the server is being run. Options: development, staging, testing, production. Default: development.
  • RECOMMENDATION_HOST: the host at which you'd like the server to run. Default: 0.0.0.0
  • RECOMMENDATION_PORT: the port at which you'd like the server to listen. Default: 5000
  • MEMCACHED_HOST: the host at which memcached is listening. Default: 127.0.0.1

Testing

To run the test suite:

nosetests

About

Recommendation server for Universal Search.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%