Podcasts are a rapidly growing medium, but Spotify’s current search functionality is limited in that it does not allow a search within the actual contents of an episode. The implemented recommendation system enhances the search functionality within Spotify and allows users to find a jump-in point for relevant podcast episodes. A query is enriched by finding the subject, object, and named entities that are expanded by knowledge graphs. Latent Dirichlet Allocation (LDA) is used to tag all transcripts and queries in a finite number of topics. The overall coherence score of the LDA model is 0.537. Next, a Vector Space Model (VSM) is implemented to rank and retrieve relevant transcripts. Normalized discounted cumulative gain (nDCG) is the metric used to evaluate the segments relevance. Of the 8 test queries 5 had a nDCG over .73 and 3 queries had a nDCG below .73. On average the recommendation engine takes 48.623 seconds to search 25,000 podcasts and return the top-10 relevant search results.
-
Notifications
You must be signed in to change notification settings - Fork 0
Vivian-Ellis/Podcasts-Ad-Hoc-Retrieval
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Ad-Hoc Retrieval for Podcasts Segments
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published