Skip to content

scxq/You_Tube_Comments_Analysis_NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

You_Tube_Comments_Analysis_NLP

This project Parsed and Analyzed YouTube comments by tokenizing , stemming and stop-words removing, and extracted features by term frequency-inverse document frequency(TF-IDF) approach.

Built a ML pipeline to convert categorical features to numeric features, vectorized the feature columns using Word2Vec.

Trained ML models including Logistic Regression, Random Forest to classify the video creators.

Evaluated model performance via k-fold cross validation strategy. Applied NLP techniques to recognize top video creators, provided topic recommendations for the cat/dog owners.

the Data

The dataset provided for this coding test are comments for videos related to animals and/or pets. the file is at this google drive link: https://drive.google.com/file/d/1o3DsS3jN_t2Mw3TsV0i7ySRmh9kyYi1a/view?usp=sharing

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages