Skip to content

souviksaha97/Data-Science-Lab

 
 

Repository files navigation

Data Science Laboratory

In this repository, I put some Machine Learning and Deep Learning algorithms. Some of them come from various MOOCs and some of them are my work.

Aim

The goal of this repository is to help aspiring data scientists to understand what happens behind the code and don't blindly copy paste it. I've done this with extensive comments in code files. I tried to be as detailed as possible, but surely, some googling will also help.

How to use

You can clone the repository and in spyder or in your other favorite IDE set the working directory to that folder. Datasets are in each folder, except for Convolutional Neural Networks, for which you can download data from provided link in the code file. I left the data folder and some pictures in it for guidance on how to prepare data folder directory for CNN. These are really toy datasets, just for learning purposes. I do hope it will make a difference.

Major Updates

In 27-Jan-2019

  • Separated ML, DL, NLP, and RL algorithms in different folders
  • Added more comments in each code file
  • Extended DL with supervised and unsupervised algorithms
  • Easily generalizable code for other datasets
  • Materials for farther reading in Deep Learning

18-Feb-2019

  • Added data visualization part
    • This folder currently includes Plotly, Dash, and Bokeh

17-Mar-2019

  • Added web scarping part.
    • Added little scripts for beginners
    • Added Craigslist scraping section

20-Apr-2019

  • Added web scraping codes

8-May-2019

  • Added data structures and algorithms part
    • This folder includes data structures and algorithms subfolders

16-May-2019

  • Added data visualization part
    • Seaborn
    • Pandas
    • Plotnine
  • Modified Machine Learning folder
    • Create *Utility Toolbox, containing Data Pre-Processing, Model Selection, and Model Explainability folders

To Do

  • Add more intuitions for each algorithm and problem
  • Add some more ML algorithms
  • Each folder will have to have a brief description of algorithms with farther references
  • Add more visualization in data visualization folder, such as Matplotlib and Altair
  • Add more data structures and algorithms

Contributors are warmly welcomed

About

Data Science Lab

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.7%
  • Other 1.3%