Skip to content

qingkongmengnuan/Spark-in-machine-learning

 
 

Repository files navigation

Spark-in-machine-learning

Machine Learning Library MLlib Guide MLlib is Spark’s machine learning (ML) library. Its goal is to make practical machine learning scalable and easy. At a high level, it provides tools such as:

ML Algorithms: common learning algorithms such as classification, regression, clustering, and collaborative filtering Featurization: feature extraction, transformation, dimensionality reduction, and selection Pipelines: tools for constructing, evaluating, and tuning ML Pipelines Persistence: saving and load algorithms, models, and Pipelines Utilities: linear algebra, statistics, data handling, etc.

Please feel free to contact me if you have any questions with this repo:)

使用Pyspark 实现几个机器学习的例子:

  1. 基于ALS 推荐算法 [完成]
  2. 使用Spark的分类器 [完成]
  3. 使用Spark的回归器 [完成]
  4. 使用Spark的ML库 [完成]

如果对代码仓有任何问题,欢迎联系我:)

About

use pysprark to do machine learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 96.4%
  • Python 3.6%