Skip to content

frienwave/pydatamining

 
 

Repository files navigation

尝试使用python进行数据挖掘.

当前状态:
+ 完成简单推荐功能,使用马氏距离、Pearson相关系数
+ Please use movielens dataset: http://www.grouplens.org/node/73
+ use pickle to store inter data, this function needs to be improved.
+ 数据标准化 (value-mean_value)/normal_deviation = normal_value

Todo List

*****优先*****
+ 加入Slope One算法 --> weighted Slope One --> BI-Polar Slope One
+ Divide into Training part and Run Part
+ Classification
+ User_based filtering
+ Item_based filtering
+ 聚类
+ add xml configuration file and related .py
+ 增加保存dict\list对象功能

***其次***
+  add logging module
+ 增加 reverse_权重
+ 更新模块code

*If I have time*:
+ add database to store inter-data
+ add User Interface
+ 增加对Hadoop的支持
+ 引入评价机制
+ 当结果不理想时(依赖于可靠的评价机制),自适应调整参数


Note:
+ 尽量用KV形式存储处理的数据


https://twitter.com/nourlcn

About

Python Data Mining

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published