Skip to content

panatronic-git/orange-plus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

orange-plus

Add-on for the Orange3 with custom widgets

Date: Jul 2020

Author: Panagiotis Papadopoulos
E-mail: panatronic[at]outlook[one dot]com.
Institution: Hellenic Open University.

About: This is my Bachelor's thesis project in the Hellenic Open University.

Thesis title: Development of Widgets for the Orange data mining platform.

Abstract

The scope of the present thesis is to create three widgets for the Orange platform. The main goal is to study and become familiar with data mining techniques, knowledge discovery, the Python language and the development environment. The first widget implements the SMOTE algorithm. SMOTE is used to balance classes in a dataset in order to allow for the dataset to be more effectively used in a machine learning model. The second widget is OPTICS, which allows clustering of an unsupervised dataset based on the dynamic density of the data. The third widget is KDE-2D and it yields a visualization of data based on a two-dimensional kernel-density estimate using Gaussian kernels. This methodology is a very useful illustration for direct detection of special features between 2 variables in large data sets and which are difficult to detect in other graphs, such as the scatter plot. In addition, hidden clusters can be found, as well as it can indicate whether the data form normal distributions. VS Code, Python and Orange, as well as “imbalanced-learn”, “scikit-learn” and “sciPy” were used in the development process. Through the aforementioned development, Orange's exceptional potential in data mining was uncovered and an insightful understanding of Data Science concepts and techniques was achieved, bringing about a valuable skillset that can be expanded and built upon.

References

https://orange.biolab.si
Demsar J, Curk T, Erjavec A, Gorup C, Hocevar T, Milutinovic M, Mozina M, Polajnar M, Toplak M, Staric A, Stajdohar M, Umek L, Zagar L, Zbontar J, Zitnik M, Zupan B (2013) Orange: Data Mining Toolbox in Python, Journal of Machine Learning Research 14(Aug): 2349−2353.

https://scikit-learn.org
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine Learning in PythonJournal of Machine Learning Research, 12, 2825–2830.

https://imbalanced-learn.org/stable/about.html
Guillaume Lemaitre, Fernando Nogueira, & Christos K. Aridas (2017). Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning, Journal of Machine Learning Research, 18(17), 1-5.

https://www.scipy.org/
Virtanen, P., Gommers, R., Oliphant, T., Haberland, M., Reddy, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., Walt, S., Brett, M., Wilson, K., Mayorov, N., Nelson, A., Jones, E., Kern, R., Larson, C., Polat, ., Feng, Y., Moore, E., Vand erPlas, J., Laxalde, J., Cimrman, R., Henriksen, E., Harris, C., Archibald, A., Ribeiro, A., Pedregosa, P., & Contributors, S. (2020). SciPy 1.0: Fundamental Algorithms for Scientific Computing in PythonNature Methods, 17, 261–272.

About

Addon for the Orange3 with custom widgets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages