analytics

#IIT Data Analytics Developed on the basis of OpenEdx Data Analytics Package.

The edx data analytics package is made of following four parts & a Result Store in MySQL database called analytics

edx-analytics-pipeline edx-analytics-data-api edx-analytics-data-api-client edx-analytics-dashboard

edx-analytics-pipeline

The edX Analytics Pipeline reads the MySQL database used by the LMS as well as the tracking log files produced by the LMS. The data is processed by a schedular running periodically and the resulting summary data is published to the result store which is a MySQL database called 'analytics'. Most of the computation performed by the edX Analytics Pipeline is implemented as Map Reduce jobs executed by a Hadoop cluster placed on Amazon cloud. Creation of individual query data has been implemented as luigi task & executions of those tasks are accomplished by periodically execution of shell commands like cron. /Jenkins.

edx-analytics-data-api

The edX Analytics Data API provides an HTTP interface to the clients for accessing data in the result store using API. The data in the result store is updated periodically by the edX Analytics Pipeline.

edx-analytics-data-api-client

This is the client program which at the behest of the edx-analytics-dashboard, accesses data api using data api url from edx-analytics-data-api server & sends back data to edx-analytics-dashboard.

edx-analytics-dashboard

This is the front end which renders the visulization of the summary data created at Result Store using a menu structure which divides the analysis of student data in three paradigms 'enrollment analysis', 'engagement' & 'performance'

IIT data analysis It has kept the essence of the openEdx edx-analytics architecture like creating listed query results at the back end as batch process & keeps the resultant data in the 'analytics' database of MySQL. The big data storage is hive storage as in edx. The places it diifers from the original edx are as follows

Like edx data anlytics it is keeping big data (all logs) on hadoop cluster. Unlike edx data analytics this cluster is not installed on Amazon cloud, but is locally installed & managed.
Instead of pushing the raw log files, it does some cleaning & send cleaned data to hadoop cluster. This stops repeated reading of entire log files where queries are generated.
Edx data anlysis does map reduce on hadoop, here it is done on Sparks map reduce which is much faster.
The front end uses R data analysis results & uses R library of visualization.
The edx front end is also available as option.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
analyticalModels		analyticalModels
backEnd		backEnd
documents		documents
frontEnd		frontEnd
installation		installation
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

analyticalModels

analyticalModels

backEnd

backEnd

documents

documents

frontEnd

frontEnd

installation

installation

README.md

README.md

Repository files navigation

analytics

About

Releases

Packages

Languages

Zarana-Parekh/analytics

Folders and files

Latest commit

History

Repository files navigation

analytics

About

Resources

Stars

Watchers

Forks

Languages