Skip to content

DSGeek24/Comparative-study-of-Big-Data-file-formats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Following tasks have been done for the document similarity application(finding inverted index and calculate similarity calculation matrix).

Implemented in Avro uncompressed version.
Implemented in Parquet uncompressed version.
Implement in Snappy compression version (either Parquet/Avro)

Compared file sizes and execution times for all file formats to understand the impact of storage size on execution time.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages