forked from hexforge/pulp_db
Capture and query large data
License
pombredanne/pulp_db
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
pulp_db AIM: To capture and query large data. Realtime capture. Offline compression. pulp_db is a NoSQL database. It was written for realtime, jagged (hard to schema) data streams. Days of immutable data. It indexes the main data. Also supports field index and quering. It supports indexing. Optimised for high throughput. It uses a two stage write process to achive this. The offline one automatcially optimises the indexes. Most data is immutable. We can use more specialised data structures if we don't need to support insertion once the db has been written. Optimised for low memory usage in read mode. Can also be used to index and markup a flat file of data. Supports python and pypy query front end. NoSQL db for: * jagged * ordered/realtime * immutable data Supports indexing. Python query api. Much nicer SQL imo. ( DB[:200](fld=selector_func) & DB.idx['time'](time_range_func)(un_indexed) ) | DB(some_selector)(some_thing)[::2](some_othering) SELECT * FROM TableTimes INNER JOIN TableTypes ON TableTimes.id = TableTypes.id vs DB.idx['time'] & DB.idx['type'] SELECT * FROM TableA FULL OUTER JOIN TableB ON TableA.name = TableB.name vs DB.idx['time'] | DB.idx['type'] Far more expressive and general. Can just chain. DB(selector)(selector)[::2](selector)(selector)(selector)[:1000] Can combine ( DB[:200](fld=selector_func) & DB.idx['time'](time_range_func)(un_indexed) ) | DB(some_selector)(some_thing)[::2](some_othering) Targeting: High write throughput: Two stage process to acchive this. 1) Capture and do little. 2) Offline optimise [lots more todo here]. Super fast read queries. [More todo here, pushing more to c] Low memory use for read mode. Streaming results. BUILDING todo TESTING todo INSTALLING todo USING todo
About
Capture and query large data
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- C 45.9%
- Shell 16.9%
- HTML 13.5%
- Python 11.9%
- C++ 9.5%
- PureBasic 1.2%
- Other 1.1%