Skip to content

nagyistge/Datawake

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project is deprecated, check out the latest work at https://github.com/Sotera/DatawakeDepot

DataWake

The DataWake project consists of various server and database technologies that aggregate user browsing data via a plug-in using domain-specific searches. This captured, or extracted, data is organized into browse paths and elements of interest. This information, in the form of Trails , can be shared or expanded amongst teams of individuals. Elements of interest which are extracted either automatically, or manually by the user, are given weighted values. Extracted elements that are not of interest or might be confused with an element that is of interest (e.g. an Organization with a similar name but not associated in any meaningful way to the one being researched) can be manually removed from the extracted data list.

Additionally, the application can be configured to export all page contents and extracted information to RESTFul services, Elasticsearch, or Kafka.

Companion projects

Necessary for building

Other projects

  • DataWake Prefetch Streaming search with scraping and entity extraction of all results.
  • Firmament Provides a simplified configuration of interconnected Docker containers.

More information including build information can be found at our Github Page.

DataWake is part of the DARPA Memex Open Catalog

For more information, please email memex@soteradefense.com.

About

Browser add-on and web server to support collection and analysis of web browsing data.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • JavaScript 44.0%
  • Python 40.2%
  • HTML 9.1%
  • CSS 5.9%
  • Other 0.8%