kluster

Development Items: Trello Board

A distributed multibeam processing system built on the Pangeo ecosystem (https://pangeo.io/).

Kluster provides a fully open source hydrographic processing package to produce accessible bathymetry products in support of ocean mapping.

Scalable - uses Dask to provide distributed parallel processing on everything from a laptop to a cloud service (AWS Fargate for example)
Cloud ready - uses Zarr as a cloud ready storage format for converted multibeam records and processed soundings
Open - data are presented using Xarray objects for easy interactivity and stored with Zarr, all open formats
Scriptable - provides a GUI for visualization and processing, but can be run from the command line or scripted easily
Extensible - From data conversion to sound velocity correction, kluster is built using modules that can be replaced, enhanced or exchanged as needed.

Kluster has been tested on:

EK60, EK80 (Using Kluster amplitude detection, see Requirements in Documentation)
Reson 7125, T20, T51
EM2040/2040c/2040p
EM2040 dual tx/dual rx
EM710/712
EM3002
EM302/304
EM122
ME70 Bathy Module

Kluster is built from the ground up in Python, and was developed using Python 3.8. Kluster includes modules developed by the hydrographic community such as (see drivers):

kmall - Kongsberg .kmall file reader
par3 - Kongsberg .all file reader
prr3 - Reson .s7k file reader
raw - Kongsberg .raw file reader
sbet - POSPac sbet/rms file reader

Kluster is a work in progress that has been in development since November 2019 by a small 'team', and is by no means feature complete. If you are interested in contributing or have questions, please contact Eric Younkin (eric.g.younkin@noaa.gov)

Why Kluster?

There are three principle motivations behind kluster:

Build a multibeam processing sandbox for scientists/engineers

The hydrographic community is continuously innovating. Oftentimes, we want to experiment with an algorithm or technique, but the data is inaccessible, or relies on intermediate products that are locked within the software. How do you get attitude corrected beam vectors into a numpy array? How can I test a new gridding algorithm without exporting soundings to text first?

Build a multibeam cloud processing system for field use/production

Cloud data storage and processing is quickly becoming a reality, as the advantages of not owning your own infrastructure become apparent. Where does this leave processing software and our traditional workflow? Kluster is designed from the ground up to address this issue, by providing processing that can be tailored and deployed in multiple different ways depending on the application. In addition, using the multiprocessing capabilities of Dask, kluster provides a powerful tool that can compete with existing software packages in terms of performance.

Evaluate the latest in open source scientific software

Much of the existing open source software related to multibeam processing has been in development for decades. There has been an explosion in scientific libraries that can benefit the hydrographic community as a whole that have not been seriously evaluated. Kluster relies on the state of the art in Python libraries to provide a sophisticated and modern software package.

Installation

We recommend that users try to run Kluster using the release attached to this GitHub repository, see releases

Kluster has been tested on Windows 10 and Ubuntu 20.04.

Kluster is not on PyPi, but can be installed using pip alongside the HSTB-drivers and HSTB-shared modules that are required.

(For Windows Users) Download and install Visual Studio Build Tools 2019 (If you have not already): MSVC Build Tools

Download and install conda (If you have not already): conda installation

Download and install git (If you have not already): git installation

Some dependencies need to be installed from the conda-forge channel. I have an example below of how to build this environment using conda.

Perform these in order:

conda create -n kluster_test -c conda-forge python=3.8.12 qgis=3.18.3 vispy=0.9.4 pyside2=5.13.2 gdal=3.3.1 h5py python-geohash

conda activate kluster_test

pip install git+https://github.com/noaa-ocs-hydrography/kluster.git#egg=hstb.kluster

Start the GUI by activating the new environment and run Kluster as a module

(kluster_test) C:>python -m HSTB.kluster

###Docker

Build docker image using the provided dockerfile

C:\Pydro21_Dev\NOAA\site-packages\Python38\git_repos\hstb_kluster>docker build -t kluster/ubuntu .

C:\Pydro21_Dev\NOAA\site-packages\Python38\git_repos\hstb_kluster>docker run -it kluster/ubuntu

(base) eyou102@faaec62a4c1c:~/kluster$ conda deactivate

eyou102@faaec62a4c1c:~/kluster$ conda activate kluster_test

(kluster_test) eyou102@faaec62a4c1c:~/kluster$ python

Quickstart

See documentation for the new quick start guide

Kluster on the console

See examples or notebooks for examples on how to use Kluster in the console.

Name		Name	Last commit message	Last commit date
Latest commit History 1,097 Commits
.circleci		.circleci
.github/workflows		.github/workflows
HSTB/kluster		HSTB/kluster
docs		docs
examples		examples
notebooks		notebooks
tests		tests
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
dependency-links.txt		dependency-links.txt
dockerfile		dockerfile
environment.yml		environment.yml
readthedocs_environment.yml		readthedocs_environment.yml
requirements.txt		requirements.txt
runtime.txt		runtime.txt
setup.py		setup.py

License

noaa-ocs-hydrography/kluster

Folders and files

Latest commit

History

Repository files navigation

kluster

Contents

Known Issues:

Kluster is:

Why Kluster?

Build a multibeam processing sandbox for scientists/engineers

Build a multibeam cloud processing system for field use/production

Evaluate the latest in open source scientific software

Installation

Quickstart

Kluster on the console

About

Resources

License

Stars

Watchers

Forks

Languages