Skip to content

maeager/Agilent2Dicom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Notes for modifying source

GUI development

  1. Use Qt4.8 designer. Label linedits and checkboxes correctly - give tooltip when appropriate
  2. Convert ui to py: pyuic4 –output Agilent2DicomQt.py Agilent2DicomQt.ui
  3. Modify Agilent2DicomAppQt.py

New filters/methods

  1. Create new method in separate file e.g. cplxfilter.py,kspace_filter.py
  2. Create new argument in fid2dicom.py
  3. Create new arguments in fid2dcm.sh
  4. Add code to GUI backend Agilent2DicomAppQt.py

Global constants

agilent2dicom_globalvars.py is readable by shell and python. Make sure that there are no spaces between variable name, =, or the string label.

AGILENT2DICOM_VERSION="1.6.4"
AGILENT2DICOM_APP_VERSION="1.8"
FDF2DCMVERSION="1.2"
FID2DCMVERSION="1.4"

GUI update

After changing the QT gui in designer use the following line to update the python version:

pyuic4 --output Agilent2DicomQt.py Agilent2DicomQt.ui

Mercurial keywords [DEPRECIATED]

Mercurial keyword expansion is used to set some handy variables. The hgrc file in the development .hg folder should be the following:

[paths]
default = ssh://hg@bitbucket.org/mbi-image/agilent2dicom
[extensions]
keyword =
[keyword]
agilent2dicom.py =
agilentFDF2dicom.py =
fdf2dcm.sh =
agilent2dicom_globalvars.py =
fid2dicom.py =
fid2dcm.sh =
Agilent2DicomApp.py =
Agilent2DicomAppQt.py =
[keywordmaps]
Author = {author|user}
Date = {date|utcdate}
Header = {root}/{file},v {node|short} {date|utcdate} {author|user}
Id = {file|basename},v {node|short} {date|utcdate} {author|user}
Revision = {node|short}

Use the kwexpand command to enable config tags to be filled correctly.

hg kwexpand

Git keywords

Using git-rsc-keywords package.

Sorting Procpar files for easy diff

sortpp alphabetises the procpar format for easier comparison using diff. The awk script pastes the label and value lines together, then stores the string in an array. The final output is sorted before being printed.

diff -y <(./sortpp <../ExampleAgilentData/kidney512iso_01.fid/procpar) <(./sortpp <../example_data/s_2014072901/T2-cor_01.fid/procpar)

procdiff implements the above line in a shell script

K space

Use ipython for interactive sessions with kspace_filter:

ipython -i ./kspace_filter.py -- -v -i ../ExampleAgilentData/kidney512iso_01.fid/ -o ../ExampleAgilentData/kidney512iso_01.dcm/ 

Standard Gaussian

kspgauss =kspacegaussian_filter2(ksp,256/0.707)
image_filtered = fftshift(ifftn(ifftshift(kspgauss)))
# print "Saving Gaussian image"
save_nifti(normalise(np.abs(image_filtered)),'gauss_kspimage')

Gaussian with sub-band1.5

print "Computing Gaussian sub-band1.5 image from Original image"
kspsubband =fouriergausssubband15(ksp.shape,0.707/256)
image_filtered = fftshift(ifftn(ifftshift(kspsubband)))
# print "Saving Gaussian image"
save_nifti(normalise(np.abs(image_filtered)),'gauss_subband')

Inhomogeneous bias correction

image_corr = inhomogeneousCorrection(ksp,ksp.shape,3.0/60.0)
save_nifti(np.abs(image_filtered/image_corr),'image_inhCorr3')

Enhanced laplacian

F{ D^2 f(x,y,z) } = (-2 pi k)^2 F(k)

print "Computing Laplacian enhanced image"
laplacian = fftshift(ifftn(ifftshift(kspgauss * fourierlaplace(ksp.shape))))
alpha=ndimage.mean(np.abs(image_filtered))/ndimage.mean(np.abs(laplacian))
image_lfiltered = image_filtered - 0.5*alpha*laplacian
save_nifti(np.abs(image_lfiltered),'laplacian_enhanced')

LoG - second-order edge detector/ blob detector

print "Computing Gaussian Laplace image from Smoothed image"
ksplog=kspacelaplacegaussian_filter(ksp,256.0/0.707)
image_filtered = fftshift(ifftn(ifftshift(ksplog)))
image_filtered= (np.abs(image_filtered))
image_filtered= (image_filtered-ndimage.minimum(image_filtered))/(ndimage.maximum(image_filtered) - ndimage.minimum(image_filtered))
save_nifti(np.abs(image_filtered),'Log_image')

Smoothed LoG

Flaplace = fourierlaplace(ksp.shape)
#Flaplace = (Flaplace - ndimage.minimum(Flaplace))/(ndimage.maximum(Flaplace)-ndimage.minimum(Flaplace))
Fsmooth =fouriergauss(ksp.shape,(4.0*np.sqrt(2.0*np.log(2.0)))/512.0)
Fgauss = fouriergauss(ksp.shape,0.707/512)  # 
laplacian = fftshift(ifftn(ifftshift(ksp * (Fgauss/ndimage.maximum(Fgauss)) * Flaplace * (Fsmooth/ndimage.maximum(Fsmooth)) )))
smoothlaplacian = (laplacian - ndimage.minimum(laplacian))/(ndimage.maximum(laplacian)-ndimage.minimum(laplacian))
print "Saving Smoothed Gauss Laplacian"
save_nifti(np.abs(smoothlaplacian),'Log_smoothed')
image_filtered = fftshift(ifftn(ifftshift(ksplog*Fsmooth)))
save_nifti(np.abs(image_filtered),'Log_smoothed2')

Double resolution

test_double_resolution(kspgauss,'Gauss')
test_double_resolution(ksplog,'LoG')

Depth algorithm

test_depth_algorithm(fftshift(ifftn(ifftshift(kspgauss))),'gauss_kspdepth')
test_double_resolution_depth(kspgauss,'gauss_depth_large')

See Examples

https://scipy-lectures.github.io/advanced/image_processing/ https://scipy-lectures.github.io/packages/scikit-image/

using PIL http://nbviewer.ipython.org/github/mroberts3000/GpuComputing/blob/master/ IPython/GaussianBlur.ipynb

http://www.lancs.ac.uk/~struijke/density/kernel.html http://sfb649.wiwi.hu-berlin.de/fedc_homepage/xplore/ebooks/html/anr/anrhtmlframe33.html

CYTHON

kspace filter

kspace_filter.py has many bespoke ndarray operations that would be suitable for optimisation using cython

  1. kspace_filter.py was copied to filterkspace.pyx
  2. define data types in all the functions as per http://docs.cython.org/src/userguide/numpy_tutorial.html#numpy-tutorial
from __future__ import division
import numpy as np
cimport numpy as np
...
DTYPE = np.int
NP_FLOAT = np.float32
NP_CMPLX = np.complex64
...
ctypedef np.int_t DTYPE_t
ctypedef np.float32_t NP_FLOAT_t
ctypedef np.complex64_t NP_CMPLX_t

...
def fouriercoords(np.ndarray siz):
...
    cdef np.ndarray sz = np.ceil(np.array(siz) / 2.0)
    cdef np.ndarray xx = np.array(range(-int(sz[0]), int(sz[0])))
    cdef np.ndarray yy = np.array(range(-int(sz[1]), int(sz[1])))
    cdef np.ndarray zz
    cdef DTYPE_t maxlen = ndimage.maximum(np.array(siz))
    cdef np.ndarray mult_fact, uu, vv, ww
  1. Include type and size checks in each function
if siz.ndim != 1 and (siz.size != 3 or siz.size != 2):
    raise ValueError("Only 2D or 3D dimensions on filter supported")
assert siz.dtype == DTYPE
  1. Create a setup script (setup.py)
  2. Include numpy headers for MASSIVE in filterkspace header _numpyconfig.h not found, even with numpy.get_headers() in setup. Numpy build path put at top of filterkspace.pyx:
# distutils: include_dirs = /usr/local/src/PYTHON/NUMPY/numpy-1.7.0/numpy/core/include/ /usr/local/src/PYTHON/NUMPY/numpy-1.7.0/build/src.linux-x86_64-2.7/numpy/core/include/numpy/
  1. Compile:
python setup.py build_ext --inplace

Note: for MASSIVE users the default gcc produces errors. Use module load gcc/4.7.2 mpfr/3.1.2 gmp/5.1.1

Profiling

At the top of filterkspace.pyx. place this:

# cython: profile=True

speed up type definitions in function arguments

...
def fouriergauss(np.ndarray[DTYPE_t, ndim=1] siz, np.ndarray[NP_FLOAT_t, ndim=1] sigma):
...
    cdef np.ndarray[DTYPE_t, ndim=1] sz = np.ceil(np.array(siz) / 2.0)
    cdef np.ndarray[DTYPE_t, ndim=1] xx = np.array(range(-int(sz[0]), int(sz[0])))
    cdef np.ndarray[DTYPE_t, ndim=1] yy = np.array(range(-int(sz[1]), int(sz[1])))
    cdef np.ndarray[DTYPE_t, ndim=1] zz
    cdef DTYPE_t maxlen = ndimage.maximum(np.array(siz))
    cdef np.ndarray[NP_FLOAT_t, ndim=3] mult_fact, uu, vv, ww

Disable numpy specific bounds checking

http://docs.cython.org/src/tutorial/numpy.html

cimport cython
@cython.boundscheck(False) # turn of bounds-checking for entire function
def fouriercoords(np.ndarray[DTYPE_t, ndim=1] siz):
...
@cython.boundscheck(False) # turn of bounds-checking for entire function
def fouriergauss(np.ndarray[DTYPE_t, ndim=1] siz, np.ndarray[NP_FLOAT_t, ndim=1] sigma):

GPU processing: CUDA and OpenCL

Notes for DrDobbs opencl with python Building and Deploying a Kernel

To build and deploy a basic OpenCL kernel, you usually need to follow these steps in a typical OpenCL C++ host program:

Obtain an OpenCL platform. Obtain a device id for at least one device (accelerator). Create a context for the selected device or devices. Create the accelerator program from source code. Build the program. Create one or more kernels from the program functions. Create a command queue for the target device. Allocate device memory and move input data from the host to the device memory. Associate the arguments to the kernel with kernel object. Deploy the kernel for device execution. Move the kernel’s output data to host memory. Release context, program, kernels and memory.

These steps represent a simplified version of the tasks that your host program must perform (each step is a bit more complex in real life). For example, the first step (obtain an OpenCL platform) usually requires checking the properties for the platforms, as I explained in the previous section. In addition, each step requires error checking. Because you can work with PyOpenCL from any Python console, you can execute the different steps with an interactive environment that makes it easy for you to learn both OpenCL and the way PyOpenCL exposes the features in the API.

MASSIVE

Cuda 5.5 and python 2.7.8 are enabled on MASSIVE

module load gcc/4.7.2 mpfr/3.1.2  gmp/5.1.1 cuda/5.5

See cuda directory.

CUDADevice with properties on MASSIVE (see http://en.wikipedia.org/wiki/Nvidia_Tesla for comparative properties)

Property Value Name: ‘Tesla M2070-Q’ ComputeCapability: 2 SupportsDouble: Yes DriverVersion: 5.5 ToolkitVersion 5 MaxThreadsPerBlock 1024 MaxShmemPerBlock 49152 MaxThreadBlockSize [1024 1024 64] MaxGridSize [65535 65535 65535] SIMDWidth 32 TotalMemory 5.6366e+09 FreeMemory 5.5217e+09 MultiprocessorCount 4 ClockRateKHz 1147000 GPUOverlapsTransfers 1 KernelExecutionTimeout 1 CanMapHostMemory 1 ComputeMode Default

PyFFT cuda

pyfft3d.py

The resulting normalised error should be around 3e-7, given the nature of single-precision floating point numbers. 512x512x512 3D images (usertime)

CPU (s) GPU (write+exec+get) ms fftn 19.9 278+107+281 fftn 20.2 278+75+280 ifftn 32.6 285+78+272 total 72.7 sec 1935 ms 97.3% speedup Second run (total runtime)

CPU (s) GPU (write+exec+get) ms fftn 22.7 294+130+682 fftn 25.9 292+130+681 ifftn 42.8 286+129+681 total 91.4 sec 3305 ms 96.4% speedup

1024x1024x1024 3D images: Does not work - memory overflow errors.

256x256x256 3D images:

CPU (s) GPU (write+exec+get) ms fftn 2.65 37+15.9+84.3 fftn 2.56 37+15.5+84.8 ifftn 4.68 38.5+15.4+205 total 9.89 sec 533.4 ms 94.6% speedup

pyfft2dslice.py

Cutting down large volumes using 2d slices or slabs of 3D images.

Four slab 3D fft 1024/4=256 (1024x1024x256 )x4 Reference matrix multilpication on CPU: 1min 59s (cold) 13.3 s (hot) (1024x1024x256 )x4 3D images:

CPU (sec) 1024x1024x1024 GPU (write+exec+get)x4 sec (1024x1024x256 )x4 fftn 3min 49s 1.78+.262+2.44+1.76+.261+2.47+1.71+.26+1.89+1.75+.26+1.87= 16.713 fftn 3min 53s 1.75+.261+1.85+1.76+.261+1.85+1.69+.261+1.85+1.75+.26+1.84= 15.383 ifftn 6min 13s 1.77+.261+1.9+1.79+.261+1.89+1.71+.261+1.9+1.77+.26+1.88= 15.653 total 48.2 92.7% speedup

Run two 9: (1024x1024x256 )x4 3D images:

CPU (sec) 1024x1024x1024 GPU (write+exec+get)x4 sec (1024x1024x256 )x4 fftn 3min 37s 2.23+.263+1.83+2.22+.261+1.92+2.19+.262+1.94+2.23+.261+1.92= 17.527 fftn 3min 39s 2.24+.261+1.83+2.24+.261+1.92+2.19+.261+1.94+2.22+.261+1.92=17.544 ifftn 6min 20s 2.54+.261+1.83+2.08+.261+1.92+2.04+.261+1.94+2.09+.261+1.91=17.394 total 13min 36s (835 s) 52.465 93.7% speedup, Norm error

Notes: actual FT output has not been compared to standard FT output

Scikits.cuda

pip install --user scikits.cuda

Example: cufft.py

Reconstruction from complex kspace

N = 512
batch_size = 512
print 'Testing recon ifft (in-place)..'
tic()
x = np.empty((batch_size, N, N), np.complex64)
x = np.asarray(np.random.rand(batch_size, N, N), np.complex64)
x.imag = np.random.rand(batch_size, N, N)
x_gpu = gpuarray.to_gpu(x)
# xf_gpu = gpuarray.empty((batch_size, N, N), np.complex64)
plan = cu_fft.Plan((batch_size, N, N), np.complex64, np.complex64, 1)
toc()
tic()
cu_fft.ifft(x_gpu, x_gpu, plan, True)
toc()
tic()
y = numpy.fft.ifftn(x)
toc()
print 'Success status: ', np.allclose(y, x_gpu.get(), atol=1e-6)
In [48]: plan = cu_fft.Plan((batch_size, N, N), np.complex64, np.complex64, 1)

In [49]: toc()
Elapsed time is 14.3354101181 seconds.
Out[49]: '14.3354740143'

In [50]: tic()

In [51]: cu_fft.ifft(x_gpu, x_gpu, plan, True)

In [52]: toc()
Elapsed time is 3.01940393448 seconds.
Out[52]: '3.0195069313'

In [53]: tic()

In [54]: y = numpy.fft.ifftn(x)

In [55]: toc()
Elapsed time is 35.2300841808 seconds.
Out[55]: '35.2301981449'

In [56]: print 'Success status: ', np.allclose(y, x_gpu.get(), atol=1e-6)
Success status:  True

Speed test

Speed test 2: exp(-(a^2 + b^2 + c^2)) blocks=6400 block_size = 128 Using nbr_values == 819200 Calculating 1000 iterations SourceModule time and first three results: 0.053058s, [ 0.54412651 0.54412651 0.54412651] Elementwise time and first three results: 0.204747s, [ 0.54412651 0.54412651 0.54412651] Elementwise Python looping time and first three results: 0.300244s, [ 0.54412651 0.54412651 0.54412651] GPUArray time and first three results: 3.665145s, [ 0.54412651 0.54412651 0.54412651] CPU time and first three results: 31.908086s, [ 0.54412651 0.54412651 0.54412651]

Speed test 3: exp(-2*pi^2(a^2 + b^2 + c^2))

FFT

Reikna CUDA

Reikna IFFT time and first three results: 12.443403s, [ 3.83110513e-05 6.72008329e-05 9.83252550e-05] Numpy IFFTN time and first three results: 43.439137s, [ 3.83110483e-05 6.72008469e-05 9.83252633e-05] 2.22041709388e-07

Reikna OpenCL

import numpy as np
## REINKA
import pycuda.driver as drv
import pycuda.tools
import pycuda.autoinit

from reikna.cluda import dtypes, any_api
from reikna.fft import FFT
from reikna.core import Annotation, Type, Transformation, Parameter
# create two timers so we can speed-test each approach
start = drv.Event()
end = drv.Event()
api = any_api()
thr = api.Thread.create()
N=512

start.record()
data_dev = thr.to_device(ksp)
cifft = ifft.compile(thr)
cifft(data_dev, data_dev,inverse=0)
result = numpy.fft.fftshift(data_dev.get() / N**3)
result = result[::-1,::-1,::-1]
result = np.roll(np.roll(np.roll(result,1,axis=2),1,axis=1),1,axis=0)
end.record()
end.synchronize()
secs = start.time_till(end)*1e-3
print "Reikna IFFT time and first three results:"
print "%fs, %s" % (secs, str(np.abs(result[:3,0,0])))

start.record()
reference = numpy.fft.fftshift(numpy.fft.ifftn(ksp))
end.record()
end.synchronize()
secs = start.time_till(end)*1e-3
print "Numpy IFFTN time and first three results:"
print "%fs, %s" % (secs, str(np.abs(reference[:3,0,0])))

print "Norm difference", numpy.linalg.norm(result - reference) / numpy.linalg.norm(reference)

Reikna IFFT time and first three results: 11.012042s, [ 3.83110513e-05 6.72008329e-05 9.83252550e-05] Numpy IFFTN time and first three results: 43.887570s, [ 3.83110483e-05 6.72008469e-05 9.83252633e-05]

Norm difference 2.22041709388e-07

PyFFT OpenCL

from pyfft.cl import Plan
import pyopencl as cl
import pyopencl.array as cl_array

tic()
ctx = cl.create_some_context(interactive=False)
queue = cl.CommandQueue(ctx)
w = h = k = 512
plan = Plan((w,h,k), normalize=True, queue=queue)
gpu_data = cl_array.to_device(queue, ksp)
plan.execute(gpu_data.data, inverse=True) 
result = gpu_data.get()
result = np.fft.fftshift(result)
print "PyFFT OpenCL IFFT time and first three results:"
print "%s sec, %s" % (toc(), str(np.abs(result[:3,0,0])))

tic()
reference = np.fft.fftshift(np.fft.ifftn(ksp))
print "Numpy IFFTN time and first three results:"
print "%s sec, %s" % (toc(), str(np.abs(reference[:3,0,0])))


print "Calulating L1 norm "
print np.linalg.norm(result - reference) / np.linalg.norm(reference)

PyFFT OpenCL IFFT time and first three results: 4.00547480583 sec, [ 3.83110491e-05 6.72008246e-05 9.83252612e-05]

Numpy IFFTN time and first three results: 46.8406469822 sec, [ 3.83110483e-05 6.72008469e-05 9.83252633e-05]

Calulating L1 norm 4.21388290199e-07

Scikits CUDA

import pycuda.autoinit
import pycuda.gpuarray as gpuarray
import numpy as np

import scikits.cuda.fft as cu_fft

N=512
batch_size=512
tic()
x_gpu = gpuarray.to_gpu(ksp)
plan_inverse = cu_fft.Plan((N, N), np.complex64, np.complex64, batch_size)
cu_fft.ifft(x_gpu, x_gpu, plan_inverse, True)
result = np.fft.fftshift(x_gpu.get() )
#result = np.fft.fftshift(data_dev.get() / N**3)
#result = result[::-1,::-1,::-1]
#result = np.roll(np.roll(np.roll(result,1,axis=2),1,axis=1),1,axis=0)
print "Scikits CUDA IFFT time and first three results:"
print "%s sec, %s" % (toc(), str(np.abs(result[:3,0,0])))

Scikits CUDA IFFT time and first three results: 4.06088709831 sec, [ 0.00219029 0.00188084 0.00227454]

Denoising in MATLAB

Non local means denoising

See README file in matlab/NLmeans

About

Agilent 9.4T MR Image Processing and DICOM Converter App

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published