Python CUDA_Kernels.get_functions Examples

Programming Language: Python

Namespace/Package Name: kernels

Class/Type: CUDA_Kernels

Method/Function: get_functions

Examples at hotexamples.com: 2

Python CUDA_Kernels.get_functions - 2 examples found. These are the top rated real world Python examples of kernels.CUDA_Kernels.get_functions extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

get_function(3)

get_functions(1)

Frequently Used Methods

get_function (3)

get_functions (1)

Example #1

Show file

File: cuda_functions.py Project: brodyh/dpmix

def gpu_sweep_row_mult(X, y):
    """ X * y = X down the rows """
    if type(X)==GPUArray:
        gX = X
    else:
        gX = to_gpu(np.asarray(X, dtype=np.float32))

    if type(y)==GPUArray:
        gy = y
    else:
        gy = to_gpu(np.asarray(y, dtype=np.float32))

    dims = np.asarray(X.shape, dtype=np.int32)
    if devinfo.max_block_threads >= 1024:
        blocksize = 32
    else:
        blocksize = 16

    gridsize = int(dims[0] / blocksize) + 1
    shared = int(4*dims[1])

    if gX.flags.c_contiguous:
        func = CUDA_Kernels.get_function("sweep_rows_mult")
    else:
        func = CUDA_Kernels.get_functions("sweep_rows_mult_cm")

    func(gX, gy, dims[0], dims[1], block=(blocksize, blocksize,1),
         grid = (gridsize,1), shared = shared)

    if type(y)!=GPUArray:
        X = gX.get()

Example #2

Show file

def gpu_sweep_row_mult(X, y):
    """ X * y = X down the rows """
    if type(X) == GPUArray:
        gX = X
    else:
        gX = to_gpu(np.asarray(X, dtype=np.float32))

    if type(y) == GPUArray:
        gy = y
    else:
        gy = to_gpu(np.asarray(y, dtype=np.float32))

    dims = np.asarray(X.shape, dtype=np.int32)
    if devinfo.max_block_threads >= 1024:
        blocksize = 32
    else:
        blocksize = 16

    gridsize = int(dims[0] / blocksize) + 1
    shared = int(4*dims[1])

    if gX.flags.c_contiguous:
        func = CUDA_Kernels.get_function("sweep_rows_mult")
    else:
        func = CUDA_Kernels.get_functions("sweep_rows_mult_cm")

    func(gX, gy, dims[0], dims[1], block=(blocksize, blocksize, 1),
         grid = (gridsize, 1), shared = shared)

    if type(y) != GPUArray:
        X = gX.get()