Ejemplos de Tesseract.clear en Python

Lenguaje de programación: Python

Namespace/Package Name: tesserwrap

Clase / Tipo: Tesseract

Método / Función: clear

Ejemplos en hotexamples.com: 4

Python Tesseract.clear - 4 ejemplos encontrados. Estos son los ejemplos en Python del mundo real mejor valorados de tesserwrap.Tesseract.clear extraídos de proyectos de código abierto. Puedes valorar ejemplos para ayudarnos a mejorar la calidad de los ejemplos.

Métodos usados con frecuencia

Mostrar Ocultar

Tesseract(9)

ocr_image(7)

clear(4)

get_text(3)

set_page_seg_mode(3)

set_variable(3)

get_mean_confidence(2)

get_utf8_text(2)

set_image(2)

get_words(1)

Ejemplo n.º 1

Mostrar archivo

Archivo: tesseract.py Proyecto: CodeForAfrica/aleph

def extract_image_data(data, languages=None):
    """Extract text from a binary string of data."""
    tessdata_prefix = get_config('TESSDATA_PREFIX')
    if tessdata_prefix is None:
        raise IngestorException("TESSDATA_PREFIX is not set, OCR won't work.")
    languages = get_languages_iso3(languages)
    text = Cache.get_ocr(data, languages)
    if text is not None:
        return text
    try:
        img = Image.open(StringIO(data))
    except DecompressionBombWarning as dce:
        log.debug("Image too large: %", dce)
        return None
    except IOError as ioe:
        log.info("Unknown image format: %r", ioe)
        return None
    # TODO: play with contrast and sharpening the images.
    extractor = Tesseract(tessdata_prefix, lang=languages)
    extractor.set_page_seg_mode(PageSegMode.PSM_AUTO_OSD)
    text = extractor.ocr_image(img)
    extractor.clear()
    log.debug('OCR done: %s, %s characters extracted',
              languages, len(text))
    Cache.set_ocr(data, languages, text)
    return text

Ejemplo n.º 2

Mostrar archivo

Archivo: books.py Proyecto: haf/making-the-computer-see-ndc-2014

def ocr_text(img):
    '''Perform OCR on the image.'''
    tr = Tesseract(lang='eng')
    tr.clear()
    pil_image = pil.Image.fromarray(img)
    tr.set_image(pil_image)
    utf8_text = tr.get_text()
    return utf8_text

Ejemplo n.º 3

Mostrar archivo

Archivo: ocr.py Proyecto: amnet04/ALECMAPREADER1

def ocr(img,idioma):
    ocr_img = Image.fromarray(img)
    ocr = Tesseract(lang=idioma)
    ocr.set_image(ocr_img)
    pattern = re.compile('[a-zA-Z0-9]')
    text = ocr.get_utf8_text()
    text = text.splitlines()
    text = [x for x in text if x != '']
    text = [x for x in text if pattern.search(x)]
    ocr.clear()
    return (text)

Ejemplo n.º 4

Mostrar archivo

Archivo: scratchpad.py Proyecto: haf/making-the-computer-see-ndc-2014

def ocr_text(img):
    tr = Tesseract(lang='eng')
    tr.clear()
    pil_image = pil.Image.fromarray(img)
    # Turn off OCR word dictionaries
    tr.set_variable('load_system_dawg', "F")
    tr.set_variable('load_freq_dawg', "F")
    tr.set_variable('-psm', "7") # treat image as single line
    tr.set_variable('tessedit_char_whitelist', "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789")
    tr.set_image(pil_image)
    utf8_text = tr.get_text()
    return unicode(utf8_text)