Python OutputGenerator.writeTextractOutputs 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: og

클래스/타입: OutputGenerator

메소드/함수: writeTextractOutputs

hotexamples.com에서의 예제들: 2

Python OutputGenerator.writeTextractOutputs - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 og.OutputGenerator.writeTextractOutputs에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

OutputGenerator(15)

run(12)

indexDocument(5)

generateInsights(2)

writeTextractOutputs(2)

structurePageForm(1)

structurePageTable(1)

structurePageText(1)

예제 #1

파일 보기

파일: textract_processor.py 프로젝트: keshava/jarvis-be

def processImage(documentId, bucketName, objectName, callerId):

    response = callTextract(bucketName, objectName)

    print("Generating output for documentId: {}".format(documentId))

    opg = OutputGenerator(documentId=documentId,
                          response=response,
                          bucketName=textractBucketName,
                          objectName=objectName,
                          forms=False,
                          tables=False)
    tagging = "documentId={}".format(documentId)
    opg.writeTextractOutputs(taggingStr=tagging)

    lineage_client.recordLineage({
        "documentId": documentId,
        "callerId": callerId,
        "sourceBucketName": bucketName,
        "targetBucketName": textractBucketName,
        "sourceFileName": objectName,
        "targetFileName": objectName
    })

예제 #2

파일 보기

파일: textract_processor.py 프로젝트: keshava/jarvis-be

def processRequest(request):

    output = ""
    status = request['jobStatus']
    jobId = request['jobId']
    jobTag = request['jobTag']
    jobAPI = request['jobAPI']
    bucketName = request['bucketName']
    objectName = request['objectName']

    pipeline_client.body = {
        "documentId": jobTag,
        "bucketName": bucketName,
        "objectName": objectName,
        "stage": PIPELINE_STAGE
    }
    if status == 'FAILED':
        pipeline_client.stageFailed(
            "Textract Analysis didn't complete successfully")
        raise Exception(
            "Textract job for document ID {}; bucketName {} fileName {}; failed during Textract analysis. Please double check the document quality"
            .format(jobTag, bucketName, objectName))

    pipeline_client.stageInProgress()
    try:
        pages = getJobResults(jobAPI, jobId)
    except Exception as e:
        pipeline_client.stageFailed()
        raise (e)

    print("Result pages received: {}".format(len(pages)))

    detectForms = False
    detectTables = False
    if (jobAPI == "StartDocumentAnalysis"):
        detectForms = True
        detectTables = True

    try:
        opg = OutputGenerator(documentId=jobTag,
                              response=pages,
                              bucketName=textractBucketName,
                              objectName=objectName,
                              forms=detectForms,
                              tables=detectTables)
    except Exception as e:
        pipeline_client.stageFailed(
            "Could not convert results from Textract into processable object. Try uploading again."
        )
        raise (e)

    tagging = "documentId={}".format(jobTag)
    opg.writeTextractOutputs(taggingStr=tagging)

    lineage_client.recordLineage({
        "documentId": jobTag,
        "callerId": request["callerId"],
        "sourceBucketName": bucketName,
        "targetBucketName": textractBucketName,
        "sourceFileName": objectName,
        "targetFileName": objectName
    })

    output = "Processed -> Document: {}, Object: {}/{} processed.".format(
        jobTag, bucketName, objectName)
    pipeline_client.stageSucceeded()
    print(output)
    return {'statusCode': 200, 'body': output}