Python PDFPageInterpreter.process_pageの例

プログラミング言語: Python

名前空間/パッケージ名: pdflib.pdfinterp

クラス/型: PDFPageInterpreter

メソッド/関数: process_page

hotexamples.comのコード掲載数: 4

Python PDFPageInterpreter.process_page - 4件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのpdflib.pdfinterp.PDFPageInterpreter.process_pageの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

PDFPageInterpreter(2)

process_page(2)

よく使われるメソッド

PDFPageInterpreter (2)

process_page (2)

コード例 #1

ファイルを表示

    def convert(self, data):
        # convert binary pdf data into a file like structure
        pdfdata = StringIO(data)

        # I have no idea why this is needed
        CMapDB.initialize('CMap', 'CDBCMap')

        # create the converter and resource manager
        rsrc = PDFResourceManager()
        converter = TextConverter(rsrc)

        # setup the parser
        doc = PDFDocument()
        parser = PDFParser(doc, pdfdata)

        # initialize the pdf
        try:
            # use empty password
            doc.initialize('')
        except PDFPasswordIncorrect:
            return ''

        # check if we can extract the contents of this file
        if not doc.is_extractable:
            return ''

        # do the conversion
        interpreter = PDFPageInterpreter(rsrc, converter)
        for page in doc.get_pages():
            interpreter.process_page(page)

        converter.close()
        pdfdata.close()

        return converter.get_text()

コード例 #2

ファイルを表示

ファイル: pdfparser.py プロジェクト: Big-Data/pypes

    def convert(self, data):
        # convert binary pdf data into a file like structure
        pdfdata = StringIO(data)

        # I have no idea why this is needed
        CMapDB.initialize('CMap', 'CDBCMap')

        # create the converter and resource manager
        rsrc = PDFResourceManager()
        converter = TextConverter(rsrc)

        # setup the parser
        doc = PDFDocument()
        parser = PDFParser(doc, pdfdata)

        # initialize the pdf
        try:
            # use empty password
            doc.initialize('')
        except PDFPasswordIncorrect:
            return ''

        # check if we can extract the contents of this file
        if not doc.is_extractable:
            return ''
 
        # do the conversion
        interpreter = PDFPageInterpreter(rsrc, converter)
        for page in doc.get_pages():
            interpreter.process_page(page)

        converter.close()
        pdfdata.close()

        return converter.get_text()

コード例 #3

ファイルを表示

ファイル: pdf2txt.py プロジェクト: Big-Data/pypes

def convert(rsrc, device, fname, pagenos=None, maxpages=0, password=''):
  doc = PDFDocument()
  fp = file(fname, 'rb')
  parser = PDFParser(doc, fp)
  try:
    doc.initialize(password)
  except PDFPasswordIncorrect:
    raise TextExtractionNotAllowed('Incorrect password')
  if not doc.is_extractable:
    raise TextExtractionNotAllowed('Text extraction is not allowed: %r' % fname)
  interpreter = PDFPageInterpreter(rsrc, device)
  for (pageno,page) in enumerate(doc.get_pages()):
    if pagenos and (pageno not in pagenos): continue
    interpreter.process_page(page)
    if maxpages and maxpages <= pageno+1: break
  device.close()
  fp.close()
  return

コード例 #4

ファイルを表示

def convert(rsrc, device, fname, pagenos=None, maxpages=0, password=''):
    doc = PDFDocument()
    fp = file(fname, 'rb')
    parser = PDFParser(doc, fp)
    try:
        doc.initialize(password)
    except PDFPasswordIncorrect:
        raise TextExtractionNotAllowed('Incorrect password')
    if not doc.is_extractable:
        raise TextExtractionNotAllowed('Text extraction is not allowed: %r' %
                                       fname)
    interpreter = PDFPageInterpreter(rsrc, device)
    for (pageno, page) in enumerate(doc.get_pages()):
        if pagenos and (pageno not in pagenos): continue
        interpreter.process_page(page)
        if maxpages and maxpages <= pageno + 1: break
    device.close()
    fp.close()
    return