Python process_file 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: pdfannots

메소드/함수: process_file

hotexamples.com에서의 예제들: 4

Python process_file - 4개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 pdfannots.process_file에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: tests.py 프로젝트: DenisSouth/pdfannots

    def setUp(self):
        pdfannots.COLUMNS_PER_PAGE = self.columns_per_page

        path = pathlib.Path(__file__).parent / 'tests' / self.filename
        with path.open('rb') as f:
            (annots, outlines) = pdfannots.process_file(f)
            self.annots = annots
            self.outlines = outlines

예제 #2

파일 보기

def get_annots(p: Path) -> List[Annotation]:
    b = time.time()
    with p.open('rb') as fo:
        doc = pdfannots.process_file(fo, emit_progress_to=None)
        annots = [a for a in doc.iter_annots()]
        # also has outlines are kinda like TOC, I don't really need them
    a = time.time()
    took = a - b
    tooks = f'took {took:0.1f} seconds'
    if took > 5:
        tooks = tooks.upper()
    logger.debug('extracting %s %s: %d annotations', tooks, p, len(annots))
    return [_as_annotation(raw=a, path=str(p)) for a in annots]

예제 #3

파일 보기

파일: pdfs.py 프로젝트: seanbreckenridge/HPI-fork

def get_annots(p: Path) -> List[Annotation]:
    b = time.time()
    with p.open('rb') as fo:
        f = io.StringIO()
        with redirect_stderr(f):
            # FIXME
            (annots, outlines) = pdfannots.process_file(fo, emit_progress=False)
            # outlines are kinda like TOC, I don't really need them
    a = time.time()
    took = a - b
    tooks = f'took {took:0.1f} seconds'
    if took > 5:
        tooks = tooks.upper()
    logger.debug('extracting %s %s: %d annotations', tooks, p, len(annots))
    return [as_annotation(raw_ann=a, path=str(p)) for a in annots]

예제 #4

파일 보기

파일: using_annots.py 프로젝트: DenisSouth/pdfannots

from pdfannots import process_file, PrettyPrinter
from colr import color as term_color
from collections import Counter

input_path = r"tests\hotos17.pdf"
# input_path = r"tests\issue9.pdf"
# input_path = r"tests\issue13.pdf"
# input_path = r"tests\pr24.pdf"

annots, outlines = process_file(open(input_path, 'rb'), emit_progress=True)

pp = PrettyPrinter(outlines, wrapcol=None, condense=True)
data = pp.return_all(annots)

all_ct = list(tuple([item.tagname] + item.selection_colour) for item in data)

classes = {}
for index, data_ in enumerate(Counter(all_ct).most_common()):
    rgb = data_[0][1:]
    class_n = data_[0]
    cnt = data_[1]
    classes[class_n] = index
    print("class:", index, "count:", cnt,
          term_color(class_n, fore=(0, 0, 0), back=rgb))
print()

for d in data:
    key = tuple([d.tagname] + d.selection_colour)
    to_print = f"class:     :  {classes[key]}" \
           f"\ntext       :{d.text}" \
           f"\ncomment    :{d.comment}" \