Python Documentの例

プログラミング言語: Python

名前空間/パッケージ名: bibim.document.document

クラス/型: Document

hotexamples.comのコード掲載数: 6

Python Document - 6件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのbibim.document.document.Documentの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

Document(3)

content(1)

get_metadata_field(1)

set_metadata_field(1)

コード例 #1

ファイルを表示

ファイル: extraction.py プロジェクト: Alex-Linhares/bibtexIndexMaker

 def extract(self, input_file):
     input_file = self._check_input_file(input_file)
     
     document = Document()
     file = open(input_file)
     document.content = file.read()
     file.close()
     
     return document

コード例 #2

ファイルを表示

class TestPDFTextExtractor(unittest.TestCase):
    def setUp(self):
        self.document = Document()

    def tearDown(self):
        pass

    def test_metadata_fields(self):
        self.document.set_metadata_field('Name', 'Document name')
        self.failUnless(
            self.document.get_metadata_field('Name') == 'Document name')

    def test_available_metadata(self):
        self.document.set_metadata_field('Name', 'Document name')
        self.document.set_metadata_field('CreationDate', 'Today')
        fields = self.document.available_metadata
        self.failUnless(len(fields) == 2)
        self.failUnless(fields.count('Name') == 1)
        self.failUnless(fields.count('CreationDate') == 1)

    def test_content(self):
        self.document.content = "Some text content"
        self.failUnless(self.document.content == "Some text content")

コード例 #3

ファイルを表示

ファイル: test_document.py プロジェクト: rxuriguera/bibtexIndexMaker

class TestPDFTextExtractor(unittest.TestCase):

    def setUp(self):
        self.document = Document()
        
    def tearDown(self):
        pass

    def test_metadata_fields(self):
        self.document.set_metadata_field('Name', 'Document name')
        self.failUnless(self.document.get_metadata_field('Name') == 
                        'Document name')
    
    def test_available_metadata(self):
        self.document.set_metadata_field('Name', 'Document name')
        self.document.set_metadata_field('CreationDate', 'Today')
        fields = self.document.available_metadata
        self.failUnless(len(fields) == 2)
        self.failUnless(fields.count('Name') == 1)
        self.failUnless(fields.count('CreationDate') == 1)

    def test_content(self):
        self.document.content = "Some text content"
        self.failUnless(self.document.content == "Some text content")

コード例 #4

ファイルを表示

ファイル: extraction.py プロジェクト: Alex-Linhares/bibtexIndexMaker

    def extract(self, input_file):
        input_file = self._check_input_file(input_file)
        # Extraction command and its options. They may be parametrized in the
        # future
        command = [self._pdf_extraction_tool, '-q', '-f', '1', '-l', '2',
                   '-enc', 'ASCII7', '-htmlmeta', input_file, '-']
        try:
            pop = subprocess.Popen(command, stdout=subprocess.PIPE)
        except subprocess.CalledProcessError as cpe:
            log.error ('Error executing PDF text extraction tool. Return code: ' #@UndefinedVariable
                   + repr(cpe.returncode))
        except OSError:
            log.error ('PDF extraction tool not found') #@UndefinedVariable
        
        stdout = pop.communicate()[0]
        if not stdout:
            raise ExtractionError('Corrupted file')
        
        parser = BeautifulSoup(stdout)
        document = Document()
        self._extract_metadata(parser, document)
        self._extract_content(parser, document)

        return document

コード例 #5

ファイルを表示

ファイル: test_document.py プロジェクト: rxuriguera/bibtexIndexMaker

 def setUp(self):
     self.document = Document()

コード例 #6

ファイルを表示

 def setUp(self):
     self.document = Document()