Python get_filepathsの例

プログラミング言語: Python

名前空間/パッケージ名: textacy.io

メソッド/関数: get_filepaths

hotexamples.comのコード掲載数: 6

Python get_filepaths - 6件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのtextacy.io.get_filepathsの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

コード例 #1

ファイルを表示

ファイル: _datasets.py プロジェクト: dbragdon1/textacy

    def load(self, langs: Set[str], min_len: int = 25) -> List[Tuple[str, str]]:
        """
        Args:
            langs
            min_len: Minimum text length in *chars* for a given example to be included.

        Returns:
            Sequence of (text, lang) examples.
        """
        data = []
        match_regex = r"ud-(train|test|dev)\.txt"
        for fpath in tio.get_filepaths(
            self.data_dir, match_regex=match_regex, recursive=True
        ):
            fname = pathlib.Path(fpath).name
            lang, _ = fname.split("_", maxsplit=1)
            if lang not in langs:
                continue

            with open(fpath, mode="rt") as f:
                text = f.read()
            if "\n" in text:
                data.extend(
                    (text_segment, lang)
                    for text_segment in re.split(r"\n+", text)
                    if len(text_segment) >= min_len
                )
            else:
                data.extend(
                    (text_segment, lang)
                    for text_segment in _randomly_segment_text(text, (50, 1000))
                    if len(text_segment) >= min_len
                )
        LOGGER.info("loaded TatoebaDataset data:\n%s ...", data[:3])
        return data

コード例 #2

ファイルを表示

ファイル: test_io.py プロジェクト: psds01/textacy

 def test_get_filepaths(self):
     expected = sorted(
         os.path.join(TESTS_DIR, fname) for fname in os.listdir(TESTS_DIR)
         if os.path.isfile(os.path.join(TESTS_DIR, fname)))
     observed = sorted(
         io.get_filepaths(TESTS_DIR,
                          ignore_invisible=False,
                          recursive=False))
     assert observed == expected

コード例 #3

ファイルを表示

ファイル: test_io.py プロジェクト: neilaconway/textacy

 def test_get_filepaths_ignore_regex(self):
     assert (
         len(
             list(
                 io.get_filepaths(TESTS_DIR, ignore_regex="test_", ignore_invisible=True)
             )
         )
         == 0
     )

コード例 #4

ファイルを表示

ファイル: test_io.py プロジェクト: psds01/textacy

 def test_get_filepaths_match_regex(self):
     assert (len(
         list(io.get_filepaths(TESTS_DIR, match_regex="io",
                               extension=".py"))) == 1)

コード例 #5

ファイルを表示

ファイル: test_io.py プロジェクト: psds01/textacy

 def test_get_filepaths_ignore_invisible(self):
     dirpath = os.path.dirname(os.path.abspath(__file__))
     assert len(list(io.get_filepaths(
         dirpath, ignore_invisible=True))) <= len(
             list(io.get_filepaths(dirpath, ignore_invisible=False)))

コード例 #6

ファイルを表示

ファイル: test_io.py プロジェクト: dbragdon1/textacy

 def test_get_filepaths_match_regex(self):
     result = list(
         io.get_filepaths(TESTS_DIR, match_regex="_io", extension=".py"))
     assert len(result) == 1