Python FileUtil.get_files_in_directory 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: utility

클래스/타입: FileUtil

메소드/함수: get_files_in_directory

hotexamples.com에서의 예제들: 3

Python FileUtil.get_files_in_directory - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 utility.FileUtil.get_files_in_directory에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

get_filename_from_path(3)

get_files_in_directory(3)

file_exists(2)

delete_file(1)

예제 #1

파일 보기

파일: Tokenizer.py 프로젝트: FChen12/finegrained-traceability

 def tokenize_all_sentences_in_directory(self, directory) -> [str]:
     sentences = []
     for file in FileUtil.get_files_in_directory(directory):
         if self._italian:
             sentences += sent_tokenize(FileUtil.read_textfile_into_string(file, self._dataset.encoding()), language="italian")
         else: 
             sentences += sent_tokenize(FileUtil.read_textfile_into_string(file, self._dataset.encoding()))
     return sentences

예제 #2

파일 보기

파일: Preprocessor.py 프로젝트: FChen12/finegrained-traceability

 def iterate_files(tokenizer, preprecessor, folder):
     for file in FileUtil.get_files_in_directory(folder, True):
         file_representation = tokenizer.tokenize(file)
         file_representation.preprocess(preprecessor)
         for word in file_representation.token_list:
             lemma = [token.lemma_ for token in spacy_lemmatizer(word)]
             if len(lemma) > 1:
                 log.info(
                     f"More than one lemma {lemma} for \"{word}\". Using \"{''.join(lemma)}\" as lemma"
                 )
             lemma = "".join(lemma)
             if word in word_to_lemma_map:
                 if not word_to_lemma_map[word] == lemma:
                     log.info(
                         f"Different Duplicate Lemma for {word}: {word_to_lemma_dataframe[word]} <-> {lemma}"
                     )
             else:
                 word_to_lemma_map[word] = lemma

예제 #3

파일 보기

 def embedd_all_files_in_directory(self, directory):
     all_filenames = FileUtil.get_files_in_directory(directory)
     all_embeddings = []
     for filename in all_filenames:
         try:
             file_representation = self._tokenize_and_preprocess(filename)
         except (FileNotFoundError, IsADirectoryError, PermissionError, UnicodeDecodeError,) as e:
             log.info(f"SKIPPED: Error on reading or tokenizing {filename}: {e}")
             continue
         except JavaSyntaxError as j:
             log.info(f"SKIPPED: JavaSyntaxError on tokenizing {filename} (Note: code files needs to be compilable): {j.at}")
             continue
         except (JavaParserError, LexerError) as j:
             log.info(f"SKIPPED: Error on tokenizing {filename} (Note: code files needs to be compilable): {j}")
             continue
         file_embedding = self._create_embeddings(file_representation)
         if file_embedding:
             all_embeddings.append(file_embedding)
         else:
             log.info(f"No embedding for {filename}")
     return all_embeddings