Python FeatureExtractor 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: feature_extractors

클래스/타입: FeatureExtractor

hotexamples.com에서의 예제들: 7

Python FeatureExtractor - 7개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 feature_extractors.FeatureExtractor에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

average_category_entropy(1)

category_entropy_variance(1)

entropy_to_paradigm_length(1)

is_positive_example(1)

min_category_entropy(1)

number_of_one_value_categories(1)

part_of_found_flex(1)

part_of_found_grammars(1)

예제 #1

0

파일 보기

파일: grdicmaker_api.py 프로젝트: sleepofnodreaming/gramdicmaker2016

    def get_training_data_matrix(self, normalize, ablation_features=(), toExclude=()):
        """ Process the training data.
        Args:
        — normalize: a boolean flag if the data should be normalized in a feature matrix;
        — toExclude: a list/tuple of paradigms that cannot be used in the training data (for cross-validation);
        — ablate: a list of features to exclude during the ablation study.
        Return:
        - headlines;
        - a training sparse scipy matrix;
        - a list of targets.


        """
        assert isinstance(normalize, bool)
        self._check_if_ablation_appropriate(ablation_features)

        # additional data initialization:
        self._read_category_val_alternations(self.categoryPath)  # self.categoryDescription
        self._read_paradigm_lengths()  # self.pLengths
        # and this is a table maker itself
        setParadigms = set()
        with codecs.open(self.MLDataPath, 'r', 'utf-8-sig') as f:
            data = json.loads(f.read())
            processedData = []
            targets = []
            for lexeme in data:
                if lexeme["paradigm"] in toExclude:
                    continue
                else: setParadigms.add(lexeme["paradigm"])

                lexemeFeatureDic = self._convert_lexeme_to_feature_dic(lexeme, ablation_features)
                processedData.append(lexemeFeatureDic)

                sampleEval = FeatureExtractor.is_positive_example(lexeme)
                targets.append(sampleEval)

            headlines, matrix = self._dic_list_to_matrix(processedData, normalize)
            if setParadigms:
                logging.info("Training set paradigms: %s", u" ".join(list(setParadigms)))
            else:
                logging.critical("Training set is empty.")

            return headlines, matrix, targets

예제 #2

0

파일 보기

파일: grdicmaker_api.py 프로젝트: sleepofnodreaming/gramdicmaker2016

 def _category_entropy_variance(self, lexeme):
     return FeatureExtractor.category_entropy_variance(lexeme, self.categoryDescription)

예제 #3

0

파일 보기

파일: grdicmaker_api.py 프로젝트: sleepofnodreaming/gramdicmaker2016

 def _number_of_one_value_categories(self, lexeme):
     return FeatureExtractor.number_of_one_value_categories(lexeme, self.categoryDescription)

예제 #4

0

파일 보기

파일: grdicmaker_api.py 프로젝트: sleepofnodreaming/gramdicmaker2016

 def _entropy_to_paradigm_length(self, lexeme):
     return FeatureExtractor.entropy_to_paradigm_length(lexeme, self.pLengths)

예제 #5

0

파일 보기

파일: grdicmaker_api.py 프로젝트: sleepofnodreaming/gramdicmaker2016

 def _part_of_found_gramm(self, lexeme):
     return FeatureExtractor.part_of_found_grammars(lexeme, self.pLengths)

예제 #6

0

파일 보기

파일: grdicmaker_api.py 프로젝트: sleepofnodreaming/gramdicmaker2016

 def _part_of_found_flex(self, lexeme):
     return FeatureExtractor.part_of_found_flex(lexeme, self.pLengths)

예제 #7

0

파일 보기

파일: grdicmaker_api.py 프로젝트: sleepofnodreaming/gramdicmaker2016

 def _min_category_entropy(self, lexeme):
     return FeatureExtractor.min_category_entropy(lexeme, self.categoryDescription)