Python Corpus.set_vector_matrix Examples

Programming Language: Python

Namespace/Package Name: convokit

Class/Type: Corpus

Method/Function: set_vector_matrix

Examples at hotexamples.com: 2

Python Corpus.set_vector_matrix - 2 examples found. These are the top rated real world Python examples of convokit.Corpus.set_vector_matrix extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

Corpus(30)

iter_objs(17)

iter_conversations(13)

get_utterance(12)

get_utterance_ids(6)

iter_utterances(6)

dump(5)

print_summary_stats(3)

load_info(3)

users(2)

get_conversation(2)

get_conversation_ids(2)

set_vector_matrix(2)

iter_users(2)

iter_speakers(1)

get_vector_matrix(1)

get_vectors(1)

conversations(1)

get_utterances_dataframe(1)

get_speaker_ids(1)

get_speaker(1)

get_meta(1)

dump_info(1)

utterances(1)

Example #1

Show file

File: bow_transformer.py Project: Nuri22/csDetector

    def transform(
        self,
        corpus: Corpus,
        selector: Callable[[CorpusComponent],
                           bool] = lambda x: True) -> Corpus:
        """
        Computes the vector matrix for the Corpus component objects and then stores it in a ConvoKitMatrix object,
        which is saved in the Corpus as `vector_name`.

        :param corpus: the target Corpus
        :param selector: a (lambda) function that takes a Corpus component object and returns True or False
            (i.e. include / exclude). By default, the selector includes all objects of the specified type in the Corpus.

        :return: the target Corpus annotated
        """
        objs = list(corpus.iter_objs(self.obj_type, selector))
        ids = [obj.id for obj in objs]
        docs = [self.text_func(obj) for obj in objs]

        matrix = self.vectorizer.transform(docs)
        try:
            column_names = self.vectorizer.get_feature_names()
        except AttributeError:
            column_names = np.arange(matrix.shape[1])
        corpus.set_vector_matrix(self.vector_name,
                                 matrix=matrix,
                                 ids=ids,
                                 columns=column_names)

        for obj in objs:
            obj.add_vector(self.vector_name)

        return corpus

Example #2

Show file

File: bow_transformer.py Project: clesanbar/Cornell-Conversational-Analysis-Toolkit

    def transform(self, corpus: Corpus, selector: Callable[[CorpusComponent], bool] = lambda x: True) -> Corpus:
        """
        Annotate the corpus objects with the vectorized representation of the object's text, with an optional
        selector that filters for objects to be transformed. Objects that are not selected will get a metadata value
        of 'None' instead of the vector.

        :param corpus: the target Corpus
        :param selector: a (lambda) function that takes a Corpus object and returns True or False (i.e. include / exclude). By default, the selector includes all objects of the specified type in the Corpus.

        :return: the target Corpus annotated
        """
        objs = list(corpus.iter_objs(self.obj_type, selector))
        ids = [obj.id for obj in objs]
        docs = [self.text_func(obj) for obj in objs]

        matrix = self.vectorizer.transform(docs)
        try:
            column_names = self.vectorizer.get_feature_names()
        except AttributeError:
            column_names = np.arange(matrix.shape[1])
        corpus.set_vector_matrix(self.vector_name, matrix=matrix, ids=ids, columns=column_names)

        for obj in objs:
            obj.add_vector(self.vector_name)

        return corpus