Python FeatureGenerator 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: sandbox.data.FeatureGenerator

클래스/타입: FeatureGenerator

hotexamples.com에서의 예제들: 2

Python FeatureGenerator - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 sandbox.data.FeatureGenerator.FeatureGenerator에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

FeatureGenerator(1)

categoricalToIndicator(1)

예제 #1

파일 보기

파일: FeatureGeneratorTest.py 프로젝트: rezaarmand/sandbox

    def testCategoricalToIndicator(self):
        X = numpy.zeros((5,5))
        X[:, 0] = numpy.array([1, 1, 2, 4, 6])
        X[:, 1] = numpy.array([2, 1, 2, 4, 6])
        X[:, 2] = numpy.array([1, 1, 2, 4, 2])
        X[:, 3] = numpy.array([1, 2, 3, 4, 2])
        X[:, 4] = numpy.array([1.1, 2.1, 4.5, 6.2, 1.1])

        logging.debug(X)

        generator = FeatureGenerator()
        inds = [0, 1]
        X2 = generator.categoricalToIndicator(X, inds)

        X3 = numpy.zeros((5, 11))
        X3[0, :] = numpy.array([[ 1,   0,   0,   0,   0,   1,   0,   0,   1,   1,   1.1]])
        X3[1, :] = numpy.array([[ 1,   0,   0,   0,   1,   0,   0,   0,   1,   2,   2.1]])
        X3[2, :] = numpy.array([[ 0,   1,   0,   0,   0,   1,   0,   0,   2,   3,   4.5]])
        X3[3, :] = numpy.array([[ 0,   0,   1,   0,   0,   0,   1,   0,   4,   4,   6.2]])
        X3[4, :] = numpy.array([[ 0,   0,   0,   1,   0,   0,   0,   1,   2,   2,   1.1]])

        self.assertTrue(numpy.linalg.norm(X3-X2) < 10**-6)

        #Test case where no indices given
        inds = []
        X2 = generator.categoricalToIndicator(X, inds)

        self.assertTrue(numpy.linalg.norm(X-X2) < 10**-6)

예제 #2

파일 보기

파일: HIVGraphReader.py 프로젝트: charanpald/wallhack

    def readHIVGraph(self, undirected=True, indicators=True):
        """
        We will use pacdate5389.csv which contains the data of infection. The undirected
        parameter instructs whether to create an undirected graph. If indicators
        is true then categorical varibles are turned into collections of indicator
        ones. 
        """
        converters = {1: CsvConverters.dateConv, 3:CsvConverters.dateConv, 5:CsvConverters.detectionConv, 6:CsvConverters.provConv, 8: CsvConverters.dateConv }
        converters[9] = CsvConverters.genderConv
        converters[10] = CsvConverters.orientConv
        converters[11] = CsvConverters.numContactsConv
        converters[12] = CsvConverters.numContactsConv
        converters[13] = CsvConverters.numContactsConv

        def nanProcessor(X):
            means = numpy.zeros(X.shape[1])
            for i in range(X.shape[1]):
                if numpy.sum(numpy.isnan(X[:, i])) > 0:
                    logging.info("No. missing values in " + str(i) + "th column: " + str(numpy.sum(numpy.isnan(X[:, i]))))
                means[i] = numpy.mean(X[:, i][numpy.isnan(X[:, i]) == False])
                X[numpy.isnan(X[:, i]), i] = means[i]
            return X 

        idIndex = 0
        featureIndices = converters.keys()
        multiGraphCsvReader = MultiGraphCsvReader(idIndex, featureIndices, converters, nanProcessor)

        dataDir = PathDefaults.getDataDir()
        vertexFileName = dataDir + "HIV/alldata.csv"
        edgeFileNames = [dataDir + "HIV/grafdet2.csv", dataDir + "HIV/infect2.csv"]

        sparseMultiGraph = multiGraphCsvReader.readGraph(vertexFileName, edgeFileNames, undirected, delimiter="\t")

        #For learning purposes we will convert categorial variables into a set of
        #indicator features
        if indicators: 
            logging.info("Converting categorial features")
            vList = sparseMultiGraph.getVertexList()
            V = vList.getVertices(list(range(vList.getNumVertices())))
            catInds = [2, 3]
            generator = FeatureGenerator()
            V = generator.categoricalToIndicator(V, catInds)
            vList.replaceVertices(V)

        logging.info("Created " + str(sparseMultiGraph.getNumVertices()) + " examples with " + str(sparseMultiGraph.getVertexList().getNumFeatures()) + " features")

        return sparseMultiGraph