Python UnicodeReader 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: utf8_utils

클래스/타입: UnicodeReader

hotexamples.com에서의 예제들: 6

Python UnicodeReader - 6개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 utf8_utils.UnicodeReader에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

UnicodeReader(6)

자주 사용되는 메소드들

UnicodeReader (6)

예제 #1

파일 보기

파일: import_mallet.py 프로젝트: riyachanduka/termite_test

	def extractTopicWordWeights( self, model_path ):
		data = {}
		words = []
		topics = []
		
		# Read in content of file (sparse matrix representation)
		filename = '{}/{}'.format( model_path, ImportMallet.TOPIC_WORD_WEIGHTS )
		with open( filename, 'r' ) as f:
			lines = UnicodeReader( f )
			for (topic, word, value) in lines:
				topic = int(topic)
				if topic not in data:
					data[ topic ] = {}
				data[ topic ][ word ] = float(value)
				words.append( word )
				topics.append( topic )
		
		# Get list of terms and topic indexes
		term_index = sorted( list( frozenset( words ) ) )
		topic_index = sorted( list( frozenset( topics ) ) )
		
		# Build dense matrix representation
		matrix = []
		for term in term_index :
			row = []
			for topic in topic_index :
				row.append( data[ topic ][ term ] )
			matrix.append( row )
		
		# Generate topic labels
		topic_str_index = [ 'Topic {}'.format(d) for d in topic_index ]
		
		self.model.term_topic_matrix = matrix
		self.model.term_index = term_index
		self.model.topic_index = topic_str_index

예제 #2

파일 보기

 def read( self ):
     self.data = {}
     filename = self.path + TokensAPI.TOKENS
     with open( filename, 'r' ) as f:
         lines = UnicodeReader( f )
         for ( docID, docTokens ) in lines:
             self.data[ docID ] = docTokens.split( ' ' )

예제 #3

파일 보기

def ReadAsSparseMatrix( filename ):
	matrix = {}
	with open( filename, 'r' ) as f:
		lines = UnicodeReader( f )
		for ( aKey, bKey, value ) in lines:
			matrix[ (aKey, bKey) ] = float( value )
	return matrix

예제 #4

파일 보기

def ReadAsSparseVector( filename ):
	vector = {}
	with open( filename, 'r' ) as f:
		lines = UnicodeReader( f )
		for ( key, value ) in lines:
			vector[ key ] = float( value )
	return vector

예제 #5

파일 보기

def ReadAsMatrix( filename ):
	matrix = []
	with open( filename, 'r' ) as f:
		lines = UnicodeReader( f )
		for line in lines:
			matrix.append( map( float, line ) )
	return matrix

예제 #6

파일 보기

    def readCsvAsMatrixStr(self, model_path, filename):
        """
		Return a matrix (list of list) of string values.
		Each row corresponds to a line of the input file.
		Each cell (in a row) corresponds to a comma-separated value (in each line).
		"""
        data = []
        filename = '{}/{}'.format(model_path, filename)
        with open(filename, 'r') as f:
            lines = UnicodeReader(f, delimiter=',')
            data = [d for d in lines]
        return data