Python getRecordsPeClass 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: smote_utility

메소드/함수: getRecordsPeClass

hotexamples.com에서의 예제들: 3

Python getRecordsPeClass - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 smote_utility.getRecordsPeClass에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: smote_main_13.py 프로젝트: Pikomonto/DataAnalysisAndLearning

import numpy as np
import smote_utility , smote 
fileName_="13_NonZeroDataset_Aggolo.csv"
the_data_set = smote_utility.readCSVAsArray(fileName_) 
#print the_data_set
# get the distribution per clss 
counter_dict = smote_utility.getCountPerClass(the_data_set)
print counter_dict

#{0.0: 393, 1.0: 106, 2.0: 2, 3.0: 3, 4.0: 6, 5.0: 3, 6.0: 7, 7.0: 6, 8.0: 6, 9.0: 7, 10.0: 5, 11.0: 4, 12.0: 1}
print "smoting time for level 1 "
## smoting time for level 1
# get the records per sample 
classVal = float(1)
records_per_class_1 = smote_utility.getRecordsPeClass(classVal, the_data_set)
#print records_
array_shaped_record = np.array(records_per_class_1)
print "original datatset ", array_shaped_record.shape
count_extra_synthetic_samples = 300
nearest_nieghbors = 10  ### Expected n_neighbors <= n_samples
smoted_dataset_1 = smote.SMOTE(array_shaped_record.shape, array_shaped_record, count_extra_synthetic_samples, nearest_nieghbors)
print "smoted dataset shape: level-1::", smoted_dataset_1.shape 
print "-----"


print "smoting time for level 2 "
## smoting time for level 2
# get the records per sample 
classVal = float(2)
records_per_class_2 = smote_utility.getRecordsPeClass(classVal, the_data_set)

예제 #2

파일 보기

fileName_="5_NonZeroDataset_Aggolo.csv"
the_data_set = smote_utility.readCSVAsArray(fileName_) 
#print the_data_set
# get the distribution per clss 
counter_dict = smote_utility.getCountPerClass(the_data_set)
print counter_dict
#{0.0: 10, 1.0: 417, 2.0: 7, 3.0: 9, 4.0: 106 }
'''
formula to fix no. of samples: no_of_samples_you_want = x * no. of neighbors / 100 
you have to provide x , x must be < 100 or multiple of 100  
'''
print "smoting time for level 0 "
## smoting time for level 0  
# get the records per sample 
classVal = float(0)
records_per_class_0 = smote_utility.getRecordsPeClass(classVal, the_data_set)
#print records_
array_shaped_record = np.array(records_per_class_0)
print "original datatset ", array_shaped_record.shape
count_extra_synthetic_samples = 3200  ## fix samples based on number of nerighbors 
nearest_nieghbors = 10 ### Expected n_neighbors <= n_samples, level 0 has 10 samples
smoted_dataset_0 = smote.SMOTE(array_shaped_record.shape, array_shaped_record, count_extra_synthetic_samples, nearest_nieghbors)
print "smoted dataset shape: level-0::", smoted_dataset_0.shape 
print "-----"


print "smoting time for level 1 "
## smoting time for level 1
# get the records per sample 
classVal = float(1)
records_per_class_1 = smote_utility.getRecordsPeClass(classVal, the_data_set)

예제 #3

파일 보기

import numpy as np
import smote_utility, smote
fileName_ = "13_NonZeroDataset_Aggolo.csv"
the_data_set = smote_utility.readCSVAsArray(fileName_)
#print the_data_set
# get the distribution per clss
counter_dict = smote_utility.getCountPerClass(the_data_set)
print counter_dict

#{0.0: 393, 1.0: 106, 2.0: 2, 3.0: 3, 4.0: 6, 5.0: 3, 6.0: 7, 7.0: 6, 8.0: 6, 9.0: 7, 10.0: 5, 11.0: 4, 12.0: 1}
print "smoting time for level 1 "
## smoting time for level 1
# get the records per sample
classVal = float(1)
records_per_class_1 = smote_utility.getRecordsPeClass(classVal, the_data_set)
#print records_
array_shaped_record = np.array(records_per_class_1)
print "original datatset ", array_shaped_record.shape
count_extra_synthetic_samples = 300
nearest_nieghbors = 10  ### Expected n_neighbors <= n_samples
smoted_dataset_1 = smote.SMOTE(array_shaped_record.shape, array_shaped_record,
                               count_extra_synthetic_samples,
                               nearest_nieghbors)
print "smoted dataset shape: level-1::", smoted_dataset_1.shape
print "-----"

print "smoting time for level 2 "
## smoting time for level 2
# get the records per sample
classVal = float(2)