Python dataIterator 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: gtnlplib.preproc

메소드/함수: dataIterator

hotexamples.com에서의 예제들: 6

Python dataIterator - 6개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 gtnlplib.preproc.dataIterator에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: clf_base.py 프로젝트: JELGT2011/CS-4650

def generateKaggleSubmission(weights, outfilename):
    with open(outfilename, 'w') as f:
        writer = csv.DictWriter(f, fieldnames=['Id', 'Prediction'])
        writer.writeheader()

        # Test data is used for private leaderboard
        testData = dataIterator(TESTKEY, test_mode=True)
        for i, (counts, _) in enumerate(testData):
            predictedLabel, _ = predict(counts, weights, ALL_LABELS)
            predictedIndex = ALL_LABELS.index(predictedLabel)
            writer.writerow({
                'Id': 'test-{}'.format(i),
                'Prediction': predictedIndex})

        # Dev data is used for public leaderboard
        devData = dataIterator(DEVKEY, test_mode=False)
        devCorrect = 0
        devTotal = 0
        for i, (counts, label) in enumerate(devData):
            devTotal += 1
            predictedLabel, _ = predict(counts, weights, ALL_LABELS)
            devCorrect += (predictedLabel == label)
            predictedIndex = ALL_LABELS.index(predictedLabel)
            writer.writerow({
                'Id': 'dev-{}'.format(i),
                'Prediction': predictedIndex})

    devAccuracy = float(devCorrect) / devTotal
    print 'Dev accuracy is ', devAccuracy, '({} correct of {})'.format(devCorrect, devTotal)
    print 'Kaggle submission saved to', outfilename, ('. Sanity check: '
                                                      'public leaderboard accuracy should be '), devAccuracy, 'on submission.'

예제 #2

파일 보기

파일: clf_base.py 프로젝트: xiaochaowei/gt-nlp-class

def generateKaggleSubmission(weights,outfilename):
    with open(outfilename, 'w') as f:
        writer = csv.DictWriter(f, fieldnames=['Id', 'Prediction'])
        writer.writeheader()

        # Test data is used for private leaderboard
        testData = dataIterator(TESTKEY,test_mode=True)
        for i,(counts,_) in enumerate(testData):
            predictedLabel,_ = predict(counts,weights,ALL_LABELS)
            predictedIndex = ALL_LABELS.index(predictedLabel)
            writer.writerow({
                'Id': 'test-{}'.format(i),
                'Prediction': predictedIndex})

        # Dev data is used for public leaderboard
        devData = dataIterator(DEVKEY,test_mode=False)
        devCorrect = 0
        devTotal = 0
        for i,(counts,label) in enumerate(devData):
            devTotal += 1
            predictedLabel,_ = predict(counts,weights,ALL_LABELS)
            devCorrect += (predictedLabel == label)
            predictedIndex = ALL_LABELS.index(predictedLabel)
            writer.writerow({
                'Id': 'dev-{}'.format(i),
                'Prediction': predictedIndex})
    
    devAccuracy = float(devCorrect) / devTotal
    print 'Dev accuracy is ', devAccuracy, '({} correct of {})'.format(devCorrect, devTotal)
    print 'Kaggle submission saved to', outfilename, ('. Sanity check: '
        'public leaderboard accuracy should be '), devAccuracy, 'on submission.'

예제 #3

파일 보기

파일: testpreproc.py 프로젝트: JELGT2011/CS-4650

def setup_module():
    # Need to do this because the dataIterator function depends
    # on the BOW file to be generated.
    global ac_train
    global ac_dev
    docsToBOWs(TRAINKEY)
    docsToBOWs(DEVKEY)
    ac_train = getAllCounts(dataIterator(TRAINKEY))
    ac_dev = getAllCounts(dataIterator(DEVKEY))

예제 #4

파일 보기

파일: testpreproc.py 프로젝트: Mercurial1101/gt-nlp-class

def setup_module():
    # Need to do this because the dataIterator function depends
    # on the BOW file to be generated.
    global ac_train
    global ac_dev
    docsToBOWs(TRAINKEY)
    docsToBOWs(DEVKEY)
    ac_train = getAllCounts (dataIterator (TRAINKEY))
    ac_dev = getAllCounts (dataIterator (DEVKEY))

예제 #5

파일 보기

파일: clf_base.py 프로젝트: JELGT2011/CS-4650

def evalClassifier(weights, outfilename, testfile, test_mode=False):
    with open(outfilename, 'w') as outfile:
        for counts, label in dataIterator(testfile, test_mode):  # iterate through eval set
            print >> outfile, predict(counts, weights, ALL_LABELS)[0]  # print prediction to file
    if test_mode:
        return
    else:
        return gtnlplib.scorer.getConfusion(testfile, outfilename)  # run the scorer on the prediction file

예제 #6

파일 보기

파일: clf_base.py 프로젝트: xiaochaowei/gt-nlp-class

def evalClassifier(weights,outfilename,testfile,test_mode=False):    
    with open(outfilename,'w') as outfile:
        for counts,label in dataIterator(testfile,test_mode): #iterate through eval set
            print >>outfile, predict(counts,weights,ALL_LABELS)[0] #print prediction to file
    if test_mode:
        return
    else:
        return gtnlplib.scorer.getConfusion(testfile,outfilename) #run the scorer on the prediction file