Python readCorpus 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: objects

메소드/함수: readCorpus

hotexamples.com에서의 예제들: 6

Python readCorpus - 6개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 objects.readCorpus에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

0

파일 보기

    print "Results have been determined"
    try:
        print "Percentage correct in test run: %f" % round(
            float(ncorrect) / float(tot_phon) * 100, 2)
    except ZeroDivisionError:
        print "Percentage correct in test run: 0.00"

    return results


########
# MAIN #
########

# Read in corpus
(corpus, suffixes) = objects.readCorpus(constants.corpus_file)

# Determine corpus size from this
corpus_size = len(corpus)

if constants.vectors == "Binary":
    root_size = int(ceil(log(corpus_size, 2)))
else:
    root_size = corpus_size

# # Create suffix dictionary
# if constants.vectors == 'binary':
#         suffix_size = int(ceil(log(len(suffixes), 2)))          # 6
#         suffix_dict = functions.binaryDict(suffixes)
# else:
#         suffix_size = len(suffixes)

예제 #2

0

파일 보기

파일: main.py 프로젝트: tylerlau07/romance_nominal_change

                print form.lemmacase, form.parent.declension, form.parent.gender, form.phonology, new_suffix

                # Set input change once we figure out how to deal with the phonology
                form.output_change[generation] = new_suffix

        print "Results have been determined"
        print "Percentage correct in test run: %f" % round(float(ncorrect)/float(len(previous_output))*100, 2)

        return results

########
# MAIN #
########

# Read in corpus
(corpus, suffixes) = objects.readCorpus(constants.corpus_file)
# Determine corpus size from this
corpus_size = len(corpus)

# Create suffix dictionary
if constants.vectors == 'binary':
        suffix_size = int(ceil(log(len(suffixes), 2)))          # 6
        suffix_dict = constants.binaryDict(suffixes)
else:
        suffix_size = len(suffixes)
        suffix_dict = dict(zip(suffixes, map(tuple, identity(suffix_size))))

inv_suffix = constants.invert(suffix_dict)
suffix_dict.update(constants.invert(suffix_dict))

##########

예제 #3

0

파일 보기

파일: main.py 프로젝트: tylerlau07/romance_nominal_change

        Number of Hidden Nodes: %d
        Number of Output Nodes: %d
        Token Frequency taken into account: %s
        Case Hierarchy taken into account: %s
        Genitive Case to be dropped: %s \n''' % ( 
                constants.epochs, 
                constants.input_nodes,
                constants.hidden_nodes,
                constants.output_nodes,
                constants.token_freq, 
                constants.hierarchy, 
                constants.gnvdrop_generation < constants.total_generations
                )

# Read in corpus
corpus = objects.readCorpus(constants.corpus_file)
# Determine corpus size from this
corpus_size = len(corpus)

# Initialize dictionary mapping from forms to Latin noun info, to be updated each generation
expected_outputs = {}

# Iterate over tokens
for lemma in corpus:
        # Iterate over cases
        for case, form in lemma.cases.iteritems():
                # Take Latin phonology of suffix as first set of expected outputs
                ending = ''.join(form.phonsuf)

                expected_output = ()
                for phoneme in ending:

예제 #4

0

파일 보기

파일: main.py 프로젝트: tylerlau07/romance_nominal_change

                num = constants.tup_to_num[tuple(result[constants.num_b:])]
                output = form.parent_lemma.cases[case+num].phonology

                # Set input change once we figure out how to deal with the phonology
                form.output_change[generation] = (gender, dec, case, num, output)

        print "Results have been determined"

        return results

########
# MAIN #
########

# Read in corpus
corpus = readCorpus(constants.corpus_file)

print '''Training on %d Epochs
        Token Frequency taken into account: %s
        Case Hierarchy taken into account: %s
        Genitive Case to be dropped: %s \n''' % ( 
                constants.epochs, 
                constants.token_freq, 
                constants.hierarchy, 
                constants.gnvdrop_generation < constants.total_generations
                )

# Initialize dictionary mapping from forms to Latin noun info, to be updated each generation
expected_outputs = {}

# Iterate over tokens

예제 #5

0

파일 보기

파일: main.py 프로젝트: tylerlau07/romance_nominal_change

        # Set input change once we figure out how to deal with the phonology
        form.output_change[generation] = (gender, dec, casenum[0:3],
                                          casenum[3:], output)

    print "Results have been determined"

    return results


########
# MAIN #
########

# Read in corpus
corpus = readCorpus(constants.corpus_file)

print '''Training on %d Epochs
        Token Frequency taken into account: %s
        Case Hierarchy taken into account: %s
        Genitive Case to be dropped: %s \n''' % (
    constants.epochs, constants.token_freq, constants.hierarchy,
    constants.gnvdrop_generation < constants.total_generations)

# Initialize dictionary mapping from forms to Latin noun info, to be updated each generation
expected_outputs = {}

# Iterate over tokens
for lemma in corpus:
    # Iterate over cases
    for case, form in lemma.cases.iteritems():

예제 #6

0

파일 보기

파일: main.py 프로젝트: tylerlau07/romance_nominal_change

                # Set input change once we figure out how to deal with the phonology
                form.output_change[generation] = new_phonology.replace('-', '')

        print "Results have been determined"

        print "Percentage correct in test run: {:.2f}".format(float(ncorrect)/float(tot_phon)*100)

        return results

########
# MAIN #
########

# Read in corpus
corpus = objects.readCorpus(constants.corpus_file)
# Determine corpus size from this
corpus_size = len(corpus)

# Determine how long the root vector should be based on the length of the corpus
root_size = int(ceil(log(corpus_size, 2)))

# Total size of input layer determined here
input_nodes = sum([root_size, constants.human_size, constants.dec_size, constants.gen_size, constants.case_size, constants.num_size])

print '''Training on %d Epochs
        Number of Input Nodes: %d
        Number of Hidden Nodes: %d
        Number of Output Nodes: %d
        Token Frequency taken into account: %s
        Case Hierarchy taken into account: %s