Ejemplos de cluster_paragraphs en Python

Lenguaje de programación: Python

Namespace/Package Name: vectorizer

Método / Función: cluster_paragraphs

Ejemplos en hotexamples.com: 6

Python cluster_paragraphs - 6 ejemplos encontrados. Estos son los ejemplos en Python del mundo real mejor valorados de vectorizer.cluster_paragraphs extraídos de proyectos de código abierto. Puedes valorar ejemplos para ayudarnos a mejorar la calidad de los ejemplos.

Ejemplo n.º 1

Mostrar archivo

Archivo: test_vectorizer.py Proyecto: 201722130162/project

    def setUp(self):
        self.text1 = 'A study on the effectiveness of milk and micronutrients.'
        self.text2 = 'A study on the effectiveness of milk.'
        self.text3 = 'Something completely unrelated'

        self.returned = cluster_paragraphs(
            [self.text1, self.text2, self.text3])

Ejemplo n.º 2

Mostrar archivo

Archivo: test_vectorizer.py Proyecto: sergeio/text_clustering

    def setUp(self):
        self.text1 = 'A study on the effectiveness of milk and micronutrients.'
        self.text2 = 'A study on the effectiveness of milk.'
        self.text3 = 'Something completely unrelated'

        self.returned = cluster_paragraphs(
            [self.text1, self.text2, self.text3])

Ejemplo n.º 3

Mostrar archivo

Archivo: Backup views.py Proyecto: piinalpin/sentiment-analisys

def Dataframe():
    if request.method == 'POST':
        dbmodel = database.DBModel()
        token = dbmodel.get_data_all("DataTA","datanya")

        data_s=[]
        for i in token :
            isi = i.values()
            isi_judul = isi[1]
            data_baru2 = isi_judul.lower()
            data_s.append(stopword.remove(data_baru2))

        data = data_s
        shuffle(data)

        cluster_paragraphs(data, num_clusters=2)
        clusters = cluster_paragraphs(data, num_clusters=2)
        data = pd.DataFrame(clusters)

        

    return render_template('dataframe.html', tables=[data.to_html(classes='table table-bordered')])

Ejemplo n.º 4

Mostrar archivo

from vectorizer import cluster_paragraphs
from random import shuffle

text1 = """Type theory is closely related to (and in some cases overlaps with) type systems, which are a programming language feature used to reduce bugs. The types of type theory were created to avoid paradoxes in a variety of formal logics and rewrite systems and sometimes "type theory" is used to refer to this broader application."""
text2 = """The types of type theory were invented by Bertrand Russell in response to his discovery that Gottlob Frege's version of naive set theory was afflicted with Russell's paradox. This theory of types features prominently in Whitehead and Russell's Principia Mathematica. It avoids Russell's paradox by first creating a hierarchy of types, then assigning each mathematical (and possibly other) entity to a type. Objects of a given type are built exclusively from objects of preceding types (those lower in the hierarchy), thus preventing loops."""
text3 = """The common usage of "type theory" is when those types are used with a term rewrite system. The most famous early example is Alonzo Church's lambda calculus. Church's Theory of Types[1] helped the formal system avoid the Kleene-Rosser paradox that afflicted the original untyped lambda calculus. Church demonstrated that it could serve as a foundation of mathematics and it was referred to as a higher-order logic."""
text4 = """A pulsar (short for pulsating radio star) is a highly magnetized, rotating neutron star that emits a beam of electromagnetic radiation. This radiation can only be observed when the beam of emission is pointing toward the Earth, much the way a lighthouse can only be seen when the light is pointed in the direction of an observer, and is responsible for the pulsed appearance of emission. Neutron stars are very dense, and have short, regular rotational periods. This produces a very precise interval between pulses that range from roughly milliseconds to seconds for an individual pulsar."""
text5 = """The precise periods of pulsars make them useful tools. Observations of a pulsar in a binary neutron star system were used to indirectly confirm the existence of gravitational radiation."""
texts = [text1, text2, text3, text4, text5]
shuffle(texts)

cluster_paragraphs(texts, num_clusters=2)
clusters = cluster_paragraphs(texts, num_clusters=2)

print
print 'Group 1:'
print '========\n'
print '\n-----\n'.join(t for t in clusters[0])
print
print 'Group 2:'
print '========\n'
print '\n-----\n'.join(t for t in clusters[1])
print

Ejemplo n.º 5

Mostrar archivo

start_time = time.time()
#Parte para ver todos los archivos tipo txt y guardarlos en una lista
dataset = []
filecontent = []
path = './txt'
files = [f for f in os.listdir(path) if os.path.split(f)]
for f in files:
    if f.endswith(".txt"):
        filecontent.append(f)
#Vamos a ver como pasar el contenido del txt para poder analizarlo

for f in filecontent:
    j = os.path.join(path, f)
    with open(j, 'r') as myfile:
        data = myfile.read().replace('\n', '')
        dataset.append(data)
#Luego de obtener el texto de cada documento se pasa a un vector con el cual analizaremos la similaridad

num_clusters = 4
cluster_paragraphs(dataset, num_clusters, filecontent)
clusters = cluster_paragraphs(dataset, num_clusters, filecontent)

cont = 0
for group in clusters:
    print('\nGroup {0}'.format(cont))
    print '\n'.join(t for t in clusters[cont])
    cont = cont + 1
print '\n\n\n'

print("The execution time was %s seconds" % (time.time() - start_time))

Ejemplo n.º 6

Mostrar archivo

slither_filename = 'scanner_res/sl_visibility_not_set.txt'

buglist1 = mythril_process(mythril_filename)
mythril_buglist = listToString(buglist1)
# print "mythril_buglist:",mythril_buglist
# print("mythril_buglist_length:",len(mythril_buglist))

buglist2 = oyente_process(oyente_filename)
oyente_buglist = listToString(buglist2)
# print "oyente_buglist:",oyente_buglist
# print("oyente_buglist_length:",len(oyente_buglist))

buglist3 = slither_process(slither_filename)
slither_buglist = listToString(buglist3)
# print "slither_buglist:",slither_buglist
# print("slither_buglist_length:",len(slither_buglist))

list = mythril_buglist + oyente_buglist + slither_buglist
# print("list_length:",len(list))
shuffle(list)

clusters = cluster_paragraphs(list)
k = len(clusters)
print("k:", len(clusters))

for i in range(k):
    print("\n")
    print('Group {}:'.format(i))
    print('========\n')
    print('\n-----\n'.join(t for t in clusters[i]))