Ejemplos de getDataSet en Python

Lenguaje de programación: Python

Namespace/Package Name: main

Método / Función: getDataSet

Ejemplos en hotexamples.com: 4

Python getDataSet - 4 ejemplos encontrados. Estos son los ejemplos en Python del mundo real mejor valorados de main.getDataSet extraídos de proyectos de código abierto. Puedes valorar ejemplos para ayudarnos a mejorar la calidad de los ejemplos.

Ejemplo n.º 1

Mostrar archivo

Archivo: featureMappers_test.py Proyecto: rohanraja/tweetsclassifier

def test_tifd_vector():
    
    data = main.getDataSet()#['My name is Rohan', 'My name is Sachin', 'rohan']
    t = TifdVector();
    print t.getFeatureVector_fromset(data)

Ejemplo n.º 2

Mostrar archivo

Archivo: twitter_on_scikit.py Proyecto: rohanraja/tweetsclassifier

        vectorizer = make_pipeline(hasher, TfidfTransformer())
    else:
        vectorizer = HashingVectorizer(n_features=opts.n_features,
                                       stop_words='english',
                                       non_negative=False,
                                       norm='l2',
                                       binary=False)
else:
    vectorizer = TfidfVectorizer(max_df=0.5,
                                 max_features=opts.n_features,
                                 min_df=2,
                                 stop_words='english',
                                 use_idf=opts.use_idf)
import main

tweets = main.getDataSet()
X = vectorizer.fit_transform(tweets)
# X = vectorizer.fit_transform(dataset.data)

print("done in %fs" % (time() - t0))
print("n_samples: %d, n_features: %d" % X.shape)
print()

if opts.n_components:
    print("Performing dimensionality reduction using LSA")
    t0 = time()
    # Vectorizer results are normalized, which makes KMeans behave as
    # spherical k-means for better results. Since LSA/SVD results are
    # not normalized, we have to redo the normalization.
    svd = TruncatedSVD(opts.n_components)
    lsa = make_pipeline(svd, Normalizer(copy=False))

Ejemplo n.º 3

Mostrar archivo

Archivo: twitter_on_scikit.py Proyecto: rohanraja/tweetsclassifier

        hasher = HashingVectorizer(n_features=opts.n_features,
                                   stop_words='english', non_negative=True,
                                   norm=None, binary=False)
        vectorizer = make_pipeline(hasher, TfidfTransformer())
    else:
        vectorizer = HashingVectorizer(n_features=opts.n_features,
                                       stop_words='english',
                                       non_negative=False, norm='l2',
                                       binary=False)
else:
    vectorizer = TfidfVectorizer(max_df=0.5, max_features=opts.n_features,
                                 min_df=2, stop_words='english',
                                 use_idf=opts.use_idf)
import main

tweets = main.getDataSet()
X = vectorizer.fit_transform(tweets)
# X = vectorizer.fit_transform(dataset.data)

print("done in %fs" % (time() - t0))
print("n_samples: %d, n_features: %d" % X.shape)
print()

if opts.n_components:
    print("Performing dimensionality reduction using LSA")
    t0 = time()
    # Vectorizer results are normalized, which makes KMeans behave as
    # spherical k-means for better results. Since LSA/SVD results are
    # not normalized, we have to redo the normalization.
    svd = TruncatedSVD(opts.n_components)
    lsa = make_pipeline(svd, Normalizer(copy=False))

Ejemplo n.º 4

Mostrar archivo

Archivo: scikit-tutorial.py Proyecto: rohanraja/tweetsclassifier

                                   stop_words='english', non_negative=True,
                                   norm=None, binary=False)
        vectorizer = make_pipeline(hasher, TfidfTransformer())
    else:
        vectorizer = HashingVectorizer(n_features=opts.n_features,
                                       stop_words='english',
                                       non_negative=False, norm='l2',
                                       binary=False)
else:
    vectorizer = TfidfVectorizer(max_df=0.5, max_features=opts.n_features,
                                 min_df=2, stop_words='english',
                                 use_idf=opts.use_idf)
import main


X = vectorizer.fit_transform(main.getDataSet())
# X = vectorizer.fit_transform(dataset.data)

print("done in %fs" % (time() - t0))
print("n_samples: %d, n_features: %d" % X.shape)
print()

if opts.n_components:
    print("Performing dimensionality reduction using LSA")
    t0 = time()
    # Vectorizer results are normalized, which makes KMeans behave as
    # spherical k-means for better results. Since LSA/SVD results are
    # not normalized, we have to redo the normalization.
    svd = TruncatedSVD(opts.n_components)
    lsa = make_pipeline(svd, Normalizer(copy=False))