Esempi in Python per read

Linguaggio di programmazione: Python

Spazio dei nomi/nome del pacchetto: read_dataset

Metodo/funzione: read

Esempi su hotexamples.com: 2

read in Python: 2 esempi trovati. Questi sono i migliori esempi reali in Python per read_dataset.read, estratti da progetti open source. Li puoi valutare, per aiutarci a migliorare la qualità dei nostri esempi.

Esempio n. 1

Mostra file

File: baseline.py Progetto: douglascastrorj/fakenews_identification

  recall_std = recall.std()
  accuracy_std = accuracy.std()

  print('Escrevendo arquivo de log\n')
  file.write('Recall Macro: ' + str(recall_mean) + ' (+/-) ' + str(recall_std * 2) + '\n' )
  file.write('Precision Macro: ' + str(precision_mean) + ' (+/-) ' + str(precision_std * 2) + '\n' )
  file.write('F1 Macro: ' + str(f1_mean) + ' (+/-) ' +str(f1_std * 2) + '\n' )
  file.write('Accuracy: ' + str(accuracy_mean) + ' (+/-) ' +str(accuracy_std * 2) + '\n' )

  file.write('\n\n#############################################\n\n')
  file.close() 



## LENDO DATASET        ######################
train,test = rd.read()
categories = ['fake', 'real']

train_text = rd.get_text(train)
train_target = rd.get_target(train)

# test_text = rd.get_text(test)
# test_target = rd.get_target(test)
#################################################

combinations = get_combinations()
# combinations = use_custom()


for combination in combinations:
  analisar_features(train_text,

Esempio n. 2

Mostra file

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import print_function
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.datasets import fetch_20newsgroups

import read_dataset as rd
import numpy as np

train, test = rd.read(percent_train=.9)

categories = ['fake', 'real']

#Tokenizar texto
from sklearn.feature_extraction.text import CountVectorizer

count_vect = CountVectorizer()
X_train_counts = count_vect.fit_transform([data['text'] for data in train])
print(X_train_counts.shape)

#ocorrencias para frequencias
#Occurrence count is a good start but there is an issue: longer documents will have higher average
#count values than shorter documents, even though they might talk about the same topics.
#To avoid these potential discrepancies it suffices to divide the number of occurrences of each word in a document by the total number of words in the document: these new features are called tf for Term Frequencies.
#Another refinement on top of tf is to downscale weights for words that occur in many documents in the corpus and are therefore less informative than those that occur only in a smaller portion of the corpus.

from sklearn.feature_extraction.text import TfidfTransformer

tf_transformer = TfidfTransformer(use_idf=False).fit(X_train_counts)
X_train_tf = tf_transformer.transform(X_train_counts)