Python doc_guid Beispiele

Programmiersprache: Python

Namespace / Paketname: yatiri.hashing

Methode / Funktion: doc_guid

Beispiele auf hotexamples.com: 4

Python doc_guid - 4 Beispiele gefunden. Dies sind die am besten bewerteten Python Beispiele für die yatiri.hashing.doc_guid, die aus Open Source-Projekten extrahiert wurden. Sie können Beispiele bewerten, um die Qualität der Beispiele zu verbessern.

Beispiel #1

Datei anzeigen

Datei: load.py Projekt: rmax/yatiri

def preprocess(doc):
    """Add additional fields before storing document.

    >>> doc = preprocess({'url': 'http://foo'})
    >>> 'guid' in doc
    True
    >>> 'url' in doc
    True

    """
    doc['guid'] = doc_guid(doc)
    return doc

Beispiel #2

Datei anzeigen

def get_classified_items(filepath, db):
    with open(args.from_classified, 'rb') as fp:
        for line in fp:
            key, cat = line.strip().split('\t')
            cat = eval(cat)
            if isinstance(cat, list):
                cat = cat[0]
            doc = db[key]
            doc['guid'] = doc_guid(doc)
            doc['category'] = cat
            if any((f not in doc) for f in ('headline', 'datetime', 'body', 'url')):
                continue
            yield key, doc

Beispiel #3

Datei anzeigen

Datei: run_classifier.py Projekt: rmax/yatiri

 def do_label(self, line):
     """Update label for current document storing it
     in the train_path.
     """
     label = line.strip()
     path = os.path.join(self.train_path, label)
     if not os.path.exists(path):
         os.mkdir(path)
     filepath = os.path.join(path, doc_guid(self.current_doc) + '.json')
     with open(filepath, 'wb') as fp:
         json.dump(self.current_doc, fp)
     print "Document stored in train category {}".format(label)
     print "Moving to next document"
     return self.do_next('')

Beispiel #4

Datei anzeigen

Datei: run_classifier.py Projekt: rolando-archive/yatiri

 def do_label(self, line):
     """Update label for current document storing it
     in the train_path.
     """
     label = line.strip()
     path = os.path.join(self.train_path, label)
     if not os.path.exists(path):
         os.mkdir(path)
     filepath = os.path.join(path, doc_guid(self.current_doc) + '.json')
     with open(filepath, 'wb') as fp:
         json.dump(self.current_doc, fp)
     print "Document stored in train category {}".format(label)
     print "Moving to next document"
     return self.do_next('')