Python Embedding.from_dict Examples

Programming Language: Python

Namespace/Package Name: polyglot.mapping

Class/Type: Embedding

Method/Function: from_dict

Examples at hotexamples.com: 1

Python Embedding.from_dict - 1 examples found. These are the top rated real world Python examples of polyglot.mapping.Embedding.from_dict extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

load(23)

from_glove(7)

Embedding(1)

from_dict(1)

from_gensim(1)

from_word2vec(1)

get(1)

zero_vector(1)

Example #1

Show file

File: embeddings.py Project: EMBEDDIA/event-detection

def load_embedding(fname,
                   format="word2vec_bin",
                   normalize=True,
                   lower=False,
                   clean_words=False,
                   load_kwargs={}):
    """
    Loads embeddings from file

    Parameters
    ----------
    fname: string
      Path to file containing embedding

    format: string
      Format of the embedding. Possible values are:
      'word2vec_bin', 'word2vec', 'glove', 'dict'

    normalize: bool, default: True
      If true will normalize all vector to unit length

    clean_words: bool, default: True
      If true will only keep alphanumeric characters and "_", "-"
      Warning: shouldn't be applied to embeddings with non-ascii characters

    load_kwargs:
      Additional parameters passed to load function. Mostly useful for 'glove' format where you
      should pass vocab_size and dim.
    """
    assert format in ['word2vec_bin', 'word2vec', 'glove',
                      'dict'], "Unrecognized format"
    if format == "word2vec_bin":
        #        w = Embedding.from_word2vec(fname, binary=True)
        #        w = KeyedVectors.load_word2vec_format('/home/boros/web_data/embeddings/GoogleNews-vectors-negative300.bin.gz', binary=True)
        w = KeyedVectors.load_word2vec_format(fname, binary=True)
    elif format == "word2vec":
        w = Embedding.from_word2vec(fname, binary=False)
    elif format == "glove":
        w = Embedding.from_glove(fname, **load_kwargs)
    elif format == "dict":
        d = pickle.load(open(fname, "rb"), encoding='latin1')
        w = Embedding.from_dict(d)


#    if normalize:
#        w.normalize_words(inplace=True)
#    if lower or clean_words:
#        w.standardize_words(lower=lower, clean_words=clean_words, inplace=True)
    return w