Exemplo n.º 1
0
from colors import ColorsCorpusReader
import os
import pandas as pd
from sklearn.model_selection import train_test_split
import torch
from torch_color_describer import (ContextualColorDescriber,
                                   create_example_dataset)
import utils
from utils import START_SYMBOL, END_SYMBOL, UNK_SYMBOL

tiny_contexts, tiny_words, tiny_vocab = create_example_dataset(group_size=3,
                                                               vec_dim=2)

toy_mod = ContextualColorDescriber(
    tiny_vocab,
    embedding=None,  # Option to supply a pretrained matrix as an `np.array`.
    embed_dim=10,
    hidden_dim=20,
    max_iter=100,
    eta=0.01,
    optimizer=torch.optim.Adam,
    batch_size=128,
    l2_strength=0.0,
    warm_start=False,
    device=None)

_ = toy_mod.fit(tiny_contexts, tiny_words)

metric = toy_mod.listener_accuracy(tiny_contexts, tiny_words)
print("listener_accuracy:", metric)
Exemplo n.º 2
0
dev_mod = ContextualColorDescriber(dev_vocab,
                                   embed_dim=10,
                                   hidden_dim=10,
                                   max_iter=5,
                                   batch_size=128)

# In[ ]:

_ = dev_mod.fit(dev_cols_train, dev_seqs_train)

# As discussed in [colors_overview.ipynb](colors_overview.ipynb), our primary metric is `listener_accuracy`:

# In[ ]:

dev_mod.listener_accuracy(dev_cols_test, dev_seqs_test)

# We can also see the model's predicted sequences given color context inputs:

# In[ ]:

dev_mod.predict(dev_cols_test[:1])

# In[ ]:

dev_seqs_test[:1]

# ## Question 3: GloVe embeddings [1 points]
#
# The above model uses a random initial embedding, as configured by the decoder used by `ContextualColorDescriber`. This homework question asks you to consider using GloVe inputs.
#
Exemplo n.º 3
0
# ### Listener-based evaluation

# `ContextualColorDescriber` implements a method `listener_accuracy` that we will use for our primary evaluations in the assignment and bake-off. The essence of the method is that we can calculate
#
# $$
# c^{*} = \text{argmax}_{c \in C} P_S(\text{utterance} \mid c)
# $$
#
#
# where $P_S$ is our describer model and $C$ is the set of all permutations of all three colors in the color context. We take $c^{*}$ to be a correct prediction if it is one where the target is in the privileged final position. (There are two such contexts; we try both in case the order of the distractors influences the predictions, and the model is correct if one of them has the highest probability.)
#
# Here's the listener accuracy of our toy model:

# In[35]:

toy_mod.listener_accuracy(toy_color_seqs_test, toy_word_seqs_test)

# ### Other prediction and evaluation methods

# You can get the perplexities for test examles with `perpelexities`:

# In[36]:

toy_perp = toy_mod.perplexities(toy_color_seqs_test, toy_word_seqs_test)

# In[37]:

toy_perp[0]

# You can use `predict_proba` to see the full probability distributions assigned to test examples: