Skip to content

A Tensorflow implementation of the Tex2Vis neural network, tha applies a new StochasticLoss criterion to learn a mapping from textual descriptions to images.

Notifications You must be signed in to change notification settings

AlexMoreo/tensorflow-Text2Vis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text2Vis

Text2Vis is a family of neural network models aimed at learning a mapping from short textual descriptions to visual features, so that one can search for images by simply providing a short description of it.

Text2Vis includes: (i) a sparse version, that takes a one-hot vector represeting the textual description as input; (ii) a dense version where words are embedded and given as input to an LSTM conditioning the last memory state to the visual space; and (iii) a Wide & Deep model, that combines both sparse and dense representations. We also included our reimplementation of the Word2VisualVec model with the MSE loss.

Note: to train the model, you need the visual features associated to the MsCOCO image repository. In our experiments, we considered the fc6 and fc7 layers of the Hybrid CNN. They were however too heavy to add them to the repository, but if you need them we will be very happy to share ours with you!

About

A Tensorflow implementation of the Tex2Vis neural network, tha applies a new StochasticLoss criterion to learn a mapping from textual descriptions to images.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages