Document Classification Comparisons featuring Hierarchical Attention Network

The Hierarchical Attention Network is a novel deep learning architecture that takes advantage of the hierarchical structure of documents to construct a detailed representation of the document. As words form sentences and sentences form the document, the Hierarchical Attention Network’s representation of the document uses this hierarchy in order to determine what sentences and what words in those sentences are most important in the classification of the document as a whole.

Figure 1: Hierarchical Attention Network Architecture Zichao (1)

This model uses two levels of LSTM encoders at the word and sentences level in order to build the word and sentence level representations of the document. The attention mechanism is used to attribute importance at the word and sentence level.

There are two applications of the attention mechanism that attend over of the word level encoder and the sentence level encoder. These allow the model to construct a representation of the document that attribute greater levels of importance to key sentences and words throughout the document.

IMDB Dataset

All experiments were performed on the Stanford IMDB dataset which is a natural language dataset where movie reviews have labels that describe the sentiment of the movie review. There are 8 different classes that describe the sentiment from 0-3 for negative sentiment to 6-10 for positive sentiment, which are mapped down to negative sentiment 0 and positive sentiment 1.

Files in this repo

Hierarchical Attention Networks: han.py
IMDB data preprocessing: dataProcessing.py other scripts will call this is break down downloaded IMDB data set
Paths shared throughout files: utils.py

To run the experiments contained in this repo

HAN can be trained with: python han_master.py --run_type train
Evaluation can be performed on a trained HAN model with: python han_master.py --run_type test

References

Zichao, Yang. Hierarchical Attention Networks for Document Classification 25 Aug. 2017.
Jozefowicz, Rafal. An Empirical Exploration of Recurrent Network Architectures Accessed 25 Aug. 2017.
Sutskever, Llya. Sequence to Sequence Learning with Neural Networks Accessed 25 Aug. 2017.
Kim, Yoon. Convolutional Neural Networks for Sentence Classification Accessed 25 Aug. 2017.
Zhou, Peng. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification Accessed 25 Aug. 2017.
Goku, Mohandas. Interpretability via attentional and memory-based interfaces, using TensorFlow Accessed 25 Aug. 2017.
Pappas, Nikolaos. Multilingual Hierarchical Attention Networks for Document Classification Accessed 25 Aug. 2017.
Wang, Yilin. Hierarchical Attention Network for Action Recognition in Videos Accessed 25 Aug. 2017.
Seo, Paul Hongsuck. Progressive Attention Networks for Visual Attribute Prediction Accessed 25 Aug. 2017.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
lib		lib
src		src
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib

lib

src

src

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Document Classification Comparisons featuring Hierarchical Attention Network

IMDB Dataset

Files in this repo

To run the experiments contained in this repo

References

About

Releases

Packages

Languages

License

poivrenoir/HierarchicalAttentionNetworksForDocumentClassification

Folders and files

Latest commit

History

Repository files navigation

Document Classification Comparisons featuring Hierarchical Attention Network

IMDB Dataset

Files in this repo

To run the experiments contained in this repo

References

About

Resources

License

Stars

Watchers

Forks

Languages