GitHub - sagorbrur/lstm-sentence-generator: Generating English sentences using Long Short-Term Memory (LSTM) Networks

Generating English Sentences Using LSTMs.

Through the use of Long Short-Term Memory (LSTM) Networks, we aim to model the English language as closely as possible, such that model created is able to generate coherent sequences of words, namely, sentences. To achieve this, every word is transformed into a feature vector, which is used as input. Its corresponding output is the next most probable word to follow. The network will learn these probabilities through the use of a large list of human-generated sentences, more specifically the novel "Twelve Short Mystery Stories” by Sapper (Ronald Standish) ¹.

Running the Code

Dependencies

Python 3.6.1
Tensorflow 1.3.0
Keras 2.1.2
matplotlib 2.0.2
numpy 1.12.1

Command-line Execution

Directory Path: /src/scripts/

Train the model: python train.py [-h] [-v] [--data DATA] [--epochs EPOCHS] [--temp TEMP] [--slen SLEN] [-win WIN]

Optional arguments:


`-h`, `--help`	show help message and exit
`-v`, `--version`	show program's version number and exit
`--data DATA`	path to training data (default: `sample_data_short.txt`)
`--epochs EPOCHS`	number of epochs to train for (default: 50)
`--temp TEMP`	select temperature value for prediction diversity (default: 1.0)
`--slen SLEN`	maximum length for a training sequence (default: 15)
`-win WIN`	select sliding window for text iteration and data collection for training (default: 3)

Make predictions: python predict.py [-h] [-v] [--data DATA] [--seed SEED] [--nwords NWORDS] [--temp TEMP] [--slen SLEN] [-win WIN]

Optional arguments:


`-h`, `--help`	show help message and exit
`-v`, `--version`	show program's version number and exit
`--data DATA`	path to training data (default: `sample_data_short.txt`)
`--seed SEED`	provide a word or sentence to seed the program with
`--nwords NWORDS`	number of words to generate (default: 400)
`--temp TEMP`	select temperature value for prediction diversity (default: 1.0)
`--slen SLEN`	maximum length for a training sequence (default: 15)
`-win WIN`	select sliding window for text iteration and data collection for training (default: 3)

1: This novel is available on the Project Gutenberg website.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.ipynb_checkpoints		.ipynb_checkpoints
docs		docs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

docs

docs

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

report.pdf

report.pdf

Repository files navigation

Generating English Sentences Using LSTMs.

Running the Code

Dependencies

Command-line Execution

About

Releases

Packages

Languages

License

sagorbrur/lstm-sentence-generator

Folders and files

Latest commit

History

Repository files navigation

Generating English Sentences Using LSTMs.

Running the Code

Dependencies

Command-line Execution

About

Resources

License

Stars

Watchers

Forks

Languages