This repository is a fork of "Language Modelling Makes Sense" for word-sense disambiguation (WSD). This repository provides the optimized code as a flask-application for finding the WSD of word(s) in a given sentence, utilizing pytorch BERT model and pre-trained word sense contexual embeddings.
Key features added:
- Development of end-to-end WSD API to get the detailed information from a sentence and it's words given as input.
- BERT model weights conversion from tensorflow to pytorch
- Excluding burdensome process (client/server) architecture bert-as-service to retrieve the pretrained BERT embeddings
- Optimization of code to get the WSD API output in less than 2 seconds on CPU (upto three words in a single request)
$ cd lmms_app
$ pip3 install -r requirements.txt
Manual Link: .npz (0.3GB)
Terminal:
$ pip3 install gdown
$ gdown https://drive.google.com/uc?id=1kuwkTkSBz5Gv9CB_hfaBh1DQyC2ffKq0&export=download
The pytorch BERT model weights are converted from the original model provided in the paper: cased_L-24_H-1024_A-16
Converted Manual Link: pytorch-bert-model
Terminal:
$ gdown https://drive.google.com/uc?id=1kuwkTkSBz5Gv9CB_hfaBh1DQyC2ffKq0&export=download
$ unzip bert_torch_model.zip
Run the following command to deploy the flask application:
$ python3 WSD_updated.py
Once the API is up and running, invoke the API through the following curl command:
$ curl -X POST -d '{"sentence":"you were right that turning right was a better way", "word": ["right",
"turning"]}' http://127.0.0.1:5000/synset_processing -H "Content-Type: application/json" -w 'Total: %{time_total}s\n'
Sample output:
{
"bert_WSD": [
{
"definition": "free from error; especially conforming to fact or truth",
"offset": 631391,
"synset": "Synset('correct.a.01')",
"synset_key": "right%3:00:02::",
"word": "right"
},
{
"definition": "toward or on the right; also used figuratively",
"offset": 387828,
"synset": "Synset('right.r.04')",
"synset_key": "right%4:02:03::",
"word": "right"
},
{
"definition": "a turn toward the side of the body that is on the north when the person is facing east",
"offset": 351168,
"synset": "Synset('left.n.05')",
"synset_key": "left%1:04:00::",
"word": "turning"
}
]
}
Total: 1.363022s