FqQuAD Question/Context

Goal : for an random question return adequate context

Dataset used : FqQuAD https://fquad.illuin.tech/

Here is an overview of possible approaches:

Tokenizing words and sentences

Use of "NTLK" package and "gensim"
Use of "gensim" Dictionary and Similarity functions
main.py returns the appropriate context to a random selected question, ie. the context with the best similarity score (gensim similiarity)
if it doesn't return the right context, it returns the 2 next possible options

Metrics used

To find out if the context ouput is the right one:

a function, checks in the train.json file, it is the appropriate context.

if yes : return 1
else : return 0

If 0 is retured, we loook at the top 3 possible

Performances over a set of 100 random questions :

46% of questions were given the right context
for 53% of questions, one of top 3 similar contexts returned were adequate

CLI execution : python main.py -path "train.json"

CLI argument : -path (indicate FqQuAD json file location) default "train.json"

Other possible approaches

More efficient approches

Bert tokenzing + classification
Use LSTM

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
__pycache__		__pycache__
data		data
model		model
.gitignore		.gitignore
README.md		README.md
main.py		main.py
metrics.py		metrics.py
train.json		train.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pycache

pycache

data

data

model

model

.gitignore

.gitignore

README.md

README.md

main.py

main.py

metrics.py

metrics.py

train.json

train.json

Repository files navigation

FqQuAD Question/Context

Tokenizing words and sentences

Metrics used

Other possible approaches

More efficient approches

About

Releases

Packages

Languages

MarionSauvage/FqQuAD_Question_Context

Folders and files

Latest commit

History

Repository files navigation

FqQuAD Question/Context

Tokenizing words and sentences

Metrics used

Other possible approaches

More efficient approches

About

Resources

Stars

Watchers

Forks

Languages