Skip to content

mriosb08/palodiem-QE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#University of Leeds ##Palodiem Project

Transfer-based Quality Estimation (QE).

Scripts to test different trasnfer learning methods on QE.

*Please update paths accordingly to your system.

##Directories

Scripts: (src/) Transfer-based QE scripts.

Data: (data/) QuEst features and HTER labels frm Autodesk data.

##Features

QuEst: *Dependency: http://www.quest.dcs.shef.ac.uk/

*config for QueEst example: config_en-es.autodesk.properties

Extract QuEst features:

sh extract_quest.sh

Bicvm, bilingual embeddings: *dependency: https://github.com/karlmoritz/bicvm *train bilingual embeddings example:

/data/mrios/workspace/sw/bicvm/bin/dbltrain --input1 autodesk.pt-es.clean.pt --input2 autodesk.pt-es.clean.es --tree plain --type additive --method adagrad --word-width 100 --hinge_loss_margin 128 --model1-out modelA.pt-es.autodesk.100 --model2-out modelB.pt-es.autodesk.100 --noise 10 --batches 20 --eta 0.05 --lambdaD 1 --calc_bi_error1 true --calc_bi_error2 true --iterations 10 &>dblog.pt-es.log

extract bilingual embeddings features:

sh bicvm.sh

##Available transfer models

Self-Taugh Learning (STL):

python src/stlSVR.py --training-examples autodesk.training.en-es.feat --training-labels autodesk.training.en-es.hter --unlabelled-examples autodesk.training.en-pt.feat --test autodesk.test.en-pt.feat --output autodesk.en-pt.pred --epsilon 41.06 --c 0.232 --hidden-layer 50

Spectral Learning: usage:python ccaQEpy [training-features] [training-label] [test-features] [test-labels] [U-file] [hid-size]

python /data/mrios/workspace/palodiem-qe/ccaQE.py sent.training.autodesk.en-ru.vec autodesk.training.en-ru.hter sent.test.autodesk.en-pl.vec autodesk.test.en-pl.hter sent.training.autodesk.en-pl.vec 10

SRV Baseline:

usage:python SVR.py [training-features] [training-label] [test-features] [test-labels] [C] [epsilon]

python /data/mrios/workspace/palodiem-qe/SVR.py sent.training.autodesk.en-es.pe.quest autodesk.training.en-es.hter sent.test.autodesk.en-pt-es.quest autodesk.test.en-pt-es.clean.goo.hter 41.06 0.232

Latent variable Gaussian Process (LVGP):

*We use the LVGP instead of autoencoder to perfom STL.

usage:python GPReg.3.py [training-features] [training-label] [test-features] [test-label] [u-file] [hid-dim] [output]

python /data/mrios/workspace/palodiem-qe/GPReg.3.py ../autodesk/sent.training.autodesk.en-es.pe.quest ../autodesk/autodesk.training.en-es.hter wmt12.test.en-es.quest wmt12.test.en-es.goo.hter wmt12.training.en-es.quest 10 deepGP.pred

Transfer SVM (classification only):

sh src/tsvm.sh

About

University of Leeds, Palodiem project for Quality Estimation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published