Skip to content

Alea4jacta6est/transformers-for-lawyers

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

transformers-for-lawyers

AI apps/benchmark for legaltech

Searching better search options for lawyers

DEMO with huggingface, Jina and European Court of Justice judgments


This site contains 113 judgments of the European Court of Justice from 2019 and 2020 concerning tax issues.

All sentences from judgments have been encoded via BERT model (bert-base-uncased provided by Huggingface's transformers library), an example of a very powerful NLP model that has conquered AI applications.

The infrastructure of the search experience is based on Jina - a wonderful scalable library to design neural search engines, based on the newest Deep Learning strategies.

The entire concept - as well as Jina and Huggingface - has a great future in legal tech, because lawyers need to use a lot of documents, and searching among them is highly challenging...


...That's why law is so compelling and hard


How does it work?
  1. Write a phrase / sentence
  2. Click Enter
  3. You get the most similar sentence (the lower the score, the better)


Enjoy!...

... and be aware that this is a playground. Sometimes BERT doesn't give proper hits, but sometimes analogies are pretty impressive, like:

QUERY: that complaint was rejected
RESULTS:

  1. That request was rejected.
  2. Its application was rejected, as was the objection that it subsequently lodged.
  3. That request was rejected.
  4. That is unfair and unlawful.
  5. That argument cannot be accepted.

Remarks

I am aiming to test other approaches (like other transformer architectures), and fine-tune it, in order to prepare an ultimate benchmark of AI solutions for legaltech, so stay tuned and follow me at LinkedIn, Twitter and on my blog at inteliLex.



If you would like to test it on more documents and play with the code, please clone this git repository and contribute to it.



Feel free to contact me: artur.tanona@gmail.com.

Works on Ubuntu 18.04 and Docker

Below please find how to launch it on Ubuntu. If you are working on (for example) Windows 10 you need to have a Docker engine running and you can skip to the last part "Run on Docker"

1. Upload

Upload documents in *.txt format to search_engine/data and frontendApp/src/assets.

2. Set global environment variables

export JINA_PORT=56798
export JINA_PARALLEL=1
export JINA_SHARDS=1
export CLIENT_PORT=80
export JINA_WORKSPACE=test_index
export JINA_MAX_DOCS=100
export JINA_PORT=65481

And put them in the .env/.env file:

JINA_PORT=56798
JINA_PARALLEL=1
JINA_SHARDS=1
CLIENT_PORT=80
JINA_WORKSPACE=test_index
JINA_MAX_DOCS=100
JINA_PORT=65481

3. Run locally

Create virtual environment and source it. Then, in the search_engine directory:

cd search_engine
pip install -r requirements.txt
python app.py -t index
cd ..
sudo docker-compose up
(If you are in Google Cloud, please, update environment.prod.ts file with your host and add firewall rules)

And you can open the website on http://localhost:4200

4. Caveat

Demo does not include PolTaxBERT and StateAid Bert - they will be included soon.

About

AI apps/benchmark for legaltech

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • CSS 62.2%
  • TypeScript 24.6%
  • HTML 8.8%
  • Python 2.4%
  • JavaScript 1.3%
  • Dockerfile 0.5%
  • Shell 0.2%