FASoC Datasheet-Scrubber

The FASoC Datasheet Scrubber is a utility that scrubs through large sets of PDF datasheets/documents in order to extract key circuit information. The information gathered is used to build a database of commercial off-the-shelf (COTS) IP that can be used to build larger SoC in the FASoC design. More information here.

Setup instructions

Ensure your machine has the correct python version and all of the python modules required to run through the datasheet scrubber.
- Requirements: Python 3.6 (packages pandas, scipy, matplot, matplotlib, pdfminer.six, pypdf2, request, lxml, tabula-py, sklearn, regex, keras, tensorflow, pdf2image, pillow, pytesseract, numpy, opencv-python, gensim, nltk). Python versions below 3.6 are not supported.
Ensure you have ssh keys setup for github. Instructions for generating and adding ssh keys can be found here.

Clone the Datasheet Scrubber repository

git clone git@github.com:idea-fasoc/datasheet-scrubber.git

Database

The FASoC database contains more than 700,000 records of Integrated Circuits (ICs) components collected from Digikey.

Database Web Application

In order to access a sample of this collection, visit our web application or proceed here.

Raw Database

To have access to the entire collection of components, please visit here.

Datasheet-Scrubber

Datasheet scrubber includes three steps of category recognition, table extracton and text extraction.

Test

an example of how to use the table extractor can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 243 Commits
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

tests

tests

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

FASoC Datasheet-Scrubber

Setup instructions

Database

Database Web Application

Raw Database

Datasheet-Scrubber

Test

About

Releases

Packages

Languages

License

hong-yh/datasheet-scrubber

Folders and files

Latest commit

History

Repository files navigation

FASoC Datasheet-Scrubber

Setup instructions

Database

Database Web Application

Raw Database

Datasheet-Scrubber

Test

About

Resources

License

Stars

Watchers

Forks

Languages