Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrape datatsheets with optical AI #5

Open
fl4p opened this issue Aug 13, 2024 · 6 comments
Open

Scrape datatsheets with optical AI #5

fl4p opened this issue Aug 13, 2024 · 6 comments

Comments

@fl4p
Copy link
Owner

fl4p commented Aug 13, 2024

charts and test conditions

CSD19506KCS copy.pdf
sample

Qgd - V chart

image

Characteristics

  • Dynamic Characteristics
  • Diode Characteristics
  • Thermal Information
    image
@fl4p
Copy link
Owner Author

fl4p commented Aug 13, 2024

Infineon IQD016N08NM5 data sheet:

image

image

@fl4p
Copy link
Owner Author

fl4p commented Aug 20, 2024

@fl4p
Copy link
Owner Author

fl4p commented Aug 20, 2024

image
onsemi NVMTS6D0N15MC

@fl4p
Copy link
Owner Author

fl4p commented Aug 20, 2024

image

https://toshiba.semicon-storage.com/eu/semiconductor/knowledge/faq/mosfet/electrical-characteristics-of-mosfetscharge-characteristic-qg-qg.html

note that there are 2 definitions for Qgs2:

  • the remaining charge after the miller plateu (NVMTS6D0N15MC)
  • the charge threshold until the plateau (FDP027N08B)
    image

@fl4p
Copy link
Owner Author

fl4p commented Aug 20, 2024

datasheet TPH5R60APL lists Qgs1 wich is the charge until the start of miller plateau

@fl4p
Copy link
Owner Author

fl4p commented Sep 24, 2024

Update:

  • ocrmypdf delivers good results
  • wan to try pix2text, which generates markdown tables.
  • tesseract with tables (TODO link)
  • train tesseract or other custom models (pix2text)

Ocr Issues:

  • most of the missing fields are because OCR struggles with reading symbols properly (Qg, Vsd ...) .
  • tabular gets table wrong
  • Symbols are no words, so no support from dictionary.
  • tesseract supports user word list, this helps with Vplateau but not with Qgs, Qgd.
  • Numerical OCR can be improved
  • table structure sometimes generates characters

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant