Public Notes
on
histre
Improved file parsing for LLM’s. Contribute to Filimoa/open-parse development by creating an account on GitHub.
#llm #parser #text #convert #markdown #split #extraction #content #python #library #pdf #ocr
Show More
Layout-Parser/layout-parser: A Unified Toolkit for Deep Learning Based Document Image Analysis
github.com
A Unified Toolkit for Deep Learning Based Document Image Analysis - Layout-Parser/layout-parser
#pdf #layout #parser #llm #python #image #ocr
Show More
OCR Software, Data Extraction Tool - Amazon Textract - AWS
aws.amazon.com
Amazon Textract is a machine learning (ML) service that uses optical character recognition (OCR) to automatically extract text, handwriting, and data from scanned PDF documents, forms, and tables.
#content #extraction #ocr #pdf #parser #api
Show More
Document AI | Google Cloud
cloud.google.com
Community maintained fork of pdfminer - we fathom PDF - pdfminer/pdfminer.six
#python #pdf #content #extraction #parser #library
Show More
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents. - pymupdf/PyMuPDF
#python #pdf #content #extraction #parser #library
Show More
A library and language for building parsers, interpreters, compilers, etc. - ohmjs/ohm
#dsl #lexical #language #parser #javascript
Show More
CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts. - kellyjonbrazil/jc: CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts.
Show More
tatatap-com/sowhat
github.com
Contribute to tatatap-com/sowhat development by creating an account on GitHub.
#javascript #command #parser #plain-text #pub
Show More
miyuchina/mistletoe: A fast, extensible and spec-compliant Markdown parser in pure Python.
github.com
A fast, extensible and spec-compliant Markdown parser in pure Python. - miyuchina/mistletoe: A fast, extensible and spec-compliant Markdown parser in pure Python.
#python #markdown #parser #library #pub
Show More
Collect and share the web
Get started for free