Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/ocrmypdf/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
https://github.com/ocrmypdf/OCRmyPDF
Bump astral-sh/setup-uv from 4 to 5
dependabot[bot] opened this pull request 11 days ago
dependabot[bot] opened this pull request 11 days ago
graft: fix invisible text appearing after strip_invisible_text
pajowu opened this pull request 23 days ago
pajowu opened this pull request 23 days ago
[Feature]: Aggressive image optimization without color quantization
user1823 opened this issue 23 days ago
user1823 opened this issue 23 days ago
hocr: only add space if boxwidth is positive
pajowu opened this pull request 23 days ago
pajowu opened this pull request 23 days ago
[Bug]: scanned pdf containig electronics schematic
saadb opened this issue 26 days ago
saadb opened this issue 26 days ago
ocrmypdf -v 2 fails with log messages interpreted as tags
fernandoherreradelasheras opened this issue 26 days ago
fernandoherreradelasheras opened this issue 26 days ago
Update intersphinx mapping to current format
QuLogic opened this pull request 29 days ago
QuLogic opened this pull request 29 days ago
Fix "Scanning contents" progress bar with --redo-ocr
aliemjay opened this pull request 29 days ago
aliemjay opened this pull request 29 days ago
fix minor grammar mistake
joskezelensky opened this pull request about 1 month ago
joskezelensky opened this pull request about 1 month ago
[Bug]: OCR Output Quality Regression on Ubuntu 24.04
guilhermebferreira opened this issue about 1 month ago
guilhermebferreira opened this issue about 1 month ago
[Bug]: deskew results in "empty" output file
hatl opened this issue about 1 month ago
hatl opened this issue about 1 month ago
Documentation for ''ocrmypdf.ocr()" not found
fatsciock opened this issue about 1 month ago
fatsciock opened this issue about 1 month ago
Bump astral-sh/setup-uv from 3 to 4
dependabot[bot] opened this pull request about 1 month ago
dependabot[bot] opened this pull request about 1 month ago
[Feature]: Option to remove OCR
user1823 opened this issue about 1 month ago
user1823 opened this issue about 1 month ago
[Feature]: Feature Request - Use Google Document AI or VIsion AI instead of Tesseract
epatels opened this issue about 2 months ago
epatels opened this issue about 2 months ago
Bump codecov/codecov-action from 4 to 5
dependabot[bot] opened this pull request about 2 months ago
dependabot[bot] opened this pull request about 2 months ago
[Bug]: pikepdf PdfMatrix module unavailale
IsaacSugden opened this issue about 2 months ago
IsaacSugden opened this issue about 2 months ago
Facing issue while applying ocrmypdf to document which different layouts at each page
prashanthkolaneru opened this issue about 2 months ago
prashanthkolaneru opened this issue about 2 months ago
[Feature]: Add drop caps support
4F2E4A2E opened this issue about 2 months ago
4F2E4A2E opened this issue about 2 months ago
ocrmypdf isn't installing on termux
eelalzep opened this issue about 2 months ago
eelalzep opened this issue about 2 months ago
[Bug]: HOCRResult.from_json() not unpickling correctly
hoblins opened this issue about 2 months ago
hoblins opened this issue about 2 months ago
[Bug]: Docker container entry point
sneakpodbob opened this issue about 2 months ago
sneakpodbob opened this issue about 2 months ago
[3rdparty]: paperless-ngx
Checole opened this issue about 2 months ago
Checole opened this issue about 2 months ago
[Bug]: test_malformed_docinfo fails with spectacular INTERNALERROR
mcepl opened this issue about 2 months ago
mcepl opened this issue about 2 months ago
[Feature]: Show page numbers when detecting rotation
tsoernes opened this issue about 2 months ago
tsoernes opened this issue about 2 months ago
[Feature]: Show page number in PriorOcrFoundError
tsoernes opened this issue about 2 months ago
tsoernes opened this issue about 2 months ago
[Bug]: '_idat' object has no attribute 'fileno' // No space left on device
kkduke opened this issue 2 months ago
kkduke opened this issue 2 months ago
[Bug]: Example docker-compose.yml not working anymore
ckagerer opened this issue 2 months ago
ckagerer opened this issue 2 months ago
[Bug]: There was an error in an annotation | Setting Overprint Mode to 1 not permitted in PDF/A-2, overprint mode not set
tsoernes opened this issue 2 months ago
tsoernes opened this issue 2 months ago
[3rdparty]: paperless-ngx PDF Fails to Process with InputFileError: PDF content stream is corrupt
singlatushar07 opened this issue 2 months ago
singlatushar07 opened this issue 2 months ago
[Bug]: "remove-background is temporarily not implemented" error on linux
dimyself opened this issue 2 months ago
dimyself opened this issue 2 months ago
[Bug]: Unable to proceed with a custom language lacking a dictionary
vchgan opened this issue 2 months ago
vchgan opened this issue 2 months ago
[Bug]: Unpaper Not Found: "Warning: using insecure memory!"
vfilby opened this issue 2 months ago
vfilby opened this issue 2 months ago
Data privacy when using OCRmyPDF
etroci opened this issue 3 months ago
etroci opened this issue 3 months ago
[Bug]: cannot import name 'PdfMatrix' from 'pikepdf'
kdbreck opened this issue 3 months ago
kdbreck opened this issue 3 months ago
[Feature]: support for Apple vision framework
santiagozky opened this issue 3 months ago
santiagozky opened this issue 3 months ago
Doc: new infix for temp files; snap temp files folder
mayeulk opened this pull request 3 months ago
mayeulk opened this pull request 3 months ago
[Bug]: Refuses to process old book with existing OCR
themaster567 opened this issue 3 months ago
themaster567 opened this issue 3 months ago
[Bug]: File generated by OCRmyPDF doesn't open in all PDF editors
sklart opened this issue 3 months ago
sklart opened this issue 3 months ago
[Bug]: Highlights/annotations repeated on all pages
Jmuccigr opened this issue 3 months ago
Jmuccigr opened this issue 3 months ago
[Bug]: pikepdf cropbox/mediabox/trimbox as list can return strings in the list
jozuas opened this issue 3 months ago
jozuas opened this issue 3 months ago
[Bug]: Cannot create a file when that file already exists
user1823 opened this issue 4 months ago
user1823 opened this issue 4 months ago
[Bug]: Tesseract fails on Alpine 3.20.3
pschichtel opened this issue 4 months ago
pschichtel opened this issue 4 months ago
[Feature]: Align pages to text baseline
swxxii opened this issue 4 months ago
swxxii opened this issue 4 months ago
How to remove the image-with-text from the PDF
SurinameClubcard opened this issue 4 months ago
SurinameClubcard opened this issue 4 months ago
Bump sigstore/gh-action-sigstore-python from 2.1.1 to 3.0.0
dependabot[bot] opened this pull request 4 months ago
dependabot[bot] opened this pull request 4 months ago
当使用ocrmypdf输入 PDF 为中文时,结果 复制PDF 中有额外的空格
deict opened this issue 4 months ago
deict opened this issue 4 months ago
[3rdparty]: 当使用ocrmypdf输入 PDF 为中文时,结果 复制PDF 中有额外的空格
deict opened this issue 4 months ago
deict opened this issue 4 months ago
[Feature]: Add a flag to enable ocrmypdf to write "last-modified attribute" to the OCR'ed file.
ashrockd opened this issue 4 months ago
ashrockd opened this issue 4 months ago
[Feature]: decrypt file if qpdf is installed (EncryptedPdfError: Input PDF is encrypted. The encryption must be removed to perform OCR.)
JoKalliauer opened this issue 4 months ago
JoKalliauer opened this issue 4 months ago
[Bug]: "AttributeError: module 'numpy.typing' has no attribute 'NDArray'" after Homebrew installation
tillboehringer opened this issue 4 months ago
tillboehringer opened this issue 4 months ago
Recommended way of running ocrmypdf with memory limits
andersfylling opened this issue 4 months ago
andersfylling opened this issue 4 months ago
Add mdate preservation
ferdiga opened this pull request 4 months ago
ferdiga opened this pull request 4 months ago
Fix broken test_rotate_page_level
QuLogic opened this pull request 5 months ago
QuLogic opened this pull request 5 months ago
[Bug]: Scan time regression in 16.4.3 with `--redo-ocr`
aliemjay opened this issue 5 months ago
aliemjay opened this issue 5 months ago
[Bug/Feature]: a way to disable Ghostscript requirement & broken plugin_manager option
nikitar opened this issue 5 months ago
nikitar opened this issue 5 months ago
[Bug]: Scan time increases quadratically with page count
aliemjay opened this issue 5 months ago
aliemjay opened this issue 5 months ago
[Bug]: Regression in 16.4
gringus opened this issue 5 months ago
gringus opened this issue 5 months ago
[Bug]: NotImplementedError in colorspace
macdeport opened this issue 5 months ago
macdeport opened this issue 5 months ago
[Bug]: ocrmypdf: error: unrecognized arguments: input.pdf output.pdf
KNDaniel opened this issue 5 months ago
KNDaniel opened this issue 5 months ago
[Feature]: Result Improvement with OpenCV + Pillow Preprocessing
vishaldwdi opened this issue 5 months ago
vishaldwdi opened this issue 5 months ago
does not ocr 90° rotated texts
stfnx opened this issue 5 months ago
stfnx opened this issue 5 months ago
[Bug]: Output file is okay but is not PDF/A
tcurdt opened this issue 5 months ago
tcurdt opened this issue 5 months ago
[Query]: docker watched folder environment variables, optimize how?
jaxjexjox opened this issue 5 months ago
jaxjexjox opened this issue 5 months ago
[Bug]: Large file size increases due to PDF/A font substitution
ferdiga opened this issue 5 months ago
ferdiga opened this issue 5 months ago
[Bug]: maximum recursion depth exceeded
you-healthtap opened this issue 5 months ago
you-healthtap opened this issue 5 months ago
[Bug]: The generated PDF is INVALID
user1823 opened this issue 5 months ago
user1823 opened this issue 5 months ago
[Bug]: Output PDF is too large
user1823 opened this issue 5 months ago
user1823 opened this issue 5 months ago
[Bug]: The width is not correct for detected words
you-healthtap opened this issue 5 months ago
you-healthtap opened this issue 5 months ago
[Bug]: cannot add non-opaque RGBA color to RGB palette
jozuas opened this issue 5 months ago
jozuas opened this issue 5 months ago
[Bug]: subprocess.CalledProcessError: Command '['D:\\latex\\texlive\\2020\\bin\\win32\\jbig2.EXE', '--version']' returned non-zero exit status 3.
459737087 opened this issue 5 months ago
459737087 opened this issue 5 months ago
[Bug]: Ghostscript rasterizing failed
user1823 opened this issue 5 months ago
user1823 opened this issue 5 months ago
[Bug]: pdfminer.pdfexceptions.PDFTypeError: invalid length: 6
user1823 opened this issue 5 months ago
user1823 opened this issue 5 months ago
ocrmypdf produces wrong page size
femifrak opened this issue 5 months ago
femifrak opened this issue 5 months ago
[Bug]: with the latest version of Ghostscript 10.03.1, ocrmypdf is passing file names to Ghostscript in the wrong order
alan-sandollar opened this issue 6 months ago
alan-sandollar opened this issue 6 months ago
[Bug]: FileNotFoundError: [Errno 2] No such file or directory: 'gs'
459737087 opened this issue 6 months ago
459737087 opened this issue 6 months ago
Update installation.rst "python -m venv .venv"
JoKalliauer opened this pull request 6 months ago
JoKalliauer opened this pull request 6 months ago
Add '--needed' flag to arch base-devel install command
mersenne-twister opened this pull request 6 months ago
mersenne-twister opened this pull request 6 months ago
--sidecar writes text content and messages to file
gerritgriebel opened this issue 6 months ago
gerritgriebel opened this issue 6 months ago
[Bug]: files signed with a-trust are not recognised as digitally signed and hence processed
ferdiga opened this issue 6 months ago
ferdiga opened this issue 6 months ago
[Bug]: Ghostscript rasterizing failed
JoKalliauer opened this issue 6 months ago
JoKalliauer opened this issue 6 months ago
[Bug]: KeyError: '/Subtype'
user1823 opened this issue 6 months ago
user1823 opened this issue 6 months ago
[Bug]: Ghostscript can't create a PDF/A-file (Page object was reserved for an Annotation destination)
JoKalliauer opened this issue 6 months ago
JoKalliauer opened this issue 6 months ago
[Bug]: problem with tif "DPI is not credible". Estimate dpi
drnicolas opened this issue 6 months ago
drnicolas opened this issue 6 months ago
[Bug]: OSError: [Errno 28] No space left on device
Salvodif opened this issue 6 months ago
Salvodif opened this issue 6 months ago
Output file images are corrupted
robmclear opened this issue 6 months ago
robmclear opened this issue 6 months ago
[Bug]: doesn't always parse Latin with diacritics
arsinclair opened this issue 6 months ago
arsinclair opened this issue 6 months ago
[Feature]: Enable execution on GPU
danielfcastro opened this issue 6 months ago
danielfcastro opened this issue 6 months ago
[Request]: Please make rich logging library an optional dependency
lucasgadams opened this issue 6 months ago
lucasgadams opened this issue 6 months ago
[Bug]: Existing text is completely replaced with other characters
david-sledge opened this issue 7 months ago
david-sledge opened this issue 7 months ago
[Bug]: ocrmypdf (16.3.1) and Tesseract 5.4.1
Johnnie390 opened this issue 7 months ago
Johnnie390 opened this issue 7 months ago
[Bug]: `lots of diacritics - possibly poor OCR` but using standalone tesseract works perfectly
KAGEYAM4 opened this issue 7 months ago
KAGEYAM4 opened this issue 7 months ago
[Bug]: No errors and no output for large DPI files
dan-ryan opened this issue 7 months ago
dan-ryan opened this issue 7 months ago
[Bug]: MetadataProgress does not respect progress_bar=False argument
DavidMChan opened this issue 7 months ago
DavidMChan opened this issue 7 months ago
[Bug]: Paperless-ngx Release 2.9.0 Ghostscript rasterizing failed
Johnnie390 opened this issue 7 months ago
Johnnie390 opened this issue 7 months ago
[Feature]: Alternative AI OCR "surya" as opposed to EasyOCR, Just found it today and it dominated the accuracy and speed of Tesseract & EasyOCR
abclution opened this issue 7 months ago
abclution opened this issue 7 months ago
[Bug]: ocrmypdf 16.3.1 fails on a file on Arch that 13.4.0 on Ubuntu handles well
Fifis opened this issue 7 months ago
Fifis opened this issue 7 months ago
[Bug]: crashes with tesseract 5.4.0
mplx opened this issue 7 months ago
mplx opened this issue 7 months ago
Update docker.rst
omidraha opened this pull request 7 months ago
omidraha opened this pull request 7 months ago
Incorrect behavior of text color setting in hocrtransform
ep0p opened this issue 7 months ago
ep0p opened this issue 7 months ago