Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/ocrmypdf/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
https://github.com/ocrmypdf/OCRmyPDF
[Bug]: --tesseract-pagesegmode is not sufficiently documented
thomas2net opened this issue 7 months ago
thomas2net opened this issue 7 months ago
Error occurred while consuming document out1.pdf: SubprocessOutputError: Ghostscript rasterizing failed.
dekoenpi opened this issue 7 months ago
dekoenpi opened this issue 7 months ago
[Bug]: OCR not complete. Parts of all pages are ignored
0lm opened this issue 7 months ago
0lm opened this issue 7 months ago
[Bug]: multiple spaces not supported for delimitation of bbox parameters
Tehgg opened this issue 7 months ago
Tehgg opened this issue 7 months ago
[Bug]: Flood of "Recursion depth exceeded in _find_image_xrefs_page"
user1584 opened this issue 7 months ago
user1584 opened this issue 7 months ago
[Bug]:
Firestar-Reimu opened this issue 8 months ago
Firestar-Reimu opened this issue 8 months ago
Pushed docker image is always Ubuntu instead of alpine
vihtap opened this issue 8 months ago
vihtap opened this issue 8 months ago
[Bug]: test_semfree fails with ghostscript 10.03.0+
gringus opened this issue 8 months ago
gringus opened this issue 8 months ago
[Bug]: NotImplementedError: not sure how to get colorspace
macdeport opened this issue 8 months ago
macdeport opened this issue 8 months ago
[Feature]: If page has text, force OCR and rasterize page
mikejokic opened this issue 8 months ago
mikejokic opened this issue 8 months ago
Show progress during postprocessing
user1823 opened this issue 8 months ago
user1823 opened this issue 8 months ago
[Bug]: Crash on multiple .pdf files
olafure opened this issue 8 months ago
olafure opened this issue 8 months ago
Indian Numbers on Arabic text
MedoHamdani opened this issue 8 months ago
MedoHamdani opened this issue 8 months ago
Make usage of --rotate-pages-threshold clearer
stegl83 opened this issue 8 months ago
stegl83 opened this issue 8 months ago
[Bug]: cannot import name 'PDFTextSeq' from 'pdfminer.pdfdevice'
user1823 opened this issue 8 months ago
user1823 opened this issue 8 months ago
[Bug]: No longer works - macos-11.7 x86_64 Python 3.10
atanasj opened this issue 8 months ago
atanasj opened this issue 8 months ago
[Bug]: File size increased
user1823 opened this issue 8 months ago
user1823 opened this issue 8 months ago
[Bug]: conda installation
kevinkaw opened this issue 8 months ago
kevinkaw opened this issue 8 months ago
[Bug]: ValueError: ObjectList must have 6 elements
macdeport opened this issue 8 months ago
macdeport opened this issue 8 months ago
not user friendly
abood-az opened this issue 8 months ago
abood-az opened this issue 8 months ago
[Feature]: JPEG XL support
Lyapsus opened this issue 8 months ago
Lyapsus opened this issue 8 months ago
Fix wrong env var for GS path in Snap
helkaluin opened this pull request 8 months ago
helkaluin opened this pull request 8 months ago
[Feature]: Change demo format to VHS
jbarlow83 opened this issue 9 months ago
jbarlow83 opened this issue 9 months ago
[Bug]: real text replaced by � � (visually unchanged, only by copying)
JoKalliauer opened this issue 9 months ago
JoKalliauer opened this issue 9 months ago
Adding language install docs for archlinux
ahmedsbytes opened this pull request 9 months ago
ahmedsbytes opened this pull request 9 months ago
Release notes don't include the latest versions
user1823 opened this issue 9 months ago
user1823 opened this issue 9 months ago
[Bug]: watcher.py requires the "ARCHIVE" folder to be assigned, even if the option is disabled
clodobox opened this issue 9 months ago
clodobox opened this issue 9 months ago
[Bug]: Warning: "xref 473: While extracting this image, an error occurred"
macdeport opened this issue 9 months ago
macdeport opened this issue 9 months ago
[Bug]: Memory Error
user1823 opened this issue 9 months ago
user1823 opened this issue 9 months ago
[Bug]: DecompressionBombWarning
user1823 opened this issue 9 months ago
user1823 opened this issue 9 months ago
Update the typer[all] dependency to typer-slim[standard]
musicinmybrain opened this pull request 9 months ago
musicinmybrain opened this pull request 9 months ago
added Macports install information
akierig opened this pull request 9 months ago
akierig opened this pull request 9 months ago
[Feature]: Could watcher.py be enhanced to support the conversion of single or multi TIF and JPG files to PDF?
EvilQoo opened this issue 9 months ago
EvilQoo opened this issue 9 months ago
max_workers must be greater than 0
nope999 opened this issue 9 months ago
nope999 opened this issue 9 months ago
[Feature]: Choose between NFKC and NFC normalization for Unicode characters so copy-pasting works
sfllaw opened this issue 10 months ago
sfllaw opened this issue 10 months ago
[Bug] SubprocessOutputError
user1823 opened this issue 10 months ago
user1823 opened this issue 10 months ago
Allow resuming OCR after DecompressionBombError
user1823 opened this issue 10 months ago
user1823 opened this issue 10 months ago
[Bug]: The file size increases significantly by OCR even without image recompression
ybeltukov opened this issue 10 months ago
ybeltukov opened this issue 10 months ago
batch example: added archive, small corrections and optimizations
NilsRo opened this pull request 10 months ago
NilsRo opened this pull request 10 months ago
Fix Broken Documentation Links
danloveg opened this pull request 10 months ago
danloveg opened this pull request 10 months ago
Recommended settings for dealing with text superimposed on clipart?
MBYlt opened this issue 10 months ago
MBYlt opened this issue 10 months ago
[Bug]: Missing support for certain unicode characters
vera-bernhard opened this issue 10 months ago
vera-bernhard opened this issue 10 months ago
[Bug]: AttributeError: 'NoneType' object has no attribute 'get'
nikitar opened this issue 10 months ago
nikitar opened this issue 10 months ago
[Bug]: "Corrupt JPEG data: premature end of data segment" with some files
macdeport opened this issue 10 months ago
macdeport opened this issue 10 months ago
Update Dockerfile.alpine
emielmolenaar opened this pull request 10 months ago
emielmolenaar opened this pull request 10 months ago
[Bug]: Ghostscript PDF/A rendering failed
davide125 opened this issue 10 months ago
davide125 opened this issue 10 months ago
[Bug]: dpi-problem with rasterizing text
JoKalliauer opened this issue 10 months ago
JoKalliauer opened this issue 10 months ago
[Bug]: OCRmyPDF Docker Hot Folder Option OCR_ON_SUCCESS_ARCHIVE OCR_ON_SUCCESS_DELETE doesnt work
mazi19 opened this issue 10 months ago
mazi19 opened this issue 10 months ago
Error: jbig2 not found on path, even though installed
anaxonda opened this issue 11 months ago
anaxonda opened this issue 11 months ago
[Bug]: OCRmyPDF succeeded with warning(s): InputFileError: pdfminer could not process page 0
Markoise opened this issue 11 months ago
Markoise opened this issue 11 months ago
Fix entrypoint for docker commands
SirRegion opened this pull request 11 months ago
SirRegion opened this pull request 11 months ago
[Bug]: version confusion
branko623 opened this issue 11 months ago
branko623 opened this issue 11 months ago
[Bug]: Watcher doesnt notice changes after update
Major2828 opened this issue 11 months ago
Major2828 opened this issue 11 months ago
Handle PermissionError when finding tools
grembo opened this pull request 11 months ago
grembo opened this pull request 11 months ago
Trying to debug OCR_ON_SUCCESS_DELETE flag not being executed - add exit code to watcher.py?
wabarkley opened this issue 11 months ago
wabarkley opened this issue 11 months ago
PDF-A produces lossy result
YutMarma opened this issue 11 months ago
YutMarma opened this issue 11 months ago
[Feature]: Support RapidOCR engine
saccohuo opened this issue 11 months ago
saccohuo opened this issue 11 months ago
[Feature]: sidecar Support Text Output to io.StringIO()
MAbdElRaouf opened this issue 11 months ago
MAbdElRaouf opened this issue 11 months ago
[Bug]: OCRmyPDF not adding any text to document v 1.4
maxi07 opened this issue 11 months ago
maxi07 opened this issue 11 months ago
[Feature]: Integrations with other backends via hOcr (naive implementation of easyOcr backend inside)
coffepowered opened this issue 11 months ago
coffepowered opened this issue 11 months ago
[Documentation]: Upgrade via pip after system install needs a different command
dajare opened this issue 11 months ago
dajare opened this issue 11 months ago
Update README.md
rudolphos opened this pull request 11 months ago
rudolphos opened this pull request 11 months ago
Bump codecov/codecov-action from 3 to 4
dependabot[bot] opened this pull request 11 months ago
dependabot[bot] opened this pull request 11 months ago
[Feature]: convert grayscale PDF to jbig monochrome while doing OCR
callegar opened this issue 11 months ago
callegar opened this issue 11 months ago
[Bug]: installation failed due to ghostcript in-compatible version and can not upgraded ghostscript in Ubuntu 20.04
rohan-paul opened this issue 11 months ago
rohan-paul opened this issue 11 months ago
[Bug]: OCR on .pdf isn't the same as tesseract but the format is correct on .txt file
matsumurae opened this issue 11 months ago
matsumurae opened this issue 11 months ago
[Feature]: Add support for docTR as alternate OCR backend?
victorhooi opened this issue 11 months ago
victorhooi opened this issue 11 months ago
[Bug]: Unknown tesseract error, returns non-zero
nepomuc opened this issue 11 months ago
nepomuc opened this issue 11 months ago
[Bug]: Memory access error if using a German terminal
Pete1976 opened this issue 11 months ago
Pete1976 opened this issue 11 months ago
Doc suggestion: also great for just removing the text layer!
hmijail opened this issue 12 months ago
hmijail opened this issue 12 months ago
[Feature]: More Accessible Via Consistently connecting words to form sentences.
PiggiesGoSqueal opened this issue 12 months ago
PiggiesGoSqueal opened this issue 12 months ago
[Feature]: Explain on the docs how to change the language of OCR on watcher.py
iohann95 opened this issue 12 months ago
iohann95 opened this issue 12 months ago
[Bug]: Conda - pikepdf is unavailable
kielbowicz opened this issue 12 months ago
kielbowicz opened this issue 12 months ago
[Bug]: 'File not found' error in latest versions
templeman opened this issue 12 months ago
templeman opened this issue 12 months ago
Add autotools automake libtool and leptonica requirements
maxi07 opened this pull request 12 months ago
maxi07 opened this pull request 12 months ago
Minor english correction in Docs
Sapkotaanish opened this pull request 12 months ago
Sapkotaanish opened this pull request 12 months ago
Update gs dependency & instructions for RHEL
nisbet-hubbard opened this pull request about 1 year ago
nisbet-hubbard opened this pull request about 1 year ago
[Bug]: Bunch of incomprehensible OCR content to delete
nicolas-75 opened this issue about 1 year ago
nicolas-75 opened this issue about 1 year ago
[Feature]: Only optimise file, skip OCR completely
Atrate opened this issue about 1 year ago
Atrate opened this issue about 1 year ago
[Bug]: RHEL 9 requires ghostscript 9.54 to work
nisbet-hubbard opened this issue about 1 year ago
nisbet-hubbard opened this issue about 1 year ago
[Bug]: PDF graphics stack overflowed spec limit
Gedankenleser opened this issue about 1 year ago
Gedankenleser opened this issue about 1 year ago
fixed a spelling mistake
Anthony-Nabil opened this pull request about 1 year ago
Anthony-Nabil opened this pull request about 1 year ago
Thank you!
zWhdmB5T opened this issue about 1 year ago
zWhdmB5T opened this issue about 1 year ago
[Bug]: OCRmyPDF does not preserve existing XMP metadata
jkorinth opened this issue about 1 year ago
jkorinth opened this issue about 1 year ago
[Bug]: Package 'pngquant' not found, exists on PATH
dumoulinalex opened this issue about 1 year ago
dumoulinalex opened this issue about 1 year ago
[Feature]: Are tesseract scripts supported?
eightfiftytwo opened this issue about 1 year ago
eightfiftytwo opened this issue about 1 year ago
allow resolution over ride that might improve text recognition etc
john-peterson opened this pull request about 1 year ago
john-peterson opened this pull request about 1 year ago
[Bug]: Every PDF I OCR has the text misaligned with the image
advert665 opened this issue about 1 year ago
advert665 opened this issue about 1 year ago
[Bug]: Every PDF I OCR has the text misaligned with the image.
advert665 opened this issue about 1 year ago
advert665 opened this issue about 1 year ago
[Bug]: Persian rendering and text positioning errors in 16.0.1 with new renderer
Rosti2022 opened this issue about 1 year ago
Rosti2022 opened this issue about 1 year ago
[Bug]: OCRmyPDF does not preserve existing XMP metadata
jkorinth opened this issue about 1 year ago
jkorinth opened this issue about 1 year ago
[Bug]: OCR gets stuck
philmas opened this issue about 1 year ago
philmas opened this issue about 1 year ago
[Bug]: NotImplementedError
gdandersson opened this issue about 1 year ago
gdandersson opened this issue about 1 year ago
I scan the documents with my Brother MFC-L8690CDW and it worked until v15.4.4
knabed opened this issue about 1 year ago
knabed opened this issue about 1 year ago
[Bug]: complete letter salad
knabed opened this issue about 1 year ago
knabed opened this issue about 1 year ago
Bump actions/upload-artifact from 3 to 4
dependabot[bot] opened this pull request about 1 year ago
dependabot[bot] opened this pull request about 1 year ago
Bump actions/download-artifact from 3 to 4
dependabot[bot] opened this pull request about 1 year ago
dependabot[bot] opened this pull request about 1 year ago
Fix performance advice to match --fast-web-view documentation
Androbin opened this pull request about 1 year ago
Androbin opened this pull request about 1 year ago
Bump actions/setup-python from 4 to 5
dependabot[bot] opened this pull request about 1 year ago
dependabot[bot] opened this pull request about 1 year ago
[Bug]: Accented characters not correct in PDF/A output
stumpylog opened this issue about 1 year ago
stumpylog opened this issue about 1 year ago