Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/ocrmypdf/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
https://github.com/ocrmypdf/OCRmyPDF
[Feature]: Not to save images and opaque text
edwvee opened this issue about 1 year ago
edwvee opened this issue about 1 year ago
[Bug]: ocrmypdf invoked oom-killer
munzirtaha opened this issue about 1 year ago
munzirtaha opened this issue about 1 year ago
[Feature]: More details for exception ColorConversionNeededError
noseshimself opened this issue about 1 year ago
noseshimself opened this issue about 1 year ago
[Bug]: Progress Bar is missing when running in Google Colab
Warborn123 opened this issue about 1 year ago
Warborn123 opened this issue about 1 year ago
[Bug]: Unable to install GhostScript using Winget
xd003 opened this issue about 1 year ago
xd003 opened this issue about 1 year ago
[Bug]: No module named 'lxml'
tcurdt opened this issue about 1 year ago
tcurdt opened this issue about 1 year ago
[Bug]: ImportError: cannot import name 'PDFTextSeq' from 'pdfminer.pdfdevice'
MimoGraphix opened this issue about 1 year ago
MimoGraphix opened this issue about 1 year ago
[Bug]: incorrect file mode
svenha opened this issue about 1 year ago
svenha opened this issue about 1 year ago
[Bug]: InputFileError
JoKalliauer opened this issue about 1 year ago
JoKalliauer opened this issue about 1 year ago
[Bug]: sandwich renders differently than hocr
femifrak opened this issue about 1 year ago
femifrak opened this issue about 1 year ago
[Bug]: Centos7 Install OCRmyPdf run command failed [Help]
huxinghai opened this issue about 1 year ago
huxinghai opened this issue about 1 year ago
[Bug]: User warning: missing specialized decoders (probably JBIG2)
femifrak opened this issue about 1 year ago
femifrak opened this issue about 1 year ago
[Feature]: enable use of Ghostscript glyph-level Unicode map generation
jbarlow83 opened this issue about 1 year ago
jbarlow83 opened this issue about 1 year ago
[Bug]: "Adobe Acrobat Reader" isn't able to open outputfile any more
JoKalliauer opened this issue about 1 year ago
JoKalliauer opened this issue about 1 year ago
[Feature]: Support Azure Recognition Service
hcoona opened this issue about 1 year ago
hcoona opened this issue about 1 year ago
[Bug]: OCR_JSON_SETTINGS does not accept JSON string anymore
JohnDoe2991 opened this issue about 1 year ago
JohnDoe2991 opened this issue about 1 year ago
[Bug]: [WinError 2] The system cannot find the file specified
MvCast opened this issue about 1 year ago
MvCast opened this issue about 1 year ago
[Bug]: watcher.py: execute_ocrmypdf() takes 0 positional arguments but 1 positional argument were given
Major2828 opened this issue about 1 year ago
Major2828 opened this issue about 1 year ago
Does OCRmyPDF break macOS "Live Text" feature?
rennefJ opened this issue about 1 year ago
rennefJ opened this issue about 1 year ago
TrimBox and CropBox not retained when "force OCR" is used
jbarlow83 opened this issue about 1 year ago
jbarlow83 opened this issue about 1 year ago
[Question]: How to pass --skip-text to watcher.py in docker container?
dolorosus opened this issue about 1 year ago
dolorosus opened this issue about 1 year ago
[Bug]: "inplace" + --skip-text on PDF with only text modifies / outputs a file
jrz opened this issue about 1 year ago
jrz opened this issue about 1 year ago
Is building OCRmyPDF and it's dependencies in a docker environment less performant?
wadeflash12 opened this issue about 1 year ago
wadeflash12 opened this issue about 1 year ago
[Bug]: --version gives 0.0.0 on Ubuntu snap
pseudomonas opened this issue about 1 year ago
pseudomonas opened this issue about 1 year ago
How can I pass tesseract argument " -c preserve_interword_spaces 1" ?
languagemaniac opened this issue about 1 year ago
languagemaniac opened this issue about 1 year ago
[Bug]: Ghostscript process fails when running OCRmyPDF on all PDFs
eoinosullivan opened this issue about 1 year ago
eoinosullivan opened this issue about 1 year ago
Correct the archive dir name in `Watched folders with Docker`
mflagg2814 opened this pull request about 1 year ago
mflagg2814 opened this pull request about 1 year ago
Handle OSError exceptions in watcher.py
mflagg2814 opened this pull request about 1 year ago
mflagg2814 opened this pull request about 1 year ago
[Bug]: PDF expands to 1.5G from 14M
alephpiece opened this issue about 1 year ago
alephpiece opened this issue about 1 year ago
[Bug] [Help Needed] Command "ocrmypdf" not found [Windows 11]
Alaiya opened this issue about 1 year ago
Alaiya opened this issue about 1 year ago
[Bug]: ocrmypdf v15.1.0+git8.2b0e1498 (snap): GPL Ghostscript 9.55.0: Can't find initialization file gs_init.ps. ghostscript.py:118
zWhdmB5T opened this issue about 1 year ago
zWhdmB5T opened this issue about 1 year ago
How can I remove extra space between every characters
hhiyorimi opened this issue about 1 year ago
hhiyorimi opened this issue about 1 year ago
[Bug]: Some PDFs are blank in macOS Safari and Preview
rezafouladian opened this issue about 1 year ago
rezafouladian opened this issue about 1 year ago
Delete or prune StructTreeRoot for --force-ocr/--redo-ocr/--skip-text and post warnings
jbarlow83 opened this issue about 1 year ago
jbarlow83 opened this issue about 1 year ago
[Bug]: Error: /typecheck in --runpdf--
muramasatheninja opened this issue about 1 year ago
muramasatheninja opened this issue about 1 year ago
Add custom deskew and page rotation logic before OCR
wadeflash12 opened this issue over 1 year ago
wadeflash12 opened this issue over 1 year ago
[Bug]: MissingDependencyError: tesseract on Heroku despite setting environment variables
troublesprouter opened this issue over 1 year ago
troublesprouter opened this issue over 1 year ago
[Feature]: Manually correcting OCR errors
tslivnik opened this issue over 1 year ago
tslivnik opened this issue over 1 year ago
OCR-Generated Text Layers Not Readable by PDF Readers for RTL Languages Like Persian
PSEUDO-SAPPHO opened this issue over 1 year ago
PSEUDO-SAPPHO opened this issue over 1 year ago
[Bug]: Always get "FileNoFoundError on input fiel
drnicolas opened this issue over 1 year ago
drnicolas opened this issue over 1 year ago
[Feature]: Add parameter to ignore "Invalid rotation" errors from img2pdf
iohann95 opened this issue over 1 year ago
iohann95 opened this issue over 1 year ago
[Bug]: --remove-background it is showing an error
wellingto198 opened this issue over 1 year ago
wellingto198 opened this issue over 1 year ago
Bump docker/setup-buildx-action from 2 to 3
dependabot[bot] opened this pull request over 1 year ago
dependabot[bot] opened this pull request over 1 year ago
Bump docker/login-action from 2 to 3
dependabot[bot] opened this pull request over 1 year ago
dependabot[bot] opened this pull request over 1 year ago
Bump docker/setup-qemu-action from 2 to 3
dependabot[bot] opened this pull request over 1 year ago
dependabot[bot] opened this pull request over 1 year ago
[Bug]: JBIG2 corruption of scanned pages & some pages overwriting other pages
gwern opened this issue over 1 year ago
gwern opened this issue over 1 year ago
Bump actions/checkout from 3 to 4
dependabot[bot] opened this pull request over 1 year ago
dependabot[bot] opened this pull request over 1 year ago
[Feature]: language translation
vdun opened this issue over 1 year ago
vdun opened this issue over 1 year ago
[Bug]: AttributeError: '_idat' object has no attribute 'fileno'
875d opened this issue over 1 year ago
875d opened this issue over 1 year ago
Change skip-ocr to skip-text for fish completion
ss8931 opened this pull request over 1 year ago
ss8931 opened this pull request over 1 year ago
[Bug]: JBIG2 - 2 colors
zvezdochiot opened this issue over 1 year ago
zvezdochiot opened this issue over 1 year ago
[Bug]: `ocrmypdf: error: unrecognized arguments: output.pdf` when running the docker image through the docker java client
lumalav opened this issue over 1 year ago
lumalav opened this issue over 1 year ago
[Feature]: Make Ghostscript Colour Conversion Configurable
marcules opened this issue over 1 year ago
marcules opened this issue over 1 year ago
[Bug]: 不是bug 请问我这是出现了什么问题 [It's not a bug. Please tell me what's wrong.]
zhengqianmaifang opened this issue over 1 year ago
zhengqianmaifang opened this issue over 1 year ago
[Bug]: No error, but 0-byte PDF produced
matt-cassinelli opened this issue over 1 year ago
matt-cassinelli opened this issue over 1 year ago
[Bug]: PIL.Image.DecompressionBombError: Image size (434275956 pixels) exceeds limit of 256000000 pixels, could be decompression bomb DOS attack.
asadfellowpro opened this issue over 1 year ago
asadfellowpro opened this issue over 1 year ago
[Bug]: SubprocessOutputError when scanning a specific PDF
AgustinOrdonez opened this issue over 1 year ago
AgustinOrdonez opened this issue over 1 year ago
[Bug]: `--jpeg-quality` does nothing useful and is extremely confusing
Atemu opened this issue over 1 year ago
Atemu opened this issue over 1 year ago
Complete train wreck of a PDF, trying to OCR rotated.
pinballelectronica opened this issue over 1 year ago
pinballelectronica opened this issue over 1 year ago
[Bug]: Docker build fails
zaphoodb opened this issue over 1 year ago
zaphoodb opened this issue over 1 year ago
Remove Unused Dependency: Deprecation
gdrosos opened this pull request over 1 year ago
gdrosos opened this pull request over 1 year ago
Add installation instructions for Gentoo Linux to README.md
fonic opened this pull request over 1 year ago
fonic opened this pull request over 1 year ago
[Bug]: Is jbig2 encoder being called?
gtusr opened this issue over 1 year ago
gtusr opened this issue over 1 year ago
Enables creation of a release and uploading the build assets to it
stumpylog opened this pull request over 1 year ago
stumpylog opened this pull request over 1 year ago
OCRmyPDF appends a space to each text element at the end of the line
gowallasnewpony opened this issue over 1 year ago
gowallasnewpony opened this issue over 1 year ago
[Bug]: When image quality is low or for skewed pdf's Ocrmypdf is not Ocring proper text. Missing some paragraphs.
goginenir6 opened this issue over 1 year ago
goginenir6 opened this issue over 1 year ago
[Feature]: Add Gentoo Linux to section 'Installation' of README.md
fonic opened this issue over 1 year ago
fonic opened this issue over 1 year ago
[Feature]: Distribute on Scoop?
ShadowCreator250 opened this issue over 1 year ago
ShadowCreator250 opened this issue over 1 year ago
[Feature]: Switch to remove images?
pinballelectronica opened this issue over 1 year ago
pinballelectronica opened this issue over 1 year ago
[Feature]: Remove images of text recognized
Kimi-Arthur opened this issue over 1 year ago
Kimi-Arthur opened this issue over 1 year ago
[Feature]: test
jbarlow83 opened this issue over 1 year ago
jbarlow83 opened this issue over 1 year ago
[Bug]: `pdfa-image-compression=auto` behaviour violates the principle of least surprise w.r.t. lossy/lossless optimisations
Atemu opened this issue over 1 year ago
Atemu opened this issue over 1 year ago
does ocrmypdf create an invisible text layer?
lbr991 opened this issue over 1 year ago
lbr991 opened this issue over 1 year ago
Confused about --unpaper-args
al1coch opened this issue over 1 year ago
al1coch opened this issue over 1 year ago
[Feature]: Parameter to automatically remove blank pages
GrabbenD opened this issue over 1 year ago
GrabbenD opened this issue over 1 year ago
orcmypdf not working in HTML/browser
Prabal1902 opened this issue over 1 year ago
Prabal1902 opened this issue over 1 year ago
[Bug]: Can not transfer image into editable text in pdf
ericosmic opened this issue over 1 year ago
ericosmic opened this issue over 1 year ago
[Bug]: PDF/A-3B files generated with a widely used commercial encoder generate garbage OCR content
jce-zz opened this issue over 1 year ago
jce-zz opened this issue over 1 year ago
Allow title, subject, author, and keywords to be unset with an empty string argument
f-hansen opened this pull request over 1 year ago
f-hansen opened this pull request over 1 year ago
[Bug]: Problem when OCR heavy PDFs - freezes at 0%
dariofilipe opened this issue over 1 year ago
dariofilipe opened this issue over 1 year ago
Problem when OCR heavy PDFs - freezes at 0%
dariofilipe opened this issue over 1 year ago
dariofilipe opened this issue over 1 year ago
do OCR if text boxs of minimum 15
pkrsreddy opened this pull request over 1 year ago
pkrsreddy opened this pull request over 1 year ago
Fix randomly ordered languages from set()
abwiersma opened this pull request over 1 year ago
abwiersma opened this pull request over 1 year ago
[Bug]: Inconsistent language order in tesseract calls
abwiersma opened this issue over 1 year ago
abwiersma opened this issue over 1 year ago
[Feature]: just curious/wondering about Tesseract 5 support
alejohern opened this issue over 1 year ago
alejohern opened this issue over 1 year ago
[Feature]: OCR on pages with multiple text rotations
matthuszagh opened this issue over 1 year ago
matthuszagh opened this issue over 1 year ago
鉴于很多使用者不会配置环境,我们在OCRmyPDF的基础上,集成了所需环境,并使用Electron开发了桌面端 [Electron version of OCRmyPDF]
FanQinFred opened this issue over 1 year ago
FanQinFred opened this issue over 1 year ago
[BUG] Frequently seeing `Syntax Error (91811): Too few (2) args to 'cm' operator`
deexpabada opened this issue over 1 year ago
deexpabada opened this issue over 1 year ago
Would be nice to be able to choose the temporary directory
al1coch opened this issue over 1 year ago
al1coch opened this issue over 1 year ago
Support for PDF-A/4
rafaelfcmaria opened this issue over 1 year ago
rafaelfcmaria opened this issue over 1 year ago
OCRmyPDF not rotating the file correctly using the version 14.2.1
gilsonbergamine opened this issue over 1 year ago
gilsonbergamine opened this issue over 1 year ago
[BUG] 'DecompressionBombError' on a ACM PDF - need resolution limit on high DPI
gwern opened this issue over 1 year ago
gwern opened this issue over 1 year ago
[BUG] Bold font in PDF is replaced by black bars
tobox opened this issue over 1 year ago
tobox opened this issue over 1 year ago
[BUG] ghostscript fails due to small resolution value
neurolabs opened this issue over 1 year ago
neurolabs opened this issue over 1 year ago
How to get the deskew angle
GoN49 opened this issue over 1 year ago
GoN49 opened this issue over 1 year ago
Replace text from original PDF with OCR'd Text
FrancisBaileyH opened this issue over 1 year ago
FrancisBaileyH opened this issue over 1 year ago
Unknown .defaultpapersize: (A4). / Unrecoverable error: rangecheck in .putdeviceprops / SubprocessOutputError: Ghostscript PDF/A rendering failed
klartext opened this issue over 1 year ago
klartext opened this issue over 1 year ago
Remove image layer after OCR?
Frooodle opened this issue over 1 year ago
Frooodle opened this issue over 1 year ago
WSL support
pinballelectronica opened this issue over 1 year ago
pinballelectronica opened this issue over 1 year ago
[BUG] AttributeError: module 'PIL.Image' has no attribute 'Resampling' on running script
acarl123 opened this issue over 1 year ago
acarl123 opened this issue over 1 year ago