Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ocrmypdf/OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
https://github.com/ocrmypdf/OCRmyPDF

subprocess: use more mypy-friendly syntax

5172dbde8d0a4140e694ee8d7b2d051fc60213b7 authored about 4 years ago
api: parse cmdline in more type friendly way

0b7e52fb5eab704f57dc6cc57f5a443c32a10d3e authored about 4 years ago
hocrtransform: fix exception if no div ocr_page object

997bf7578d8a3bf2f61bc9ec8a3dd34ade669a8e authored about 4 years ago
Declare ocrmypdf as typed

a5feef07d0821f6feea04ebf6c1cfcee5bc6a58b authored about 4 years ago
hocrtransform: trivial typing

043258242cce31a9ab46a9ad987872b81347b571 authored about 4 years ago
Change prefix of temporary folders

Shouldn't really use a name that suggests a connection to GitHub.

f11bb53e61d539c2427ba81a6a4df567c6a55aee authored about 4 years ago
Add feature to generate hocr-pdf with visible debug text

68a57a7839f866a0c57ac763edfaa65ee7a4d14a authored about 4 years ago
Begin next release notes

4194430dc16b58b7c89570b5e2b5a282b3660667 authored about 4 years ago
windows: look in registry for Tesseract and Ghostscript

3cba50bfbd6f63b29ab9592f8da855586444598b authored about 4 years ago
docs: improve windows instructions

a707c56fae2f86141a2b71d3d3d7f08fbfd75f52 authored about 4 years ago
completions: consider *.PDF and some images too

ed5e17d0a40c2429da9f694eb011adc41a9c56c6 authored about 4 years ago
Decouple tqdm from progressbar setup

ce0e0ecd4d8018d361d3eac33cab51e0c959cf7b authored about 4 years ago
ghostscript: add output tracing

7e1223c12c295152ddb245e8e51a870f1f5659f0 authored about 4 years ago
subprocess: refactor and add run_polling_stderr

b83d7f6d1aa9c9cc27697a1e05e7d9573aefda84 authored about 4 years ago
tesseract: fix run call with logs_errors_to_stdout

80e957908a3ebb8f627a8a0e6949e801f348f83c authored about 4 years ago
docs: remove redundant statement

f0e7bea8ba7b579592342c6c95d0c28011650a86 authored about 4 years ago
docs: remove description of how OMP_THREAD_LIMIT is managed

0cdb9bd04a5ba11bd6b2bfb3c08b695773450c17 authored about 4 years ago
v11.3.4 release notes

8224d89bc6a225fcefae2f1f039d462a0fe37bd2 authored about 4 years ago
v11.3.4 release notes

a2bbbe2a26421fcf4c1326be2e084811fbfc55bd authored about 4 years ago
check_pdf: document how we handle linearization

43f41863fa55a4708815552cbcd76e8bdad36983 authored about 4 years ago
Fix "readLinearizationData for file that is not linearized"

pikepdf 2.1.0 throws wrong type of exception in this case, so special-case it.

Closes #680
Clos...

d71e50e83d95892b111cded61eea380deb83901b authored about 4 years ago
ghostscript: better docs and comments

1f598da3c168c6dcfd39318b1082ab883d23abbb authored about 4 years ago
watcher: include uppercase .PDF too

d0cdbd5e1c5cda68ba6dac11effc03449d00cb5a authored over 4 years ago
unpaper: type hints

5c56f6120923547b35947601f085357499dc297c authored over 4 years ago
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF

9bec85470af8ef12777703f07a9c0007231c0c80 authored over 4 years ago
docs: fix link to docker image

a03863a17d449eee7843ffbd77663213a98575aa authored over 4 years ago
docs: fix csv-table errors

22cd9b236485e3004f40fdaaac81c919f5de8431 authored over 4 years ago
fix typo "charcter" -> "character" (#673)

4fc7d6d93e75eda67258b7a4f3f839050552e8bf authored over 4 years ago
v11.3.3 release notes

71f0e7f545f754cdc37c0551be703c98dbd05abf authored over 4 years ago
Replace most uses of universal_newlines with text

The parameters are equivalent but the latter is better named. Since
Python 3.6 doesn't support t...

895fddd85e449273709417a14677ec68d119f09a authored over 4 years ago
unpaper: don't use universal_newlines=True

There's no specific reason to do this. We can log binary output equally
well.

5a59e4d5432e40537846c3bfb1f81f8aed9e3598 authored over 4 years ago
azure: Fix indentation mistake

b51abf22495aa0710f66b05d33b69958713b4022 authored over 4 years ago
api: rework ocr() slightly to simplify variable handling

6d3f9ff15af22cc927f4eb9754bf6d3e52482f79 authored over 4 years ago
docs: more details about macOS API changes

Due to fork->spawn

5d1d1a712be22872adfee274de6832871a271195 authored over 4 years ago
docs: show ifmain guard in example

6d5f8133e0fbc513b853eb65203ba98b1ac04f2f authored over 4 years ago
ci: Extend test matrix to Python 3.9

13018d3d5c5dc049110b753c6bcc2b803e39d4f3 authored over 4 years ago
Fix pinned dependencies

14a85f9473eff26abdea3b756c42846694d8d338 authored over 4 years ago
v11.3.2 release notes (2)

Since we never tagged it, fix other things.

d22a1b3367566258b0d4cb79d7075d9a9d7c59c2 authored over 4 years ago
ghostscript: don't repeat log in debug

Subprocess already does this for us.

b913e5dfefa6e5be17857edd49abab6946de0bf9 authored over 4 years ago
Fix log domain names

ocrmypdf.subprocess.subprocess.ghostscript -> ocrmypdf.subprocess.ghostscript

dd8a5a4c7230552219885955b2ca62d158b886f9 authored over 4 years ago
Remove extraneous page rotation

This was added in commit b5ccbfd but seems to have been ill-advised.

36e9a54f02905788899916609fd6ccd488cc841e authored over 4 years ago
Change pdf.root to pdf.Root

3707af3b74d64bb56eab5f5ac88655c1a259e429 authored over 4 years ago
unpaper: round off DPI

ced7ad9164e195fedb7609b0771c2cf2eba5ac5e authored over 4 years ago
Fix UnboundLocalError when considering ImageMasks for optimization

Uncovered by test file in issue 667, although unrelated to that issue.

54bbbfdeb3b6853f7aaa67082f73f6dbfcc9d00d authored over 4 years ago
Some Python 3.9 fixes

7f73a6ed1ee9b2a4c32854db8cd243709dbaec8c authored over 4 years ago
Fix pre-commit for Py3.9

dce206d3dc01c2a7f74912dd3ab317f51df46c77 authored over 4 years ago
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF

9304c856cf57462832e16a899163833efbf3986e authored over 4 years ago
v11.3.2 release notes

e5df98cbdfc55bc72b0b8b669940cbdf03ef8136 authored over 4 years ago
api: improve typing

19bf3aeb00c41392853f71c05ff0297b446bb7ae authored over 4 years ago
unpaper: fix process output handling

With the ocrmypdf.subprocess wrapper, logging the output here
is redundant and loses the page nu...

e86be0031c6b41f38390dbdbdcfa78389561ef1e authored over 4 years ago
unpaper: use pnm instead of png

Some users reported problems with PNG recently; try PNM.

Fixes #665
Fixes #667

6425977998f257b23bfe559efb7b19c3231ff854 authored over 4 years ago
subprocess: support programs that write their messages to stdout

d57df2d980e4531b3f73d88839d22be28d6d3d61 authored over 4 years ago
Document configure_debug_logging

664d0c7969d6b7e53fd554c7c342ee6a10547b43 authored over 4 years ago
Fix typo in API documentation

a354663ee172706374979a5b8cfc57df9f66b4a5 authored over 4 years ago
Add macOS brew language support (#615)

Note `brew` command for installing additional languages on macOS.

b21b048ec4f68067de41aa2aa266c57ef9c2fc53 authored over 4 years ago
v11.3.1 release notes

709c65b41aa21ba6d00b4c4cdc0cadff7ffd4b0f authored over 4 years ago
Endorse pdfminer.six 20201018

67f99c5bb730805b60bc5cc46af875af44b6926e authored over 4 years ago
Fix warning about --pdfa-image-compression argument at wrong times

Closes #663

d55e673d9cbcdb714cbf385820d265df4aaaef79 authored over 4 years ago
Endorse pikepdf 2.x

21b90d2d147fe570aab5416a92ed85d998ee0128 authored over 4 years ago
Use % for percentage in string format (#643)

2def7e3392874d5249823dc948eca266d5310fbf authored over 4 years ago
v11.3.0 release notes

b0dcaa7512ede6dd1130fd88ba21d8acfb364b15 authored over 4 years ago
Add test to confirm rasterize_pdf_page rotates correct

e8285b1d10446cf026e08ca840536d1e698e155a authored over 4 years ago
Fix page rotation issue (again)

Commit 1327ab3 introduced a fix for a regression, which was reported
in #581, #634. It appears t...

5ba56adb5338db331d1356ddd69759a4e791871e authored over 4 years ago
setup: Version pluggy better

ca735278e02f06d8e5ea731924f20667bb1f0464 authored over 4 years ago
Fix hookspec of rasterize_pdf_page to remove default parameters

b5ccbfdf25188e166f0aecae00045b327e341151 authored over 4 years ago
Fix debug log messages being suppressed from child processes

8c35d6e6e41600cd8852352853dafdb254fc78b5 authored over 4 years ago
Ensure worker_pdf is closed after gathering info in a thread

This is hacky, uses global state, but it does improve the situation for now.

d1e0c81edab4744d975811c87413537e046de663 authored over 4 years ago
Only create debug.log when running from command line

When used as a library ocrmypdf shouldn't make policy decisions, like where to
put a log file. U...

10c8e4f8b405dcb84269cbe0af54dbf4fb6e610d authored over 4 years ago
Describe "OCR" step as "Image processing" when --tesseract-timeout=0

Fixes #647

6be2242c215db0f6067d865fb7d30c3709c98d43 authored over 4 years ago
Fix inverted colors during JBIG2 optimization on paletted images

Fixes #640

204c9d6ae19851c182e31da34e41f1bcd75ea6b3 authored over 4 years ago
v11.2.0 release notes

Change v11.1.3 to v11.2.0 since it contains functional changes.

6eb393590b4124e7daab8391a4f7bacd27e75562 authored over 4 years ago
v11.1.3 release notes

07c6654057a9f05207158e74567d0343f28a3ef8 authored over 4 years ago
Fix image optimization discarding image masks and soft masks associated with PNGs

Fixes #648

4e15eb8d14e9fca921129aa5cfa160a14d758656 authored over 4 years ago
Better type checking on ocrmypdf.ocr(plugins=...)

8b01ab8ad293b93bada51ba0b3bd5f5b2099d268 authored over 4 years ago
Document the example plugin

e0a522ad506b5252e395d864f9dcb64d4b89d302 authored over 4 years ago
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF

a1a8788c5a98ddfc6a3ad2ffd7a6125492e6d4f4 authored over 4 years ago
v11.1.2 release notes

cccdc178c37223d6df5b30594c1ecdb4bc338e0e authored over 4 years ago
hOCR: write text in correct order

Fixes #642

4eacb3454f142dc8b3f590d592140a0ed5e506f7 authored over 4 years ago
docs: Add 'unpaper' optional dependency for Ubuntu 18.04 (#639)

82b8b41e80916ae70b321d69807b397f3318cbd1 authored over 4 years ago
v11.1.1 release notes

581c5020ab8fc6fe9d9225a8d201da28b1a5008c authored over 4 years ago
pngquant driver: refactor, use streams instead of temporary files

3ef8872a1ea369a9c445477f6b8bc844e5142ca7 authored over 4 years ago
Tighten unpaper-args validation to exclude . and ..

Just in case

28eec73eedd80be56adf23e666e5d99e25d4631e authored over 4 years ago
Tidy a log message

bfe4a5b329b069f8e493e80c18b4091b2b65fba2 authored over 4 years ago
Release notes typo

29097837d614bf8d9d9346023bb9df1927638269 authored over 4 years ago
Remove unpaper from macOS build

Homebrew seems to be having issues with its deps?

a40361db3c2dce069e2aa8dfbd081a584147271e authored over 4 years ago
Merge commit '9a6cd95e5fe2826d40861229aaa0431b76e302e7'

8b29e3cbaba652fd3692b9f46bae0dcdf3b29c71 authored over 4 years ago
v11.1.0 release notes

b170be120b7315b988527afb39163f8dad07dc01 authored over 4 years ago
load zlib before liblept on windows (#633)

fixes #631

9a6cd95e5fe2826d40861229aaa0431b76e302e7 authored over 4 years ago
Use img2pdf to create optimized PNG images

Fixes #629, #620

d464d3122e94205250de868e7e2f09f7cb7e9b9c authored over 4 years ago
Fix page rotation regression

Fixes #634, #581

1327ab37d4c267ccdb313052040d5547e6485ead authored over 4 years ago
Display page numbers in log messages when grafting

67553fc5c6b4e2f05cd7bb171da378c4e54dddbf authored over 4 years ago
Remove unused function log_page_orientations

306a903854a614fd21ee9c7749e66142ff22303d authored over 4 years ago
Disable pikepdf mmap

Infrequently we can reproduce this error:

terminating with uncaught exception of type std::runt...

b93cf51c0fa9b99bbbbc6203a255ca71c2b45cee authored over 4 years ago
Remove Python 3.7 from build since homebrew removed it

6b994221c615dd928c43d923bd9a41d835b79a7c authored over 4 years ago
Expand documentation of filter_page_image

8b5b02e0d8008267c9095bcdcd94e88892cf28f7 authored over 4 years ago
Extend example plugin with example of mono conversion

624df9bb23e15b2c60e2820099c65b634e4be854 authored over 4 years ago
v11.0.2 release notes

fa06ea360001ee85da62938a2b832e572c99c8ba authored over 4 years ago
metadata fixup: don't try to update original PDF's metadata with docinfo

31994258fb48dbbd689a7bbd2a839c479879577c authored over 4 years ago
Add "Postprocessing" message as a hint for long Ghostscript runs

1f15ecbca54038470d3ff4887bfbe493da7a3db1 authored over 4 years ago
Reorganize issue templates

bcf5657e5c77225f5f1e1ed5672af7e669bd3477 authored over 4 years ago