Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ocrmypdf/OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
https://github.com/ocrmypdf/OCRmyPDF

Don't generate PDF/A-1b with object streams

Acrobat insists that PDF/A-1b should not have object streams.
Other programs like veraPDF disagr...

4124889f360446bdfdab3ba0aaef1b154579c63c authored almost 4 years ago
helpers: tidy check_pdf

a23c22b0e8b373cd421f11aa6df2ce9b81ed8621 authored almost 4 years ago
pyproject: black doesn't like py39 yet

dd1f5f7215ed6cd12fa1c9065a4226befc41e1ca authored almost 4 years ago
Allow --sidecar along --pages (#735)

5e2206bae76fe4b98f353676fdb37d8092a305e4 authored almost 4 years ago
pyproject: also target py39

079ee86d4354470616051abaf07acb48d54bd1a7 authored almost 4 years ago
v11.6.2 release notes

3692868004c2c4255603b3f75b4b1d872ffd1bec authored almost 4 years ago
Fix page rotation regression

Page size fixes in commit b26749 did accounted for a "kept" rotation,
but not a corrected rotati...

064f935699e9763045f89f8128d18e6ba9b26635 authored almost 4 years ago
tests: remove unreliable/incomplete test

8770fff96885dd16b2bc95e5815592bb18d7ef94 authored almost 4 years ago
v11.6.1 release notes

82de78b6b0ec35b038a9e9db560bd3e99f70d650 authored almost 4 years ago
optimize: skip images with unusually small dimensions

They're unlikely to be handled well by our recompressors. It seems
that JBIG2 cannot handle very...

2a52c6dec2ea58a7ba0a108b5dd60b14691c8d7b authored almost 4 years ago
docker-compose: fix typo

2898879be7bdf4b526c3924626ea673aa6768a56 authored almost 4 years ago
docker-compose: fix typo

18e613657cac71e7c12b0a3aa950747a181299b4 authored almost 4 years ago
Add filter_pdf_page hook

a48ca556c7f568d3245096219c0a1c34be184803 authored almost 4 years ago
Remove deprecated code

9cba738b485fec161c50768c0eaa8aacc1783995 authored about 4 years ago
Package OCR in Form XObject

Should improve results in some situations where the initial content
stream is messy or not well-...

390fdf8c05f5a07f25748add31d3775d7a7cf1fb authored about 4 years ago
Stricter parameter checking for many public functions

bccf2f423f43b61425ccdc10d3001d8e2a0c278a authored about 4 years ago
Merge branch 'feature/colorstrategy'

166de3086b333fc2625fda59ffcccb4c414bdb52 authored about 4 years ago
docs: api

206c675df68bbca06a7358b6c7a137d5f27d3032 authored about 4 years ago
Update awslambda to new pluginspec

6c8f9223e9e3e18f95ccd9cd60c77e0b6a6d86d0 authored about 4 years ago
Fix calls to hook.get_executor

85c6a974ca4ef8335d6e413ad008d959535d951a authored about 4 years ago
leptonica: tidy

dccdcfaa913db3c4f4258927a806f7eb36a8ed36 authored about 4 years ago
Add plugin for setting logging console

So that we are not tied to tqdm.

b1da09f141fe1c9af0aa6c7e3b4cc8b39b5c31ed authored about 4 years ago
optimize: rewrite JPEG optimize to avoid use of tqdm and parallelize

For some reason JPEG optimization was not done in parallel, and was
perhaps never done in parall...

42c84531e42d909b1e1a295483f32bceca7b8d37 authored about 4 years ago
optimize: Remove shim for unsupported pikepdf version

a9ad805347e48df3cd6ed53be46bc3519b46a429 authored about 4 years ago
Refactor - decouple progressbar from executor

16bda74974df2802f44f421402ed969b3f119e3b authored about 4 years ago
Refactor to eliminate global state in _concurrent

d274d88929d7d87b0e41313325ca56b1e4739b87 authored about 4 years ago
Use ColorConversionStrategy "LeaveColorUnchanged"

Faster, still produces PDF/A

327df5cbbc78ce160e877eb27bae603791fe397b authored about 4 years ago
v11.6.0 release notes

46d0632fe27f75e583e957edcc21beec63a2223e authored about 4 years ago
Delinting

ef1e7a814ec7578c81589fa5e34042f3f7a7e9ae authored about 4 years ago
docs: improve API docs

108472493762cb529b158e238432e8a1f1e44cbf authored about 4 years ago
docs: fix rst formatting error

ecb0109d79fcbfc015a3a329e113aadc8605dac3 authored about 4 years ago
Make progress pool common rather than plugin-specific

386cabff001dabfb8aa3e8644fc02ba6ce7ba083 authored about 4 years ago
lambda: move to extra_plugins folder

3bd5054634446cec9fa2152e7dc6fa66ed2c4fa8 authored about 4 years ago
lambda: more issues related to new executor semantics

Now all tests pass, except for:
-tests that check the progress bar
-tests where xdist may or may...

6a8dd65aa28ac87ef8ec38167991de29948829fb authored about 4 years ago
lambda: don't overrun number of workers needed

6083b4f0a7e71c7547136aaa91b14c0a0309e87e authored about 4 years ago
lambda: Don't be paranoid about exception marshalling

It works

1a3ce59476df8f77a9a7fb297963358833516b1d authored about 4 years ago
Temporary move into package

c6a2716cdbe85ca94cc69216ba94b5b04f29cfcf authored about 4 years ago
lambda: tidying, special casing use_threads

c395436ba30bc78d529c147bb75aaf18c741fe92 authored about 4 years ago
Operational lambda executor

8d23d0b4414ac955483d433edc568c6b7b0c7804 authored about 4 years ago
tests: fix concurrency

7bccb8c74844af7b1b7f1376a2abc3bdce7bb466 authored about 4 years ago
lambda_plugin.py: doesn't work since entry point needs to be in package

5545bae76f986f8d965c0aa0c4787c71e77d8b30 authored about 4 years ago
concurrency: lock progress pool

For API sanity and to communicate expectations. One progress pool at
a time is plenty of complex...

173c0d12740cfc9a6731a427670e1a9590effa28 authored about 4 years ago
pdfinfo: remove some messy concurrency handling

We can cut down on the use of global variables and save opening
an extra copy of the Pdf when th...

6953f324653ea4b5903e6137a142578733df7c70 authored about 4 years ago
Refactor concurrency so that it is pluggable

However, this may not be the best idea because it involves global
state that could be overridden...

26b4d9bb4b4bf508ed8d22419eff8a8e64512c67 authored about 4 years ago
Use queue.Queue instead of multiprocessing.Queue in threaded mode

34e564cd7de9847c63093b8d3601968341b22148 authored about 4 years ago
Refactor plugin manager to eliminate callback

504d5776d2118bfea6bddcc2b7407604378c43fb authored about 4 years ago
Re-sequence plugin installation

ee23976858418a464956fcd48c719d18f36816d4 authored about 4 years ago
Insert setuptools plugins with ocrmypdf prefix

f559316881befc67d389530599d0bbd10f31019e authored about 4 years ago
Automate insertion of builtin modules

084610c242be607d5cf586146f0829c71b3a7951 authored about 4 years ago
Update pre-commit

9ff627472b24dfab4ef33a4cc06faccc98298dc3 authored about 4 years ago
Import PageContext, PdfContext since they are referenced in pluginspec

956310d1ecc69c9eca121560aa35533daf0c55d6 authored about 4 years ago
tests: confirm that we produce pdf when optimization is off

1a982da442a91bea8a50b0d67b4de60820be1b9e authored about 4 years ago
docs: no MS Store Python

4879a1f0ded5e1e4cbeb98e6f6f1b4f4943ee4be authored about 4 years ago
github: Ask how ocrmypdf was installed

ce66bcc9c84b224bb557a8265f6332b317771ab5 authored about 4 years ago
v11.5.0 release notes

1ebf3144afd0ba454551e043d71523fb7f46182f authored about 4 years ago
Fallback to LeptonicaErrorTrap_Redirect if ffi.callback fails

Might fix issue #709, Apple silicon support.

7a1cccbc4e098762fb0a6303cf5193b94a503985 authored about 4 years ago
tests: Fix debug logging test

ebacff1b3915435365b9391e768f4547b558081d authored about 4 years ago
Add test for configure_debug_logging

Since we can't directly test it

c7c447be66cb51b9a389901e66a56d738512fad8 authored about 4 years ago
Consider text when determining page raster DPI

Previously if we found vectors of any sort on a page, we would bump
the DPI up to 400. We did no...

91aa175602e2b3a878dc4bc2bda8c8865e8a5e4b authored about 4 years ago
Create raster PDF pages to match input page size

Previously we produced a raster image, then multiplied image width
by DPI to get the page size. ...

b267494e4a38e694178f8f49bba7d0a760a8a847 authored about 4 years ago
tests: tidy pdfinfo

f687180ecc3561ee129ee4991755f7d6b3bea02f authored about 4 years ago
ghostscript: tidy comments

6f4b38b103d8481986bf870257185f5875f775ab authored about 4 years ago
v11.4.5 release notes

d32324859ca54cc0ef225f3724ea3296438442c7 authored about 4 years ago
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF

48222b87b5eafbe054178501b72a68436b7656a4 authored about 4 years ago
fix unclosed file warnings. (#710)

Co-authored-by: Jonas Winkler <[email protected]>

62e5edc72bfbff8640f1589355a89c1d30d46b7c authored about 4 years ago
Remove .coveragerc and fold into setup.cfg

2846d46bb83ea846fb71dfafab46ef03067943ec authored about 4 years ago
v11.4.4 release notes

47ef1914d491f9a6652c554aa6c93099410f2439 authored about 4 years ago
Make ocrmypdf.ocr take a threading lock

df157552f3040773eaac84a2d6dee807e75a1b7e authored about 4 years ago
Partial fix crash on 'userunit' None (#700)

Our method of getting data from pdfminer would silently consume a StopIteration
if pdfminer retu...

0b3a526049f10032a29939a82dcd5113cd895935 authored about 4 years ago
tesseract: fix typing of some optional arguments

1e80d412fa983e96d30128770fbaba366132cc58 authored about 4 years ago
concurrent: simplify results loop

df6e1062033003ca3dcf5463b95d04b74d00481b authored about 4 years ago
tests: tag tests that need pngquant, jbig2enc

bd0f00586147795993629c8d362134213fae77ff authored about 4 years ago
ci: temporarily disable pngquant on Windows

Looks like a packaging error, choco complains of bad hashes.

6ba4b7b3f3a2dbbec9f974df97f551d23fff025d authored about 4 years ago
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF

2c11349ee83475d85a21ab53cbfa972cec55e5bb authored about 4 years ago
v11.4.3 release notes

b0afef09efe23124adbdef95e625739719286f14 authored about 4 years ago
tests: skip metadata test for two pikepdf versions that warn incorrectly

72fa347c38319613516dc0b6a2004fbfc0367b39 authored about 4 years ago
pipeline: refactor metadata_fixup

96d68c2413672dd6b37c2aef8b5365b75d76cd28 authored about 4 years ago
tests: assert that most patched functions are called

We were not actually checking if functions we patched we called when
expected.

babc76fa740a282240fada0620088c439a1cc12d authored about 4 years ago
docs: fix simple typo, instsalled -> installed (#704)

There is a small typo in docs/installation.rst.

Should read `installed` rather than `instsall...

dc06990e5d67f4b89e4f342437384c3ad41925ae authored about 4 years ago
Remove PDF/A overprint debug message

Since we currently log all of a process's output at debug it's
redundant to log this separate me...

0ff0d2f8d16a38bb0216cc848cff7baed6f08994 authored about 4 years ago
Fix test not patching properly after Ghostscript polling change

81602cf420e8e79609097c39b62d85cdf5828604 authored about 4 years ago
v11.4.2 release notes

607e2d7e8172da0534744ff89b2c7a930d9eb794 authored about 4 years ago
Deal with missing pthread_sigmask on Cygwin

Closes #701

b01d9e07e80435f755429c978018b850015ee60d authored about 4 years ago
watcher: fix OCR_LOGLEVEL env var not processed

Closes #702

91db94cf2ec166b7a30bdf01391f2ed5c771f84c authored about 4 years ago
pdfinfo: stricter typing

416df803d46e39235fd05d61ab0a4e34f57efd67 authored about 4 years ago
pdfinfo: refactor to eliminate RawPageInfo

037b96ca16217bb91f30e0f838bdab3eee74c5d8 authored about 4 years ago
pdfinfo: Refactor pageinfo dictionary into a class

bb258fc99c3d3676977d708dded660d3802bc283 authored about 4 years ago
v11.4.1 release notes

4b8ccbe8cb76480b03ab42b0c61814acd1c59a60 authored about 4 years ago
misc: synology fix

Accept user-contributed fix. Not testable.

Close #690.

ab1ff3331b3d9c42ec4b8920fedac5834f7e10aa authored about 4 years ago
Fix certain invalid page ranges causing exception

Closes #686

3675ae918cf4e91b4fed9a237350c2c0f7a5d1aa authored about 4 years ago
Revert "v11.4.0 release notes - remove change not actually implemented"

This reverts commit ad202693b3dcf905e180a665a54f349d00d8dfba.
Temporary folder prefix was actual...

0ba32b96b71bc683139b1c52ed75ec37fb36028e authored about 4 years ago
docs: com.github.ocrmypdf -> ocrmypdf.io

add64e4fa2e8887be42759b8117b0feec62aa320 authored about 4 years ago
Change wheel tag to py36, update package_data to include py.typed

7fe2954edef586c93cb9a1b9aaef666b10d96d65 authored about 4 years ago
v11.4.0 release notes - remove change not actually implemented

Remove a change that was pushed back to a future release.

ad202693b3dcf905e180a665a54f349d00d8dfba authored about 4 years ago
v11.4.0 release notes

594ef83551be9233c6fe12a4b958ed793c7b51ba authored about 4 years ago
Fix BufferedReader TypeError

78b71618c1ac8244f2fd96bb37d185f32c69186f authored about 4 years ago
Fix log message queue flooding on certain files

Fixes #692

b8aa89e1ece7fbd58910e390057415dab96fea62 authored about 4 years ago
cli: typing

156d5d9a9c2739a4c5b1837a99f7ec5efb5b818b authored about 4 years ago
typing: tidy up

b4c1f66bc1d9b39811644f33ce1e3da80c0d0a53 authored about 4 years ago
pdfa: help mypy figure out a type

d2908640c61480011091ddd20e1245e72507f038 authored about 4 years ago