Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Collective - Host: opensource - https://opencollective.com/ocrmypdf - Code: https://github.com/jbarlow83/OCRmyPDF

Refactor setup_pipeline to decouple manage_work_folder

github.com/ocrmypdf/OCRmyPDF - 6f82097d141cb7d2ae76487aec654e7787b2a89a authored about 1 year ago by James R. Barlow <[email protected]>
Eliminate api= kwarg and implicit creation of pluginmanager

github.com/ocrmypdf/OCRmyPDF - e8ae370ceb9c22798955c0db33f46ed30f07cd65 authored about 1 year ago by James R. Barlow <[email protected]>
Working HOCR folder to PDF converter

github.com/ocrmypdf/OCRmyPDF - 23951c9e380b31719503fc01d43625049f537478 authored about 1 year ago by James R. Barlow <[email protected]>
Fix use_threads logic for get_pdfinfo

Some debug code was level in place that forced pdfinfo to run with only
one worker when --use-th...

github.com/ocrmypdf/OCRmyPDF - 53c953a561cdc71192dc9b266c14da82f6f835bd authored about 1 year ago by James R. Barlow <[email protected]>
Refactor lossless reconstruction setter into separate function

Still messy but good enough as a start.

github.com/ocrmypdf/OCRmyPDF - 95b14ee282d0f44ee1e347fb1b0c771e89ebff23 authored about 1 year ago by James R. Barlow <[email protected]>
Refactor exec_page_sync -> extract _process_page

github.com/ocrmypdf/OCRmyPDF - 6827a6efe8b9fb7a1aa697cb353f551f444b971f authored about 1 year ago by James R. Barlow <[email protected]>
Refactor setup_pipeline to context manager

github.com/ocrmypdf/OCRmyPDF - 8985c0dfe97e61188d0ffe12a8dc03b8c9540b4b authored about 1 year ago by James R. Barlow <[email protected]>
Refactor conversion of ocrmypdf.ocr() arguments to cmdline

github.com/ocrmypdf/OCRmyPDF - b3de5833d317bbfded637d69d3a8db9efa4fded0 authored about 1 year ago by James R. Barlow <[email protected]>
Introduce pdf_to_hocr API

github.com/ocrmypdf/OCRmyPDF - 0443e87345555a89e704d32e87b3062efe821898 authored about 1 year ago by James R. Barlow <[email protected]>
pdf_to_hocr: improve plugin handling

github.com/ocrmypdf/OCRmyPDF - 68bb38d0addb26e18986407a5ec0a6ecf6274682 authored about 1 year ago by James R. Barlow <[email protected]>
Refactor exec_page_sync to outputs

github.com/ocrmypdf/OCRmyPDF - 86a20c41306c0fb64b9418f7ea464f503f071680 authored about 1 year ago by James R. Barlow <[email protected]>
Refactor main pipeline and start hocr pipeline

github.com/ocrmypdf/OCRmyPDF - 8991d2cb33c5ca5a154311c1ec3bfe844fa6092b authored about 1 year ago by James R. Barlow <[email protected]>
Plugin manager: set reasonable default when called without params

github.com/ocrmypdf/OCRmyPDF - 07b89e6a19959cebc440b7d364eeb67df3eded55 authored about 1 year ago by James R. Barlow <[email protected]>
Add hocr to ocr pdf pipeline

github.com/ocrmypdf/OCRmyPDF - cbb0868ae3c0f735a03d5b63e41acc9f6ba48067 authored about 1 year ago by James R. Barlow <[email protected]>
Refactor main pipeline into discrete pipelines

- test_ghostscript_pdfa_failure fails
- haven't dealt with logging record factories

Further ref...

github.com/ocrmypdf/OCRmyPDF - 1f16eb6f5048b97ab2594e771f27590a17d55345 authored about 1 year ago by James R. Barlow <[email protected]>
Refactor logging record thread local storage

github.com/ocrmypdf/OCRmyPDF - ebfe008432d5040691933d024749e0f6f4254322 authored about 1 year ago by James R. Barlow <[email protected]>
Test on release py312 not py312-rc

github.com/ocrmypdf/OCRmyPDF - c6b53326998f8dd5d383a3804aff68a42ee19008 authored about 1 year ago by James R. Barlow <[email protected]>
Correct the archive dir name in `Watched folders with Docker` (#1173)

github.com/ocrmypdf/OCRmyPDF - 68610046c6444f4489b9972a5cb983a62b0b14f9 authored about 1 year ago by Michael Flagg <[email protected]>
v15.3.0 release notes

github.com/ocrmypdf/OCRmyPDF - 880326868d24487b599ad6415a5c72b6b9ae0c78 authored about 1 year ago by James R. Barlow <[email protected]>
watcher: Improve parameter validation

github.com/ocrmypdf/OCRmyPDF - c6be3ba07650eacbe12ce84201889e0d7985af5b authored about 1 year ago by James R. Barlow <[email protected]>
misc/watcher.py: use Typer and dotenv to improve ease of use

github.com/ocrmypdf/OCRmyPDF - 0565cb0b1083ce02f158bce44e771ed63182a0b7 authored about 1 year ago by James R. Barlow <[email protected]>
Improve wait_for_file_ready loop

github.com/ocrmypdf/OCRmyPDF - dc49906704a4a5f539c9dc5b9511bd8ce926b78d authored about 1 year ago by James R. Barlow <[email protected]>
Detect and warn about Tagged PDFs

github.com/ocrmypdf/OCRmyPDF - 93fda0dd009cf8d999059ec3c6e4812b53bb0751 authored about 1 year ago by James R. Barlow <[email protected]>
Fix "pikepdf mmap disabled" spam on macOS

github.com/ocrmypdf/OCRmyPDF - d4110e78cb8b870b4c1823a95214275d57c7f821 authored about 1 year ago by James R. Barlow <[email protected]>
Add check for filenotfound due to being inside a snap

github.com/ocrmypdf/OCRmyPDF - 5285d68fcc4d8ced4889f29ca3ff68d5be1f539b authored about 1 year ago by James R. Barlow <[email protected]>
Update docker document and v15.2.0 release notes

github.com/ocrmypdf/OCRmyPDF - 2b0e1498090eebb9d348256c271c53eb8def6acc authored about 1 year ago by James R. Barlow <[email protected]>
Don't build docker alpine aarch64 for now and add alias tag for Ubuntu

github.com/ocrmypdf/OCRmyPDF - b7ce5b0d7d65e1648a184435a6f2e99bea28ebeb authored about 1 year ago by James R. Barlow <[email protected]>
Attempt at alpine on arm64 - not working

pip attempts to download musllinux_1_1 wheels and can't find pikepdf's which is only 1_2. Not cl...

github.com/ocrmypdf/OCRmyPDF - ffd6a64ce9e4bd9a05b81f08fb897a1420ab8d3e authored about 1 year ago by James R. Barlow <[email protected]>
Test docker alpine build and ubuntu aliasing

github.com/ocrmypdf/OCRmyPDF - 5727f1e081327bd01e0531b8c4c9b8a85da971b0 authored about 1 year ago by James R. Barlow <[email protected]>
Alpine image: Fix missing OSD and path

github.com/ocrmypdf/OCRmyPDF - b75a7eca2ab894d8e674e890c4eae24a59ae8e27 authored about 1 year ago by James R. Barlow <[email protected]>
docker docs: remove VirtualBox content and update a few explanations

github.com/ocrmypdf/OCRmyPDF - 2b016764346527d5ebee6baf56fbce580c7cfcf3 authored about 1 year ago by James R. Barlow <[email protected]>
Add Alpine dockerfile

github.com/ocrmypdf/OCRmyPDF - e11c386c583182d3f8dfc7427b14efd697a7c137 authored about 1 year ago by James R. Barlow <[email protected]>
Simplify deep nested with-block

github.com/ocrmypdf/OCRmyPDF - 9346d1f970cb8079aaf006904fdfc306d2d6375b authored about 1 year ago by James R. Barlow <[email protected]>
v15.1.0 release notes

github.com/ocrmypdf/OCRmyPDF - 012cbef8656e87af696be865faa812282d23e409 authored about 1 year ago by James R. Barlow <[email protected]>
sync: update documentation

github.com/ocrmypdf/OCRmyPDF - 0687568e1b216e3975cd0dbd8721f218efc899f0 authored about 1 year ago by James R. Barlow <[email protected]>
pipeline: documentation and tweak merge_sidecars

github.com/ocrmypdf/OCRmyPDF - 3086cfc3d91413724c0a5d0a578c0da13779a80c authored about 1 year ago by James R. Barlow <[email protected]>
Require Pillow >= 10.0.1 and drop shims for older versions

github.com/ocrmypdf/OCRmyPDF - 91a14660b314d54abbf348a72c9b71359330ce1f authored about 1 year ago by James R. Barlow <[email protected]>
Document some missing CLI options to API

github.com/ocrmypdf/OCRmyPDF - 539f0ee0ce7f670824a30b6e772d8238584e70b8 authored about 1 year ago by James R. Barlow <[email protected]>
Replace ExitStack with contextmanager

github.com/ocrmypdf/OCRmyPDF - 7172817cd6bd809730722dbafd2eb5bce540d913 authored about 1 year ago by James R. Barlow <[email protected]>
Fix typos in release notes

github.com/ocrmypdf/OCRmyPDF - d9cc759142272256242662f98d4d31f48eaf01ef authored about 1 year ago by James R. Barlow <[email protected]>
Fix typing for pdinfo.layout and modernize

Also changed handling of undefined chars to be more Liskov-consistent with parent class.

github.com/ocrmypdf/OCRmyPDF - 364799fc3ecb842ae9897d757e2e09cbde1e58b6 authored about 1 year ago by James R. Barlow <[email protected]>
Use Python 3.9-style type hinting for tuple[] and AbstractSet -> Set

github.com/ocrmypdf/OCRmyPDF - f4c211fa2d70d71d8d4cedd2a77d0e877805ea95 authored about 1 year ago by James R. Barlow <[email protected]>
ruff autofixes (mostly typing.* -> collections.abc.*)

github.com/ocrmypdf/OCRmyPDF - 113a6b45bdb9a910e99aabcb5c0c81377879ff01 authored about 1 year ago by James R. Barlow <[email protected]>
Update notes on concurrency

github.com/ocrmypdf/OCRmyPDF - e9419d2c40f24ea0f4fd9ee58677d2c81f4badb5 authored about 1 year ago by James R. Barlow <[email protected]>
v15.0.2 release notes

github.com/ocrmypdf/OCRmyPDF - fb006ef39f7f8842dec1976bebe4bcd5ca2e8df8 authored about 1 year ago by James R. Barlow <[email protected]>
docs: update to discuss some v15 features not yet documented

github.com/ocrmypdf/OCRmyPDF - 890b9944031b5cd68ae63e8a42972409b32fadbc authored about 1 year ago by James R. Barlow <[email protected]>
docs: mention new features in v15, fix 32-bit text again

github.com/ocrmypdf/OCRmyPDF - 01bbf7d144bd1527e49d328168f65c1dde8824b0 authored about 1 year ago by James R. Barlow <[email protected]>
Add Python 3.12 to test matrix

github.com/ocrmypdf/OCRmyPDF - 468de5324aeb05a4791d926ac58fea53149ed2f4 authored about 1 year ago by James R. Barlow <[email protected]>
v15.0.1 release notes (again)

github.com/ocrmypdf/OCRmyPDF - 072db75fa3de6e6555c22f19c5824d41da5fa077 authored about 1 year ago by James R. Barlow <[email protected]>
docs: clarify situation around 32-bit support

Maintainers of ARM 32-bit in particular don't necessarily need to drop support....

github.com/ocrmypdf/OCRmyPDF - 8519b3f6251acbe56cb7a8a48a598355e37c98e8 authored about 1 year ago by James R. Barlow <[email protected]>
docs: update install notes for some things missed with v15 release

github.com/ocrmypdf/OCRmyPDF - dd7c4f3eaa3305c06b40e600b6289ab739cc4146 authored about 1 year ago by James R. Barlow <[email protected]>
Change 32-bit message from error to warning

github.com/ocrmypdf/OCRmyPDF - c8e6f20f8d1706a0086b01b7c8f4dc38659e4044 authored about 1 year ago by James R. Barlow <[email protected]>
Change Ghostscript version skip to fail

Reported to fail on earlier versions than the check tested for.

github.com/ocrmypdf/OCRmyPDF - 10530a8698c6c7f4169bdf8612703c95318bdce8 authored about 1 year ago by James R. Barlow <[email protected]>
v15.0.1 release notes

github.com/ocrmypdf/OCRmyPDF - 207866abf59764384622702500f472775f03d158 authored about 1 year ago by James R. Barlow <[email protected]>
Fix bdist_wheel tag set to py38

github.com/ocrmypdf/OCRmyPDF - 3829af16fb9533c50dfabbec990ffd60245d0332 authored about 1 year ago by James R. Barlow <[email protected]>
Innocuous change to bump tag

github.com/ocrmypdf/OCRmyPDF - 24db31b4c56c5e849a3873c7091c4096c465053c authored about 1 year ago by James R. Barlow <[email protected]>
Update release notes and files

github.com/ocrmypdf/OCRmyPDF - 8132a4ae10bfad7f9e9bc13c8a7ddecf0d00226f authored about 1 year ago by James R. Barlow <[email protected]>
Further improvements to image DPI calculation

github.com/ocrmypdf/OCRmyPDF - d5128c5cf5e29367fc5f2e322fcadc11c8f5301c authored about 1 year ago by James R. Barlow <[email protected]>
Fix pluginmanager typing

github.com/ocrmypdf/OCRmyPDF - 270e31fa672aeb202198042ae7967b72e0111dd2 authored about 1 year ago by James R. Barlow <[email protected]>
Tidy imports and line length

github.com/ocrmypdf/OCRmyPDF - 85e31d0a1906f6eb51234b132cadd140bd6c5f64 authored about 1 year ago by James R. Barlow <[email protected]>
Overhaul version checkers to prefer Version to str

github.com/ocrmypdf/OCRmyPDF - ea36aedb5fed635cb8d6417665de5eab5721710f authored about 1 year ago by James R. Barlow <[email protected]>
Remove Python 3.8 shim for missing str.removeprefix

github.com/ocrmypdf/OCRmyPDF - bd4d44e182617ef5cc026b36977fd9611d39680f authored about 1 year ago by James R. Barlow <[email protected]>
Rename pike local variable to pdf for consistency

github.com/ocrmypdf/OCRmyPDF - 8fcf358934f271f241a4f53fc7e0e3136bf23df5 authored about 1 year ago by James R. Barlow <[email protected]>
Minor documentation and typing fixes

github.com/ocrmypdf/OCRmyPDF - 47b0f2856417a9dcf99db15fe81af5210c403e12 authored about 1 year ago by James R. Barlow <[email protected]>
Refactor ghostscript error message deduplicating

github.com/ocrmypdf/OCRmyPDF - 7018e2b247af565ca29c9c03b0a9d8516024f614 authored about 1 year ago by James R. Barlow <[email protected]>
Refactor docinfo repair code

github.com/ocrmypdf/OCRmyPDF - 8d12ecb798af2c5681b0b7114fb5505539317130 authored about 1 year ago by James R. Barlow <[email protected]>
Update release notes

github.com/ocrmypdf/OCRmyPDF - 0ab29ec0babdebb4fc1f4a89ff0d5202987a5f3e authored about 1 year ago by James R. Barlow <[email protected]>
Update release ntoes

github.com/ocrmypdf/OCRmyPDF - 179714770ac51dc38f33705cf6ddb8c800531a0b authored about 1 year ago by James R. Barlow <[email protected]>
ghostscript: fix missing type annotation

github.com/ocrmypdf/OCRmyPDF - f04f45545c0c1f7f0100c783dabceb3fdb83c196 authored about 1 year ago by James R. Barlow <[email protected]>
logging: Avoid possible multiplication by None

github.com/ocrmypdf/OCRmyPDF - a3a083c1251fc7a85d96bdc6e2a386b425567d87 authored about 1 year ago by James R. Barlow <[email protected]>
Remove single dispatch version of calculate_downsample

github.com/ocrmypdf/OCRmyPDF - d855f63985f34e3e643a4da0a34cfcf34e4b5571 authored about 1 year ago by James R. Barlow <[email protected]>
helpers: fix typing

github.com/ocrmypdf/OCRmyPDF - d4863cbf0fc8a4f129f26745275b414c4a4c12ab authored about 1 year ago by James R. Barlow <[email protected]>
Enable pikepdf mmap again

github.com/ocrmypdf/OCRmyPDF - 7b8f081fbf85482090e0c43e22e5e37dc3d2990c authored about 1 year ago by James R. Barlow <[email protected]>
Add comment re: reportlab shim

github.com/ocrmypdf/OCRmyPDF - 7d33039bcd57ab43d1d7345c9e1e8380b5d33f46 authored over 1 year ago by James R. Barlow <[email protected]>
Revert "Drop reportlab warning shim"

This reverts commit 162a47f98ee4f5b08e86880d7f1b48eee6be51bb.

github.com/ocrmypdf/OCRmyPDF - fde886baf4eb64efb78e88c2245b71ea67b263fd authored over 1 year ago by James R. Barlow <[email protected]>
Regenerate test cache

github.com/ocrmypdf/OCRmyPDF - 146da79c00f0a9f42dc07ea48922726a98fe5aee authored over 1 year ago by James R. Barlow <[email protected]>
Move issue*.pdf into separate folder

github.com/ocrmypdf/OCRmyPDF - 2fc3b0d97337c6d470d4bf5d3df5959e0af4c339 authored over 1 year ago by James R. Barlow <[email protected]>
Lower pngquant version req

github.com/ocrmypdf/OCRmyPDF - 5667424530db44bae71a1e3a10feda6b0c79bd68 authored over 1 year ago by James R. Barlow <[email protected]>
Drop Ubuntu 20.04 from build matrix

github.com/ocrmypdf/OCRmyPDF - 8add531ffd7828d58a43a9db530f6d699df34fe3 authored over 1 year ago by James R. Barlow <[email protected]>
Merge branch 'feature/jbig2thresh' into v15

github.com/ocrmypdf/OCRmyPDF - 0388c23ae75abb5b25802b1669adb87f3e97e093 authored over 1 year ago by James R. Barlow <[email protected]>
Merge branch 'feature/snap22' into v15

github.com/ocrmypdf/OCRmyPDF - 9b77daae7c56a72dadec6839f92b1243cb738997 authored over 1 year ago by James R. Barlow <[email protected]>
Merge branch 'feature/fix-raster-dpi-too-high' into v15

github.com/ocrmypdf/OCRmyPDF - 3e1b3ec98d4690ff79dc7dd7c64a7014feeb4a04 authored over 1 year ago by James R. Barlow <[email protected]>
Draft v15 release notes

github.com/ocrmypdf/OCRmyPDF - 0f0ca6f517ced14d6772041ee1cb32402a70eb89 authored over 1 year ago by James R. Barlow <[email protected]>
Document that Ubuntu 22.04 and 20.04 both use Tesseract 4.1.1

github.com/ocrmypdf/OCRmyPDF - 23a37fc35c9116e5eb3bbc0e45113a5d9b97e391 authored over 1 year ago by James R. Barlow <[email protected]>
Don't use really old Python for deliverable building

github.com/ocrmypdf/OCRmyPDF - 0c287929c2e0ac66633ab0169545c632456cf0b0 authored over 1 year ago by James R. Barlow <[email protected]>
Convert workflow to trusted PyPI publisher

github.com/ocrmypdf/OCRmyPDF - c93349c350e35af41ead6f27cb47b00dc491154f authored over 1 year ago by James R. Barlow <[email protected]>
Drop Python 3.8 too

(cherry picked from commit a8cdc5a191bdf1fe401e96c2dc9a5fba39b90d79)

github.com/ocrmypdf/OCRmyPDF - ec1c3775320341d32dd7ad619de11005be0190df authored over 1 year ago by James R. Barlow <[email protected]>
Remove shim for img2pdf < 0.4.4

github.com/ocrmypdf/OCRmyPDF - e8c82ee4b6dda7ac5ada48478dddd99876894da6 authored over 1 year ago by James R. Barlow <[email protected]>
Remove tqdm dependency and TqdmConsole

Might be too aggressive? No deprecation warning....

github.com/ocrmypdf/OCRmyPDF - de2bb5ce8cf5025c5f86d86d1feec57bb48ee54a authored over 1 year ago by James R. Barlow <[email protected]>
Change minimum Ghostscript version to 9.55

github.com/ocrmypdf/OCRmyPDF - eec8a2b574276c9ea44a288603fb63ee2a98f2dc authored over 1 year ago by James R. Barlow <[email protected]>
Complain about all 32-bit interpreters, not just windows

github.com/ocrmypdf/OCRmyPDF - 2637e84691866fda584bc66a0d79a87c81dd5826 authored over 1 year ago by James R. Barlow <[email protected]>
Require pngquant 2.13.1 or newer

github.com/ocrmypdf/OCRmyPDF - 2ad8961d0bd80f2ccafddcd896b54e709c984c25 authored over 1 year ago by James R. Barlow <[email protected]>
Drop support for gswin32c / 32-bit Ghostscript for Windows

github.com/ocrmypdf/OCRmyPDF - 6c78076beab33bd6d91e96a5530f746023e3884a authored over 1 year ago by James R. Barlow <[email protected]>
Tigthen Python dependencies

github.com/ocrmypdf/OCRmyPDF - 0239f69912248090ed13ac236a374063ea08ad25 authored over 1 year ago by James R. Barlow <[email protected]>
Drop reportlab warning shim

github.com/ocrmypdf/OCRmyPDF - 162a47f98ee4f5b08e86880d7f1b48eee6be51bb authored over 1 year ago by James R. Barlow <[email protected]>
Added weighted DPI rendering

To address #1010 and other issues.

github.com/ocrmypdf/OCRmyPDF - 173428e81aa0f0a0f5fd919a02988eae6d2a63b9 authored over 1 year ago by James R. Barlow <[email protected]>
Fix incorrect printed_area calculation

github.com/ocrmypdf/OCRmyPDF - 67ed29dcea1ccc11cfea582c37299f1dd2b3fa2f authored over 1 year ago by James R. Barlow <[email protected]>
Introduce Resolution.take_min

github.com/ocrmypdf/OCRmyPDF - 3454c050edc2fa67d014fedcdeedaf8b342b0284 authored over 1 year ago by James R. Barlow <[email protected]>
Update snap to use core22

github.com/ocrmypdf/OCRmyPDF - 5ee99b26e79117317a9d610229e0dd9592a31de9 authored over 1 year ago by James R. Barlow <[email protected]>
Revise to PageResolutionProfile

github.com/ocrmypdf/OCRmyPDF - ac3aa67d8a08e0193212387bd4f642321893581c authored over 1 year ago by James R. Barlow <[email protected]>