Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Collective - Host: opensource - https://opencollective.com/ocrmypdf - Code: https://github.com/jbarlow83/OCRmyPDF

Add copying of essential information from Tesseract textonly

github.com/ocrmypdf/OCRmyPDF - 7ee90890ec8e3cf68dc28ec3abf856a00ff66832 authored almost 7 years ago
Expand size growth reasons to other arguments that trigger transcoding

github.com/ocrmypdf/OCRmyPDF - 383e726d65fc17e5173efc2da576a279ff957fee authored almost 7 years ago
Set OMP_THREAD_LIMIT unconditionally, for pngquant

github.com/ocrmypdf/OCRmyPDF - e046f70642361b801cc9a034bcc825f6bb8cf0af authored almost 7 years ago
Fix --remove-background error on PDFs with colormapped images

It's unclear how exactly a
colormapped image gets to this
spot given the tendency of other
image...

github.com/ocrmypdf/OCRmyPDF - 2131ad46706704e7f378374fa488938db33ba8ee authored almost 7 years ago
test_pageinfo: remove duplicate import

github.com/ocrmypdf/OCRmyPDF - 219fe2155be776d787ac77df4acb35c4db336ea2 authored almost 7 years ago
Add gpg key to issue template

github.com/ocrmypdf/OCRmyPDF - 4209034d2078cfdbff83ee4977d4e7a3f9706a76 authored almost 7 years ago
Fix helpers.py again

github.com/ocrmypdf/OCRmyPDF - abcae0c2a43020be505b40f0ae45e1b44006e1ce authored almost 7 years ago
Don't suppress error message from config_notfound

Since it showed up in s390x bionic

github.com/ocrmypdf/OCRmyPDF - 0934905493eea4db19d7660a147da3e7e6dfd72d authored almost 7 years ago
helpers: fix missing call to complain()

In practice this is probably unreachable.

github.com/ocrmypdf/OCRmyPDF - 11cd6201d934458200cad42206f63785a633a6cd authored almost 7 years ago
Page unsplit, development

github.com/ocrmypdf/OCRmyPDF - 8d2a917676e1b47dfbb3adbdd7a185a76babdecb authored almost 7 years ago
Begin conversion from page splititng to page markers

github.com/ocrmypdf/OCRmyPDF - 44b4afa534f936ee40e539551c6f49e5ade7ced7 authored almost 7 years ago
Cherrypick merge_pages unification

github.com/ocrmypdf/OCRmyPDF - 775be3933c77d687033b7ef61b9276910e947911 authored almost 7 years ago
Add support for PDF/A-3

No ability to attach files however

github.com/ocrmypdf/OCRmyPDF - df87e21c85077014fe89638cb761386fe87f96ff authored almost 7 years ago
Use more standard __version__ rather than PILLOW_VERSION (#257)

github.com/ocrmypdf/OCRmyPDF - d761d80750ad53a5e0e3f03c21ee17569fc3db9f authored almost 7 years ago
optimize: fix reporting of jbig2 groups

github.com/ocrmypdf/OCRmyPDF - 8052019ddeb3c4ef7d55cb09799a3403ebe02422 authored almost 7 years ago
optimize: Don't save JPEGs if larger

github.com/ocrmypdf/OCRmyPDF - a3d89500888b89c8c2c88e20262f876895162a76 authored almost 7 years ago
optimize: further improve decodeparms handling

github.com/ocrmypdf/OCRmyPDF - 004f5d3bf1176a846ecaf5dd532899b1494506e5 authored almost 7 years ago
optimize: refactor tricky /Filter and /DecodeParms handling

github.com/ocrmypdf/OCRmyPDF - f5d308a156ca0831031eec8c78d0dfbc32b54d19 authored almost 7 years ago
optimize: jbig2 error

github.com/ocrmypdf/OCRmyPDF - 386999675886d4454d0829fa6ed1a2aeeef0769d authored almost 7 years ago
optimize: jbigs2 fix

github.com/ocrmypdf/OCRmyPDF - cdb2107c4e1fbae8672fd9f1aa89d2eb24cb67d2 authored almost 7 years ago
optimize: more robustness

github.com/ocrmypdf/OCRmyPDF - 4db2b3413b541c6773ac16879334145a3284eb7b authored almost 7 years ago
Make optimize a lot safer

github.com/ocrmypdf/OCRmyPDF - b2f31bec790925d4b34a5e5205e02eb77d505af2 authored almost 7 years ago
Be more defensive about accessing

github.com/ocrmypdf/OCRmyPDF - 78f9f4a26628042765e42fe28e1e3c43bbd39618 authored almost 7 years ago
optimize: more fixes

github.com/ocrmypdf/OCRmyPDF - ad6087c342aebb4b4328860f90649953fcc87c75 authored almost 7 years ago
optimize: fix "length not defined"

github.com/ocrmypdf/OCRmyPDF - 0d6ef430def82ef68ca09d2f2191fac43314afaf authored almost 7 years ago
optimize: fix error on missing /Filter

github.com/ocrmypdf/OCRmyPDF - a5942209e8a42def097cecd89a7707cb11acbb3f authored almost 7 years ago
optimize: ccitt header fixes

Changed to match TIFF spec's use of unsigned types, eliminated check for
/Columns.

There is som...

github.com/ocrmypdf/OCRmyPDF - 9a60694cfc2f166e1ef8f8f0207c51e41bd5e4b5 authored almost 7 years ago
optimize: be less chatty

github.com/ocrmypdf/OCRmyPDF - 4bf13f473788e059fd53b35ee0c0e98a08e29dd9 authored almost 7 years ago
Merge v6.1.5

github.com/ocrmypdf/OCRmyPDF - 9e89b75186e8a604ddb43aa3993d64b37583cc3c authored almost 7 years ago
Fix regression: Disable Ghostscript JPEG passthrough entirely

github.com/ocrmypdf/OCRmyPDF - 0b10db91beceb5056d28b756064a5a82c3cd3502 authored almost 7 years ago
Fix regression: time stamp test suite failures

github.com/ocrmypdf/OCRmyPDF - 1a516b2af9bd996729482bece6fcf1e5a3f091c5 authored almost 7 years ago
Disable JPEG passthrough for Ghostscript 9.23

Seems to corrupt JPEGs involved in image masks?

github.com/ocrmypdf/OCRmyPDF - 076363d78eace78d2f5864979a72f799edd22d21 authored almost 7 years ago
Update notes for v6.1.5

github.com/ocrmypdf/OCRmyPDF - 5fde2142904a8bd4f9d169100da4fd03717fcabf authored almost 7 years ago
Fix PDF/A validation failure due to timezone being omitted from /ModDate

github.com/ocrmypdf/OCRmyPDF - a620724d6ae5867f347f0b2acd0d36063401f2c6 authored almost 7 years ago
Fix PDF/A validation failure due to timezone being omitted from /ModDate

github.com/ocrmypdf/OCRmyPDF - 640b953ec7fe7d11e53c641735f9184eaa73bcc0 authored almost 7 years ago
Disable JPEG passthrough for Ghostscript 9.23

Seems to corrupt JPEGs involved in image masks?

github.com/ocrmypdf/OCRmyPDF - a009ca7597e08e0be69d6076f9ad66bd13c46bbf authored almost 7 years ago
Clarify license of two test files - https://github.com/jbarlow83/OCRmyPDF/issues/254

github.com/ocrmypdf/OCRmyPDF - 7368399f8bea48aaf8bb4be3de46af78ca03de3b authored almost 7 years ago
Search for image masks too

github.com/ocrmypdf/OCRmyPDF - c974aec934459fe16c01b9f007b39e41b27a289e authored almost 7 years ago
Iterate images with pikepdf / fix mono PNG corruption

To work around PNG corruption problem in PyMuPDF for monochrome images,
extract and save monochr...

github.com/ocrmypdf/OCRmyPDF - 3033f03f643e8282d505c27afee70e6d8c498a2f authored almost 7 years ago
optimize: be quieter

github.com/ocrmypdf/OCRmyPDF - 72723e0bb5fc2c6a39fcc83bd44b65bff7302d5e authored almost 7 years ago
Trap writePNG error

github.com/ocrmypdf/OCRmyPDF - 2fb6ab39396bc171f7196caad8cea153e9557cf9 authored almost 7 years ago
Move optimize to new file

github.com/ocrmypdf/OCRmyPDF - 25c1c160b8754cea8284af19728955c41eb397ab authored almost 7 years ago
Parallelize pngquant

github.com/ocrmypdf/OCRmyPDF - 7e9289547128a9f91c2efe0147e00bdffb7f0182 authored almost 7 years ago
PNG palette: parse PDF string from leptonica instead

Seems better to accept whatever leptonica rather than make detailed
assumptions about how it enc...

github.com/ocrmypdf/OCRmyPDF - d291d4899154ffb710962125e750c576db46aa5d authored almost 7 years ago
Implement PNG palettization

github.com/ocrmypdf/OCRmyPDF - 0e6b8042b0adfd00d93aaa52eace74a2a81825d2 authored almost 7 years ago
Fix list table for tests/resources

[ci skip]

github.com/ocrmypdf/OCRmyPDF - 34c78a892ae8cb93f5a2ac8d10e442e4abb4a5a2 authored almost 7 years ago
Update Ubuntu 14.04 instructions

Closes #252

github.com/ocrmypdf/OCRmyPDF - 9d28879505f58eaab8c4471eb2d72ac03d889f09 authored almost 7 years ago
hocr: avoid division by zero

Issue #253 - PDF that produces the error is not available, but if font_width
is zero, chances ar...

github.com/ocrmypdf/OCRmyPDF - 2482296e2bc9065895350a1622c30c4c8827e0a8 authored almost 7 years ago
Try pngquant

github.com/ocrmypdf/OCRmyPDF - f755fb76ee8a1a9500a50ee2ef88fabe9e156d4f authored almost 7 years ago
Fix PDF/A validation error from setting /Predictor 0

github.com/ocrmypdf/OCRmyPDF - c61b5dcb62c4d3e8d7e5ddd1b02fa8d4c34b795a authored almost 7 years ago
Reinstate transcoding of PNG

github.com/ocrmypdf/OCRmyPDF - fae893b9d91022a2991e2b98c0c5bca65245e25b authored almost 7 years ago
Document return codes

github.com/ocrmypdf/OCRmyPDF - 10aadefd6a2ddd7fd64328511af36dc6d05feeb2 authored almost 7 years ago
Try reading compressed data directly to see if Leptonica will add predictor

Turns out it does not transcode at all in this case, so probably going
to revert to transcoding ...

github.com/ocrmypdf/OCRmyPDF - e75b6280fd8495d03e16249add9de97a5cf7e584 authored almost 7 years ago
Release L_COMP_DATA properly

github.com/ocrmypdf/OCRmyPDF - 8c4023165aec134222801469c0a3207352bcc653 authored almost 7 years ago
Deprecate Pix.read() behaving as an open function

github.com/ocrmypdf/OCRmyPDF - b7d403f106d9339819b40618629f9de30d9f573d authored almost 7 years ago
Use Leptonica to rewrite all PNGs with predictor

Leptonica does a better job of encoding them than Ghostscript, about -15%.
For a test file 450k ...

github.com/ocrmypdf/OCRmyPDF - b069de0caa6dc8260ebd0aad1d26767f4bab7f08 authored almost 7 years ago
Update branch with v6.1.4

github.com/ocrmypdf/OCRmyPDF - 136da74bfa762da2f80986690e81318db9355cff authored almost 7 years ago
Fix NameError 'ghostscript'

github.com/ocrmypdf/OCRmyPDF - 7fc897e6dc1b320dbc477c41d8412e467f6dce50 authored almost 7 years ago
Set Ghostscript -sColorConversionStrategy the way old/new versions expect

github.com/ocrmypdf/OCRmyPDF - 9b731d63b8c130be5306b5a23bfb3924a6431593 authored almost 7 years ago
v6.1.4 fix test suite regression with Ghostscript 9.23

github.com/ocrmypdf/OCRmyPDF - 10aa59f6749a12708d52a1034b4103b4a090bb74 authored almost 7 years ago
v6.1.4 release notes update

github.com/ocrmypdf/OCRmyPDF - 1f7837e7b18f1eb5a6f4089750db7d7f68c549ba authored almost 7 years ago
Update test cache to account for unpaper --layout none change

github.com/ocrmypdf/OCRmyPDF - ba0535e3fbb6b8541d810c5f1a8cb6999680c3f1 authored almost 7 years ago
tesseract_cache: don't reveal host system file paths in manifest file

github.com/ocrmypdf/OCRmyPDF - 49fa7f6b5cd7c16346aac21fcff0bab9353181db authored almost 7 years ago
v6.1.4 merge

github.com/ocrmypdf/OCRmyPDF - c95db246d46d8dbf9ca99096dcf0cff641180410 authored almost 7 years ago
docs: Update installation to reflect qpdf 7.0.0 requirement

github.com/ocrmypdf/OCRmyPDF - 1ba93371ce98024ffd327581fdfb66a26b9af698 authored almost 7 years ago
Travis: compile qpdf from source

The older version in Travis's Ubuntu 14.04 can't pass the test suite anymore.

github.com/ocrmypdf/OCRmyPDF - fedbbdb5756608d6c6e9524b924c976dbd6e044a authored almost 7 years ago
Fix setup.py syntax

github.com/ocrmypdf/OCRmyPDF - 85ebba72bc79258fa99e17c5fab2eee7f7fc9570 authored almost 7 years ago
setup: Blacklist Pillow 5.1.0 on macos

https://github.com/python-pillow/Pillow/issues/3068

github.com/ocrmypdf/OCRmyPDF - b6cd436d5d75a197a7fbd1a203247aa7742748dd authored almost 7 years ago
Travis: use setup.py for requirements, don't override with .txt

github.com/ocrmypdf/OCRmyPDF - ec170c7e1ed822e4dca9cd888768f8602361ea2a authored almost 7 years ago
optimize: use Leptonica to compact JPEGs

Pillow could do it too, but Leptonica is somewhat more PDF aware.

github.com/ocrmypdf/OCRmyPDF - f6399eb90f85c28581c2f81c8b74c93cc84d30e7 authored almost 7 years ago
Leptonica: add L_COMP_DATA compressed data manager

github.com/ocrmypdf/OCRmyPDF - 77f2448e59a8e9ba3feeecdbac498f08c55e132a authored almost 7 years ago
Release notes

github.com/ocrmypdf/OCRmyPDF - 3d69b46fcad59b19aa1373f4b7ac3d2a70b37f3f authored almost 7 years ago
Use defusedxml for XML parsing when reading XMP

github.com/ocrmypdf/OCRmyPDF - 4b6153ad18644b6ebcdd39e6096e23794e8909a3 authored almost 7 years ago
docs: expand ocr of image usage

github.com/ocrmypdf/OCRmyPDF - 75d37eb1035ec55339e2d05770c1469e410e12f5 authored almost 7 years ago
unpaper: close images on error paths

github.com/ocrmypdf/OCRmyPDF - 11b6f77df0fb3d193fa21ce4d7dabf38983e03a3 authored almost 7 years ago
get_version: repeat system error messages if the process exists with a signal

github.com/ocrmypdf/OCRmyPDF - db8b0319dd482a8ed233cda53d0f854f22a964a8 authored almost 7 years ago
JBIG2: refactor, don't recompress existing JBIG2

github.com/ocrmypdf/OCRmyPDF - c9dd33076645f3d7fc4d67c1bf3c2e7e7abba243 authored almost 7 years ago
JBIG2: Streams created in this manner are already indirect objects

github.com/ocrmypdf/OCRmyPDF - e40228102cd5014b733885e10f6dab598cb37b71 authored almost 7 years ago
Parallelize JBIG2 execution with thread pools

github.com/ocrmypdf/OCRmyPDF - 7889c6fb4c2a23b08820355611616a3da1d3fd99 authored almost 7 years ago
Fix JBIG2Globals included multiple times in output

github.com/ocrmypdf/OCRmyPDF - 6eb17731105be0121a7414154045994230014872 authored almost 7 years ago
Implement functional, single threaded optimize

Passes verapdf

github.com/ocrmypdf/OCRmyPDF - 1d25823746ee97af8f44799881c380bb61e346e2 authored almost 7 years ago
Add issue links to release notes

github.com/ocrmypdf/OCRmyPDF - d1d4f1e1983c7e99c44df2c68f70568fe9ef69de authored almost 7 years ago
Regroup three merge steps into a single step

All take the same inputs and deliver similar outputs, so it makes sense.

github.com/ocrmypdf/OCRmyPDF - 709c01c7a1baa6820d1310a8a6a12b2424516bef authored almost 7 years ago
Merge branch 'master' into feature/jbig2-2018

github.com/ocrmypdf/OCRmyPDF - 4a341c9034b4c8a4e747c151fd63767a4f4cfa07 authored almost 7 years ago
Update flowchart

[ci skip]

github.com/ocrmypdf/OCRmyPDF - be41ff6d5436c2433360b894e56766ecab49391f authored almost 7 years ago
Notes on relevant envvars, repology

github.com/ocrmypdf/OCRmyPDF - 1dbb6f1746c31c74cc53404b2646067b19f7da8c authored almost 7 years ago
Tell unpaper to use --layout none so it won't blank out multi column text

github.com/ocrmypdf/OCRmyPDF - 753e6274ab236e9bbb78d5e948d01e8262b1419d authored almost 7 years ago
v6.1.3 notes

github.com/ocrmypdf/OCRmyPDF - 7f462c618b579d192235e191a38b5b6864f78cb9 authored almost 7 years ago
Experimental add jbig2

It appears that fitz forces conversion of jbig2 to ccitt no matter what,
so pikepdf will be need...

github.com/ocrmypdf/OCRmyPDF - a95ffcdc46f49969b98774f1f35b9804e0de4a3e authored almost 7 years ago
Convert monochrome images to JBIG2

Awkwardly using fitz and pikepdf, transcode monochrome to CCITT.
This requires _OCRMYPDF_NO_FITZ...

github.com/ocrmypdf/OCRmyPDF - d8ac6e28ab1642a566c5767c853f2ba3f35744e5 authored almost 7 years ago
Warn about Python 3.5 page count issue

github.com/ocrmypdf/OCRmyPDF - 1b01d45dd246e90b0252b692fe84a33ad42e6816 authored almost 7 years ago
Fix creation date metadata lost from input

Closes #247

github.com/ocrmypdf/OCRmyPDF - 7a1cd39b21002b29dccffccab0afeb17743b3e30 authored almost 7 years ago
Don't depend on pytest-xdist in setup.cfg

github.com/ocrmypdf/OCRmyPDF - 1c1fd9616aead13a41d816f6380378e3f8a94e47 authored almost 7 years ago
remove addopts key from tool:pytest section of setup.cfg (#246)

The '-n' command line argument is not supported by recent pytest.

github.com/ocrmypdf/OCRmyPDF - 11e19e408544452db79e23fc057572dfb56065ab authored almost 7 years ago
Update installation.rst, further info on fitz

github.com/ocrmypdf/OCRmyPDF - 2a43f7322875660648b98b9949ed0fcda53bd86b authored almost 7 years ago
Dockerfile: use fitz

github.com/ocrmypdf/OCRmyPDF - b1d1310a754439a3e7511dd97dfa39dc4d34442a authored almost 7 years ago
Remove inaccurate statement from setup.py

github.com/ocrmypdf/OCRmyPDF - 0e7fa78e65faa6797109f3d2bd689a6640eaba8d authored almost 7 years ago
Change docs for fitz/PyMuPDF

github.com/ocrmypdf/OCRmyPDF - 4032570d9737f99fe3c9058b3b1d6abe9e5c914b authored almost 7 years ago
pipeline: refactoring, use with block for images

github.com/ocrmypdf/OCRmyPDF - 90644a301724f05f616a4ae6b5e98116c140be13 authored almost 7 years ago
Update copyrights

github.com/ocrmypdf/OCRmyPDF - 4f6bffb477ed26520be04158f7b068b10b6ff377 authored almost 7 years ago