Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Collective - Host: opensource - https://opencollective.com/ocrmypdf - Code: https://github.com/jbarlow83/OCRmyPDF

Fixed setup.py syntax error

github.com/ocrmypdf/OCRmyPDF - 158f902c3b59ae9c399db4a7334d477dc0f391b0 authored almost 7 years ago
v6.1.2: add license to wheels, depend on defusedxml

github.com/ocrmypdf/OCRmyPDF - 6dc25ddc6e435517cacb06b4390cf69566620d7b authored almost 7 years ago
Remove PyMuPDF 1.12.4 shim

github.com/ocrmypdf/OCRmyPDF - ace439910e71d6ea721126f8bb595c9092e986da authored almost 7 years ago
Add envvar to ease testing without PyMuPDF

github.com/ocrmypdf/OCRmyPDF - 7f038568de484296324f161c134bc0499b2121c5 authored almost 7 years ago
Test macos without fitz too

github.com/ocrmypdf/OCRmyPDF - af777c0b6aa51b80cd42d607f7a66ffdb74b656a authored almost 7 years ago
v6.1.1 release notes

Better get the last one out

github.com/ocrmypdf/OCRmyPDF - fc299032a4da0e9ff5b60b4c199234e9f953a1d2 authored almost 7 years ago
Fix text reported as found on all pages when PyMuPDF is not available

github.com/ocrmypdf/OCRmyPDF - e0f3f07907d5eeff51688a127454c3a8fee8a7fe authored almost 7 years ago
pdfa: codecs.encode -> hexlify (simpler)

github.com/ocrmypdf/OCRmyPDF - b36df9cf9e3226479c1f682590c57f78f36977fc authored almost 7 years ago
Travis: Should test 3.6 Linux without fitz too

github.com/ocrmypdf/OCRmyPDF - 81c3f780d4cb4978e968a1b2dba39631d98be536 authored almost 7 years ago
Travis: don't upload to legacy PyPI anymore, it will stop working soon

github.com/ocrmypdf/OCRmyPDF - b51efdd3e3ab629a977cf6f574bbf608f18a038f authored almost 7 years ago
Update release notes

github.com/ocrmypdf/OCRmyPDF - 610b769df9f28e24e33056d9572dd0276ee3d0a7 authored almost 7 years ago
Workaround fitz not escaping parentheses

Closes #239

github.com/ocrmypdf/OCRmyPDF - 527f4d01018a0752b547c01bb06570dde8c729d6 authored almost 7 years ago
test_bookmarks_preserved won't raise ImportError any more

Due to trapping this in ocrmypdf.lib

github.com/ocrmypdf/OCRmyPDF - 8d9be43c601b1dea51f3ba33216603b2dcba131a authored almost 7 years ago
Add new argument --skip-repair to skip the repair step

github.com/ocrmypdf/OCRmyPDF - 40ef4f0bbe7b22f4d86ccc33f6050151d899d977 authored almost 7 years ago
More debug messages on repair; update notes

github.com/ocrmypdf/OCRmyPDF - d0271d5049aa18311bd544b089c6e647069ff10c authored almost 7 years ago
Refactor fitz ImportError trap

github.com/ocrmypdf/OCRmyPDF - 5becfcf8ea2fab011e5d0ec3cc42a1e10cb3a24e authored almost 7 years ago
Fix regression: PDF/A broken without fitz

github.com/ocrmypdf/OCRmyPDF - 112e8d6c181816c41e15deb91e2bf6988c1f9f0f authored almost 7 years ago
Add PyMuPDF to preamble

github.com/ocrmypdf/OCRmyPDF - 1d8d49a01d2b6037845cfd1354101100a5da325a authored almost 7 years ago
Add warning for large file size increases

github.com/ocrmypdf/OCRmyPDF - 505015568560b3f28580dc9f6d937fd08e49d31c authored almost 7 years ago
Merge branch 'optional-fitz'

github.com/ocrmypdf/OCRmyPDF - a9bd494cc016fb2583a1ec5c89ccf835b4c9030c authored almost 7 years ago
Add _naive_find_text to search for text when fitz is not available

github.com/ocrmypdf/OCRmyPDF - 6a4df78bc060b0fcedc713d400fda788816f8242 authored almost 7 years ago
Fix test_main missing file_claims_pdfa

github.com/ocrmypdf/OCRmyPDF - 530eae38988e5219a91581b5b10d3857a58f7697 authored almost 7 years ago
Make fitz optional

github.com/ocrmypdf/OCRmyPDF - 3e444f6a9003a6a282ea0a1c67297f0eaba3288a authored almost 7 years ago
Fix table of contents not preserved in PDF/A

github.com/ocrmypdf/OCRmyPDF - 45dbff64017eb3f7c218a9aac000cb15e035bf80 authored almost 7 years ago
Move metadata tests to new test_metadata

github.com/ocrmypdf/OCRmyPDF - bc56b8e0587fbd6ad93397fbf4728d38a3aad200 authored almost 7 years ago
v6.0.1 start release notes

github.com/ocrmypdf/OCRmyPDF - d86e315c4855a9d46488ee4bb3c1539258a8a912 authored almost 7 years ago
Remove deprecated --pdf-renderer tess4, which was renamed to sandwich

Should have been cut in v6.0.0

github.com/ocrmypdf/OCRmyPDF - 746969207a531973f7534c87d465833a8302ca9f authored almost 7 years ago
tesseract: Fix FileExistsError on if output file was created at timeout

github.com/ocrmypdf/OCRmyPDF - 1caebaefb55a6c30505d5d2fa47b5c86e6509f77 authored almost 7 years ago
Fix typo in release notes

github.com/ocrmypdf/OCRmyPDF - 2d10fdcf0f1c51e160e2378b12c4230e7f3f5e88 authored almost 7 years ago
Note other web frontends

github.com/ocrmypdf/OCRmyPDF - 355ec70a801f9641f1e91edfc2bcd121571e0683 authored almost 7 years ago
Remove pageinfo.py which release notes said was gone for v6

github.com/ocrmypdf/OCRmyPDF - a2f499de01cec31773b0ba524495c116b0923620 authored almost 7 years ago
Remove Tesseract 4 message

github.com/ocrmypdf/OCRmyPDF - f4bca89722c8a2955428fb27994bb6c7b296d508 authored almost 7 years ago
v6.0.0 release

github.com/ocrmypdf/OCRmyPDF - 9fbc69df3ff2b8caa3b62b22166e5e16b3e9a761 authored almost 7 years ago
conftest: py3.5 path issue

github.com/ocrmypdf/OCRmyPDF - 230d3012688dcca4920d61457cb7573db5923d30 authored almost 7 years ago
Travis: don't cache tests/cache anymore, you get it with git

github.com/ocrmypdf/OCRmyPDF - 1ce7b02d9423c4e0c60a33e456720f0d1e25e645 authored almost 7 years ago
tess cache: fix tess3 error for -psm instead of --psm

github.com/ocrmypdf/OCRmyPDF - a2d00f5f1da6c910826b45b9e735b14de2e162e2 authored almost 7 years ago
Fix PyMuPDF version for Travis

github.com/ocrmypdf/OCRmyPDF - f68eaa3b46a8dbaf6e2f5f7ae5f5a3adcda96668 authored almost 7 years ago
Tweak Manifest and .travis once more

Travis "do_not_include" moving around no longer needed, thankfully.
Manifest needed LICENSE.

github.com/ocrmypdf/OCRmyPDF - 0199ab220ea35bd5bcd00fe84a2b19de95a5abec authored almost 7 years ago
Update release notes

github.com/ocrmypdf/OCRmyPDF - 656045610aad8cc5cb4c8d666db06810fec86531 authored almost 7 years ago
test cache: fix Path + str error

github.com/ocrmypdf/OCRmyPDF - 8c1c61f20788bc8578cd9c529af505b848fa7c3f authored almost 7 years ago
Move ocrmypdf to src/ocrmypdf

github.com/ocrmypdf/OCRmyPDF - af085b79ddcc456b4742b28288429139174c90c2 authored almost 7 years ago
test cache: use .bin extension, fix .gitignore .gitattributes

github.com/ocrmypdf/OCRmyPDF - 77476965ae2519af28eb1c80dfdafb0a218d1d8c authored almost 7 years ago
Update manifest.in

github.com/ocrmypdf/OCRmyPDF - 961c1365f990c2eca53ca93e61864ee0ec821379 authored almost 7 years ago
Add test cache

github.com/ocrmypdf/OCRmyPDF - ca5151404607ec0f646edc26cbd0fa2339ed7bdd authored almost 7 years ago
Fix test_testonly_pdf generating an output file in pwd

github.com/ocrmypdf/OCRmyPDF - 8975b72a014535339a9f6d1105c3904a102d03fc authored almost 7 years ago
Add missing fixture to test_unpaper

github.com/ocrmypdf/OCRmyPDF - 874ec6a87f2f8c2bab3ec5d842f470681cd8142d authored almost 7 years ago
spoof: Allow tesseract cache to share cache

Previous incarnation was only suitable for generating a local cache
where the suite was executed...

github.com/ocrmypdf/OCRmyPDF - 909eaeeeadf3893be253a1a31f14eb365068a2ad authored almost 7 years ago
Tests: more cleanup

github.com/ocrmypdf/OCRmyPDF - c138161fae29b54db273a85acc9bb3762929cf2e authored almost 7 years ago
Refactor out unpaper-specific tests

github.com/ocrmypdf/OCRmyPDF - e48590d66cedd1f97e6245c4b112dee6f7b62c81 authored almost 7 years ago
Review some skipped tests to make sure reasons still valid

github.com/ocrmypdf/OCRmyPDF - 5b1c8541fcb7fd1f83d98da3d344d6c9908e57ed authored almost 7 years ago
Remove the OCRMYPDF_program environment variables

Really, this was just replicating the functionality of the PATH
environment variable, and users ...

github.com/ocrmypdf/OCRmyPDF - e5e011021bd7966364c6150dd0d9cdfb26414f6d authored almost 7 years ago
Remove the OCRMYPDF_program environment variables

Really, this was just replicating the functionality of the PATH
environment variable, and users ...

github.com/ocrmypdf/OCRmyPDF - 11d74dea09d01af9f76aff669693f1dd1d9ca73c authored almost 7 years ago
Update requirements

github.com/ocrmypdf/OCRmyPDF - cbdf9c88c5710bc2feced3d1b1a46ed76e6d6ee0 authored almost 7 years ago
setup: skip 1.12.4.1 since it does not provide wheels

github.com/ocrmypdf/OCRmyPDF - 46601b13507dcb0567b4edbb9d6993f14148f093 authored almost 7 years ago
v6.0.0 notes, build machinery changes

github.com/ocrmypdf/OCRmyPDF - 6f1a40b2caf364eef73b99f429686d153b7ae75f authored almost 7 years ago
Update documentation license info

github.com/ocrmypdf/OCRmyPDF - a2b1f54eb2ca13d30df0a4a308f965ee4c23765e authored almost 7 years ago
Add license notice to all files

Source files to GPL3

Exceptions:
-tests/spoof/* to MIT
-hocrtransform.py
-_unicodefun.py

Test ...

github.com/ocrmypdf/OCRmyPDF - 675601657282e1620720c8dc1517b60dc21c48da authored almost 7 years ago
pipeline: make removal of merge_qpdf more explicit

github.com/ocrmypdf/OCRmyPDF - f42123afc3c8f1efd74e5d2ed7680ecaf373272a authored almost 7 years ago
pipeline: Merge branch 'feature/mumerge' into test

Replaces qpdf page merging

github.com/ocrmypdf/OCRmyPDF - 1425ffd274fe36df4fe3373344cc698049cf688d authored almost 7 years ago
Fix regressions after --skip-text improvements

github.com/ocrmypdf/OCRmyPDF - d700154e0e25aebb4ee264f1c7b86600eb81213a authored almost 7 years ago
Add PyMuPDF and use to detect text on pages

github.com/ocrmypdf/OCRmyPDF - efecf42566f333567ca8bde4f6bd1b3840ef8ace authored almost 7 years ago
mumerge: fix regressions

github.com/ocrmypdf/OCRmyPDF - 74bdfc07fb9c9e9d6e63f55fc44c7e06f960f127 authored almost 7 years ago
Fix text/image files not closed in combine_layers

github.com/ocrmypdf/OCRmyPDF - 376dfdba1c07d1a64554a9e697b7892d6bf085a7 authored almost 7 years ago
Try out pymupdf merging

With garbage collection it reduces waste on the worst case file.
That's nice. 1 MB -> 105 MB -> ...

github.com/ocrmypdf/OCRmyPDF - 3795d6720f88692101ba13c537776b79283b4d0a authored almost 7 years ago
Remove duplication between page merge functions

github.com/ocrmypdf/OCRmyPDF - 537aaf56d763463fe1f2cd763dd3c99c46db7888 authored almost 7 years ago
Merge branch 'feature/faster-split'

github.com/ocrmypdf/OCRmyPDF - 34d51b5d3d205b9e845d775a81eaeef61b989afb authored almost 7 years ago
Move available_cpu_count to helpers

github.com/ocrmypdf/OCRmyPDF - 4f1f3b9b517440e45f46d81358dfe9fed251d3ed authored almost 7 years ago
Optimize page splitting by multiprocessing

Previously page splitting occurred in a single process because it was
not believed to affect per...

github.com/ocrmypdf/OCRmyPDF - dea8fcfb5b9842aa9105de6fa0c46926fcfe4703 authored almost 7 years ago
Document some instances of 0 vs 1-based page numbering, import cleanup

github.com/ocrmypdf/OCRmyPDF - dfeb8812adf85448100144e38604a28378dde0c5 authored almost 7 years ago
Travis: avoid using set -e since it interferes with Travis

https://github.com/travis-ci/docs-travis-ci-com/issues/1672

github.com/ocrmypdf/OCRmyPDF - 63e2b4273a59c2b3d9c6b3f691148def90a2ab1d authored almost 7 years ago
Merge commit '9e2105e08d5fc765dbf636d108809bb66ab562a5'

github.com/ocrmypdf/OCRmyPDF - 5790dbc085167f69ee9fc370a1ee64e74a2333e1 authored almost 7 years ago
Update readme shields

Drop Docker Hub for now, add homebrew

github.com/ocrmypdf/OCRmyPDF - 9e2105e08d5fc765dbf636d108809bb66ab562a5 authored almost 7 years ago
Travis: don't trigger Docker Hub anymore

Docker Cloud is set up to build on pushes to master and tagged releases.
Hopefully that will wor...

github.com/ocrmypdf/OCRmyPDF - 22582bbd1c35d90772261ccaed988e8630fdbb26 authored almost 7 years ago
Solve text detection issue with PyMuPDF

github.com/ocrmypdf/OCRmyPDF - e5f27b7a122ffc7427e2ce2a9187fbf7592b004f authored almost 7 years ago
Tweak release notes

github.com/ocrmypdf/OCRmyPDF - e88ec9822badccbce8ab5eb586f8394ccf36c8f1 authored almost 7 years ago
Not ending Py3.5 support just yet

github.com/ocrmypdf/OCRmyPDF - 5ffd2f5c966017b7842b5fbbc1a38796e77f7b55 authored almost 7 years ago
Update release notes for v5.7.0

github.com/ocrmypdf/OCRmyPDF - 11fdb4c5d84df67b1989d4893bc7b95727838c5e authored almost 7 years ago
Merge better-hocr

github.com/ocrmypdf/OCRmyPDF - 319aff6d09cfec42f58312f7c32c1ac94a6d60cd authored almost 7 years ago
hocr: simplify some math expressions and add comments

github.com/ocrmypdf/OCRmyPDF - a614fa3400a2f881ecc495eaa9fd592f8420336c authored almost 7 years ago
Fix typos in advanced.rst (#228)

github.com/ocrmypdf/OCRmyPDF - 8d691391acf7a776c8115dad91fe7fbc000e7825 authored almost 7 years ago
hocr: Make interword spaces default and non-optional for hocr

Update documentation to match.

github.com/ocrmypdf/OCRmyPDF - 0089a84c9421ed839769ac3a855ad6ba449f73a3 authored almost 7 years ago
hocr: Remove baseline dashes

github.com/ocrmypdf/OCRmyPDF - 90676e1c6a29feca9fa7e906faeb00bb39de07a1 authored almost 7 years ago
Some cleanup and variable renaming

github.com/ocrmypdf/OCRmyPDF - 062901be43c2844480527e4fec6eb65c571b1a0f authored almost 7 years ago
Force Tesseract 4 to be single threaded

Gives better performance (throughput basis) than the existing solution
and scales better on powe...

github.com/ocrmypdf/OCRmyPDF - 6d7ee987213990d610de423f7fc64d400f522cc3 authored almost 7 years ago
v5.6.3 notes

github.com/ocrmypdf/OCRmyPDF - fc0800ed5d770d4587f188f7d508b55c9b1059ae authored almost 7 years ago
v5.6.2 notes

github.com/ocrmypdf/OCRmyPDF - f4e3a0e5b268a4a3d9afea2ef730b99e2eef60a2 authored almost 7 years ago
Suppress debug message when merging large files

github.com/ocrmypdf/OCRmyPDF - d631c80024c3ba8c8e5e62933b47a8a444bba482 authored almost 7 years ago
Suppress spurious debug message in --output-type pdf

github.com/ocrmypdf/OCRmyPDF - f1f0033875a418ccc8794a37ecb76c18737d10da authored almost 7 years ago
v5.6.1 notes

github.com/ocrmypdf/OCRmyPDF - 84d120e850d6d7bb48f5d2ede4003f4901dc39b8 authored almost 7 years ago
Skip one test that fails for qpdf 8.0.[0,1], due to qpdf regression

github.com/ocrmypdf/OCRmyPDF - 8159cc6b880da0a4773e0f19a84a11ca433fe283 authored almost 7 years ago
hocr: account for baseline offset to position text more accurately

github.com/ocrmypdf/OCRmyPDF - 995f8c106b7341a2e0ce1070be52249eef0b070b authored almost 7 years ago
hocr: account for skewed baseline

github.com/ocrmypdf/OCRmyPDF - 7cc104b138d657df2a58c4c8414183a7c9c458fb authored almost 7 years ago
hocr: Refactor use of text object

We don't need to declare the font on each word.

No improvement for removing trailing space or a...

github.com/ocrmypdf/OCRmyPDF - 4986afca282946f741a58c3ad75278072f10ccef authored almost 7 years ago
hocr: adjust text cursor with relative moves

github.com/ocrmypdf/OCRmyPDF - b4d66650bd89e55b1d87a488c5c27ff83734b4b8 authored almost 7 years ago
hocr: add baseline function, hocr doc link

github.com/ocrmypdf/OCRmyPDF - 0e7a4deaec24d2d1bab8b883d43dc431748fc39f authored almost 7 years ago
Suppress spurious debug message in --output-type pdf

github.com/ocrmypdf/OCRmyPDF - 04c54a7c315d672f0f1af1d15e35d866b3bf857e authored almost 7 years ago
hocr: Make words on line use the line height

Seems to improve the behavior and appearance
of selected text a fair bit.

github.com/ocrmypdf/OCRmyPDF - 2b6004a82bec07068ddf7e9a197eab08c7d2bab5 authored almost 7 years ago
hocr: refactor/improve PEP8 a bit

github.com/ocrmypdf/OCRmyPDF - b3a7299a623607d02353aca86e6b1c91c8836e97 authored almost 7 years ago