Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Collective -
Host: opensource -
https://opencollective.com/ocrmypdf
- Code: https://github.com/jbarlow83/OCRmyPDF
Acrobat insists that PDF/A-1b should not have object streams.
Other programs like veraPDF disagr...
github.com/ocrmypdf/OCRmyPDF - a23c22b0e8b373cd421f11aa6df2ce9b81ed8621 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - dd1f5f7215ed6cd12fa1c9065a4226befc41e1ca authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 5e2206bae76fe4b98f353676fdb37d8092a305e4 authored almost 4 years ago by Dima Kuznetsov <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 079ee86d4354470616051abaf07acb48d54bd1a7 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 3692868004c2c4255603b3f75b4b1d872ffd1bec authored almost 4 years ago by James R. Barlow <[email protected]>
Page size fixes in commit b26749 did accounted for a "kept" rotation,
but not a corrected rotati...
github.com/ocrmypdf/OCRmyPDF - 8770fff96885dd16b2bc95e5815592bb18d7ef94 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 82de78b6b0ec35b038a9e9db560bd3e99f70d650 authored almost 4 years ago by James R. Barlow <[email protected]>
They're unlikely to be handled well by our recompressors. It seems
that JBIG2 cannot handle very...
github.com/ocrmypdf/OCRmyPDF - 2898879be7bdf4b526c3924626ea673aa6768a56 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 18e613657cac71e7c12b0a3aa950747a181299b4 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - a48ca556c7f568d3245096219c0a1c34be184803 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 9cba738b485fec161c50768c0eaa8aacc1783995 authored almost 4 years ago by James R. Barlow <[email protected]>
Should improve results in some situations where the initial content
stream is messy or not well-...
github.com/ocrmypdf/OCRmyPDF - bccf2f423f43b61425ccdc10d3001d8e2a0c278a authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 166de3086b333fc2625fda59ffcccb4c414bdb52 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 206c675df68bbca06a7358b6c7a137d5f27d3032 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6c8f9223e9e3e18f95ccd9cd60c77e0b6a6d86d0 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 85c6a974ca4ef8335d6e413ad008d959535d951a authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - dccdcfaa913db3c4f4258927a806f7eb36a8ed36 authored almost 4 years ago by James R. Barlow <[email protected]>
So that we are not tied to tqdm.
github.com/ocrmypdf/OCRmyPDF - b1da09f141fe1c9af0aa6c7e3b4cc8b39b5c31ed authored almost 4 years ago by James R. Barlow <[email protected]>
For some reason JPEG optimization was not done in parallel, and was
perhaps never done in parall...
github.com/ocrmypdf/OCRmyPDF - a9ad805347e48df3cd6ed53be46bc3519b46a429 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 16bda74974df2802f44f421402ed969b3f119e3b authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - d274d88929d7d87b0e41313325ca56b1e4739b87 authored almost 4 years ago by James R. Barlow <[email protected]>
Faster, still produces PDF/A
github.com/ocrmypdf/OCRmyPDF - 327df5cbbc78ce160e877eb27bae603791fe397b authored almost 4 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 46d0632fe27f75e583e957edcc21beec63a2223e authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - ef1e7a814ec7578c81589fa5e34042f3f7a7e9ae authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 108472493762cb529b158e238432e8a1f1e44cbf authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - ecb0109d79fcbfc015a3a329e113aadc8605dac3 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 386cabff001dabfb8aa3e8644fc02ba6ce7ba083 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 3bd5054634446cec9fa2152e7dc6fa66ed2c4fa8 authored almost 4 years ago by James R. Barlow <[email protected]>
Now all tests pass, except for:
-tests that check the progress bar
-tests where xdist may or may...
github.com/ocrmypdf/OCRmyPDF - 6083b4f0a7e71c7547136aaa91b14c0a0309e87e authored almost 4 years ago by James R. Barlow <[email protected]>
It works
github.com/ocrmypdf/OCRmyPDF - 1a3ce59476df8f77a9a7fb297963358833516b1d authored almost 4 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - c6a2716cdbe85ca94cc69216ba94b5b04f29cfcf authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - c395436ba30bc78d529c147bb75aaf18c741fe92 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 8d23d0b4414ac955483d433edc568c6b7b0c7804 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 7bccb8c74844af7b1b7f1376a2abc3bdce7bb466 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 5545bae76f986f8d965c0aa0c4787c71e77d8b30 authored almost 4 years ago by James R. Barlow <[email protected]>
For API sanity and to communicate expectations. One progress pool at
a time is plenty of complex...
We can cut down on the use of global variables and save opening
an extra copy of the Pdf when th...
However, this may not be the best idea because it involves global
state that could be overridden...
github.com/ocrmypdf/OCRmyPDF - 34e564cd7de9847c63093b8d3601968341b22148 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 504d5776d2118bfea6bddcc2b7407604378c43fb authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - ee23976858418a464956fcd48c719d18f36816d4 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - f559316881befc67d389530599d0bbd10f31019e authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 084610c242be607d5cf586146f0829c71b3a7951 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 9ff627472b24dfab4ef33a4cc06faccc98298dc3 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 956310d1ecc69c9eca121560aa35533daf0c55d6 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 1a982da442a91bea8a50b0d67b4de60820be1b9e authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 4879a1f0ded5e1e4cbeb98e6f6f1b4f4943ee4be authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - ce66bcc9c84b224bb557a8265f6332b317771ab5 authored almost 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 1ebf3144afd0ba454551e043d71523fb7f46182f authored almost 4 years ago by James R. Barlow <[email protected]>
Might fix issue #709, Apple silicon support.
github.com/ocrmypdf/OCRmyPDF - 7a1cccbc4e098762fb0a6303cf5193b94a503985 authored almost 4 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - ebacff1b3915435365b9391e768f4547b558081d authored almost 4 years ago by James R. Barlow <[email protected]>
Since we can't directly test it
github.com/ocrmypdf/OCRmyPDF - c7c447be66cb51b9a389901e66a56d738512fad8 authored almost 4 years ago by James R. Barlow <[email protected]>
Previously if we found vectors of any sort on a page, we would bump
the DPI up to 400. We did no...
Previously we produced a raster image, then multiplied image width
by DPI to get the page size. ...
github.com/ocrmypdf/OCRmyPDF - f687180ecc3561ee129ee4991755f7d6b3bea02f authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6f4b38b103d8481986bf870257185f5875f775ab authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - d32324859ca54cc0ef225f3724ea3296438442c7 authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 48222b87b5eafbe054178501b72a68436b7656a4 authored about 4 years ago by James R. Barlow <[email protected]>
Co-authored-by: Jonas Winkler <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 62e5edc72bfbff8640f1589355a89c1d30d46b7c authored about 4 years ago by Jonas Winkler <[email protected]>github.com/ocrmypdf/OCRmyPDF - 2846d46bb83ea846fb71dfafab46ef03067943ec authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 47ef1914d491f9a6652c554aa6c93099410f2439 authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - df157552f3040773eaac84a2d6dee807e75a1b7e authored about 4 years ago by James R. Barlow <[email protected]>
Our method of getting data from pdfminer would silently consume a StopIteration
if pdfminer retu...
github.com/ocrmypdf/OCRmyPDF - 1e80d412fa983e96d30128770fbaba366132cc58 authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - df6e1062033003ca3dcf5463b95d04b74d00481b authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - bd0f00586147795993629c8d362134213fae77ff authored about 4 years ago by James R. Barlow <[email protected]>
Looks like a packaging error, choco complains of bad hashes.
github.com/ocrmypdf/OCRmyPDF - 6ba4b7b3f3a2dbbec9f974df97f551d23fff025d authored about 4 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 2c11349ee83475d85a21ab53cbfa972cec55e5bb authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - b0afef09efe23124adbdef95e625739719286f14 authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 72fa347c38319613516dc0b6a2004fbfc0367b39 authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 96d68c2413672dd6b37c2aef8b5365b75d76cd28 authored about 4 years ago by James R. Barlow <[email protected]>
We were not actually checking if functions we patched we called when
expected.
There is a small typo in docs/installation.rst.
Should read `installed` rather than `instsall...
github.com/ocrmypdf/OCRmyPDF - dc06990e5d67f4b89e4f342437384c3ad41925ae authored about 4 years ago by Tim Gates <[email protected]>
Since we currently log all of a process's output at debug it's
redundant to log this separate me...
github.com/ocrmypdf/OCRmyPDF - 81602cf420e8e79609097c39b62d85cdf5828604 authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 607e2d7e8172da0534744ff89b2c7a930d9eb794 authored about 4 years ago by James R. Barlow <[email protected]>
Closes #701
github.com/ocrmypdf/OCRmyPDF - b01d9e07e80435f755429c978018b850015ee60d authored about 4 years ago by James R. Barlow <[email protected]>Closes #702
github.com/ocrmypdf/OCRmyPDF - 91db94cf2ec166b7a30bdf01391f2ed5c771f84c authored about 4 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 416df803d46e39235fd05d61ab0a4e34f57efd67 authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 037b96ca16217bb91f30e0f838bdab3eee74c5d8 authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - bb258fc99c3d3676977d708dded660d3802bc283 authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 4b8ccbe8cb76480b03ab42b0c61814acd1c59a60 authored about 4 years ago by James R. Barlow <[email protected]>
Accept user-contributed fix. Not testable.
Close #690.
github.com/ocrmypdf/OCRmyPDF - ab1ff3331b3d9c42ec4b8920fedac5834f7e10aa authored about 4 years ago by James R. Barlow <[email protected]>Closes #686
github.com/ocrmypdf/OCRmyPDF - 3675ae918cf4e91b4fed9a237350c2c0f7a5d1aa authored about 4 years ago by James R. Barlow <[email protected]>
This reverts commit ad202693b3dcf905e180a665a54f349d00d8dfba.
Temporary folder prefix was actual...
github.com/ocrmypdf/OCRmyPDF - add64e4fa2e8887be42759b8117b0feec62aa320 authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 7fe2954edef586c93cb9a1b9aaef666b10d96d65 authored about 4 years ago by James R. Barlow <[email protected]>
Remove a change that was pushed back to a future release.
github.com/ocrmypdf/OCRmyPDF - ad202693b3dcf905e180a665a54f349d00d8dfba authored about 4 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 594ef83551be9233c6fe12a4b958ed793c7b51ba authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 78b71618c1ac8244f2fd96bb37d185f32c69186f authored about 4 years ago by James R. Barlow <[email protected]>
Fixes #692
github.com/ocrmypdf/OCRmyPDF - b8aa89e1ece7fbd58910e390057415dab96fea62 authored about 4 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 156d5d9a9c2739a4c5b1837a99f7ec5efb5b818b authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - b4c1f66bc1d9b39811644f33ce1e3da80c0d0a53 authored about 4 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - d2908640c61480011091ddd20e1245e72507f038 authored about 4 years ago by James R. Barlow <[email protected]>