Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Collective -
Host: opensource -
https://opencollective.com/ocrmypdf
- Code: https://github.com/jbarlow83/OCRmyPDF
github.com/ocrmypdf/OCRmyPDF - e70387b1af33e5b6a5a4c5c86a4b1223236ad462 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 44f47fba216bf7927897d5bdfc0d9fdd1d946cdf authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 02584094a1bcfcc29002ac2a31f8303ef643de30 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 91d715ac934d0f8c357b335678eea95f26b0c298 authored over 8 years ago by James R. Barlow <[email protected]>
Should extend test for other Asian languages
github.com/ocrmypdf/OCRmyPDF - 35addb8a33485986766e80b1580a023a4bfbdbd8 authored over 8 years ago by James R. Barlow <[email protected]>
I tried "qpdf merge + PyPDF2 metadata patching" first. The problem is
that PyPDF2 produces a 1.3...
github.com/ocrmypdf/OCRmyPDF - 12575d594a0fa9468a892feb52473a0cd588d5e7 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 0746083301a7d074525a3fa2a970c732cb0c65dd authored over 8 years ago by James R. Barlow <[email protected]>
All but one tests pass, test_input_file_not_a_pdf
Not sure if PyPDF2 metadata generation will m...
github.com/ocrmypdf/OCRmyPDF - 5c99acf6d12997e337d582f5517c66038dd62a2f authored over 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 2b10df7b7449d0e08df457add75f4603f1f3dc6b authored over 8 years ago by James R. Barlow <[email protected]>
Tests mostly passing. For the moment this is the new default.
Although PyPDF2 produces a PDF-1....
github.com/ocrmypdf/OCRmyPDF - ebe68de4ff9bd033dd7fe793b334b32c77fc0665 authored over 8 years ago by James R. Barlow <[email protected]>Does not copy /Catalog metadata, but otherwise functional
github.com/ocrmypdf/OCRmyPDF - b17c6a146d18ff22d095b575842d32a05493508d authored over 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 46d837c86652514fceaa969869e677da54aeafa1 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 24856b61e41ab06cceee5f042c694827c0dae5b4 authored over 8 years ago by James R. Barlow <[email protected]>
Sadly the Python developers are removing this script
github.com/ocrmypdf/OCRmyPDF - 8d0c6ff616d8d3f5b4ed1cbab6b445f39a5543f3 authored over 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 0b24f971cd7f38984696d2eccc9ea9d348f88473 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - bc5d3824bd0c4893e8d37e3ccbd2cac255cee58e authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 43569837071ffc8dfb4a07fe07a892cff44ae4fb authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 2414b79ee61c1352873578f36544e5e6a56f2a88 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 968e1546f01dd3ec0d17bf6145599dff8bbb21c5 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 48213c9c3f8e91295566979a2c913ce71f75829b authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - f385772d212fb6eb7fb7e7b233fb3a10b3eac50a authored over 8 years ago by James R. Barlow <[email protected]>
It seems that ruffus sometimes decides to send a ['inputfile.pdf']
instead of a bare string.
github.com/ocrmypdf/OCRmyPDF - 7b72ffec4f8ed1243c222fe83a16a679e8fb3a79 authored over 8 years ago by James R. Barlow <[email protected]>
First cut.
May have broken ruffus errors again too.
github.com/ocrmypdf/OCRmyPDF - 757f6826dca355c30ce0e081299eebca5dec0e47 authored over 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 5df83a0d30c60f5cb57229b2c5e406bb5cfe1ed1 authored over 8 years ago by James R. Barlow <[email protected]>
It's a good habit to ensure any iterator test is explicit about
allowing or disallowing strings.
github.com/ocrmypdf/OCRmyPDF - 0dfceedcfb35a373f635f9d8d2748a410a79323c authored over 8 years ago by James R. Barlow <[email protected]>
The build is #122
https://travis-ci.org/jbarlow83/OCRmyPDF/builds/148255615
Errors seem to be r...
github.com/ocrmypdf/OCRmyPDF - 2c30f4bfc5357b54f5eb1d850e8339beea733eaf authored over 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 9e7fb52b4798e1e098a18b57598b3950bc8fe8d5 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - bb5fd38e3870d2c253394a8dba882c35c4a75bab authored over 8 years ago by James R. Barlow <[email protected]>
This removes some backports for packages that Ubuntu trusty offers but
for which Ubuntu precise ...
This test is exercised by page 4 of multipage.pdf. If all images are
JPEGs, and one of deskew/cl...
Tesseract renderer not immediately fixable.
github.com/ocrmypdf/OCRmyPDF - 8f77576dc4558f31e4bbe193c4685a36323939ab authored over 8 years ago by James R. Barlow <[email protected]>
Some called functions are particular about the data format of DPI and
don't like to deal with th...
Useful for people who want to reprocess text.
This also requires --oversample because DPI is un...
github.com/ocrmypdf/OCRmyPDF - 16e4d342d2d18e88f36a1a12e679380655cdc577 authored over 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 8458a51860c18ecae7fdaba0b0c15387814f8421 authored over 8 years ago by James R. Barlow <[email protected]>
-dSAFER does not work when rendering PDF/A, because that needs to load
the ICC file, and -dSAFER...
github.com/ocrmypdf/OCRmyPDF - 514efa36fcc2f79ae173f429cb208a63ae968f5b authored over 8 years ago by jbarlow83 <[email protected]>
github.com/ocrmypdf/OCRmyPDF - bd48f40d3d049d605aa571fdddd9be2a070afac2 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - c02dbc809ab0cbfbf16c0b6485eef7335979255f authored over 8 years ago by James R. Barlow <[email protected]>
Issue #79.
User submitted PDF with ICC profile attached to the monochrome image
in the input fil...
github.com/ocrmypdf/OCRmyPDF - 68cf9cbd87c188823027f9d1bfe9029017e7281f authored over 8 years ago by jbarlow83 <[email protected]>
github.com/ocrmypdf/OCRmyPDF - c9b2540d9d69c3cffd92d4881c6b6a4aaff53561 authored over 8 years ago by jbarlow83 <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 1bacf35a2c84fe57154d39d64852d522280f8305 authored over 8 years ago by jbarlow83 <[email protected]>
Adding explicit reference to help
github.com/ocrmypdf/OCRmyPDF - 8aef0d92779af722da0e18938a7907c2b4c3ac8d authored over 8 years ago by jbarlow83 <[email protected]>github.com/ocrmypdf/OCRmyPDF - b2fa8645ba34e9ebf2546f86fea5274a13d1f3ff authored over 8 years ago by John Muccigrosso <[email protected]>
github.com/ocrmypdf/OCRmyPDF - c96823a6485cfceb722959f60107e840043c87ca authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 3807b7d65506a5bdbb1602039d72ffed7b188e1b authored over 8 years ago by James R. Barlow <[email protected]>
Issue #73. The order of operations happens to not matter for scaling
but does matter for transla...
Algorithm 4 -> PDF version 1.6
github.com/ocrmypdf/OCRmyPDF - b4a734fc0d6051c587239cf9adac351f71287a66 authored over 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - bbd02926e14d8d75f1dfcff21809e7dfaa8f0667 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 5022ded27647bd598e56c46b27fb6a2f95067ea9 authored over 8 years ago by jbarlow83 <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 8d79b94b8456742e8bebfe395af9e60e6ba0051a authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - d7f60b96c107cd5c385aec229b6b9b1fe4d7442b authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - c7612152ef8096d32d0222abeb5a012bc13a3ac9 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - af91642cd177ba4f59fb3d0020b2cc0164bbd5ff authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 9c66334c38f9fe738e4187de35c3d3b201fdee26 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - b964999427c01317b991322f46409c5e02482e1d authored over 8 years ago by James R. Barlow <[email protected]>
New file is from Debian package icc-profiles-free
github.com/ocrmypdf/OCRmyPDF - 3473345ea61b9bf57bb9146bd57984b91e92ad14 authored over 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 349ec5c81fb5153dabc78239d20be11d22be5797 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - ff78d7c56c18f3f6affb53779a6ac4245dc2ca5f authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - ff092c86299e09bc9bba5c176c791a5ad9814e16 authored over 8 years ago by James R. Barlow <[email protected]>
I found this issue in ruffus 2.6.3
https://github.com/bunbun/ruffus/issues/65
also discussed her...
github.com/ocrmypdf/OCRmyPDF - 507fbc01d5fe62397f72c3e17c47b8e8c1b63b20 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 325479e5be3daf29ddf03415841278a6d351951e authored over 8 years ago by James R. Barlow <[email protected]>
Very unlikely to occur
github.com/ocrmypdf/OCRmyPDF - e926ecb8b26adc8b8d6c3b6829f27efb87d86105 authored over 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - d0cb6c0e924586fe96205c29b7780014644820a4 authored over 8 years ago by James R. Barlow <[email protected]>
/ImageMask means the the image is a stencil mask for a grayscale or
color image. From issue #63 ...
github.com/ocrmypdf/OCRmyPDF - 40baab32acbec996079adf545f5681a66e15bb3f authored over 8 years ago by James R. Barlow <[email protected]>
Take the threshold from tesseract's default value for -psm 1.
github.com/ocrmypdf/OCRmyPDF - e877d37ac869cef03df6b6d8bd5448294a39d1a8 authored over 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 5a9f77e4382aadcb6b4ae0fc3f52feb2646f15c6 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 8ddd67d1e292f0cebcac5c77f9e2e1d4370923de authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 1605408c23fa1b9252c5d3f10f279b43733b0728 authored almost 9 years ago by jbarlow83 <[email protected]>
Needs testing
github.com/ocrmypdf/OCRmyPDF - 2d3b1ebf6ef6533cc88254407e1bf1239c1197fb authored almost 9 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - c74eaab7f55883284d818d6e21d41aaceff8fbc6 authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - c21d231388ab52f450f47104889046e6883e9bfb authored almost 9 years ago by James R. Barlow <[email protected]>
README: Debian and Ubuntu installation option
github.com/ocrmypdf/OCRmyPDF - a73afc4e769202b916d35dee481d741cf6bb7224 authored almost 9 years ago by jbarlow83 <[email protected]>github.com/ocrmypdf/OCRmyPDF - 76c364150d42b80e07e053fe8aec89ff89d93b76 authored almost 9 years ago by Sean Whitton <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 94a3e447cc4cfd09b697108919df8fbc3d737bdc authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 12868b461a3dc032a49a4a0ed645406a7298eef0 authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 322085933bc1e095b0ae99e7a2ae9ad5c691a186 authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 3fed94bb796bae3686cd05270bce8b7d344243a3 authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 8c877482bd5406376dcbd2411437b19270b9d28c authored almost 9 years ago by James R. Barlow <[email protected]>
Ghostscript dev advised against. It appears that this is for
creating target for a device that c...
github.com/ocrmypdf/OCRmyPDF - 368252a2439f14116637142278f03a220b3219d0 authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - ccefda1beea2445102133d230e4cf9b05bc38a59 authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 3d0e8c9629b89b89a8e5e589746e0efd6e1d84bb authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 313bbbb94c8afd5d5f6360a67efb46c7b8a977c4 authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 0360f078de904159a641b80c6e09a185adaffeee authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - c8901666c41f45ed2715e87e84f46d94acf419be authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 7430006596fb71d367e5cc87526863a389cf8556 authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - f3e06b2dbd688e6a40cb42edb6a925151b79af7f authored almost 9 years ago by James R. Barlow <[email protected]>
Replace broken link to c't article by permalink
github.com/ocrmypdf/OCRmyPDF - e97df307ffe2f7499793a6f20c234156495204ef authored almost 9 years ago by jbarlow83 <[email protected]>Update also the 2nd article link to use a permalink, too.
Signed-off-by: Stefan Weil <sw@weilne...
github.com/ocrmypdf/OCRmyPDF - 1443354aa2d701ebef2f28f9bd47709e4c451d05 authored almost 9 years ago by Stefan Weil <[email protected]>github.com/ocrmypdf/OCRmyPDF - 250e68c1cd3546a5b1349973158ad516401fac8e authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6a380ee99ce6b91a13d160888185d6fee6111903 authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 3c90bd96a99efde05378f51aceec9e0b10a87f9a authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 06a7ceb25a8e443f6ae0a09e7f45b76aeebc5405 authored almost 9 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 733a8e7d58c26d46b714d02379f2361370ce94d1 authored almost 9 years ago by James R. Barlow <[email protected]>