Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Collective -
Host: opensource -
https://opencollective.com/ocrmypdf
- Code: https://github.com/jbarlow83/OCRmyPDF
github.com/ocrmypdf/OCRmyPDF - 3d0dc95a06b77a1ab2f9c09447c5aa2a345d708d authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 04a57a3cc28ddae2157dac8c5517522e751a48d4 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - d0c22ce01dd4df447066353c03cf13d627ff6b81 authored about 8 years ago by James R. Barlow <[email protected]>
It looks like GS 9.19 can incorrectly set overprinting for the text layer
even though this makes...
not exactly what we’re looking for
github.com/ocrmypdf/OCRmyPDF - eecab9b95d64cf684147ec6edc594f6a4df1a18a authored about 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 8abc2f113cce0789be3d3888d7da7fcb5951cb90 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 949d2ff1c23b042fb78904a133340ac25496712b authored about 8 years ago by James R. Barlow <[email protected]>
The behavior of this test will ultimately depend on what version of
img2pdf is installed, since ...
Turns out this occurred in any case where pdf-renderer hocr was used
and a tesseract timeout or ...
github.com/ocrmypdf/OCRmyPDF - cc9c0d819eaa1ea7a9c5d98df54460012213d6c7 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - a72b8caf476c6875108708a1194ca03eb04e7cad authored about 8 years ago by James R. Barlow <[email protected]>
Mostly by reducing RGB -> monochrome and applying JBIG2 compression
github.com/ocrmypdf/OCRmyPDF - fdd9b8b8ce90d6b223ece399f6fc426b8710617e authored about 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - c096b4ca8cac00436448976e0c2630f01497d590 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 427add30086960dc07308df9125580dba3571343 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - c45871700dd2a8748b4cba47ecd63e22a9d1849b authored about 8 years ago by James R. Barlow <[email protected]>
Mathjax isn't actually needed for OCRmyPDF's docs, but enabling this
extension causes the brows...
For sanity's sake, deal with tesseract streams in binary without
transcoding (via universal_newl...
github.com/ocrmypdf/OCRmyPDF - f24fb0e0c522dc98b15e4818747e7180bdfc9614 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 73b88a0a6ffcbbbca6832108172257c39b06247c authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - c42f39e2d4eb4e8bf4eed1e35e8bd637dcd6ac47 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 5e5fe3175f1bf79aee8036242fed4c1eb06f7e2c authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - cab65d1f11a18ebee7d55856de23af1ad5bc7eea authored about 8 years ago by James R. Barlow <[email protected]>
ReadTheDocs needs this.
github.com/ocrmypdf/OCRmyPDF - 245f05d5f4b94ac2b3cabd8e30ca292cb1415fd5 authored about 8 years ago by James R. Barlow <[email protected]>
# Conflicts:
# ocrmypdf/__main__.py
github.com/ocrmypdf/OCRmyPDF - 3d37ae988a4c92480ae01478c899ae7b02993231 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 717acd9855688aeab1f47c9fd4242999c757cd66 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 2e4431cc638d3b320e244ae197513be9889cdc55 authored about 8 years ago by James R. Barlow <[email protected]>
No need to involve 'cat', just hook the file up to stdin.
github.com/ocrmypdf/OCRmyPDF - f7387b0859133774f11b305b252b42c084fef8e4 authored about 8 years ago by James R. Barlow <[email protected]>To ensure piping to stdout is possible.
github.com/ocrmypdf/OCRmyPDF - a09f6b8977477707b82fd2dabd832a25358b9934 authored about 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - d63449c2143caabb61f1eaaf95d3bd0070c93160 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - a86805f0d98f58e1c419c3f7e95fc0d82d55ee1e authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 7d2009ccefbfe9fcfea096f62c564ce99ca69ce9 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 18ae5db06da6db751c5b1b4880f657d87ed66290 authored about 8 years ago by James R. Barlow <[email protected]>
Some PDFs omit the traditional q/Q wrapper and alter ctm with a stack
depth of zero, so make our...
github.com/ocrmypdf/OCRmyPDF - e20346032d2663d13d48039541c8a7317ebcdc85 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 693a27d76c91c2cb0423cb9af365fee41987b394 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 203966d86b2fffdc3b2e1d189efcf9be83cd5e76 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 7eca8508fd53f6faab4cd83b4b865a84f1709faf authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - b85270df1c644edba2f9b45431171653da92b6b3 authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - aff597cef4384f4690b1bc36e10c374c106f08aa authored about 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 61b05b3dee4fa8ac6371c72bcb6237850aa6b905 authored about 8 years ago by James R. Barlow <[email protected]>
`brew install tesseract` just installed the english language pack not French, German or Spanish
github.com/ocrmypdf/OCRmyPDF - 453c4ef602ff88baa428644483f6af70519f4d44 authored about 8 years ago by Julian Kahnert <[email protected]>github.com/ocrmypdf/OCRmyPDF - cf4b04f92d435e7d82a388f840cb270a6558215f authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 06c699998708079052b606b7aaedcaec1ec1b856 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 013c5a369f608ba756813358fd68f1dad0df82d1 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 07891d994aab92e7a14aebe1ac509aab2d4f170c authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6baf8668a66efa4325dd3e0b3b7eef01ce1c1aa8 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 4ba2962c5698e3dc9724c645dedf7c8ffb104bfd authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 7ad92f5db4a9b7aec229d91bf8361bd03afc10f7 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 4dad09cc919575d928c5d43501079d4db5d460bb authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 7b2e0c7a7a7188fba39fbfaa798278e5cd6ebcfa authored over 8 years ago by Sean Whitton <[email protected]>
Skip the test if the fair use restricted milk.pdf is not present.
github.com/ocrmypdf/OCRmyPDF - 7f08f15fc9483b33819b3b040b52a854a911fc39 authored over 8 years ago by Sean Whitton <[email protected]>github.com/ocrmypdf/OCRmyPDF - 825c0f8b2a726b7f78bb1e15cb26ad6547b6a47c authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - dbe880bc417e96b6f3559f5ff7d954971ffb30f3 authored over 8 years ago by James R. Barlow <[email protected]>
Found some interesting options for background norm.
github.com/ocrmypdf/OCRmyPDF - 2ec516b6ffdf1e91b7a3621fc4aa2db6221997a7 authored over 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 7942a01e504c65feeace2a9b178cb33515a1a872 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - df684f9344b80573022aff55ff306cbc64314ca9 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - ae16e95e42977deb01a99ebbda361ef8623df1e1 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 2ac8e8a0cc1ebd313b43699beda4225726d919ed authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 0a0ceda71f51ee154c763893f3796f685488cfd6 authored over 8 years ago by James R. Barlow <[email protected]>
This helper script is still in development and needs to be changed each
release, which breaks th...
github.com/ocrmypdf/OCRmyPDF - c62a8a97c9d95429060dbe48259be1b87e0ef336 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - f8a1136979f06a7a8264c524628b5941864ea162 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 9ca29c787b85dd1924840ce006301a90cbe55004 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6af748a251bd3a27db166b6f0d91092f98c0ea82 authored over 8 years ago by James R. Barlow <[email protected]>
More thorough testing showed that Acrobat do not presume that images
fill the page if the CTM is...
github.com/ocrmypdf/OCRmyPDF - 04099b087c501bece05a39041e7875e8e34b099b authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6d6234714cb65d67513b69a071e71d5efbabfd6b authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 520be23481596681c1886023cff00a4d110387a7 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 346c3c8dd3f56ac6ed39a5de2c60d4f437dce73b authored over 8 years ago by James R. Barlow <[email protected]>
Executing a package with python -m packagename will check for
__main__.py inside the package. I...
github.com/ocrmypdf/OCRmyPDF - 2625368aed389a6b4fefef8b000ea14af910d31f authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 8ac94879f109948246cf755bd0d5a0d346edcf20 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - dd8c0f3756a97e4f3f61dbbd3ddd2f73fdb416ce authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 010f353a5e28610847dbb9a3810d9b550ae7c452 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - e0a18edb924745a8c3fd6b337ec311bd1d926042 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - c6f2eea058e806179be3579fccf12b49a09999cf authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - bf89e38c69c6035871c3493f4d96608e6418b5b9 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - e1f0640d42955fc2501199fca30815fc683ffcd0 authored over 8 years ago by jbarlow83 <[email protected]>
TypeError: bad operand type for unary -: 'IndirectObject'
github.com/ocrmypdf/OCRmyPDF - 71b54035ba192313956fd66f579f05c6cc835871 authored over 8 years ago by James R. Barlow <[email protected]>As @spwhitton found:
The test suite needs to call "python3 -m ocrmypdf.main" instead of
just "o...
github.com/ocrmypdf/OCRmyPDF - 1a9f09c4d519eebce7d0e2b7890b939ab2bbc8f6 authored over 8 years ago by James R. Barlow <[email protected]>
Depends on locale being configured properly, and it's not necessary
to be able to do this.
https://sources.debian.net/src/ocrmypdf/4.2.1%2Bgit.20160824.1.5d67cc7-1/debian/patches/0003-pyt...
github.com/ocrmypdf/OCRmyPDF - 74cc2346a5cd869b7fa7037db09afbb9e8cd6b90 authored over 8 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - cc7e328358af8e4b1f9fdf78a7582572764cf08c authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - d25397e2b005996041d253787873ba7f47b2afa3 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - bc11454e1c70fbaff0c2236bf2911b4bd8096dea authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 2025a096c369fae6078964398476b4b7c525c534 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 38fe14b1080e750df7c9a01193aabd11bfd0a9d4 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 1b7b2f3695717ddb0e8e05f50286174342083dd3 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 5d67cc76cc7aaa4f22c45ed13138dfb880a8a902 authored over 8 years ago by James R. Barlow <[email protected]>
The recent commit to accept files from stdin broken the feature of
returning the input filename ...
github.com/ocrmypdf/OCRmyPDF - b06e0bfdcde943aa5863e5824687450db18f6ba9 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - d616f25324a4cb98e581c59942b5b914d82de26e authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - b03028e31feff9244410ad3e646023f1b2e3cbd1 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - e08c42fd3d78a4dd89f276cd4486eaa1ff7d4c5f authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 16901f7134047b9681fd579ff9c9039c2f6ebc7a authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - dffceedd85da173216836b10609776a6b18e4552 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - e5541e435caf5552d507c27f4b2cf70b369124e3 authored over 8 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - b969aad67b4adb8f47b9f9d6892b2d307b1ff764 authored over 8 years ago by James R. Barlow <[email protected]>