Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Collective -
Host: opensource -
https://opencollective.com/ocrmypdf
- Code: https://github.com/jbarlow83/OCRmyPDF
github.com/ocrmypdf/OCRmyPDF - 1d0584c64468aed832f84e4e352d7555d6672072 authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 84b9d4d021113560948274f35712668381d00ea2 authored almost 3 years ago by James Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 41efd3bf0fb6357de0d5078e3e75de5116823a33 authored almost 3 years ago by James Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 776ada671391a6282cdf397c78a3487fb1607059 authored almost 3 years ago by James Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - f3593c915dbfdafa1117990436dd4140badd3d68 authored almost 3 years ago by James Barlow <[email protected]>
Switch to --use-threads seems to have broken tests that assumed they could
monkeypatch things. A...
github.com/ocrmypdf/OCRmyPDF - 0c43963d697038ee5e6dbdf10462ef6cb1515656 authored almost 3 years ago by James Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - f29fe7f23eb21342b9dc8ff53f7d3b824d59f9c8 authored almost 3 years ago by James Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 04996caac34a418cf233c0f3c8ac436b6f2b5920 authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 13917c051c91e677b245d2e57098d6b8ab316e0d authored almost 3 years ago by James R. Barlow <[email protected]>
Re: issue Hanging on Random Files #814
github.com/ocrmypdf/OCRmyPDF - 8182fe9c927c8ba1053cf649bf222a1b5be2c837 authored almost 3 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 1950acfbda3a659ca70658c848f900306ab2e35e authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - fca6403083206ee3a90098143d56dadc35169595 authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - c4e2fce1efe2f3ebe6c20198812384fa4c19cb1f authored almost 3 years ago by James R. Barlow <[email protected]>
This is overly cautious but will do for now.
github.com/ocrmypdf/OCRmyPDF - 354647965893d6803c32f69fe21767ec8397cebf authored almost 3 years ago by James R. Barlow <[email protected]>
Appears that these are spurious errors from qpdf probing the /DecodeParms
dict on images that do...
github.com/ocrmypdf/OCRmyPDF - 8f714b1375e0f8bde2cef09e6fe05b8af7041038 authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - cb05c1d1224bcb6685c41d2e87fe2fc5e5ef6410 authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - b0ad07bc5f34af7f99517ad9ba1101ed19db1364 authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 514038d4ec2ce249d039a7fc9d1ef6023181e936 authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 50d76e7f6c063bed71b88c45a9f3cbf863a64446 authored almost 3 years ago by James R. Barlow <[email protected]>
pikepdf will now get the ICC profile out and put it in the JPEG.
github.com/ocrmypdf/OCRmyPDF - 6c78a462856ef44af7272492f260769a308576bf authored almost 3 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 863d56063262d7c0789da65d4a90fcf4d2be2840 authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 73934c854c0ee24c21bba369a5de34636c751c8b authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 2be8eeec2cffa7860f9b078aeea4d063767ee64b authored almost 3 years ago by rdiez <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 3dfde479e2d0d6dac4c6f55093038068e475b82c authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - aea1862644caf5101b532558a53f294eb1a6e92f authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 3b406112d0b59d525ddf81996918c035168d928c authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - fcc4c2d37180cd5ef6fce57e662e1aafc50713e5 authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 3de18ed6123db1517ed0dd364051554e9bda600b authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 93cca42e2056ef323e2a4bc60bef04159581f4f7 authored almost 3 years ago by James R. Barlow <[email protected]>
Fixes #894
github.com/ocrmypdf/OCRmyPDF - 2d0ac4707c6b19614bf56bede0892656cd0e1f0c authored almost 3 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 7d208175cf3fe5da3db27c12a245abf360dcb64c authored almost 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - ea69e868ed95a335b362a3708628c0372cb7abb8 authored almost 3 years ago by James R. Barlow <[email protected]>
This reverts commit 7966192d6edb989f208cbbc6487346fdda635e78.
github.com/ocrmypdf/OCRmyPDF - beea603ab32eee22c1e9fa00f55a15b7e45d5ed2 authored about 3 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 7966192d6edb989f208cbbc6487346fdda635e78 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 5acbd7a2525aa312fc57a7b9108403937fdfb60b authored about 3 years ago by Anton Gladky <[email protected]>
github.com/ocrmypdf/OCRmyPDF - aed955ca8c70798a4ed6c642c768cbd77f483c7f authored about 3 years ago by Krasimir Nedelchev <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 298bdb8690a4cbb7ce134372e5a4f6a018134684 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 1a58abcc6a9c0cf555d91d0621e8a9ce017e9006 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - dbfceba0201af5cede0529f27d3f203e831295b7 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 0faa618c3c27642460fced1b4c2d81d1e7534c9b authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 7035002c03dd12bfc5db3dbf9479a9688d02946a authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - f8fadaef41ae6ee8b84e93f8369e4f5e840326d8 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - ee21bf9ef61faafa3444ef4a1f1efe6f09a879b8 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 190ca8195131d033dd1bb2bf4ff9bbbdd82dca1a authored about 3 years ago by James R. Barlow <[email protected]>
Closes #868
github.com/ocrmypdf/OCRmyPDF - d48254d477c45402134dfa4683781739346bf336 authored about 3 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 1ec2ccca14f2bc1a8c8f3ef11a659f9f96407562 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - e78f0cc56f3c144a57eedad268160cc97b066730 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 13af3252ff68d6d82b5f06ac5aa4391f3143bcea authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 0528867e0be5da69b50501ee508afc01532c7986 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6910c48b8113ace3ddc14392c87c938a2b1d7624 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 69aa3981c43af6ba689f3e63081a975283b29341 authored about 3 years ago by James R. Barlow <[email protected]>
It's EOL.
github.com/ocrmypdf/OCRmyPDF - 9c1e5adfe61c22bf1259bb685bc2ee91756cb8bd authored about 3 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - e642dd4b356ba18691648729fb7f6b7428e56bfa authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 1414a8f5dcc1a6697f2c42efb50e1672b771107b authored about 3 years ago by James R. Barlow <[email protected]>
ProcessPool/ThreadPool don't have the ability to notice when a child worker
was terminated. Proc...
github.com/ocrmypdf/OCRmyPDF - 26badf2882a64798ed14d06aa1c3f2867fe932bd authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 8f873aaa45aada0b8db0cdc3a6f0e56843cdf880 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 8fdcb15b4ea42ea875ed00c0b5de2662d763e947 authored about 3 years ago by James R. Barlow <[email protected]>
[ci skip]
github.com/ocrmypdf/OCRmyPDF - 0323738adaf213bc23747940a58a281ee2b5119c authored about 3 years ago by James R. Barlow <[email protected]>Squashed commit of the following:
commit 974de2e8ccad7fd34694f2c3a7a17c64bb52cdab
Merge: a8d7f9...
github.com/ocrmypdf/OCRmyPDF - 4c1ff1086c7e183a185bc1384ff59aec2201411d authored about 3 years ago by James R. Barlow <[email protected]>
Also add missing test for --tesseract-oem
github.com/ocrmypdf/OCRmyPDF - f91faf97955087704366df0060df398522fb622a authored about 3 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 793cc33a905ffda5ea43e7d845c590570bf5df3c authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - fbd72efd45d52cb7706eb25207928138d1f98858 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 1115923995abe01f29f63be6975b113eb656c28e authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 8478d67b2879689740cd34707d8353f1fd0ef17f authored about 3 years ago by James R. Barlow <[email protected]>
Seems acceptable. We don't normally use Ghostscript to downsample PDFs
like is happening in this...
github.com/ocrmypdf/OCRmyPDF - 312c1e51b5ca9b0be6e8ab3dd39540e36e126291 authored about 3 years ago by mara004 <[email protected]>
github.com/ocrmypdf/OCRmyPDF - cfe2bb25ba0e0c9cdc87868f8ba57d2bf616679c authored about 3 years ago by James R. Barlow <[email protected]>
Specifying option --oversample tends to introduce upsampling in rendering
by rasterizing page t...
To avoid ValueError: max() arg is an empty sequence
As suggested by @meet1919 in #833.
github.com/ocrmypdf/OCRmyPDF - 7ce1692eef82880e60c921196862a2b01db42c77 authored about 3 years ago by James R. Barlow <[email protected]>github.com/ocrmypdf/OCRmyPDF - 7959f7628d3429f3f2cf0e90311f81a810231369 authored about 3 years ago by James R. Barlow <[email protected]>
PDFs are quite likely to have a lot of pixels, e.g. large high resolution scans.
250 MP is a pag...
github.com/ocrmypdf/OCRmyPDF - 3810e576ffda761a17cb1352dfbb3afc521af03d authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 01c7895044aaffa426a803beb32a36383e1c2e51 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - fdc6aa03fb54e0d72a199f12a161ed13ee089646 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 25cc17ee038fdd6eea9b8cf2d708f6a6059a9985 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - e8098a147563825cdf411aab74c7e66eac41edaf authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6b773883dc6094a1b46b0eaf97983441d8020165 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 4ed962233519d9530c4960c7b7d546d7e85005dc authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - acc9d58c390b7977c75c888b5f2cfba29abea614 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 659e738f929dba5cf36949792d9774674ae1f725 authored about 3 years ago by James R. Barlow <[email protected]>
It seems that chocolately doesn't put gswin[32,64]c on PATH anymore,
so compensate.
github.com/ocrmypdf/OCRmyPDF - e3126d28068b39cff2b21e669e96517cdf4a2e6d authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 45020a7fcd56b259cb2e90625d2a0c61c304dcce authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - f51164aff8ebf29496e132393953554cf62c5bf7 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6f58a143510431588b7d76d1eb36047f48e6653b authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 7ba04267b19ac59a55a61127e009192a032e23d6 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 974956431365272688179d2feb352b7b7e1c7b0b authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 698e8791d7724f02ff211aff5b57ca349d0b3688 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 380b981763cc38760b758722ff2a9ec8e077bc6b authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 5abfb14c2a1bdcc5bf0a287616996d6aae3e33cd authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 036afc4d88eb2677a38d22425a1862f737522f51 authored about 3 years ago by James R. Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 59642a98b2e0d8a3b60f2dbd5cdf525fb2d9c48a authored about 3 years ago by James R. Barlow <[email protected]>
New function is likely not as robust but seems capable of inexact image comparison.
github.com/ocrmypdf/OCRmyPDF - f8c6be2e26145b0f27d41d55983f812b914c4702 authored about 3 years ago by James R. Barlow <[email protected]>
Confirmed that img2pdf just inserts JPEG verbatim. Never had to go through
the trouble we did.
Tesseract is now included better thresholding (binarization) in v5. Users that have
thresholding...
github.com/ocrmypdf/OCRmyPDF - b159e021104b5a97dbb7bbc8d4c9779b6cd19285 authored about 3 years ago by James R. Barlow <[email protected]>