Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/ocrmypdf/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
https://github.com/ocrmypdf/OCRmyPDF
1d0584c64468aed832f84e4e352d7555d6672072 authored almost 3 years ago
84b9d4d021113560948274f35712668381d00ea2 authored almost 3 years ago
41efd3bf0fb6357de0d5078e3e75de5116823a33 authored almost 3 years ago
776ada671391a6282cdf397c78a3487fb1607059 authored almost 3 years ago
f3593c915dbfdafa1117990436dd4140badd3d68 authored almost 3 years ago
Switch to --use-threads seems to have broken tests that assumed they could
monkeypatch things. A...
0c43963d697038ee5e6dbdf10462ef6cb1515656 authored almost 3 years ago
f29fe7f23eb21342b9dc8ff53f7d3b824d59f9c8 authored almost 3 years ago
04996caac34a418cf233c0f3c8ac436b6f2b5920 authored almost 3 years ago
13917c051c91e677b245d2e57098d6b8ab316e0d authored almost 3 years ago
Re: issue Hanging on Random Files #814
8182fe9c927c8ba1053cf649bf222a1b5be2c837 authored almost 3 years ago1950acfbda3a659ca70658c848f900306ab2e35e authored almost 3 years ago
fca6403083206ee3a90098143d56dadc35169595 authored almost 3 years ago
c4e2fce1efe2f3ebe6c20198812384fa4c19cb1f authored almost 3 years ago
This is overly cautious but will do for now.
354647965893d6803c32f69fe21767ec8397cebf authored almost 3 years ago
Appears that these are spurious errors from qpdf probing the /DecodeParms
dict on images that do...
8f714b1375e0f8bde2cef09e6fe05b8af7041038 authored almost 3 years ago
cb05c1d1224bcb6685c41d2e87fe2fc5e5ef6410 authored almost 3 years ago
b0ad07bc5f34af7f99517ad9ba1101ed19db1364 authored almost 3 years ago
514038d4ec2ce249d039a7fc9d1ef6023181e936 authored almost 3 years ago
50d76e7f6c063bed71b88c45a9f3cbf863a64446 authored almost 3 years ago
pikepdf will now get the ICC profile out and put it in the JPEG.
6c78a462856ef44af7272492f260769a308576bf authored almost 3 years ago863d56063262d7c0789da65d4a90fcf4d2be2840 authored almost 3 years ago
73934c854c0ee24c21bba369a5de34636c751c8b authored almost 3 years ago
2be8eeec2cffa7860f9b078aeea4d063767ee64b authored about 3 years ago
3dfde479e2d0d6dac4c6f55093038068e475b82c authored about 3 years ago
aea1862644caf5101b532558a53f294eb1a6e92f authored about 3 years ago
3b406112d0b59d525ddf81996918c035168d928c authored about 3 years ago
fcc4c2d37180cd5ef6fce57e662e1aafc50713e5 authored about 3 years ago
3de18ed6123db1517ed0dd364051554e9bda600b authored about 3 years ago
93cca42e2056ef323e2a4bc60bef04159581f4f7 authored about 3 years ago
Fixes #894
2d0ac4707c6b19614bf56bede0892656cd0e1f0c authored about 3 years ago7d208175cf3fe5da3db27c12a245abf360dcb64c authored about 3 years ago
ea69e868ed95a335b362a3708628c0372cb7abb8 authored about 3 years ago
This reverts commit 7966192d6edb989f208cbbc6487346fdda635e78.
beea603ab32eee22c1e9fa00f55a15b7e45d5ed2 authored about 3 years ago7966192d6edb989f208cbbc6487346fdda635e78 authored about 3 years ago
5acbd7a2525aa312fc57a7b9108403937fdfb60b authored about 3 years ago
aed955ca8c70798a4ed6c642c768cbd77f483c7f authored about 3 years ago
298bdb8690a4cbb7ce134372e5a4f6a018134684 authored about 3 years ago
1a58abcc6a9c0cf555d91d0621e8a9ce017e9006 authored about 3 years ago
dbfceba0201af5cede0529f27d3f203e831295b7 authored about 3 years ago
0faa618c3c27642460fced1b4c2d81d1e7534c9b authored about 3 years ago
7035002c03dd12bfc5db3dbf9479a9688d02946a authored about 3 years ago
f8fadaef41ae6ee8b84e93f8369e4f5e840326d8 authored about 3 years ago
ee21bf9ef61faafa3444ef4a1f1efe6f09a879b8 authored about 3 years ago
190ca8195131d033dd1bb2bf4ff9bbbdd82dca1a authored about 3 years ago
Closes #868
d48254d477c45402134dfa4683781739346bf336 authored about 3 years ago1ec2ccca14f2bc1a8c8f3ef11a659f9f96407562 authored about 3 years ago
e78f0cc56f3c144a57eedad268160cc97b066730 authored about 3 years ago
13af3252ff68d6d82b5f06ac5aa4391f3143bcea authored about 3 years ago
0528867e0be5da69b50501ee508afc01532c7986 authored about 3 years ago
6910c48b8113ace3ddc14392c87c938a2b1d7624 authored about 3 years ago
69aa3981c43af6ba689f3e63081a975283b29341 authored about 3 years ago
It's EOL.
9c1e5adfe61c22bf1259bb685bc2ee91756cb8bd authored about 3 years agoe642dd4b356ba18691648729fb7f6b7428e56bfa authored about 3 years ago
1414a8f5dcc1a6697f2c42efb50e1672b771107b authored about 3 years ago
ProcessPool/ThreadPool don't have the ability to notice when a child worker
was terminated. Proc...
26badf2882a64798ed14d06aa1c3f2867fe932bd authored about 3 years ago
8f873aaa45aada0b8db0cdc3a6f0e56843cdf880 authored about 3 years ago
8fdcb15b4ea42ea875ed00c0b5de2662d763e947 authored about 3 years ago
[ci skip]
0323738adaf213bc23747940a58a281ee2b5119c authored about 3 years agoSquashed commit of the following:
commit 974de2e8ccad7fd34694f2c3a7a17c64bb52cdab
Merge: a8d7f9...
4c1ff1086c7e183a185bc1384ff59aec2201411d authored about 3 years ago
Also add missing test for --tesseract-oem
f91faf97955087704366df0060df398522fb622a authored about 3 years ago793cc33a905ffda5ea43e7d845c590570bf5df3c authored about 3 years ago
fbd72efd45d52cb7706eb25207928138d1f98858 authored about 3 years ago
1115923995abe01f29f63be6975b113eb656c28e authored about 3 years ago
8478d67b2879689740cd34707d8353f1fd0ef17f authored about 3 years ago
Seems acceptable. We don't normally use Ghostscript to downsample PDFs
like is happening in this...
312c1e51b5ca9b0be6e8ab3dd39540e36e126291 authored about 3 years ago
cfe2bb25ba0e0c9cdc87868f8ba57d2bf616679c authored about 3 years ago
Specifying option --oversample tends to introduce upsampling in rendering
by rasterizing page t...
To avoid ValueError: max() arg is an empty sequence
As suggested by @meet1919 in #833.
7ce1692eef82880e60c921196862a2b01db42c77 authored about 3 years ago7959f7628d3429f3f2cf0e90311f81a810231369 authored about 3 years ago
PDFs are quite likely to have a lot of pixels, e.g. large high resolution scans.
250 MP is a pag...
3810e576ffda761a17cb1352dfbb3afc521af03d authored about 3 years ago
01c7895044aaffa426a803beb32a36383e1c2e51 authored about 3 years ago
fdc6aa03fb54e0d72a199f12a161ed13ee089646 authored about 3 years ago
25cc17ee038fdd6eea9b8cf2d708f6a6059a9985 authored about 3 years ago
e8098a147563825cdf411aab74c7e66eac41edaf authored about 3 years ago
6b773883dc6094a1b46b0eaf97983441d8020165 authored about 3 years ago
4ed962233519d9530c4960c7b7d546d7e85005dc authored about 3 years ago
acc9d58c390b7977c75c888b5f2cfba29abea614 authored about 3 years ago
659e738f929dba5cf36949792d9774674ae1f725 authored about 3 years ago
It seems that chocolately doesn't put gswin[32,64]c on PATH anymore,
so compensate.
e3126d28068b39cff2b21e669e96517cdf4a2e6d authored about 3 years ago
45020a7fcd56b259cb2e90625d2a0c61c304dcce authored about 3 years ago
f51164aff8ebf29496e132393953554cf62c5bf7 authored about 3 years ago
6f58a143510431588b7d76d1eb36047f48e6653b authored about 3 years ago
7ba04267b19ac59a55a61127e009192a032e23d6 authored about 3 years ago
974956431365272688179d2feb352b7b7e1c7b0b authored about 3 years ago
698e8791d7724f02ff211aff5b57ca349d0b3688 authored about 3 years ago
380b981763cc38760b758722ff2a9ec8e077bc6b authored about 3 years ago
5abfb14c2a1bdcc5bf0a287616996d6aae3e33cd authored about 3 years ago
036afc4d88eb2677a38d22425a1862f737522f51 authored about 3 years ago
59642a98b2e0d8a3b60f2dbd5cdf525fb2d9c48a authored about 3 years ago
New function is likely not as robust but seems capable of inexact image comparison.
f8c6be2e26145b0f27d41d55983f812b914c4702 authored about 3 years ago
Confirmed that img2pdf just inserts JPEG verbatim. Never had to go through
the trouble we did.
Tesseract is now included better thresholding (binarization) in v5. Users that have
thresholding...
b159e021104b5a97dbb7bbc8d4c9779b6cd19285 authored about 3 years ago