Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/ocrmypdf/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
https://github.com/ocrmypdf/OCRmyPDF
2ce3d9e19d50bcd42e259e0fb1838919ec2f62ef authored almost 12 years ago
The fix should now be compatible to most implementation of grep
9271fe73a8519f15286381826bba2f92640d7e2d authored almost 12 years ago
- In debug mode: compute and echo time required for processing
- Resolutions (x/y) that are near...
- corrected fct absolutePath() to handle path with spaces correctly
- pdf title metadata: split ...
beb1d7ab5459dd6346e3942321eeaadd53c3c7d4 authored almost 12 years ago
5ce3e9bfec97ecc06b1ce5f1cb0692ce487d9d39 authored almost 12 years ago
- added metadata in final pdf file: fixes #4
- improved logging of PDF/A validation results
fixes #9
24415511565baf5f537603ca7d8353823c299942 authored almost 12 years agofixes #21
15baca5e080ae228663624ee994ce00bc149e8e6 authored almost 12 years agofixes #22
062ef0ca3a37fa9546d1245d9ee45b6209a40b74 authored almost 12 years ago24b46869448b35a76e6a5de34d4f96736a27375c authored almost 12 years ago
fixes #19
d3d1c20ca2820481d789ff11d0ec41ee33e1dc88 authored almost 12 years ago7f7b81154f7a1dbec07b647616332b617c4fe0fa authored almost 12 years ago
After deskew the images was cropped to the wrong size
6372cec6b801b36ad31698edd108d9c200280c91 authored almost 12 years ago5ec875325e0ea33644ddd3af0a669b3966c0919e authored almost 12 years ago
4d80709cfd63e145dd1ae9b325800675d18866ae authored almost 12 years ago
2642c1b3d31c8c17f4dda0d741c5c06fd47926c7 authored almost 12 years ago
c4cd7e198281d89bd5a2265aa4d277f5030be3f8 authored almost 12 years ago
b993c158d06d314e23e831e9f70230ffdbe0e3cd authored almost 12 years ago
1b727042fe7530d727607ef08450a2983026e6e0 authored almost 12 years ago
- put all src files (except OCRmyPDF.sh) to src
- rename tesseract_cfg to tess-cfg
a766c5f2b7df1bd555be65dbc9c5f82560ba7d09 authored almost 12 years ago
3b2c804f23a52f6ac00258210b97bd4683d3e93a authored almost 12 years ago
486ed6f2170ebffbb028fb4c88b5518ca97eb359 authored almost 12 years ago
ae716a91cbb0995b7534ed5c55ebe2d1b1dc41ce authored almost 12 years ago
d4195b4362464c7824a6649ee6220cc730b1b99e authored almost 12 years ago
6ae0452d87600c8acef87598cfc95d40efc490d9 authored almost 12 years ago
aimed at checking if the quality of the images drops quickly or not
815117f653b631b40cfdc53ff403d4cbd87c066c authored almost 12 years ago
- additionnal data logged
- width/height were inverted: corrected
- few other minor changes
- fixes #10
- check not only if the final PDF is well formed and valid, but also if
it conforms ...
7c173dcc679805bc42c1b9d4675a90b97ee323ca authored almost 12 years ago
- fixes #14
- minor other changes
- Font changed to Helvetica (instead of courrier)
- License text deleted (license file already a...
1860f80cae5dc549b64fe9dfa0e90c046313d924 authored almost 12 years ago
4ea97c4fe48fa8f4951a64de94116ff52de57234 authored almost 12 years ago
fixes #13
ee738be6814bdf4c0b881908abc4fa6380a9c42b authored almost 12 years agofixes #12
e21b3155e582940294777a35d635a2755660f25b authored almost 12 years agoc293ffd621d63f6cbf8b6c455a0feb44e9190245 authored almost 12 years ago
- user can now define the name/location of the output file
- check if the folder in which in/out...
968a66f66bf45f3c695109376b57567dc99b9179 authored almost 12 years ago
This corresponds to the -C option
5992afb70713eeab2d128216c2408c6a8e6f1e8e authored almost 12 years ago9aa83215c45ab9a77d88519f62759244bdeb38b3 authored almost 12 years ago
939a148812cbdba1c3cbc77af1c5fc310799e41c authored almost 12 years ago
4ce249e6eddb7b40b6e4869e13b62898620ab08f authored almost 12 years ago
-a option remove
bounding boxes for paragraphs added
color and style of bounding boxes improved
fixes #5
b9a346ce7da812b9c05f45a64bc64efa4e8dd80e authored almost 12 years ago90fc5c9de48e766b3f10a92007ed2ef4000233b2 authored almost 12 years ago
7118c2f04b191b6000159ffa32a1f87147f0dfe8 authored almost 12 years ago
d66712ab4233e3e30d91d42b410ae325edbfcd7c authored almost 12 years ago
fixes #3
fixes #2
a5c5353fbdf2cab8f550f846febbfba3b9e3da24 authored almost 12 years ago
- check if x_dpi = y_dpi
- separate options for image deskewing and cleaning
- exit codes define...
7c1820384551190c5a51df8dd7f8cb6e86f808af authored almost 12 years ago
d7c238723bc86df347422f3ba9a646501a9738a0 authored almost 12 years ago
f3e581d162188855669943bf1470fc2998bfb662 authored almost 12 years ago
4f65a31eba54332d52ae27f8cea22dbf219b582d authored almost 12 years ago
Fixes an error that lead the script not to exit correctly in case more
than 1 image is detected ...
- automatic analysis of jhove validation report
- quiet generation of PDF/A with gs
- deletion o...
fcac99bc73d21a793aba84cac93b70e40b917cf5 authored almost 12 years ago
42208aa5feed809e3ab73ef81ae32fa559daca7f authored almost 12 years ago
7c3abea2320d4291d379e210811ed5034d656e0c authored almost 12 years ago
2c23bca913ecb8c17c4947cc2ac74ce55380986a authored almost 12 years ago
Added compuation of resolution of each PDF page
Added extract of image of pgm if colorspace is G...
preparation of extraction of the image in the same resolution than the
original image inside the...
Adapted to new new cmd line I/F of hocrTransform.py
b041c0080b4a85b7c75977d38e45cb461e540abe authored almost 12 years ago
Command line interface improved in order to allow:
- show bounding boxes border
- set OCR resolu...
df56c134e416c8ff914c976e91a78c918d9c77de authored almost 12 years ago
c51babfd279aa1ee8ccf0e92b03b9127a6b7fd71 authored almost 12 years ago
8fdbfc3c95bb0d94e14a09a85fba82979d570fb0 authored almost 12 years ago
4d378c3b148802ecabb0f13ac75c88226907f852 authored almost 12 years ago
81d5b7b5e52d7e8adde9daf531b964743ff9fd88 authored almost 12 years ago
accc082b918487e0bb85d45a87d9d9edc492ce1a authored almost 12 years ago
4e4b5ddc58067a91d019e2f9e3fad3e00cbd1f80 authored almost 12 years ago
4202826dfac73040baa21bcf7c5fe7d76bcd1e2c authored almost 12 years ago
b011ddd2d950edfccc020286931f43088c300dc8 authored almost 12 years ago
7972a156fc441c33cd6ddd60c9ff793fc523c3ff authored almost 12 years ago