Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Collective - Host: opensource - https://opencollective.com/ocrmypdf - Code: https://github.com/jbarlow83/OCRmyPDF

pytest: don't run tests that happened to be part of pyvenv

github.com/ocrmypdf/OCRmyPDF - 78697341a20ce2b6b248ad7a05d620e2f277a031 authored about 9 years ago by James R. Barlow <[email protected]>
Merge commit 'b1769cbe18e6380ddfe96b3b22e6d02cb603338b' into develop

github.com/ocrmypdf/OCRmyPDF - cfb56dd8ffcce7d4f5dae466e58cac19cfdd54cc authored about 9 years ago by James R. Barlow <[email protected]>
README: El Capitan supported now, Py3.5 supported

github.com/ocrmypdf/OCRmyPDF - b1769cbe18e6380ddfe96b3b22e6d02cb603338b authored about 9 years ago by jbarlow83 <[email protected]>
Merge branch 'master' into develop

github.com/ocrmypdf/OCRmyPDF - 955b801e7f95198010288d01368a594dfc984926 authored over 9 years ago by James R. Barlow <[email protected]>
Try to work around git binary file bug again

github.com/ocrmypdf/OCRmyPDF - 3cea3f1afe235441398322c2a639d6a06fc764d1 authored over 9 years ago by James R. Barlow <[email protected]>
Force this file to stop thinking it was modified

github.com/ocrmypdf/OCRmyPDF - fd4a227ccbddbd3c277dba5359e653eb1b100440 authored over 9 years ago by James R. Barlow <[email protected]>
Update notes

github.com/ocrmypdf/OCRmyPDF - 19c30974834e06a665c72888478faf7678364f66 authored over 9 years ago by James R. Barlow <[email protected]>
Suppress failing test

github.com/ocrmypdf/OCRmyPDF - cdd1a6d03c6f13337e1c1802662596ec04c65d30 authored over 9 years ago by James R. Barlow <[email protected]>
Try new PPA for libav

github.com/ocrmypdf/OCRmyPDF - 5fb8411571ea5050c9cd63dffb528c057b1fff79 authored over 9 years ago by James R. Barlow <[email protected]>
typo fix

github.com/ocrmypdf/OCRmyPDF - 334a15b8c7494e978128a8d48434fd2025de7779 authored over 9 years ago by James R. Barlow <[email protected]>
ffmpeg-dev instead?

github.com/ocrmypdf/OCRmyPDF - 63907365775b7335e4095d80ef77db4cc19c5099 authored over 9 years ago by James R. Barlow <[email protected]>
Autoreconf?

github.com/ocrmypdf/OCRmyPDF - d55a214516a522845a6c8421776fda29a472075c authored over 9 years ago by James R. Barlow <[email protected]>
travis: apt-get install in wrong place

github.com/ocrmypdf/OCRmyPDF - 0994164b9a010e904388673bdf73b56ea935b663 authored over 9 years ago by James R. Barlow <[email protected]>
travis: fix typo

github.com/ocrmypdf/OCRmyPDF - 54ee0dd1477469962327f6069faf8d8a4750c294 authored over 9 years ago by James R. Barlow <[email protected]>
travis: build unpaper with cache

github.com/ocrmypdf/OCRmyPDF - 47c7990fb3cbc874750b66de526a45910485687b authored over 9 years ago by James R. Barlow <[email protected]>
travis: build unpaper

github.com/ocrmypdf/OCRmyPDF - 997e95de4d8404432355e5a4e83d2df33f0b3771 authored over 9 years ago by James R. Barlow <[email protected]>
Fix order of PPAs

github.com/ocrmypdf/OCRmyPDF - 44204be256278f009c6f0c8fa9ed06a10170c01d authored over 9 years ago by James R. Barlow <[email protected]>
travis: improve, add new PPA, etc.

github.com/ocrmypdf/OCRmyPDF - 9b1d9aa88ad2c499bec8bc30377d98fbf867d0fd authored over 9 years ago by James R. Barlow <[email protected]>
travis: doesn't like gcc-4.8, try just gcc

github.com/ocrmypdf/OCRmyPDF - b775762f6a38627657ea84f63e514b08e78ace33 authored over 9 years ago by James R. Barlow <[email protected]>
Travis needs sudo mode

github.com/ocrmypdf/OCRmyPDF - df1a28e31917e27b49567933c9f335599224e602 authored over 9 years ago by James R. Barlow <[email protected]>
travis: tabs -> spaces

github.com/ocrmypdf/OCRmyPDF - c300b2802a552ed00d7db05914724bf346ab9baa authored over 9 years ago by James R. Barlow <[email protected]>
More complete travis.yml

github.com/ocrmypdf/OCRmyPDF - 01040ace4cc42fec9aa6eb1a4676d0eaa6749e9f authored over 9 years ago by James R. Barlow <[email protected]>
Start setting up Travis CI

github.com/ocrmypdf/OCRmyPDF - 8367172e0bfbe6a24f408637dbcf4d96e41cad4a authored over 9 years ago by James R. Barlow <[email protected]>
Move to my repo: github.com/fritz-hh => jbarlow83

I made several efforts to contact fritz but he is no longer
communicating, and to set up Github ...

github.com/ocrmypdf/OCRmyPDF - 09afd8d25d29dead597f08e10b36e6e66f51ba0b authored over 9 years ago by James R. Barlow <[email protected]>
Test case: No longer using JHOVE

So JHOVE will not claim this is an invalid PDF and we should see it
reported as valid.

github.com/ocrmypdf/OCRmyPDF - 7ed60429b326e7c0c383e7e1e6d8fb435e7b44cb authored over 9 years ago by James R. Barlow <[email protected]>
bump to v3.0 and move repos

github.com/ocrmypdf/OCRmyPDF - 281eafada0f5057ef6461f21e54d02261a94ae0f authored over 9 years ago by James R. Barlow <[email protected]>
Bump version to -rc9

github.com/ocrmypdf/OCRmyPDF - c14e10128a8c7f01e240e50b39b9da5cdef659b8 authored over 9 years ago by James R. Barlow <[email protected]>
ghostscript: quiet startup on rasterize

github.com/ocrmypdf/OCRmyPDF - 32706351929afcf00d4aae52ce82c79dabb6b18d authored over 9 years ago by James R. Barlow <[email protected]>
Add test cases for additional image formats

github.com/ocrmypdf/OCRmyPDF - 3d26257710bb0a029b5e9033cb2bafeac7902f2d authored over 9 years ago by James R. Barlow <[email protected]>
Prevent running validation on missing file after an exception is thrown

github.com/ocrmypdf/OCRmyPDF - c4f134d694753b9f1893bc2e25f1b1b5a248b10f authored over 9 years ago by James R. Barlow <[email protected]>
Use png256 raster device when possible

Someone reported a bug where the .png input to unpaper ended up being
type 'P' (palette) for som...

github.com/ocrmypdf/OCRmyPDF - 83f9dfbac42550101bbca69fe7bb12e11f2b1aac authored over 9 years ago by James R. Barlow <[email protected]>
unpaper: support paletted files by conversion instead of bailing

github.com/ocrmypdf/OCRmyPDF - 3a445ad5f7ce206b044a45f22c7059da66853065 authored over 9 years ago by James R. Barlow <[email protected]>
Throw exception if iccprofiles not found instead of returning None

So far iccprofiles were only missing for a user who had a custom and
possibly broken ghostscript...

github.com/ocrmypdf/OCRmyPDF - c6d106ec33fbdf7f4cfee7513626f8a695799815 authored over 9 years ago by James R. Barlow <[email protected]>
Bump to -rc8

github.com/ocrmypdf/OCRmyPDF - 2ce6834be4508dd67bd589b38d6c4643c8d6d63a authored over 9 years ago by James R. Barlow <[email protected]>
Bug fix: exception thrown if input PDF was missing DocumentInfo block

github.com/ocrmypdf/OCRmyPDF - b376672dbc14da320564455578c9926d1cd490aa authored over 9 years ago by James R. Barlow <[email protected]>
Merge branch 'master' of https://github.com/fritz-hh/OCRmyPDF

github.com/ocrmypdf/OCRmyPDF - d07db8547f99e882858433ec0380d044bae97896 authored over 9 years ago by James R. Barlow <[email protected]>
Fix requirements.txt problem

github.com/ocrmypdf/OCRmyPDF - aab08bfcc7801261004dc3525db65b2d229c053f authored over 9 years ago by James R. Barlow <[email protected]>
Explain the need for multi core, etc

github.com/ocrmypdf/OCRmyPDF - e0a25494ee813fb207625835e6c81a490ca0cb30 authored over 9 years ago by jbarlow83 <[email protected]>
Merge branch 'develop'

github.com/ocrmypdf/OCRmyPDF - fd876d5e4ed99d1c6af68097f75c5973b964455a authored over 9 years ago by James R. Barlow <[email protected]>
Require unpaper 6.1; no messing around with broken versions

github.com/ocrmypdf/OCRmyPDF - ee7f008ff54ea9b5f81589af8e47b39f3216b619 authored over 9 years ago by James R. Barlow <[email protected]>
Update README: docker run instructions

github.com/ocrmypdf/OCRmyPDF - d9161a6ddb7e8a89c53ede6ee5e0652c1e22e7b3 authored over 9 years ago by jbarlow83 <[email protected]>
Update README with docker install instructions

github.com/ocrmypdf/OCRmyPDF - f8d66768e3726ed72282525cff78488d1a8d657b authored over 9 years ago by jbarlow83 <[email protected]>
Update notes for -rc6

github.com/ocrmypdf/OCRmyPDF - 4f3673d14d40f1c08f528ee61ddffb52d19248ae authored over 9 years ago by James R. Barlow <[email protected]>
Merge branch 'feature/docker-debian'

github.com/ocrmypdf/OCRmyPDF - 1712fdb74a3d428ede2876a0bec0cdefc7f7b1d3 authored over 9 years ago by James R. Barlow <[email protected]>
Stock debian unpaper is no good; replace with 6.1 built from source

debian and ubuntu both install unpaper 0.4.2 or so. No .deb packages
available at higher version...

github.com/ocrmypdf/OCRmyPDF - 3a5ffc79e0d66332fe93aad3c619f5ebcefa1c94 authored over 9 years ago by James R. Barlow <[email protected]>
Fixup other docker test suite errors

Outstanding failures:
test_pageinfo::test_jpeg
tests involving unpaper due to version <6.1 failures

github.com/ocrmypdf/OCRmyPDF - 859b063444d62ad61a675943f81a479094adf0c2 authored over 9 years ago by James R. Barlow <[email protected]>
dockerignore *.pyc

https://github.com/docker/docker/issues/13113
Docker kinda sucks. No recursive exclusion.

github.com/ocrmypdf/OCRmyPDF - bd61e7c644d724ea5617f07abaeb0ad93d531431 authored over 9 years ago by James R. Barlow <[email protected]>
Set docker locale to utf-8

Shocked, shocked, that there's a Linux distribution out that there isn't
doing the right thing a...

github.com/ocrmypdf/OCRmyPDF - c9abf282b5a3f60dee89e51376713ec220b9be47 authored over 9 years ago by James R. Barlow <[email protected]>
Major overhaul of the Dockerfile

Switched from Ubuntu to debian:stretch because stretch has more recent
versions of our binary pa...

github.com/ocrmypdf/OCRmyPDF - 9dad40b5a3d605b0f8f1e7b7bd47ee08b6684521 authored over 9 years ago by James R. Barlow <[email protected]>
Rework Dockerfile, setup.py to work with wheels for better cache use

github.com/ocrmypdf/OCRmyPDF - 8e2d690cb04a57d3e70c6e91d29a5a0b7fcd19a2 authored over 9 years ago by James R. Barlow <[email protected]>
Dockerfile: use local copy of application

github.com/ocrmypdf/OCRmyPDF - c132e091e119455ef4f598ad64d88fdce267d7d6 authored over 9 years ago by James R. Barlow <[email protected]>
pip chokes on Unicode filenames?

github.com/ocrmypdf/OCRmyPDF - 630e6cbf1e59da228f97eb7da8522649f3912f2f authored over 9 years ago by James R. Barlow <[email protected]>
Dockerfile comment cleanup

github.com/ocrmypdf/OCRmyPDF - 83ff5760a8da72fa3a5c8ea639c54ab117e52ff9 authored over 9 years ago by James R. Barlow <[email protected]>
Fix ruffus writing to RO directory in container

github.com/ocrmypdf/OCRmyPDF - fed0ee638e4321ed8ada16259c61d8993c4a4a16 authored over 9 years ago by James R. Barlow <[email protected]>
Replace fileinput with regular open-replace

fileinput is supposed to save time in these cases but it's not capable
of doing both in-place re...

github.com/ocrmypdf/OCRmyPDF - cc161780df8f7ef5b373c19d4cc46045315354c7 authored over 9 years ago by James R. Barlow <[email protected]>
Works

github.com/ocrmypdf/OCRmyPDF - 898b2b000a5cff4125cec9863287c465e3c8177c authored over 9 years ago by James R. Barlow <[email protected]>
WIP on docker

github.com/ocrmypdf/OCRmyPDF - b3ee743ed7bd021620f42472b47416981f7d926f authored over 9 years ago by James R. Barlow <[email protected]>
README needs ghostscript

github.com/ocrmypdf/OCRmyPDF - ef17b669fe88e7b518e2f159583a12f863f996d4 authored over 9 years ago by James R. Barlow <[email protected]>
Drop libxml2 dependency

It seems that Python's internal XML parser is good enough to do the job.

github.com/ocrmypdf/OCRmyPDF - 2dff3e07cea492f58dc753e033e100a0b725f6c3 authored over 9 years ago by James R. Barlow <[email protected]>
Bump to -rc5

github.com/ocrmypdf/OCRmyPDF - 53c88093ad69bbf0dfd190526d0f76137f7aabe8 authored over 9 years ago by James R. Barlow <[email protected]>
Fix test cases: minor issues

-os.environ directly modified when whole suite run, breaking subsequent
tests
-no longer trustin...

github.com/ocrmypdf/OCRmyPDF - 0ec13d3a17077a468252a2bfe9e74a7d7486cb20 authored over 9 years ago by James R. Barlow <[email protected]>
Update README with better install instructions

github.com/ocrmypdf/OCRmyPDF - 0d5104049aed9c8af3dec2607df39e88a84b08ee authored over 9 years ago by jbarlow83 <[email protected]>
Update readme

github.com/ocrmypdf/OCRmyPDF - ce8fa69785c1c8e55726ddaaddab1a431b94df83 authored over 9 years ago by James R. Barlow <[email protected]>
Pillow sucks

Far from being fluffy or friendly, Pillow silently allows installation
of itself without support...

github.com/ocrmypdf/OCRmyPDF - 30072e0c701c1bc9b874281e11d2cd767e070542 authored over 9 years ago by James R. Barlow <[email protected]>
Relax Pillow requirement for Ubuntu 14.04 LTS

github.com/ocrmypdf/OCRmyPDF - eb04a890b292db3bb6c422c7cad981f994cfc54d authored over 9 years ago by James R. Barlow <[email protected]>
setup: rollback lxml version to 3.3.3 - that's the latest in Ubuntu 14.04

github.com/ocrmypdf/OCRmyPDF - 0c53adb04fb2348cac0e58f2b69c3312a3317fe0 authored over 9 years ago by James R. Barlow <[email protected]>
setup: suppress jhove errors

github.com/ocrmypdf/OCRmyPDF - ee5a43fd47920478e17994ce409aea29b4e5a6cc authored over 9 years ago by James R. Barlow <[email protected]>
Merge branch 'develop' of https://github.com/fritz-hh/OCRmyPDF into develop

Conflicts:
setup.py

github.com/ocrmypdf/OCRmyPDF - c43d6c2cbee76cf3e52a0757e4e03356d0735733 authored over 9 years ago by James R. Barlow <[email protected]>
Fix erroneous instruction to "apt-get install tesseract"

Should be tesseract-ocr

github.com/ocrmypdf/OCRmyPDF - 87aeeacb04f213e6ac7321b19f425bef6dad03ea authored over 9 years ago by James R. Barlow <[email protected]>
Fix erroneous instruction to "apt-get install tesseract"

Should be tesseract-ocr

github.com/ocrmypdf/OCRmyPDF - 6b26e9cad6d653f1baae5643e685d395573d81bd authored over 9 years ago by James R. Barlow <[email protected]>
Add test case for blank PDF page

github.com/ocrmypdf/OCRmyPDF - 85af0f0d0317800affc39221114a3d40c21bd5db authored over 9 years ago by James R. Barlow <[email protected]>
Remove Java from setup.py

github.com/ocrmypdf/OCRmyPDF - f6f4705ea344ec614b3ace143437c18aac17710a authored over 9 years ago by James R. Barlow <[email protected]>
Possible fix for issue #111

github.com/ocrmypdf/OCRmyPDF - a4702bff22d24c38063030ab1be43d72522bcee1 authored over 9 years ago by James R. Barlow <[email protected]>
Update notes

github.com/ocrmypdf/OCRmyPDF - 73c5c48f797b5db11be7d12cdef68f617aa322ec authored over 9 years ago by James R. Barlow <[email protected]>
Remove JHOVE

JHOVE is not an effective PDF/A validator, as detailed in this article:
http://www.pdfa.org/2014...

github.com/ocrmypdf/OCRmyPDF - adf495e8cc68ec467c20382fe126b3b239c5790c authored over 9 years ago by James R. Barlow <[email protected]>
Improve ruffus exception handling

ruffus swallows the return code if the process of handling an exception
we hit an error in ruffu...

github.com/ocrmypdf/OCRmyPDF - 9247ea00bf741b4e32493be50c7eaa1317f643ee authored over 9 years ago by James R. Barlow <[email protected]>
Document override binary test

github.com/ocrmypdf/OCRmyPDF - a1238d7bf91249f923376fa1427acdb52f66388f authored over 9 years ago by James R. Barlow <[email protected]>
Work around JHOVE bug for now, so that the test passes

github.com/ocrmypdf/OCRmyPDF - 2d63268f0f49a7fbd27f13a73c5bc28ed8be6f2b authored over 9 years ago by James R. Barlow <[email protected]>
Refactor exit codes; test for missing tessdata

Some versions of tesseract installed by homebrew end up without a
functional tessdata folder, an...

github.com/ocrmypdf/OCRmyPDF - 1cb5f6a90d7bc1ed88e7d71f77247318adb45f70 authored over 9 years ago by James R. Barlow <[email protected]>
Fix code, test case: complain when GS fails to produce PDF/A

Modified pipeline to fix regression and return the proper error code if
we did not produce a PDF...

github.com/ocrmypdf/OCRmyPDF - 8d848284dfa2b4ecaf702294f8a6ae1135040d35 authored over 9 years ago by James R. Barlow <[email protected]>
Add new test case to check invalid PDF/A case

It revealed a regression - return code not the same as v2.x for invalid
PDF/A. It's also not ea...

github.com/ocrmypdf/OCRmyPDF - 8fe54d1a5c58e7eb89e45d6af560b8760a300b26 authored over 9 years ago by James R. Barlow <[email protected]>
setup.py: block unsafe 'upload', say to use twine instead

github.com/ocrmypdf/OCRmyPDF - 11dd9f14c3ad8b77b50e570501e70c9239221a03 authored over 9 years ago by James R. Barlow <[email protected]>
Bump version to -rc4

github.com/ocrmypdf/OCRmyPDF - 16d24f116649ee83387ffd3ec94872bb5a9613db authored over 9 years ago by James R. Barlow <[email protected]>
Add a test case to check on the @argumentsfile syntax

github.com/ocrmypdf/OCRmyPDF - 97015ef775014f548eb171473c1ab78c36e912fd authored over 9 years ago by James R. Barlow <[email protected]>
New test case: ensure metadata is preserved from input to output

github.com/ocrmypdf/OCRmyPDF - 2744dafb7413f3af6a9ee2ec974d4a059f90bfcf authored over 9 years ago by James R. Barlow <[email protected]>
Remove duplication in test case

github.com/ocrmypdf/OCRmyPDF - 7b268dbe1a5c6c945532b2d27a3926ac326e6059 authored over 9 years ago by James R. Barlow <[email protected]>
Improve usage text

github.com/ocrmypdf/OCRmyPDF - 8fcbbcef94dc28be730920c1b39e68d72677ff29 authored over 9 years ago by James R. Barlow <[email protected]>
Tidy docs

github.com/ocrmypdf/OCRmyPDF - 8f93f0a06e3a59a96c43f3c34511be4cd1c2df0e authored over 9 years ago by James R. Barlow <[email protected]>
Kill duplicate file

github.com/ocrmypdf/OCRmyPDF - 387142488ca5c160017fcffedf0469853e14d1d8 authored over 9 years ago by James R. Barlow <[email protected]>
Bug fix: exception from process timeout should be TimeoutExpired

github.com/ocrmypdf/OCRmyPDF - 6887e232fc220877bce09147e337269af4215681 authored over 9 years ago by James R. Barlow <[email protected]>
Merge branch 'feature/drop-mupdf-poppler' into develop

github.com/ocrmypdf/OCRmyPDF - 6ac7ffd77bbded2310524832451e115a9a576639 authored over 9 years ago by James R. Barlow <[email protected]>
Automatically use all available cores unless told not to

github.com/ocrmypdf/OCRmyPDF - b28faa582a59844f7ceee80e1ae13516089b86f7 authored over 9 years ago by James R. Barlow <[email protected]>
Run final ghostscript in multithreaded mode

This step is serialized so all cores are not busy at this stage.

github.com/ocrmypdf/OCRmyPDF - 454ee029c87f23069c7443ce9c5b62b419879f31 authored over 9 years ago by James R. Barlow <[email protected]>
Replace mupdf and poppler with qpdf

Drop two dependencies and replace them with one that does the job of
both. Smells like progress...

github.com/ocrmypdf/OCRmyPDF - a036de318edb06dec44dc9163298dbd7ddce7882 authored over 9 years ago by James R. Barlow <[email protected]>
Use img2pdf in test case because it does a better job

github.com/ocrmypdf/OCRmyPDF - 9918c4020e1c0fd212759549de9a73eae44aba24 authored over 9 years ago by James R. Barlow <[email protected]>
Fix formatting of 'motivation'

github.com/ocrmypdf/OCRmyPDF - 3d6264e1b85a26c3c081750c40d593d04dea5f01 authored over 9 years ago by jbarlow83 <[email protected]>
Improve instructions for users that need sudo or venv

github.com/ocrmypdf/OCRmyPDF - 1c252705030a155df302133f432da624efb31ed5 authored over 9 years ago by jbarlow83 <[email protected]>
setup.py: allow mutool 1.7

github.com/ocrmypdf/OCRmyPDF - 47e50f82c48b7a5b03d32052886c8b82038a2831 authored over 9 years ago by James R. Barlow <[email protected]>
More fixes to error cases in setup.py

github.com/ocrmypdf/OCRmyPDF - 27ecdfbba8caf5b0480b7ae8dadace0de3080b8d authored over 9 years ago by James R. Barlow <[email protected]>
Fix some installer issues

github.com/ocrmypdf/OCRmyPDF - 6901550065523e575d7d1af609add653b5d2fb91 authored over 9 years ago by James R. Barlow <[email protected]>