Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Collective - Host: opensource - https://opencollective.com/ocrmypdf - Code: https://github.com/jbarlow83/OCRmyPDF

More travis build tweaks

github.com/ocrmypdf/OCRmyPDF - dd6eaacc6b3398574fafa5fd54d1657aec2611e7 authored about 7 years ago by James R. Barlow <[email protected]>
Add workaround for tess4 form feed behavior change

github.com/ocrmypdf/OCRmyPDF - 37dc03eec62b0d037ec7d2341b5e5b0ac84e0f9b authored about 7 years ago by James R. Barlow <[email protected]>
Try using plain tessdata instead of tessdata_best

github.com/ocrmypdf/OCRmyPDF - 6b478172f61a4c5dfda05118fb45f1d9e1c116d5 authored about 7 years ago by James R. Barlow <[email protected]>
Still failing, did the cp work at all?

github.com/ocrmypdf/OCRmyPDF - c6e73bcfd6674fe1a8b1c9c6406d3251b5cfe6a1 authored over 7 years ago by James R. Barlow <[email protected]>
cp overwrite needs sudo

github.com/ocrmypdf/OCRmyPDF - a2d62938ce2c6a9fd98a5ecf50c841e928f0c244 authored over 7 years ago by James R. Barlow <[email protected]>
travis: Replacement problematic traineddate file

github.com/ocrmypdf/OCRmyPDF - d7ae1f3cca42f4318bb80536303e6ba79a94a3ff authored over 7 years ago by James R. Barlow <[email protected]>
Disable tesseract 4 so tests can succeed

tess4 -psm 0 is broken right now

github.com/ocrmypdf/OCRmyPDF - 70219581c4a79e61732e7673fdd03df65dd5a1b5 authored over 7 years ago by James R. Barlow <[email protected]>
Workaround travis issues in build stages... maybe

github.com/ocrmypdf/OCRmyPDF - f70ac9fb8948fa133f1c41361a0248edf7466487 authored over 7 years ago by James R. Barlow <[email protected]>
Resolve merge conflicts

github.com/ocrmypdf/OCRmyPDF - 235b9fbaf0dc64ee8c942f0a0286734c4732fbe7 authored over 7 years ago by James R. Barlow <[email protected]>
travis: need script for each stage

github.com/ocrmypdf/OCRmyPDF - ebda7f42db3878770f6ed3a80f69a47383bc382e authored over 7 years ago by James R. Barlow <[email protected]>
Try out travis build matrix

github.com/ocrmypdf/OCRmyPDF - 0b04e4b9771e4dc0819047edb577c835afb0656e authored over 7 years ago by James R. Barlow <[email protected]>
Add docs on adding to docker iamge

github.com/ocrmypdf/OCRmyPDF - 9498601a376990ed8313a02a16d5fa24805df20f authored over 7 years ago by James R. Barlow <[email protected]>
Ignore .vscode too

github.com/ocrmypdf/OCRmyPDF - ef5d320e06f463360bb46f0230ae1fabafb945f9 authored over 7 years ago by James R. Barlow <[email protected]>
Remove meaningless version from Dockerfile.polyglot

github.com/ocrmypdf/OCRmyPDF - b00c9a562d60b9e826c9b332bcef61764d72395e authored over 7 years ago by James R. Barlow <[email protected]>
Don't say tess4 support is experimental - it's pretty good now

github.com/ocrmypdf/OCRmyPDF - 5372656893689f948b9c72e69b6c8b54785c2a4d authored over 7 years ago by James R. Barlow <[email protected]>
Update release notes

github.com/ocrmypdf/OCRmyPDF - 571de0e36885d2e2fb1ae9e20da12f0ce78a5689 authored over 7 years ago by James R. Barlow <[email protected]>
Update batch processing docs to include Synology script

github.com/ocrmypdf/OCRmyPDF - 82cea2fd858eddba50ea74d7443063cd6368fb5a authored over 7 years ago by James R. Barlow <[email protected]>
Use Ubuntu 17.04 instead of 16.10 for Docker image (issue #191)

Due to 16.10 PPAs no longer being generated by alex-p

github.com/ocrmypdf/OCRmyPDF - aed9814345e6b0223402307c81b59f10e994c8fc authored over 7 years ago by James R. Barlow <[email protected]>
Add reminder that blank.pdf is not trivial

github.com/ocrmypdf/OCRmyPDF - 34fc1f5fd78d35740cfffa8b5f9a2b2fb92a6e6e authored over 7 years ago by James R. Barlow <[email protected]>
Improve clarity of --pdf-renderer=tesseract deprecation warning

github.com/ocrmypdf/OCRmyPDF - 87c2ed8b276904555cbe33db69ea56d1b2f66b07 authored over 7 years ago by James R. Barlow <[email protected]>
Add more leptonica functions

github.com/ocrmypdf/OCRmyPDF - 1467d118ab08c45ab970b772dadd1fdfc259ed6c authored over 7 years ago by James R. Barlow <[email protected]>
Update MANIFEST rules

github.com/ocrmypdf/OCRmyPDF - 922dbe83c3023bcb506ad24820e26f4126033c0c authored over 7 years ago by James R. Barlow <[email protected]>
Fix CI failure due to spoofers not being updated to Tesseract 3.05 strings

github.com/ocrmypdf/OCRmyPDF - 6af7d61ee55dce0e161e1ffd36953c04283d3664 authored over 7 years ago by James R. Barlow <[email protected]>
Update release notes

github.com/ocrmypdf/OCRmyPDF - bafd08391dcb0d84a0ea12443824860b6af5ba70 authored over 7 years ago by James R. Barlow <[email protected]>
Fix missing error message about trying to use sandwich on old tesseract

github.com/ocrmypdf/OCRmyPDF - 82ebd8ef1a86f3e551cc59b822aa7ff267143880 authored over 7 years ago by James R. Barlow <[email protected]>
Release notes: fix indentation

github.com/ocrmypdf/OCRmyPDF - 4ed1aa4d23e86b06f64420e40bc8caaa14d14bd1 authored over 7 years ago by James R. Barlow <[email protected]>
Update copyright info for test files

[ci skip]

github.com/ocrmypdf/OCRmyPDF - d04e43d46d96526acdd5533ad77c62d41d3db50d authored over 7 years ago by James R. Barlow <[email protected]>
Dockerfiles: set LANG=C.UTF-8

Issue #184 to avoid issue with printing UTF-8 text to sidecar

github.com/ocrmypdf/OCRmyPDF - 952f0cca1551c7afe8040ced82175e0fb2f3558f authored over 7 years ago by James R. Barlow <[email protected]>
Fix Ubuntu 14.04 install instructions to account for dropping Py3.4 support

[ci skip]

github.com/ocrmypdf/OCRmyPDF - f6a4d8f1f808a1c963c85e498a773ef0439db5ed authored over 7 years ago by James R. Barlow <[email protected]>
Fix broken test case related to language packs

github.com/ocrmypdf/OCRmyPDF - b3097a2384e2c2923a2100680144f2f5a4678d23 authored over 7 years ago by James R. Barlow <[email protected]>
v5.3.1 notes

github.com/ocrmypdf/OCRmyPDF - 6d9ddbe98b942958e2ab5067dee8133f3ad88d3d authored over 7 years ago by James R. Barlow <[email protected]>
Wrong error type used for missing language

github.com/ocrmypdf/OCRmyPDF - 9bb42c0229dc2a718d61ddf21754f0b48aff9e0a authored over 7 years ago by James R. Barlow <[email protected]>
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF

github.com/ocrmypdf/OCRmyPDF - bd7226b27a98734c59b352ede40436f1657c544f authored over 7 years ago by James R. Barlow <[email protected]>
Cookbook: add "don't OCR" examples

github.com/ocrmypdf/OCRmyPDF - 5b413e38737dd2e8d830d8748df857ac62446c67 authored over 7 years ago by James R. Barlow <[email protected]>
Offer the readme as a long description for new PyPI

github.com/ocrmypdf/OCRmyPDF - be5831a62929ac457198deb121a0c1ebad621138 authored over 7 years ago by James R. Barlow <[email protected]>
More badges

github.com/ocrmypdf/OCRmyPDF - 084d2bf8e23a4e4a386941b75ad830dc2c92bfb3 authored over 7 years ago by jbarlow83 <[email protected]>
macos: Skip brew audit because it seems to crash ruby on travis

github.com/ocrmypdf/OCRmyPDF - da79e6bac74c74d75dead208d52fc628e19a0258 authored over 7 years ago by James R. Barlow <[email protected]>
v5.3 release notes

github.com/ocrmypdf/OCRmyPDF - c4831ac00c9147f0831c1d8e71b84d34f06c6e45 authored over 7 years ago by James R. Barlow <[email protected]>
Fix missing import for Py3.5

github.com/ocrmypdf/OCRmyPDF - 93a954ef9f76d19d485959b386523ed7e4a2b991 authored over 7 years ago by James R. Barlow <[email protected]>
Weaken the --user-words test so it will pass on Travis

github.com/ocrmypdf/OCRmyPDF - f7ce8f44e9dbc89b9efbd733c91c4a0ce5e94824 authored over 7 years ago by James R. Barlow <[email protected]>
Whitelist the Latin-1 languages that work with HOCR

Omitted French because the rare 'oe' and 'ÿ' glyphs are not in Latin-1.
Basically steer people a...

github.com/ocrmypdf/OCRmyPDF - 0b012697e54531110de0bd55fb304dce7fde4920 authored over 7 years ago by James R. Barlow <[email protected]>
Report location of attempted output_file that fails to write

github.com/ocrmypdf/OCRmyPDF - 58e357c992246160c4497fdf807895f10354ddd5 authored over 7 years ago by James R. Barlow <[email protected]>
Fix py3.5 test

github.com/ocrmypdf/OCRmyPDF - 71fbad83ad8afea71033213df91b1b8071d23e7e authored over 7 years ago by James R. Barlow <[email protected]>
Add a differential test that checks tesseract uses supplied word list

github.com/ocrmypdf/OCRmyPDF - 52483072dcab45dba45643ee7ef105b3087330ce authored over 7 years ago by James R. Barlow <[email protected]>
Tests: accept rich path objects without having to str() everything

github.com/ocrmypdf/OCRmyPDF - 7f0b8621f3aa651fe4007ab8eb668616eafcac4f authored over 7 years ago by James R. Barlow <[email protected]>
Crash test all renderers, not just two

github.com/ocrmypdf/OCRmyPDF - cd8db60b064b8bc4a1f602fa2747c7d175df1773 authored over 7 years ago by James R. Barlow <[email protected]>
Make some interfaces accepting of both str-paths and Path objects

github.com/ocrmypdf/OCRmyPDF - 1aa34f5d2e187c6adcd511cdbb3a4d49dda4200e authored over 7 years ago by James R. Barlow <[email protected]>
Fix missing user_words/user_patterns from textonly_pdf case

github.com/ocrmypdf/OCRmyPDF - dfa1d88ce98ee5ba3a3bc02c89e741221d9884e8 authored over 7 years ago by James R. Barlow <[email protected]>
Merge branch 'feature/user-words' into develop

# Conflicts:
# ocrmypdf/exec/tesseract.py

github.com/ocrmypdf/OCRmyPDF - dd38519f078bda0bdaf141b68e278a1f0a53bbff authored over 7 years ago by James R. Barlow <[email protected]>
docs: remove deprecated example of pdftotext

github.com/ocrmypdf/OCRmyPDF - 098f5d4f0ba9b2f166e65e454d4a47cb754b6b9b authored over 7 years ago by James R. Barlow <[email protected]>
docs: envvar markup

github.com/ocrmypdf/OCRmyPDF - ffc685d5369a921004ab5aef2feffb368f55aa0a authored over 7 years ago by James R. Barlow <[email protected]>
Refactor int(os.path.basename(s)[0:6]) -> page_number(s)

github.com/ocrmypdf/OCRmyPDF - cd1a99a0de49d6e5475e782a4a98a266d91d0a5e authored over 7 years ago by James R. Barlow <[email protected]>
Accept PDFs with whitespace ahead of %PDF marker

Noticed in @aagahi 's fork

github.com/ocrmypdf/OCRmyPDF - 48e3b267fc89690373a998918efa8ea1254b7fdf authored over 7 years ago by James R. Barlow <[email protected]>
Don’t check tags and branch at the same time as Travis doesn’t get this

Travis is weird

github.com/ocrmypdf/OCRmyPDF - 3a7c3417bbad72e21d1a8c63678d2c66e7a4ece9 authored over 7 years ago by James R. Barlow <[email protected]>
Give the ‘auto’ renderer setting more test covfefe

github.com/ocrmypdf/OCRmyPDF - d792ef7222d279a9ae62266be0598ad2e964c797 authored over 7 years ago by James R. Barlow <[email protected]>
Rename “tess4” renderer to “sandwich” and make it default in Tess 3.05.01

Tesseract 3.05.01 backported the textonly_pdf=1 which allows the use
of this superior PDF render...

github.com/ocrmypdf/OCRmyPDF - 2c24f67debf93999016a896c571cd4d84817c64d authored over 7 years ago by James R. Barlow <[email protected]>
Homebrew needs x11 to compile Pillow

github.com/ocrmypdf/OCRmyPDF - 9e75e28d0cb4298509cefa9900a9d8fe68c0170c authored over 7 years ago by James R. Barlow <[email protected]>
Support “textonly PDF” renderer in Tesseract 3.05.01

github.com/ocrmypdf/OCRmyPDF - 32326438094660ef6341ee37585f019fbdb7e79a authored over 7 years ago by James R. Barlow <[email protected]>
Document what is meant by the ocrmypdf “API”

github.com/ocrmypdf/OCRmyPDF - f7ee9e90ce330406f0652c444b2fdaec2f2b75c2 authored over 7 years ago by James R. Barlow <[email protected]>
Remove Python <3.5 test

github.com/ocrmypdf/OCRmyPDF - 47298be13261e3e15abcb8ccf5b10cf969a740d0 authored over 7 years ago by James R. Barlow <[email protected]>
Travis: fix deploy conditions for homebrew autobrew

github.com/ocrmypdf/OCRmyPDF - a88fa8351590cbb351cabb987514125407a76267 authored over 7 years ago by James R. Barlow <[email protected]>
v5.1 release notes

github.com/ocrmypdf/OCRmyPDF - 12bfe203859a16cdd41583854519032528c738dc authored over 7 years ago by James R. Barlow <[email protected]>
Fix tess4 test using old-style pageinfo API

github.com/ocrmypdf/OCRmyPDF - 3d2f6f07720b38bf012de1a732c77c6f70ebed16 authored over 7 years ago by James R. Barlow <[email protected]>
Merge UserUnit

github.com/ocrmypdf/OCRmyPDF - 1cb607f64bbc802686ac40154c4e12c7cca43d45 authored over 7 years ago by James R. Barlow <[email protected]>
For —rotate-pages, rasterize preview at half DPI instead of 200 DPI

Ensures that time is not wasted on previews at higher resolution than
the input as was sometimes...

github.com/ocrmypdf/OCRmyPDF - d3c54fbbde923282b7858c9c301868337ef2c059 authored over 7 years ago by James R. Barlow <[email protected]>
Refactor common test fixtures

github.com/ocrmypdf/OCRmyPDF - 28341b755f4c9295488e51c770584398fc69ec90 authored over 7 years ago by James R. Barlow <[email protected]>
Add new test file

github.com/ocrmypdf/OCRmyPDF - 4b5cd420e1d5ba966676cbfdf1ad31a53c48598f authored over 7 years ago by James R. Barlow <[email protected]>
Fix Ghostscript rasterizing of UserUnit pages and related sizing issues

github.com/ocrmypdf/OCRmyPDF - 1d57bcc99e50b660572cebd696b5222e618404fb authored over 7 years ago by James R. Barlow <[email protected]>
Ghostscript: refactor image output resizing

github.com/ocrmypdf/OCRmyPDF - facdd138797c8ccc04b7a7467d84b02f6f4fd569 authored over 7 years ago by James R. Barlow <[email protected]>
ghostscript, qpdf: Restore API backward compatibility

github.com/ocrmypdf/OCRmyPDF - 6e891f91d38fdb1b97e013bf221b18b7d5c42b04 authored over 7 years ago by James R. Barlow <[email protected]>
Partially solve ghostscript rasterize_pdf producing wrong file size

Kludge. Assumes JPEG for now. Messy.

github.com/ocrmypdf/OCRmyPDF - 9b50ede977d1500464c95aad336121afb51514c8 authored over 7 years ago by James R. Barlow <[email protected]>
Error out if trying to produce PDF/A >200” due to Ghostscript limitation

github.com/ocrmypdf/OCRmyPDF - 82cf010333b66113653e80892e43f9cb85968cf1 authored over 7 years ago by James R. Barlow <[email protected]>
—output-type=pdf now outputs /UserUnit PDFs at the correct size

This currently distorts the output size because Tesseract assumes it
knows the DPI better than ...

github.com/ocrmypdf/OCRmyPDF - 6ff6c8614f2c4f34a8b1d2b39833ff185d508d70 authored over 7 years ago by James R. Barlow <[email protected]>
Add an open helper that is compatible with pathlib

github.com/ocrmypdf/OCRmyPDF - eb1cd38f6c7e67bb143e99e33d596545d6e176ae authored over 7 years ago by James R. Barlow <[email protected]>
Prove multiprocessing works, although it is still racy in some places

github.com/ocrmypdf/OCRmyPDF - 148b632b4fb2d58fcfe06216ef217b0c1e0df162 authored over 7 years ago by James R. Barlow <[email protected]>
Add more dependencies for autobrew

github.com/ocrmypdf/OCRmyPDF - 591e213713828262c47e4f2bb1b09765ee33f64c authored over 7 years ago by James R. Barlow <[email protected]>
Ensure JobContext stuff is actually tested for IPC consistency

github.com/ocrmypdf/OCRmyPDF - 75f2262659df0f3d19c219f8619910c915a4d1c4 authored over 7 years ago by James R. Barlow <[email protected]>
pdfinfo: replace most remaining dict-style access

github.com/ocrmypdf/OCRmyPDF - d9005a10740f5eb54ea2c4cccfac2ed1ef8f4420 authored over 7 years ago by James R. Barlow <[email protected]>
pageinfo: deprecation warning

github.com/ocrmypdf/OCRmyPDF - 3e73fa81bf15c351273a656db9463b636777ad74 authored over 7 years ago by James R. Barlow <[email protected]>
Restore old pageinfo.py to avoid breaking compatibility

github.com/ocrmypdf/OCRmyPDF - ba6e2902314d3bf6f2980a71a7cc6eb95cf796fb authored over 7 years ago by James R. Barlow <[email protected]>
Rename pageinfo to pdfinfo

github.com/ocrmypdf/OCRmyPDF - 08e47117a3ccc87bf185eac5c0e31c15d7f087c2 authored over 7 years ago by James R. Barlow <[email protected]>
/UserUnit is a scalar, not an array

github.com/ocrmypdf/OCRmyPDF - 532ef38157e2a9805e41c3fded445af7c9c3fece authored over 7 years ago by James R. Barlow <[email protected]>
docs: upload unpaper Dropbox link, .rst typo blocking macOS install

[ci skip]

github.com/ocrmypdf/OCRmyPDF - 4c09875890afe0b566829827e6e37957d9f224f2 authored over 7 years ago by James R. Barlow <[email protected]>
Upload to upload.pypi.org/legacy as recommend by PyPA

https://github.com/pypa/warehouse/issues/1996#issuecomment-302784126

github.com/ocrmypdf/OCRmyPDF - 0e98139712ffa3891a162518738889e7beee90c8 authored over 7 years ago by James R. Barlow <[email protected]>
Introduce /UserUnit checking

github.com/ocrmypdf/OCRmyPDF - 4c04d802d7c634d5b76317743f2c5e4f475b45a3 authored over 7 years ago by James R. Barlow <[email protected]>
Update unpaper.deb link (fixes #171)

*Shakes fist a Dropbox*

github.com/ocrmypdf/OCRmyPDF - b3dc404571b46001ae9d770509338c7b80d4c7d3 authored over 7 years ago by James R. Barlow <[email protected]>
Replace magic strings colorspace and encoding with Enums

github.com/ocrmypdf/OCRmyPDF - 8694f8d2ebdb639f8f2cdb9a71508ec767e19bca authored over 7 years ago by James R. Barlow <[email protected]>
pageinfo: debug stuff

github.com/ocrmypdf/OCRmyPDF - 263f9b79f4e5f49f088777d38f37cc11ff35d52a authored over 7 years ago by James R. Barlow <[email protected]>
Refactor from ImageInfo index to attribute accessing

github.com/ocrmypdf/OCRmyPDF - 56d2aae96387d2c3cf123578f68acd3cd2de4ab5 authored over 7 years ago by James R. Barlow <[email protected]>
Refactor dictionary based image info to ImageInfo

github.com/ocrmypdf/OCRmyPDF - 127706153dc96d9f05f265ccc20f58d0829f58eb authored over 7 years ago by James R. Barlow <[email protected]>
Access PageInfo instance variables instead of dictionary

github.com/ocrmypdf/OCRmyPDF - caee5b14283f72fd8014a367744a42cee12f585b authored over 7 years ago by James R. Barlow <[email protected]>
Refactor pageinfo dictionary to PageInfo()

github.com/ocrmypdf/OCRmyPDF - 6c12e7e944afccda17535a9e2cc0ea9a4d2ed355 authored over 7 years ago by James R. Barlow <[email protected]>
Refactor PdfInfo(str(filename)) -> PdfInfo(filename)

github.com/ocrmypdf/OCRmyPDF - cd04ae6949c0ea2cdfc999d9bc87a1049fc5f05e authored over 7 years ago by James R. Barlow <[email protected]>
Refactor pdf_get_all_pageinfo to PdfInfo

github.com/ocrmypdf/OCRmyPDF - 6a0b68298f2813702f9a6cbfbea027cdd200f1e1 authored over 7 years ago by James R. Barlow <[email protected]>
docs: Fix restructured text typos

github.com/ocrmypdf/OCRmyPDF - 0a2f7322678f061a1d7d070d022765d667b5134c authored over 7 years ago by James R. Barlow <[email protected]>
docs: Remark that someone got bash on Windows working

github.com/ocrmypdf/OCRmyPDF - 4bade99f27ff3e6ea9e53c59eca591ddce5abe81 authored over 7 years ago by James R. Barlow <[email protected]>
Join the build badge club

github.com/ocrmypdf/OCRmyPDF - 0b048cd24edf17bd0bd21d0e54a1dd48c2093b69 authored over 7 years ago by James R. Barlow <[email protected]>
Travis, true is a program, not a keyword

github.com/ocrmypdf/OCRmyPDF - c69ee63d82bf4c1fe40af2f12af2bb4c2ade278a authored over 7 years ago by James R. Barlow <[email protected]>
v5.0.1 release notes (anticipating)

github.com/ocrmypdf/OCRmyPDF - 744fa104d7ddce5ef11821e07b3d98e11d307961 authored over 7 years ago by James R. Barlow <[email protected]>
Travis: don’t update the homebrew version because we pushed to testpypi

github.com/ocrmypdf/OCRmyPDF - e24ff0fd646c112c582c94cc15e34079b5256f7b authored over 7 years ago by James R. Barlow <[email protected]>