Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ocrmypdf/OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
https://github.com/ocrmypdf/OCRmyPDF

Refactor page rotation and re-enable message at info level

f6d7aa6e333642c660c7feae237b9f799b4a3a47 authored about 5 years ago
Fix typos, add instructions for training data (#477)

6f66232d44dc75c2af5b2b09c67cb5b0b13d75f2 authored about 5 years ago
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF

a005d14f9147eefa829eccc21f82430df0eb307f authored about 5 years ago
Wait for file based on pikepdf

b8a780d684bb85260fbe9dc70ead21e8e153ad78 authored about 5 years ago
Order of events

82f393dd096a5895c23cb2b43b9de2b23607d84d authored about 5 years ago
watcher: some refactoring

4952af16047bef36b38c115c2b5f563b7c626b9d authored about 5 years ago
Fix grammar in output message

bcf77375c015060c42f0c3108b798f188ec22ca6 authored about 5 years ago
Update logging and env var extensibility

3eab161771f63aac04891f4e7a0f4d47dae96afa authored about 5 years ago
Watched folder bug fixes, new flags, and docs updates.

b7f38e976b5a85d9936d3b1c40c46a1663eb0de0 authored about 5 years ago
v9.5.0 release notes revised

a6567f2ae4602da734533c8ed5823c16668ba7a5 authored about 5 years ago
Fix regression: metadata updates not taking effect

e860c56b7517cac6e0c6ef0cad6e6961ccc76f5b authored about 5 years ago
v9.5.0 release notes

2e15d52895f9faea3df6365cc7a8ba8a1f3ae0a4 authored about 5 years ago
Add OCR quality measurement API

ce97af5a7971136a17d1972a920139ee50520993 authored about 5 years ago
Refactor metadata_fixup

3831c4cd4dde9273e487db3598ca9b6223746d1c authored about 5 years ago
Skip test that needs chmod when on Windows

61a26743175b1579ccdb802043ecadecb5ba4b15 authored about 5 years ago
Fix assert that depends on POSIX-y file handling

9ad8cbf1f65836e63aadbafaf7236b7740f1a422 authored about 5 years ago
Don't use debug.log in pytest

pytest does not reset the state of logging if we install a file handler,
which will cause FileNo...

123fde174d4a51833475a5639bf5f5b34a419b79 authored about 5 years ago
Allow pdfminer.six 20200104 and update recommended versions

fd991a2380f1803924b1b8192e42e67a80998dde authored about 5 years ago
Also generate log file in temp folder on verbose mode

6f5d77d930933ee58b9e9fc1347dcb3239ff3981 authored about 5 years ago
docs: mention pdfgrep too

5169ac633b49719a6d03d14cba7e06f9ce71c84b authored about 5 years ago
lept: improve lib not found error message

Closes #471

5b6ab1e003234f0dd8cb8552f319c5850ecd2d80 authored about 5 years ago
docs: add note on limitations of sidecar file

8f984bf9589025c5651fc9f3cc771de12418c727 authored about 5 years ago
Eliminate last use of PyPDF2 from test suite

9c5f0d0ec60bde8f5782d64b9aa74208658e9605 authored about 5 years ago
tests: improve tesseract coverage

32041c43e1f1751eb92b69a0757e18e38bbbf24d authored about 5 years ago
tesseract: don't explicitly set lstm_use_matrix

Apparently tesseract does this own its own as needed.

599028bebbafdaf1c3b320fe93a806d9bcb9dc62 authored about 5 years ago
logging: always log process arguments and stderr when at debug

Also remove ad-hoc logging of this information.

6faa8f72215ff8d2ad49eed078b3b835e9156642 authored about 5 years ago
logging: fix incorrect usage: logging.Logger()

a4dc5e365f3741f318cea6e192c5a40929c693df authored about 5 years ago
logging: create a debug log when -k parameter is issued

e2a563cc761a6e2201b202f1a7b62d40fe12fa2d authored about 5 years ago
tests: use smaller files for ghostscript

1037d73efbf7cc91856568fffb6bbf12008e83e4 authored about 5 years ago
tests: skip tests not compatible with coverage

For reasons not entirely clear, stdout will get some data injected when
pytest-cov is running. O...

aeb7b142a96937be8177a7446bb88599a870b89d authored about 5 years ago
Remove session scope from fixtures

pytest seems to prepare os.environ in complex ways, so we want to ensure
these fixtures are not ...

422ea9777ecc321201799348005fdcbf8b3f92aa authored about 5 years ago
Rewrite main pool loop

pytest-cov documentation recommends using explicit
management of multiprocessing.Pool rather tha...

2f1c743227b062b08a983d9e235f47f1e8e3f3f6 authored about 5 years ago
Try to set up subprocess coverage better

96ee21aee92b2a6389127ea5d2cfba7b68c8ca78 authored about 5 years ago
tests: fix problems with ghostscript spoofers

4b759af6ffbc5637deedc7856f3a804722a046f3 authored about 5 years ago
test: environment warnings/cleanup

25d2b0cda4cb50de4cc0f13b14af555b46d4e470 authored about 5 years ago
ghostscript: don't delete output_file that will never exist

We stream output now, so no point in deleting.

16dd8b54a8d2dfdfd3eb576d8b96af62bd03651f authored about 5 years ago
tests: remove some obscure things from coverage

c4dc5269d29442e1b4fe62303b40adc9595fba14 authored about 5 years ago
tests: test TqdmConsole

c36e9950ae2b64f1e4df3ea724803f4036a1a983 authored about 5 years ago
tests: AcroForm test case did not work correctly; fixed

0c0d53b10ff72c79c76d571ee97f9f68bd721c0a authored about 5 years ago
Improve error message for unreadable input files

63de7e1677819b187923dcce7628ae5209cd6e4a authored about 5 years ago
tests: add coverage for helpers

b0e92760a24539fe9426093a7a516c0ea5a6e58c authored about 5 years ago
Update completions

054c0773a3c8ec6fe0bab6dc586d4f0217263e57 authored about 5 years ago
docs: fix obsolete statement to "brew install tesseract-lang"

Closes #469

89aa78b724ca935e1fef9f3475eae4fbc6eeb6c0 authored about 5 years ago
Windows: Remove Program Files cache from ocrmypdf.exec

@lru_cache doesn't work here, so let's just remove it.

708113a514924391343a90fd52f36990fc4ac20b authored about 5 years ago
azure: tweak windows script

95ef5410c23c38ccc3bebcf3c7827df45db01d4f authored about 5 years ago
exec/init: os.get_exec_path() returns list not str

868b3b4abde71927ccd576e552aa9fd9e77654bf authored about 5 years ago
azure: homebrew broke something to do with python@2?

045bdff95a636bc71c8521d01b989a88b332aeb2 authored about 5 years ago
v9.3.0 release notes

d12b27ac1ddc571aef1f7ba6be15f6c4ec500868 authored about 5 years ago
Add improved example demonstrating watched folder functionality

Closes #466

e4e00de79fb07129a1c1962d181372a34748a68f authored about 5 years ago
Fix exception on parsing Ghostscript error messages

a53a3937c2b891e0e4e5f11383cce3a13b9f87f3 authored about 5 years ago
azure: only publish code coverage for macOS

macOS (due to Homebrew) currently has the most comprehensive code
coverage. Azure's code coverag...

343424b4d2e51425eb5075a52682ec2ede36ef45 authored about 5 years ago
Sort imports

c5edff2c2ff2a4e8082be567e0881dc1c4d58296 authored about 5 years ago
Add isort to precommit

8c5f8b8ddd81ef34f972866336dc52eaf959d78e authored about 5 years ago
Look in Program Files for executables and liblept5.dll

39da931a565c7f6e6c46b2d81ba320095c3191b9 authored about 5 years ago
Generally update documentation about available platforms

9fe354359b8885b25bf811c9f7979cdd5f3990d8 authored about 5 years ago
Remove command line qpdf from azure and travis

facc4750bc68ca583203a87d91dd13cecbfb5ec7 authored about 5 years ago
v9.2.0 release notes and docs

437c2357385ce70ac6cf62aae5095142796d1a3c authored about 5 years ago
Use pikepdf to perform qpdf.check()

9559b0b18660202111600505f42884b2ad38f38b authored about 5 years ago
pdfa.py: Fix misleading comment

91456e19a4b58a517b4fbcc7b86de0a956833eca authored about 5 years ago
Improve help messages for Windows

a2d89f67c4846136b06b185f424cda1480e87156 authored about 5 years ago
Fixed case where page image was not converted to JPEG

If a preprocessing option was used,
and all original images on the page were JPEGs,
and --output...

f34130d193f00f53c2b0d1a1dc7d214f4cb887b0 authored about 5 years ago
Improve test coverage of _sync.py

c5571388e2d0976bce142b098af33a2ad20d6e24 authored about 5 years ago
docs: improvements for Windows

9af59c0d6d47e6c5716f25278dc60564c4dfe836 authored about 5 years ago
azure: fix extra build step

55ae838cb75b442963cc7ee649d28ee1d2393b23 authored about 5 years ago
tests: split out preprocessing tests

607eee198d81a614f67e50d8f11fdde35b39e2e3 authored about 5 years ago
docs: more install notes

c434b97f55a4fb39d80f098920c7aef9dd78391b authored about 5 years ago
tests: speed up several slow tests

5e2a7f8a56bb010de44fd96b12d96dc5b900c7cb authored about 5 years ago
Add Azure Pipelines CI/CD

fd9550acdacfe8d63bf54f6a5d2675313e9d8ba4 authored about 5 years ago
Address tests that fail on Windows with Python 3.7 or 3.6

7be293f628f0f13fbb2c2ba45443fce115d1050d authored about 5 years ago
Fix close_fds=True on Windows Python 3.6

65855dc14c1b1c4573f47f5cd14fb5552fd664c3 authored about 5 years ago
ghostscript: document need to write to stdout when using txtwrite

b354511ac92c0ec988eef556dd145177b8151564 authored about 5 years ago
Suppress duplicate error messages from Ghostscript

cac4a8b9b67b3b513422274df439e5b158cce1ab authored about 5 years ago
Ignore mypy cache

17d97b354adcfe36d99d88ba7dd11603bf9c50b4 authored about 5 years ago
Add typing hints for ocr() function

1c1b60fa9f4a5392be917518b32ac1cb0296bfbf authored about 5 years ago
Merge branch 'windows'

6b745d892fbdd21978c8d7410d2074c7791c070c authored about 5 years ago
Remove Tesseract < 4.0 specific check

fbf271a3ecdbb6c3303bc869becb2a737ea75578 authored about 5 years ago
Possible fix to loss of log adapter state

8077718804c5e7978291f4d24c2f05d1f1029a58 authored about 5 years ago
docs: cause about using Windows in production

66bda3420a2fbd13fe6ae2aa5089fee556e3deea authored about 5 years ago
Document function of symlink shim

f6510e2b1512a2c760256c42fe57b5e0b8e68612 authored about 5 years ago
Tesseract no longer posts an error message if config file not found

51abd791363116b0fd4150e247a4e2bd99e067df authored about 5 years ago
tests: error message from tesseract change

5607429d9a5e6e2bbfdee50b001681def84db5ed authored about 5 years ago
Fix DecompressionBomb related errors due to Windows process differences

b8b7ecfe7f7d05d30037f4f2d9ac97d24c952c94 authored about 5 years ago
Add Windows install advice

cb3cfaa055e2f6b2fca99657daf4f60ec4b1dbd7 authored about 5 years ago
Remove test_bad_utf8

Due to difficulties of getting this to work on Python 3.8, Windows, and
high probability that th...

9db01c7ff5cdea61f5b5d807fd9d5493754ad8b9 authored about 5 years ago
Make test_german more Windows-friendly

cff37bf6814d3ea44c96f1bd4f43cee3e77e1113 authored about 5 years ago
Don't expect filenames to be replicated on NT

66d04dd6e32c7930baec14ebf3fbb008992c1c1a authored about 5 years ago
docs: sketch Windows install procedure

d4abe88452e07f919406246c9d1e4b1926c8472b authored about 5 years ago
ghosttext: mention page number differences

d0301813cc30f4282f2d360de04426e544da7ec5 authored about 5 years ago
Use _OCRMYPDF_TEST_PATH for testing and .py stubs to simulate symlinks

06a1f987d499f855d1576c3c67a4e49483c3e40d authored about 5 years ago
ghostscript: Refactor checking for executable name on Windows

e51e21c6b6422fdf96287d296f1f1bae5efe7a96 authored about 5 years ago
ghostscript: use run(check=True) for more consistent error handling

c5fa72bd4ec7d46dad864263ab1e2883dc1a5282 authored about 5 years ago
Remove os_environ() context manager

43ab7c88d7604d7c9b6f59d5eb15a21699879c42 authored about 5 years ago
ghostscript: don't use NamedTemporaryFile

Temporary files are more awkward for Windows.

d249aef57d60fd38b07613c483aca5b669bbb7fa authored about 5 years ago
ghostscript: use correct executable name on Windows

bf99587aa1024255907eff74cb7bdbf532da91f7 authored about 5 years ago
test: Replace many instances of run_ocrmypdf in subprocess with inline

fde550f9a708d62bde595b4a056cac5d51f3f1a6 authored about 5 years ago
Move gs tests to test_ghostscript

ca9669742d632ff1c773e45921d1c98401727441 authored about 5 years ago
Enforce str-only environment for Windows since it's more strict

0cd424ffcbecf3b85e30c6476f1145e68b278549 authored about 5 years ago
Don't worry about closed streams on Windows

8a1dddc3eeec9e7ca3bdbe9921ba72346d40bf48 authored about 5 years ago
Fix test_metadata: use mmap in a Windows and POSIX compatible way

a3726e4ce3ba092b8f9981814fa690aa89ca6670 authored about 5 years ago
tests: a few Windows fixes

37f6f72df3aad5e24fb1db1792371ef87ed038a5 authored about 5 years ago