Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Collective -
Host: opensource -
https://opencollective.com/ocrmypdf
- Code: https://github.com/jbarlow83/OCRmyPDF
github.com/ocrmypdf/OCRmyPDF - efce7de9aea113015babdb0a301bcb2b5bd47c0c authored almost 11 years ago by fritz-hh <[email protected]>
concatenation is now done also with ghostscript
github.com/ocrmypdf/OCRmyPDF - 38c64ac689aa9a3d40d0b786c8dbd38fc4e6e5c9 authored almost 11 years ago by fritz-hh <[email protected]>github.com/ocrmypdf/OCRmyPDF - 6d203e3eee87137220ca0275a95ad2b2973f9126 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 81f461e5576af86859fd55a452bbb3071afd7cfc authored almost 11 years ago by fritz-hh <[email protected]>
mktemp: consider both FreeBSD/OSX and Linux OS having incompatible
syntax
From now on temporary ...
github.com/ocrmypdf/OCRmyPDF - aedbabdbe8d0965612a4eaeb409e33b56a510808 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6ed53e53c7d95d9245f372b60204fdb078fb9834 authored almost 11 years ago by fritz-hh <[email protected]>
Signed-off-by: Mansour Behabadi <[email protected]>
github.com/ocrmypdf/OCRmyPDF - a78630ce99c9ec8844f3c6a14b1a195b535909c0 authored almost 11 years ago by Mansour Behabadi <[email protected]>Signed-off-by: Mansour Behabadi <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6653066784275ccacd2d600405b23bf88d0b9e41 authored almost 11 years ago by Mansour Behabadi <[email protected]>github.com/ocrmypdf/OCRmyPDF - e40f1fa0811e9fb06100c548453c5aa406a6a8cc authored almost 11 years ago by fritz-hh <[email protected]>
to be make which parameters are allowed to be changed by the user
github.com/ocrmypdf/OCRmyPDF - a872ce751d33f9a7faf88eb9927839b3f19ae2b5 authored almost 11 years ago by fritz-hh <[email protected]>github.com/ocrmypdf/OCRmyPDF - 317846fbdca262026c76c6c1bb72836fae95791b authored almost 11 years ago by fritz-hh <[email protected]>
Fix temporary folder name generation collisions
github.com/ocrmypdf/OCRmyPDF - f581a5554416cc00dfe659b6edba9942eaf93ba0 authored almost 11 years ago by fritz-hh <[email protected]>github.com/ocrmypdf/OCRmyPDF - 447b291e70aa45ab74ff3d7484db834afcfb732f authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 01d07253e84c92753f9ca4f31cce201fe291ff9e authored almost 11 years ago by fritz-hh <[email protected]>
Fix AttributeError on self.width if Tesseract finds no OCR text
github.com/ocrmypdf/OCRmyPDF - 034a4660942bf25072821c79de5907bbaa8d4502 authored almost 11 years ago by fritz-hh <[email protected]>Verify that pdftoppm is the Poppler version, not xpdf version
github.com/ocrmypdf/OCRmyPDF - c6211e23354de7110f66f1b94e6b2d39e31f8ccd authored almost 11 years ago by fritz-hh <[email protected]>github.com/ocrmypdf/OCRmyPDF - 1d03a6417dce15b179e72f2e259a328d8d8bcedc authored almost 11 years ago by Jim Barlow <[email protected]>
self.width remains undefined unless hOCR finds text. It might not, if
a page contains only an i...
First, the regular expression matches everything after the first period
in a filename. Adding t...
github.com/ocrmypdf/OCRmyPDF - bf02ee3bdc7399fd5d34b9229d493eff6b754671 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - a3c7fba02d5cff767eb7174707cadb2562e75d32 authored almost 11 years ago by fritz-hh <[email protected]>
Tell the script that "nbImg" is a number, so that leading/trailing
spaces are removed
github.com/ocrmypdf/OCRmyPDF - 20c008b84fd0383929f788c01e43592bb4c8e55c authored almost 11 years ago by fritz-hh <[email protected]>
Check if reportlab and lxml are installed, otherwise exist with an error
github.com/ocrmypdf/OCRmyPDF - 7cd73566bef8f93385267da4e87fb03b7819e6fc authored almost 11 years ago by fritz-hh <[email protected]>github.com/ocrmypdf/OCRmyPDF - e56fd53d0649147d27a499abb93ecdd5e7ad8448 authored almost 11 years ago by fritz-hh <[email protected]>
Fix pdffonts error when filename contains a space
github.com/ocrmypdf/OCRmyPDF - 810b1b3b3e335e944524624f840df961c7b63ee9 authored almost 11 years ago by fritz-hh <[email protected]>github.com/ocrmypdf/OCRmyPDF - cb0b033fe76ee91605eece0d71d3426b67c57d74 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 46f673a3b7696b2b4357d39ff732b3cfb2c1939d authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 455303b3d47a413086be96641ad1535d6ce1fb6e authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 24a84d63803adb1b46ee7b252ed7e66cbc9683ae authored almost 11 years ago by Jim Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 9aa21710522896f6aaacc47e383b95f3ebe75181 authored almost 11 years ago by Jim Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 3a46ea1f3660536ccb1ddcb1326795d3b30563f5 authored almost 11 years ago by Jim Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - d33779f301d4c116b5d179a70407cd8699f63335 authored almost 11 years ago by Jim Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - d6ea0793b8434c6a879f22b1ce506ac47a65eba5 authored almost 11 years ago by Jim Barlow <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 4e5e5bb92539f56a4043bbabc9ab7257c5ffcf15 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 3232ed8e38c98ba23d7d61cdf72dcd286d7a3cea authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 29d6748af8f372ddc2360420a355b0bbc0b8a22a authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 828f1950716d96c6e455974ef16ca7d148f4cf35 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - b0b7e327830030d56b8bc24c6b1573ed8623b858 authored almost 11 years ago by fritz-hh <[email protected]>
fixes #41
versions older than 3.02.02 are known to produce invalid hocr output (in
some cases)
github.com/ocrmypdf/OCRmyPDF - 940a016e952f7faab737d008d80faf02ee8abb8c authored almost 11 years ago by fritz-hh <[email protected]>
If deskew and/or cleanup is not requested, do not copy the files, but
just create symbolic link....
github.com/ocrmypdf/OCRmyPDF - 54f47ab89bb4c7ef8117e6d667c227191c2106e6 authored almost 11 years ago by fritz-hh <[email protected]>
In order to have the debug page after the normal panel in the final PDF
file
github.com/ocrmypdf/OCRmyPDF - 414c4e3f3cad4c12278f30211d514c68e268fa2b authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6a9f38d31eaae7338292736bb2cd8fa06e89a9c1 authored almost 11 years ago by fritz-hh <[email protected]>
The x/y resolutions are not computed separately anymore.
We do not check anymore if x and y reso...
github.com/ocrmypdf/OCRmyPDF - 8a1241ba44e7d9aa028b42f45f2455196bfc8dca authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 7eab052e0f9d19f279fc46670880ab026ad2d658 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 552d19e36b01e74f8efbf93ebf3cdbe5aae0cf76 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 463b04e795ac5eae256cc6f3ea4e9ba401d85c7f authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - c0d8508264afe6e856d25c7c55ce046a3cc638a1 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 6ef4ba31e21dbac8f35f23d50539967948d38e35 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 10a3d26291a9f2ff74253b50651ece0b8a44635c authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - ab994b32eefc0dbc0dd65b31973461e69cda4171 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 9352b71d787c7309983cec21cf63a2040be3c9e3 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 71593421ed6159505b80ea924f922f797192910d authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 2754970f37d5d2302851080d4c91d37e5dcced4a authored almost 11 years ago by fritz-hh <[email protected]>
Fixes #16
github.com/ocrmypdf/OCRmyPDF - 5945454597cb11df2fb8e4e75f8368055b03d4be authored almost 11 years ago by fritz-hh <[email protected]>github.com/ocrmypdf/OCRmyPDF - 884dbce712c6107405e4feb6aa766e697f69a302 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 8ee1bc659821c7a62b0e0e538cef8997e0e4792a authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 7d76c467312485e3814df0242e13ccd5be35d8a8 authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - f8ccf42c06b4040e10d7a45562922273877e447d authored almost 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 0abe0f1f10c38d6ed51a7f6fe75b360cacde7c5c authored about 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - ee8a5d80ff17d929d6319bb17edf538247d576cc authored about 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - f08893b5c8dfa5e6af12d30df626920d52011d33 authored about 11 years ago by fritz-hh <[email protected]>
Fixes #35
github.com/ocrmypdf/OCRmyPDF - 41cd88506e040597b0252936ee4f91837bc34dbf authored about 11 years ago by fritz-hh <[email protected]>file committed by mistake... So deleting it now
github.com/ocrmypdf/OCRmyPDF - 081223b1381811bac5be7ed8b7df9b4891aa0c8a authored about 11 years ago by fritz-hh <[email protected]>github.com/ocrmypdf/OCRmyPDF - 4e60c9ba0964e0bcad87d8c796951e16a48d0a46 authored about 11 years ago by fritz-hh <[email protected]>
- Oversampling resolution can now be set from the cmd line (-o option)
- If a page contains more...
- If resolution is too low (<250dpi) perform automatic oversampling of
the image
- comments impr...
github.com/ocrmypdf/OCRmyPDF - 045362425f2170c7a4438e39fadeed29371ab8b8 authored about 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 2b2637fbc3f43710a0ea2d155c7aea8cf6898601 authored about 11 years ago by fritz-hh <[email protected]>
- dpi computation moved to in dedicated function
- do not exit in case of resolution mismatch (f...
github.com/ocrmypdf/OCRmyPDF - 407670e1f3a5e1a2023146885962ec7f6986b40f authored about 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - d0671d81b5c91fa97534165e6b25f509633a63ca authored about 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 7a74ebbcc3cf60633a03802bc8a12e4e9b88d9d9 authored about 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 9e698003326e5a7e1266b7f85d998b3436641a66 authored about 11 years ago by fritz-hh <[email protected]>
== does not exist in bourne shell
github.com/ocrmypdf/OCRmyPDF - 7542188592a46252bc817b88db16a4c50aa1fcb8 authored about 11 years ago by fritz-hh <[email protected]>
tell GNU parallel to protect against evaluation by the sub shell (-q
flag).
This is required in ...
-Constants moved to config.sh
- Use "python2" cmd instead of "python"
- few other minor changes
github.com/ocrmypdf/OCRmyPDF - 50dee556069f01ddd22e5ecdd8209fff1d4d5cf1 authored about 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - da5cd01fe48cb62954f38200fe677d4ec379c017 authored over 11 years ago by fritz-hh <[email protected]>
new feature: Process several pages in parallel if more than one CPU core
is available
github.com/ocrmypdf/OCRmyPDF - 88ddeb1fb64a72ce8e95bf293c4d0e0d51b3079e authored over 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - f9e2e74bf3846af7a05bb9ed1035fd9a4427d7fd authored over 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - 87e01aff607c56ad7e315669a22af9b745708d83 authored over 11 years ago by fritz-hh <[email protected]>
Removed feature to add metadata in final pdf file (because it lead to to
final PDF file that doe...
- basic implementation of parallel page processing using GNU parallel
- processing around 40% fa...
Conflicts:
OCRmyPDF.sh
Fixes #31
github.com/ocrmypdf/OCRmyPDF - 064d4be83cf406a51260e5ac75c4d6531d5e3b75 authored over 11 years ago by fritz-hh <[email protected]>fixes #31
github.com/ocrmypdf/OCRmyPDF - ab536d5678b8441f8830d5add1603d25f620f4fd authored over 11 years ago by fritz-hh <[email protected]>
Required to perform OCR of several pages in parallal (using GNU
parallel)
github.com/ocrmypdf/OCRmyPDF - f7923a9761e22a5156000b8ee0c7f1bbf21e7f22 authored over 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - fd52650255060b9dabaf6b5665637ac5e147a5ba authored over 11 years ago by fritz-hh <[email protected]>
github.com/ocrmypdf/OCRmyPDF - f0fe2951752fd23544e2bed787e47cd6b6e7a85f authored over 11 years ago by fritz-hh <[email protected]>
.gitignore file corrected, because it prevented some required jhove
binary files from being chec...
github.com/ocrmypdf/OCRmyPDF - 5aa27343e0d399163e1cc5b719c27ae86045bdb6 authored over 11 years ago by fritz-hh <[email protected]>
Deleted number of jhove files that are not required
(documentation and java source code mainly)
...
github.com/ocrmypdf/OCRmyPDF - e4ffb58269a5ce55e223eb4dd87c6f2b728c54ad authored over 11 years ago by fritz-hh <[email protected]>