Docker Brew



The easiest way to install OCRmyPDF is to follow the steps for your operatingsystem/platform. This version may be out of date, however.

Official docker build of Clear Linux OS for Intel Architecture. Brew install docker-compose. In prometheus.yml, add the following. Obviously, you can alter the scrape interval: scrapeconfigs: - jobname: cadvisor scrapeinterval: 5s staticconfigs: - targets: - cadvisor:8080 Then you will have to alter or create the docker-compose.yml config file. Today I’d like to announce Homebrew 2.3.0. The most significant changes since 2.2.0 are GitHub Actions CI usage, fetching resources before installation, Docker image improvements and the deprecation of brew install from URLs. Build performed in parallel, so, it is highly recommended to not use npm task per platform (e.g. Npm run dist:mac && npm run dist:win32), but specify multiple platforms/targets in one build command.

These platforms have one-liner installs:

Debian, Ubuntuaptinstallocrmypdf
Windows Subsystem for Linuxaptinstallocrmypdf
Fedoradnfinstallocrmypdf
macOSbrewinstallocrmypdf
LinuxBrewbrewinstallocrmypdf
FreeBSDpkginstallpy37-ocrmypdf
Conda (WSL, macOS, Linux)condainstallocrmypdf

More detailed procedures are outlined below. If you want to do a manualinstall, or install a more recent version than your platform provides, read on.

Platform-specific steps

  • Installing on Linux
  • Installing on macOS
  • Installing on Windows
  • Installing with Python pip
  • Installing HEAD revision from sources
OCRmyPDF versions in Debian & Ubuntu

Users of Debian 9 (“stretch”) or later, or Ubuntu 18.04 or later, including usersof Windows Subsystem for Linux, may simply

As indicated in the table above, Debian and Ubuntu releases may lagbehind the latest version. If the version available for your platform isout of date, you could opt to install the latest version from source.See Installing HEAD revision fromsources. Ubuntu 16.10 to 17.10inclusive also had ocrmypdf, but these versions are end of life.

For full details on version availability for your platform, check theDebian Package Tracker orUbuntu launchpad.net.

Docker brewery

Note

OCRmyPDF for Debian and Ubuntu currently omit the JBIG2 encoder.OCRmyPDF works fine without it but will produce larger output files.If you build jbig2enc from source, ocrmypdf 7.0.0 and later willautomatically detect it (specifically the jbig2 binary) on thePATH. To add JBIG2 encoding, see Installing the JBIG2 encoder.

OCRmyPDF version

Users of Fedora 29 or later may simply

For full details on version availability, check the Fedora PackageTracker.

If the version available for your platform is out of date, you could optto install the latest version from source. See Installing HEAD revisionfrom sources.

Note

OCRmyPDF for Fedora currently omits the JBIG2 encoder due to patentissues. OCRmyPDF works fine without it but will produce larger outputfiles. If you build jbig2enc from source, ocrmypdf 7.0.0 and laterwill automatically detect it on the PATH. To add JBIG2 encoding,see Installing the JBIG2 encoder.

Ubuntu 20.04 includes ocrmypdf 9.6.0 - you can install that with apt. Toinstall a more recent version, uninstall the system-provided version ofocrmypdf, and install the following dependencies:

To install ocrmypdf for the system:

To install for the current user only:

Ubuntu 18.04 includes ocrmypdf 6.1.2 - you can install that with apt, butit is quite old now. To install a more recent version, uninstall the old versionof ocrmypdf, and install the following dependencies:

We will need a newer version of pip then was available for Ubuntu 18.04:

Then install the most recent ocrmypdf for the local user and set theuser’s PATH to check for the user’s Python packages.

To add JBIG2 encoding, see Installing the JBIG2 encoder.

No package is available for Ubuntu 16.04. OCRmyPDF 8.0 and newer requirePython 3.6. Ubuntu 16.04 ships Python 3.5, but you can install Python3.6 on it. Or, you can skip Python 3.6 and install OCRmyPDF 7.x or older- for that procedure, please see the installation documentation for theversion of OCRmyPDF you plan to use.

Docker

Install system packages for OCRmyPDF

This will install a Python 3.6 binary at /usr/bin/python3.6alongside the system’s Python 3.5. Do not remove the system Python. Thiswill also install Tesseract 4.0 from a PPA, since the version availablein Ubuntu 16.04 is too old for OCRmyPDF.

Now install pip for Python 3.6. This will install the Python 3.6 versionof pip at /usr/local/bin/pip.

Mac

Install OCRmyPDF

OCRmyPDF requires the locale to be set for UTF-8. On some minimalUbuntu installations, such as the Ubuntu 16.04 Docker images it may benecessary to set the locale.

Now install OCRmyPDF for the current user, and ensure that the PATHenvironment variable contains $HOME/.local/bin.

To add JBIG2 encoding, see Installing the JBIG2 encoder.

There is an Arch User Repository (AUR) package for OCRmyPDF.

Installing AUR packages as root is not allowed, so you must first setup anon-root user andconfigure sudo.The standard Docker image, archlinux/base:latest, does not have anon-root user configured, so users of that image must follow these guides. Ifyou are using a VM image, such as the official Vagrant image, this work may alreadybe completed for you.

Next you should install the base-devel package group. This includes thestandard tooling needed to build packages, such as a compiler and binary tools.

Now you are ready to install the OCRmyPDF package.

At this point you will have a working install of OCRmyPDF, but the Tesseractinstall won’t include any OCR language data. You can install thetesseract-data package group to add all supportedlanguages, or use that package listing to identify the appropriate package foryour desired language.

As an alternative to this manual procedure, consider using an AUR helper. Such a tool willautomatically fetch, build and install the AUR package, resolve dependencies(including dependencies on AUR packages), and ease the upgrade procedure.

If you have any difficulties with installation, check the repository packagepage.

Note

The OCRmyPDF AUR package currently omits the JBIG2 encoder. OCRmyPDF worksfine without it but will produce larger output files. The encoder isavailable from the jbig2enc-git AUR package and may be installedusing the same series of steps as for the installation OCRmyPDF AURpackage. Alternatively, it may be built manually from source following theinstructions in Installing the JBIG2 encoder. If JBIG2 isinstalled, OCRmyPDF 7.0.0 and later will automatically detect it.

To install OCRmyPDF for Alpine Linux:

There is no OS-level packaging available for Mageia, so you must install thedependencies:

To install ocrmypdf for the system:

# As root userpip3 install ocrmypdfldconfig

Or, to install for the current user only:

export PATH=$HOME/.local/bin:$PATHpip3 install –user ocrmypdf

See theRepology page.

In general, first install the OCRmyPDF package for your system, thenoptionally use the procedure Installing with Pythonpip to install a more recent version.

OCRmyPDF is now a standard Homebrew formula. Toinstall on macOS:

This will include only the English language pack. If you need otherlanguages you can optionally install them all:

Note

Users who previously installed OCRmyPDF on macOS usingpipinstallocrmypdf should remove the pip version(pip3uninstallocrmypdf) before switching to the Homebrewversion.

Note

Users who previously installed OCRmyPDF from the private tap shouldswitch to the mainline version (brewuntapjbarlow83/ocrmypdf)and install from there.

These instructions probably work on all macOS supported by Homebrew, and arefor installing a more current version of OCRmyPDF than is available fromHomebrew. Note that the Homebrew versions usually track the release versionsfairly closely.

If it’s not already present, install Homebrew.

Update Homebrew:

Install or upgrade the required Homebrew packages, if any are missing.To do this, use breweditocrmypdf to obtain a recent list of Homebrewdependencies. You could also check the azure-pipelines.yml.

This will include the English, French, German and Spanish languagepacks. If you need other languages you can optionally install them all:

Update the homebrew pip:

You can then install OCRmyPDF from PyPI, for the current user:

or system-wide:

The command line program should now be available:

Note

Administrator privileges will be required for some of these steps.

You must install the following for Windows:

  • Python 3.7 (64-bit) or later
  • Tesseract 4.0 or later
  • Ghostscript 9.50 or later

Using the Chocolatey package manager, install thefollowing when running in an Administrator command prompt:

  • chocoinstallpython3
  • chocoinstall--pretesseract
  • chocoinstallghostscript
  • chocoinstallpngquant (optional)

The commands above will install Python 3.x (latest version), Tesseract, Ghostscriptand pngquant. Chocolatey may also need to install the Windows Visual C++ RuntimeDLLs or other Windows patches, and may require a reboot.

You may then use pip to install ocrmypdf. (This can performed by a user orAdministrator.):

  • pipinstallocrmypdf

Chocolatey automatically selects appropriate versions of these applications. If youare installing them manually, please install 64-bit versions of all applications for64-bit Windows, or 32-bit versions of all applications for 32-bit Windows. Mixingthe “bitness” of these programs will lead to errors.

OCRmyPDF will check the Windows Registry and standard locations in your Program Filesfor third party software it needs (specifically, Tesseract and Ghostscript). Tooverride the versions OCRmyPDF selects, you can modify the PATH environmentvariable. Follow these directionsto change the PATH.

Warning

As of early 2021, users have reported problems with the Microsoft Store version ofPython affected most third party Python packages including OCRmyPDF. Please usePython downloaded from Python.org or Chocolatey as recommended here.

  1. Install Ubuntu 18.04 for Windows Subsystem for Linux, if not already installed.
  2. Follow the procedure to install OCRmyPDF on Ubuntu 18.04.
  3. Open the Windows command prompt and create a symlink:

Then confirm that the expected version from PyPI () is installed:

You can then run OCRmyPDF in the Windows command prompt or Powershell, prefixingwsl, and call it from Windows programs or batch files.

First install the the following prerequisite Cygwin packages using setup-x86_64.exe:

Note

The Cygwin package for Ghostscript in versions 9.52 and9.52-1 contained a bug that caused an exception to occur whenocrmypdf invoked gs. Make sure you have either 9.50 (or earlier)or 9.52-2 (or later).

Then open a Cygwin terminal (i.e. mintty), run the following commands. Notethat if you are using the version of pip that was installed with the CygwinPython package, the command name will be pip3. If you have since updatedpip (with, for instance pip3install--upgradepip) the the command islikely just pip instead of pip3:

The optional dependency “unpaper” that is currently not available under Cygwin.Without it, certain options such as --clean will produce an error message.However, the OCR-to-text-layer functionality is available.

You can also Install the Docker container on Windows. Ensure thatyour command prompt can run the docker “hello world” container.

FreeBSD 11.3, 12.0, 12.1-RELEASE and 13.0-CURRENT are supported. Otherversions likely work but have not been tested.

To install a more recent version, you could attempt to first install the systemversion with pkg, then use pipinstall--userocrmypdf.

Install

For some users, installing the Docker image will be easier thaninstalling all of OCRmyPDF’s dependencies.

See OCRmyPDF Docker image for more information.

OCRmyPDF is delivered by PyPI because it is a convenient way to installthe latest version. However, PyPI and pip cannot address the factthat ocrmypdf depends on certain non-Python system libraries andprograms being installed.

For best results, first install your platform’sversion ofocrmypdf, using the instructions elsewhere in this document. Thenyou can use pip to get the latest version if your platform versionis out of date. Chances are that this will satisfy most dependencies.

Use ocrmypdf--version to confirm what version was installed.

Then you can install the latest OCRmyPDF from the Python wheels. Firsttry:

You should then be able to run ocrmypdf--version and see that thelatest version was located.

Since pip3install--user does not work correctly on some platforms,notably Ubuntu 16.04 and older, and the Homebrew version of Python,instead use this for a system wide installation:

Note

AArch64 (ARM64) users: this process will be difficult because mostPython packages are not available as binary wheels for your platform.You’re probably better off using a platform install on Debian, Ubuntu,or Fedora.

OCRmyPDF currently requires these external programs and libraries to beinstalled, and must be satisfied using the operating system packagemanager. pip cannot provide them.

  • Python 3.6 or newer
  • Ghostscript 9.15 or newer
  • qpdf 8.1.0 or newer
  • Tesseract 4.0.0-beta or newer

As of ocrmypdf 7.2.1, the following versions are recommended:

  • Python 3.7 or 3.8
  • Ghostscript 9.23 or newer
  • qpdf 8.2.1
  • Tesseract 4.0.0 or newer
  • jbig2enc 0.29 or newer
  • pngquant 2.5 or newer
  • unpaper 6.1

jbig2enc, pngquant, and unpaper are optional. If missing certainfeatures are disabled. OCRmyPDF will discover them as soon as they areavailable.

jbig2enc, if present, will be used to optimize the encoding ofmonochrome images. This can significantly reduce the file size of theoutput file. It is not required.jbig2enc is not generallyavailable for Ubuntu or Debian due to lingering concerns about patentissues, but can easily be built from source. To add JBIG2 encoding, seeInstalling the JBIG2 encoder.

pngquant, if present, is optionally used to optimize the encoding ofPNG-style images in PDFs (actually, any that are that losslesslyencoded) by lossily quantizing to a smaller color palette. It is onlyactivated then the --optimize argument is 2 or 3.

unpaper, if present, enables the --clean and --clean-finalcommand line options.

These are in addition to the Python packaging dependencies, meaning thatunfortunately, the pipinstall command cannot satisfy all of them.

If you have git and Python 3.6 or newer installed, you can installfrom source. When the pip installer runs, it will alert you ifdependencies are missing.

If you prefer to build every from source, you will need to buildpikepdf fromsource.First ensure you can build and install pikepdf.

To install the HEAD revision from sources in the current Python 3environment:

Or, to install in developmentmode,allowing customization of OCRmyPDF, use the -e flag:

You may find it easiest to install in a virtual environment, rather thansystem-wide:

However, ocrmypdf will only be accessible on the system PATH whenyou activate the virtual environment.

To run the program:

If not yet installed, the script will notify you about dependencies thatneed to be installed. The script requires specific versions of thedependencies. Older version than the ones mentioned in the release notesare likely not to be compatible to OCRmyPDF.

To install all of the development and test requirements:

To add JBIG2 encoding, see Installing the JBIG2 encoder.

Completions for bash and fish are available in the project’smisc/completion folder. The bash completions are likely zshcompatible but this has not been confirmed. Package maintainers, pleaseinstall these at the appropriate locations for your system.

To manually install the bash completion, copymisc/completion/ocrmypdf.bash to /etc/bash_completion.d/ocrmypdf(rename the file).

To manually install the fish completion, copymisc/completion/ocrmypdf.fish to~/.config/fish/completions/ocrmypdf.fish.

You can install the CLI with a curl utility script, brew or by downloading the binary from the releases page. Once installed you'll get the faas-cli command and faas alias.

Linux or macOS¶

Utility script with curl:

The flag -E allows for any http_proxy environmental variables to be passed through to the installation bash script.

Non-root with curl downloads the binary into your current directory and will then print installation instructions:

Via brew:

Note

The brew release may not run the latest minor release but is updated regularly.

Windows¶

In PowerShell:

Environment variable overrides¶

Several overrides exist which will be used by default if set and no other command-line flag has been set.

  • OPENFAAS_TEMPLATE_URL - to set the default URL to pull templates from
  • OPENFAAS_PREFIX - for use with faas-cli new - this can act in place of --prefix
  • OPENFAAS_URL - to override the default gateway URL

Running faas-cli with sudo¶

Docker Brew Install

If you're running the faas-cli with sudo we recommend using sudo -E to pass through any environmental variables you may have configured such as a http_proxy, https_proxy or no_proxy entry.

Docker Brew Install

Docker image¶

The faas-cli is also available as a Docker image making it convenient for use in CI jobs such as with a Jenkins pipeline or a task in cron.

There is no 'latest' tag, so find the version of the CLI you want to use from the tags page on the Docker Hub. These correspond to the release from GitHub.

Docker Brew Vs Dmg

Brew

Note: the Docker image cannot be used to perform a build directly, but you can use it to generate a build context which can be used with a container builder such as Docker, buildkit or Kaniko in another part of your build pipeline.

Use-cases for the Docker image:

Docker Brew

  • Generate the build context without running docker build - faas-cli --shrinkwrap
  • Deploy an existing image to a remote server faas-cli deploy
  • Manage secrets with faas-cli secret
  • Invoke functions via cron with faas-cli invoke
  • Check the health of your remote gateway with faas-cli info

Building from source¶

The contributing guide has instructions for building from source and for configuring a Golang development environment.

Docker Brew Image

  • Star/fork on GitHub: faas-cli

Docker Brew Mac

Tutorial: learn how to use the CLI¶





Comments are closed.