Installation#

The tool can be installed from PyPi with

pip install -U sec-certs && python -m spacy download en_core_web_sm

Note, that Python>=3.10 is required.

The tool can be pulled as a docker image with

docker pull seccerts/sec-certs

The stable release is also published on GitHub from where it can be setup for development with

git clone https://github.com/crocs-muni/sec-certs.git
python3 -m venv venv
source venv/bin/activate
pip install -e .
python -m spacy download en_core_web_sm

Alternatively, our Our Dockerfile represents a reproducible way of setting up the environment.

If you’re not using Docker, you must install the dependencies as described below.

Dependencies#

  • Java is needed to parse tables in FIPS pdf documents, must be available from PATH.

  • Some imported libraries have non-trivial dependencies to resolve:

    • pdftotext requires Poppler to be installed. We’ve experienced issues with older versions of Poppler (0.x), make sure to install 20.x version of these libraries.

    • tesseract is required for OCR of malformed PDF documents, together with data files for English, French and German.