Encrypting Large Amounts of Data Automatically

Some sites have a need to encrypt large sets of data in an efficient way and be compliant with the approved use of GnuPG VS-Desktop® for restricted communication. The approval documents demand the implementation of certain checks in the scripting process which are non trivial to do on Windows. Fortunately, a command line option introduced with version 3.1.22 makes this much easier.

We will describe here step by step how to use this option for single file processing and then how to use it for multiple files with "gpgtar".

The require-compliance option

GnuPG VS-Desktop is by design flexible enough to be used in restricted mode and in standard mode. This allows the use of the software also to handle non-classified data. The GUI (Graphical User Interface) which is known under the name “Kleopatra” takes care of informing and warning the user if non-classified data or inappropriate keys are used.

For efficiency reasons it is sometimes required to resort to the CLI (Command Line Interface). In particular the processing of data in the range of tens of gigabytes can greatly be sped up by using the CLI. To make sure that the processing is compliant with the approval a new option was introduced with version 3.1.22 for the OpenPGP tool (gpg) and the S/MIME tool (gpgsm):

--require-compliance
To check that data has been encrypted according to the rules of the current compliance mode, a gpg user needs to evaluate the status lines. This allows frontends to handle compliance checks in a more flexible way. However, for scripted use the required evaluation of the status line requires quite some effort; this option can be used instead to make sure that the gpg process exits with a failure if the compliance rules are not fulfilled. Note that this option has currently an effect only in "de-vs" mode.

The mentioned “de-vs” mode is the said approved mode and is always active in a proper installation of GnuPG VS-Desktop. With this knowledge we can now encrypt data on Windows or Linux. The examples are for Windows, on Linux it works the same, just use slashes instead of backslashes.

Encryption of a single file

The first example encrypts the data in the file.dat to the key described by “foo@example.org” and writes the result to file.data.gpg. Note that the backslash at the end of the line should not be entered as it merely indicates that the line has been split for easier reading.

gpg --require-compliance --encrypt -r foo@example.com \
    --batch --yes -o file.dat.gpg file.dat

As long as “foo@example.org” describes a valid and compliant key (shown as green in the GUI), the command starts to works and will return without an error. If however, the key is non-compliant, the command will terminate with an error. This can be easily tested for, even on Windows. In this case it is best to remove any partly created output file.

The option –batch ensures that gpg never falls into interactive mode; however it may ask for the passphrase via the Pinentry dialog. The option –yes overwrites an existing output file “file.data.gpg”; without this option gpg would throw an error.

To encrypt to more than one key, simply add more -r options, like here:

gpg --require-compliance --encrypt -r foo@example.com -r bar@example.de \
    --batch --yes -o file.dat.gpg file.dat

It is often required that the data is first signed and then encrypted. To do this simply add the option –sign. To select a dedicated signing key use the option -u like here:

gpg --require-compliance --encrypt --sign -u me@example.fr \
    -r foo@example.com -r bar@example.de \
    --batch --yes -o file.dat.gpg file.dat

The sign operation will pop up the so called “Pinentry” to enter the password to unlock the signing key. The next section has hints on how to avoid this.

Note: For the options -r and -u a fingerprint may also be used instead of a mail address. This is for example necessary if several valid keys with the same mail address exist or if the key has no other unique identifier. Here is an example where one of the keys is specified by a fingerprint.

gpg --require-compliance --encrypt -r foo@example.com \
    -r 6FE78D1B9F38ACA68C300F76AF99952165A3D8C5 \
    --batch --yes -o file.dat.gpg file.dat

Often ist is required to encrypt using a pre-shared password (symmetric-only encryption). This can be done in a similar way:

gpg --require-compliance --symmetric \
    --batch --yes -o file.dat.gpg file.dat

The “Pinentry” will pop up similar as with the sign option. See below on how to automate this. Actually this can also be combined with signing (–sign) and public key encryption (–encrypt and -r) so that the recipients may either decrypt using their private key (if given by the -r option) or the pre-shared password.

If the files are already compressed, it is often a good idea to add the –no-compress option to “gpgtar“ to avoid the extra overhead of useless compression.

Decryption of a single file

Using the CLI for the decryption will also speed up the processing. Example:

gpg --require-compliance --decrypt \
    --batch --yes -o file.dat file.dat.gpg

The system will figure out whether file.dat.gpg has been encrypted to a public key for which you hold the corresponding private and then decrypt it. If the file has been encrypted with a pre-shared password (symmetric-only) the user will be asked to enter this password. In any case the output file will only be written if it has been properly encrypted using algorithms and keys compliant with the approval. If this is not the case the program is terminated with an error.

To get more verbose human readable information, add the option -v to the invocation.

Encryption of multiple files or directories

We need to distinguish two cases:

  1. Encrypting several single files into separate encrypted files.
  2. Encrypting several single files or directories into one encrypted archive.

The first case is usually handled by scripting and using the mechanism described above. Another way is to make use of the –multifile option. The next example shows how to encrypt 3 files in one run:

gpg --require-compliance --encrypt -r foo@example.com  \
    --batch --yes --multifile a.txt b.txt c.txt

The result are 3 files a.txt.gpg, b.txt.gpg, and c.txt.gpg. Decryption works similar:

gpg --require-compliance --decrypt \
    --batch --yes --multifile a.txt.gpg b.txt.gpg c.txt.gpg

and brings back the 3 original files. Instead of passing the file names on the command line it is also possible to read them from a file. To do this put the files names one per line into a file and run

gpg --require-compliance --encrypt -r foo@example.com  \
    --batch --yes --multifile <file_with_filenames.lst

Note that adding a signature is currently not possible with this option.

The second case is the more common one. The encrypted files are to be stored into an encrypted archive. For example to encrypt all files in the current directory with a suffix of “.txt” as well as the directories “foo” and “bar” with all their files and sub-directories, this command can be used:

gpgtar --require-compliance --encrypt -r foo@example.com \
       --batch --yes -o archive.tar.gpg   *.txt foo bar

The created file “archive.tar.gpg” is encrypted to the key of “foo@example.com” and contains all files along with some meta data. If it has been encrypted to one of your keys you may quickly check the content of the archive using the command

gpgtar -t archive.tar.gpg

which should yield an output like:

-rw-r--r-- 0 1000/1000  61 2023-05-09 11:30:36 a.txt
-rw-r--r-- 0 1000/1000  63 2023-05-09 11:30:28 b.txt
-rw-r--r-- 0 1000/1000  86 2023-05-09 11:30:28 c.txt
drwxr-xr-x 0 1000/1000   0 2023-05-09 11:27:59 foo
-rw-r--r-- 0 1000/1000  68 2023-05-09 11:27:59 foo/a-long-file-name.txt
drwxr-xr-x 0 1000/1000   0 2023-05-09 11:42:23 bar
-rw-r--r-- 0 1000/1000  73 2023-05-09 11:41:35 bar/A name with spaces.dat
-rw-r--r-- 0 1000/1000   0 2023-05-09 11:41:15 bar/another-file

The tool “gpgtar” accepts a couple of other options most notably “–sign” and “-u” to also sign the encrypted archive. See the GnuPG manual for a description of other options. For smooth integration with the Kleopatra frontend, the name of the archive files should always end in “.tar.gpg”. Note that only proper files are stored in the archive; symlinks and other special files are ignored.

If the files are already compressed, it is often a good idea to add the –no-compress option to “gpgtar“ to avoid the extra overhead of useless compression.

Decryption of an encrypted tar archive

Decryption of an encrypted archive is straightforward:

gpgtar --require-compliance --decrypt --status-fd 2 \
       --batch --yes archive.tar.gpg

This creates a new directory “archive.tar_1_" and extracts all files into that directory. If that directory already exists, the number in the suffix is incremented until a new directory can be created. The extraction into a newly created directory is done for security reasons. If the archive shall be extracted into a given directory, add the option -C NAME and make sure that the directory NAME exists. The use of the “–status-fd” option is required up until version 3.1.26 to avoid a false compliance warning. The calling process shall in any case check the return code of “gpgtar” to assure the integrity of the archive and to detect extraction errors.

Conveying a password

According to the approval, the password may not be provided on the command line. However, it is possible to convey it using a file descriptor, via a file, or using a custom Pinentry module.

Often passwords are available in files stored on the local machine; this can be used by adding these option to the invocation of gpg:

--pinentry-mode=loopback --passphrase-file c:/foo/password.txt

(On Windows it doesn't matter whether you use forward of backward slashes for file names)

The file c:/foo/password.txt should have just one line with the password optionally followed by a linefeed.