DVX_GUI/security/README.md
2026-03-20 20:00:05 -05:00

16 KiB

Security -- Diffie-Hellman Key Exchange and XTEA-CTR Cipher

Cryptographic library providing Diffie-Hellman key exchange, XTEA symmetric encryption in CTR mode, and a DRBG-based pseudo-random number generator. Optimized for 486-class DOS hardware running under DJGPP/DPMI.

This library has no dependencies on the serial stack and can be used independently for any application requiring key exchange, encryption, or random number generation.

Components

1. XTEA Cipher (CTR Mode)

XTEA (eXtended Tiny Encryption Algorithm) is a 64-bit block cipher with a 128-bit key and 32 Feistel rounds. In CTR (counter) mode, it operates as a stream cipher: an incrementing counter is encrypted with the key to produce a keystream, which is XOR'd with the plaintext. Because XOR is its own inverse, the same operation encrypts and decrypts.

Why XTEA instead of AES or DES:

XTEA requires zero lookup tables, no key schedule, and compiles to approximately 20 instructions per round (shifts, adds, and XORs only). This makes it ideal for a 486 where the data cache is tiny (8KB) and AES's 4KB S-boxes would thrash it. DES is similarly table-heavy and has a complex key schedule. XTEA has no library dependencies -- the entire cipher fits in about a dozen lines of C. At 32 rounds, XTEA provides 128-bit security with negligible per-byte overhead even on the slowest target hardware.

CTR mode properties:

  • Encrypt and decrypt are the same function (XOR is symmetric)
  • No padding required -- operates on arbitrary-length data
  • Random access possible (set the counter to any value)
  • CRITICAL: the same counter value must never be reused with the same key. Reuse reveals the XOR of two plaintexts. The secLink layer prevents this by deriving separate TX/RX cipher keys for each direction.

XTEA block cipher internals:

The Feistel network uses the golden-ratio constant (delta = 0x9E3779B9) as a round key mixer. Each round combines the two 32-bit halves using shifts, additions, and XORs. The delta ensures each round uses a different effective subkey, preventing slide attacks. No S-boxes or lookup tables are involved anywhere in the computation.

2. Diffie-Hellman Key Exchange (1024-bit)

Uses the RFC 2409 Group 2 safe prime (1024-bit MODP group) with a generator of 2. Private exponents are 256 bits for fast computation on 486-class hardware.

Why 1024-bit DH with 256-bit private exponents:

RFC 2409 Group 2 provides a well-audited, interoperable safe prime. 256-bit private exponents (versus full 1024-bit) reduce the modular exponentiation from approximately 1024 squarings+multiplies to approximately 256 squarings + approximately 128 multiplies (half the exponent bits are 1 on average). This makes key generation feasible on a 486 in under a second rather than minutes. The security reduction is negligible -- Pollard's rho on a 256-bit exponent requires approximately 2^128 operations, matching XTEA's key strength.

Key validation:

secDhComputeSecret() validates that the remote public key is in the range [2, p-2] to prevent small-subgroup attacks. Keys of 0, 1, or p-1 would produce trivially guessable shared secrets.

Key derivation:

The 128-byte shared secret is reduced to a symmetric key via XOR-folding: each byte of the secret is XOR'd into the output key at position i % keyLen. For a 16-byte XTEA key, each output byte is the XOR of 8 secret bytes, providing thorough mixing. A proper KDF (HKDF, etc.) would be more rigorous but adds complexity and code size for marginal benefit in this use case.

3. Pseudo-Random Number Generator

XTEA-CTR based DRBG (Deterministic Random Bit Generator). The RNG encrypts a monotonically increasing 64-bit counter with a 128-bit XTEA key, producing 8 bytes of pseudorandom output per block. The counter never repeats (64-bit space is sufficient for any practical session length), so the output is a pseudorandom stream as long as the key has sufficient entropy.

Hardware entropy sources:

  • PIT (Programmable Interval Timer) -- runs at 1.193182 MHz. Its LSBs change rapidly and provide approximately 10 bits of entropy per read, depending on timing jitter. Two readings with intervening code execution provide additional jitter.
  • BIOS tick count -- 18.2 Hz timer at real-mode address 0040:046C. Adds a few more bits of entropy.

Total from hardware: roughly 20 bits of real entropy per call to secRngGatherEntropy(). This is not enough on its own for cryptographic use but is sufficient to seed the DRBG when supplemented by user interaction timing (keyboard, mouse jitter).

Seeding and mixing:

The seed function (secRngSeed()) XOR-folds the entropy into the XTEA key, derives the initial counter from the key bits, and then generates and discards 64 bytes to advance past any weak initial output. This discard step is standard DRBG practice -- it ensures the first bytes the caller receives do not leak information about the seed material.

Additional entropy can be stirred in at any time via secRngAddEntropy() without resetting the RNG state. This function XOR-folds new entropy into the key and then re-mixes by encrypting the key with itself, diffusing the new entropy across all key bits.

Auto-seeding: if secRngBytes() is called before secRngSeed(), it automatically gathers hardware entropy and seeds itself as a safety net.

BigNum Arithmetic

All modular arithmetic uses a 1024-bit big number type (BigNumT) stored as 32 x uint32_t words in little-endian order. Operations:

Function Description
bnAdd Add two bignums, return carry
bnSub Subtract two bignums, return borrow
bnCmp Compare two bignums (-1, 0, +1)
bnBit Test a single bit by index
bnBitLength Find the highest set bit position
bnShiftLeft1 Left-shift by 1, return carry
bnClear Zero all words
bnSet Set to a 32-bit value (clear upper words)
bnCopy Copy from source to destination
bnFromBytes Convert big-endian byte array to little-endian words
bnToBytes Convert little-endian words to big-endian byte array
bnMontMul Montgomery multiplication (CIOS variant)
bnModExp Modular exponentiation via Montgomery multiply

Montgomery Multiplication

The CIOS (Coarsely Integrated Operand Scanning) variant computes a * b * R^(-1) mod m in a single pass without explicit division by the modulus. This replaces the expensive modular reduction step (division by a 1024-bit number) with cheaper additions and right-shifts.

For each of the 32 outer iterations (one per word of operand a):

  1. Accumulate a[i] * b into the temporary product t
  2. Compute the Montgomery reduction factor u = t[0] * m0inv mod 2^32
  3. Add u * mod to t and shift right by 32 bits (implicit division)

After all iterations, the result is in the range [0, 2m), so a single conditional subtraction brings it into [0, m).

Montgomery constants (computed once, lazily on first DH use):

  • R^2 mod p -- computed via 2048 iterations of shift-left-1 with conditional subtraction. This is the Montgomery domain conversion factor.
  • -p[0]^(-1) mod 2^32 -- computed via Newton's method (5 iterations, doubling precision each step: 1->2->4->8->16->32 correct bits). This is the Montgomery reduction constant.

Modular exponentiation uses left-to-right binary square-and-multiply scanning. For a 256-bit private exponent, this requires approximately 256 squarings plus approximately 128 multiplies (half the bits are 1 on average), where each operation is a Montgomery multiplication on 32-word numbers.

Secure Zeroing

Key material (private keys, shared secrets, cipher contexts) is erased using a volatile-pointer loop:

static void secureZero(void *ptr, int len) {
    volatile uint8_t *p = (volatile uint8_t *)ptr;
    for (int i = 0; i < len; i++) {
        p[i] = 0;
    }
}

The volatile qualifier prevents the compiler from optimizing away the zeroing as a dead store. Without it, the compiler would see that the buffer is about to be freed and remove the memset entirely. This is critical for preventing sensitive key material from lingering in freed memory where a later malloc could expose it.

Performance

At serial port speeds, XTEA-CTR encryption overhead is minimal:

Speed Blocks/sec CPU Cycles/sec % of 33 MHz 486
9600 120 ~240K < 1%
57600 720 ~1.4M ~4%
115200 1440 ~2.9M ~9%

DH key exchange takes approximately 0.3 seconds at 66 MHz or 0.6 seconds at 33 MHz (256-bit private exponent, 1024-bit modulus, Montgomery multiplication).

API Reference

Constants

Name Value Description
SEC_DH_KEY_SIZE 128 DH public key size in bytes
SEC_XTEA_KEY_SIZE 16 XTEA key size in bytes
SEC_SUCCESS 0 Success
SEC_ERR_PARAM -1 Invalid parameter or NULL pointer
SEC_ERR_NOT_READY -2 Keys not yet generated/derived
SEC_ERR_ALLOC -3 Memory allocation failed

Types

typedef struct SecDhS SecDhT;         // Opaque DH context
typedef struct SecCipherS SecCipherT; // Opaque cipher context

RNG Functions

int secRngGatherEntropy(uint8_t *buf, int len);

Reads hardware entropy from the PIT counter and BIOS tick count. Returns the number of bytes written (up to 8). Provides roughly 20 bits of true entropy -- not sufficient alone, but enough to seed the DRBG when supplemented by user interaction timing.

void secRngSeed(const uint8_t *entropy, int len);

Initializes the DRBG with the given entropy. XOR-folds the input into the XTEA key, derives the initial counter, and generates and discards 64 bytes to advance past weak initial output.

void secRngAddEntropy(const uint8_t *data, int len);

Mixes additional entropy into the running RNG state without resetting it. XOR-folds data into the key and re-mixes by encrypting the key with itself. Use this to stir in keyboard timing, mouse jitter, or other runtime entropy sources.

void secRngBytes(uint8_t *buf, int len);

Generates len pseudorandom bytes. Auto-seeds from hardware entropy if not previously seeded. Produces 8 bytes per XTEA block encryption of the internal counter.

Diffie-Hellman Functions

SecDhT *secDhCreate(void);

Allocates a new DH context. Returns NULL on allocation failure. The context must be destroyed with secDhDestroy() when no longer needed.

int secDhGenerateKeys(SecDhT *dh);

Generates a 256-bit random private key and computes the corresponding 1024-bit public key (g^private mod p). Lazily initializes Montgomery constants on first call. The RNG should be seeded before calling this.

int secDhGetPublicKey(SecDhT *dh, uint8_t *buf, int *len);

Exports the public key as a big-endian byte array into buf. On entry, *len must be at least SEC_DH_KEY_SIZE (128). On return, *len is set to 128.

int secDhComputeSecret(SecDhT *dh, const uint8_t *remotePub, int len);

Computes the shared secret from the remote side's public key (remote^private mod p). Validates the remote key is in range [2, p-2]. Both sides compute this independently and arrive at the same value.

int secDhDeriveKey(SecDhT *dh, uint8_t *key, int keyLen);

Derives a symmetric key by XOR-folding the 128-byte shared secret down to keyLen bytes. Each output byte is the XOR of 128/keyLen input bytes.

void secDhDestroy(SecDhT *dh);

Securely zeroes the entire DH context (private key, shared secret, public key) and frees the memory. Must be called to prevent key material from lingering in the heap.

Cipher Functions

SecCipherT *secCipherCreate(const uint8_t *key);

Creates an XTEA-CTR cipher context with the given 16-byte key. The internal counter starts at zero. Returns NULL on allocation failure or NULL key.

void secCipherCrypt(SecCipherT *c, uint8_t *data, int len);

Encrypts or decrypts data in place. CTR mode is symmetric -- the same function handles both directions. The internal counter advances by one for every 8 bytes processed (one XTEA block). The counter must never repeat with the same key; callers are responsible for ensuring this (secLink handles it by using separate cipher instances per direction).

void secCipherSetNonce(SecCipherT *c, uint32_t nonceLo, uint32_t nonceHi);

Sets the 64-bit nonce/counter to a specific value. Both the nonce (baseline) and the running counter are set to the same value. Call this before encrypting if you need a deterministic starting point.

void secCipherDestroy(SecCipherT *c);

Securely zeroes the cipher context (key and counter state) and frees the memory.

Usage Examples

Full Key Exchange

#include "security.h"
#include <string.h>

// Seed the RNG
uint8_t entropy[16];
secRngGatherEntropy(entropy, sizeof(entropy));
secRngSeed(entropy, sizeof(entropy));

// Create DH context and generate keys
SecDhT *dh = secDhCreate();
secDhGenerateKeys(dh);

// Export public key to send to remote
uint8_t myPub[SEC_DH_KEY_SIZE];
int     pubLen = SEC_DH_KEY_SIZE;
secDhGetPublicKey(dh, myPub, &pubLen);
// ... send myPub to remote, receive remotePub ...

// Compute shared secret and derive a 16-byte XTEA key
secDhComputeSecret(dh, remotePub, SEC_DH_KEY_SIZE);

uint8_t key[SEC_XTEA_KEY_SIZE];
secDhDeriveKey(dh, key, SEC_XTEA_KEY_SIZE);
secDhDestroy(dh);  // private key no longer needed

// Create cipher and encrypt
SecCipherT *cipher = secCipherCreate(key);
uint8_t message[] = "Secret message";
secCipherCrypt(cipher, message, sizeof(message));
// message is now encrypted

// Decrypt (reset counter first, then apply same operation)
secCipherSetNonce(cipher, 0, 0);
secCipherCrypt(cipher, message, sizeof(message));
// message is now plaintext again

secCipherDestroy(cipher);

Standalone Encryption (Without DH)

// XTEA-CTR can be used independently of Diffie-Hellman
uint8_t key[SEC_XTEA_KEY_SIZE] = { /* your key */ };
SecCipherT *c = secCipherCreate(key);

uint8_t data[1024];
// ... fill data ...
secCipherCrypt(c, data, sizeof(data));  // encrypt in place

secCipherDestroy(c);

Random Number Generation

// Seed from hardware
uint8_t hwEntropy[16];
secRngGatherEntropy(hwEntropy, sizeof(hwEntropy));
secRngSeed(hwEntropy, sizeof(hwEntropy));

// Stir in user-derived entropy (keyboard timing, etc.)
uint8_t userEntropy[4];
// ... gather from timing events ...
secRngAddEntropy(userEntropy, sizeof(userEntropy));

// Generate random bytes
uint8_t randomBuf[32];
secRngBytes(randomBuf, sizeof(randomBuf));

Building

make        # builds ../lib/libsecurity.a
make clean  # removes objects and library

Cross-compiled with the DJGPP toolchain targeting i486+ CPUs. Compiler flags: -O2 -Wall -Wextra -march=i486 -mtune=i586.

Objects are placed in ../obj/security/, the library in ../lib/.

No external dependencies -- the library is self-contained. It uses only DJGPP's <pc.h>, <sys/farptr.h>, and <go32.h> for hardware entropy collection (PIT and BIOS tick count access).

Files

  • security.h -- Public API header (types, constants, function prototypes)
  • security.c -- Complete implementation (bignum, Montgomery, DH, XTEA, RNG)
  • Makefile -- DJGPP cross-compilation build rules

Used By

  • seclink/ -- Secure serial link (DH handshake, cipher creation, RNG seeding)