| .. | ||
| Makefile | ||
| README.md | ||
| security.c | ||
| security.h | ||
Security -- Diffie-Hellman Key Exchange and XTEA-CTR Cipher
Cryptographic library providing Diffie-Hellman key exchange, XTEA symmetric encryption in CTR mode, and a DRBG-based pseudo-random number generator. Optimized for 486-class DOS hardware running under DJGPP/DPMI.
This library has no dependencies on the serial stack and can be used independently for any application requiring key exchange, encryption, or random number generation.
Components
1. XTEA Cipher (CTR Mode)
XTEA (eXtended Tiny Encryption Algorithm) is a 64-bit block cipher with a 128-bit key and 32 Feistel rounds. In CTR (counter) mode, it operates as a stream cipher: an incrementing counter is encrypted with the key to produce a keystream, which is XOR'd with the plaintext. Because XOR is its own inverse, the same operation encrypts and decrypts.
Why XTEA instead of AES or DES:
XTEA requires zero lookup tables, no key schedule, and compiles to approximately 20 instructions per round (shifts, adds, and XORs only). This makes it ideal for a 486 where the data cache is tiny (8KB) and AES's 4KB S-boxes would thrash it. DES is similarly table-heavy and has a complex key schedule. XTEA has no library dependencies -- the entire cipher fits in about a dozen lines of C. At 32 rounds, XTEA provides 128-bit security with negligible per-byte overhead even on the slowest target hardware.
CTR mode properties:
- Encrypt and decrypt are the same function (XOR is symmetric)
- No padding required -- operates on arbitrary-length data
- Random access possible (set the counter to any value)
- CRITICAL: the same counter value must never be reused with the same key. Reuse reveals the XOR of two plaintexts. The secLink layer prevents this by deriving separate TX/RX cipher keys for each direction.
XTEA block cipher internals:
The Feistel network uses the golden-ratio constant (delta = 0x9E3779B9) as a round key mixer. Each round combines the two 32-bit halves using shifts, additions, and XORs. The delta ensures each round uses a different effective subkey, preventing slide attacks. No S-boxes or lookup tables are involved anywhere in the computation.
2. Diffie-Hellman Key Exchange (1024-bit)
Uses the RFC 2409 Group 2 safe prime (1024-bit MODP group) with a generator of 2. Private exponents are 256 bits for fast computation on 486-class hardware.
Why 1024-bit DH with 256-bit private exponents:
RFC 2409 Group 2 provides a well-audited, interoperable safe prime. 256-bit private exponents (versus full 1024-bit) reduce the modular exponentiation from approximately 1024 squarings+multiplies to approximately 256 squarings + approximately 128 multiplies (half the exponent bits are 1 on average). This makes key generation feasible on a 486 in under a second rather than minutes. The security reduction is negligible -- Pollard's rho on a 256-bit exponent requires approximately 2^128 operations, matching XTEA's key strength.
Key validation:
secDhComputeSecret() validates that the remote public key is in the
range [2, p-2] to prevent small-subgroup attacks. Keys of 0, 1, or p-1
would produce trivially guessable shared secrets.
Key derivation:
The 128-byte shared secret is reduced to a symmetric key via XOR-folding:
each byte of the secret is XOR'd into the output key at position
i % keyLen. For a 16-byte XTEA key, each output byte is the XOR of
8 secret bytes, providing thorough mixing. A proper KDF (HKDF, etc.)
would be more rigorous but adds complexity and code size for marginal
benefit in this use case.
3. Pseudo-Random Number Generator
XTEA-CTR based DRBG (Deterministic Random Bit Generator). The RNG encrypts a monotonically increasing 64-bit counter with a 128-bit XTEA key, producing 8 bytes of pseudorandom output per block. The counter never repeats (64-bit space is sufficient for any practical session length), so the output is a pseudorandom stream as long as the key has sufficient entropy.
Hardware entropy sources:
- PIT (Programmable Interval Timer) -- runs at 1.193182 MHz. Its LSBs change rapidly and provide approximately 10 bits of entropy per read, depending on timing jitter. Two readings with intervening code execution provide additional jitter.
- BIOS tick count -- 18.2 Hz timer at real-mode address 0040:046C. Adds a few more bits of entropy.
Total from hardware: roughly 20 bits of real entropy per call to
secRngGatherEntropy(). This is not enough on its own for
cryptographic use but is sufficient to seed the DRBG when supplemented
by user interaction timing (keyboard, mouse jitter).
Seeding and mixing:
The seed function (secRngSeed()) XOR-folds the entropy into the XTEA
key, derives the initial counter from the key bits, and then generates and
discards 64 bytes to advance past any weak initial output. This discard
step is standard DRBG practice -- it ensures the first bytes the caller
receives do not leak information about the seed material.
Additional entropy can be stirred in at any time via secRngAddEntropy()
without resetting the RNG state. This function XOR-folds new entropy into
the key and then re-mixes by encrypting the key with itself, diffusing
the new entropy across all key bits.
Auto-seeding: if secRngBytes() is called before secRngSeed(), it
automatically gathers hardware entropy and seeds itself as a safety net.
BigNum Arithmetic
All modular arithmetic uses a 1024-bit big number type (BigNumT)
stored as 32 x uint32_t words in little-endian order. Operations:
| Function | Description |
|---|---|
bnAdd |
Add two bignums, return carry |
bnSub |
Subtract two bignums, return borrow |
bnCmp |
Compare two bignums (-1, 0, +1) |
bnBit |
Test a single bit by index |
bnBitLength |
Find the highest set bit position |
bnShiftLeft1 |
Left-shift by 1, return carry |
bnClear |
Zero all words |
bnSet |
Set to a 32-bit value (clear upper words) |
bnCopy |
Copy from source to destination |
bnFromBytes |
Convert big-endian byte array to little-endian words |
bnToBytes |
Convert little-endian words to big-endian byte array |
bnMontMul |
Montgomery multiplication (CIOS variant) |
bnModExp |
Modular exponentiation via Montgomery multiply |
Montgomery Multiplication
The CIOS (Coarsely Integrated Operand Scanning) variant computes
a * b * R^(-1) mod m in a single pass without explicit division by the
modulus. This replaces the expensive modular reduction step (division by a
1024-bit number) with cheaper additions and right-shifts.
For each of the 32 outer iterations (one per word of operand a):
- Accumulate
a[i] * binto the temporary productt - Compute the Montgomery reduction factor
u = t[0] * m0inv mod 2^32 - Add
u * modtotand shift right by 32 bits (implicit division)
After all iterations, the result is in the range [0, 2m), so a single conditional subtraction brings it into [0, m).
Montgomery constants (computed once, lazily on first DH use):
R^2 mod p-- computed via 2048 iterations of shift-left-1 with conditional subtraction. This is the Montgomery domain conversion factor.-p[0]^(-1) mod 2^32-- computed via Newton's method (5 iterations, doubling precision each step: 1->2->4->8->16->32 correct bits). This is the Montgomery reduction constant.
Modular exponentiation uses left-to-right binary square-and-multiply scanning. For a 256-bit private exponent, this requires approximately 256 squarings plus approximately 128 multiplies (half the bits are 1 on average), where each operation is a Montgomery multiplication on 32-word numbers.
Secure Zeroing
Key material (private keys, shared secrets, cipher contexts) is erased using a volatile-pointer loop:
static void secureZero(void *ptr, int len) {
volatile uint8_t *p = (volatile uint8_t *)ptr;
for (int i = 0; i < len; i++) {
p[i] = 0;
}
}
The volatile qualifier prevents the compiler from optimizing away the
zeroing as a dead store. Without it, the compiler would see that the
buffer is about to be freed and remove the memset entirely. This is
critical for preventing sensitive key material from lingering in freed
memory where a later malloc could expose it.
Performance
At serial port speeds, XTEA-CTR encryption overhead is minimal:
| Speed | Blocks/sec | CPU Cycles/sec | % of 33 MHz 486 |
|---|---|---|---|
| 9600 | 120 | ~240K | < 1% |
| 57600 | 720 | ~1.4M | ~4% |
| 115200 | 1440 | ~2.9M | ~9% |
DH key exchange takes approximately 0.3 seconds at 66 MHz or 0.6 seconds at 33 MHz (256-bit private exponent, 1024-bit modulus, Montgomery multiplication).
API Reference
Constants
| Name | Value | Description |
|---|---|---|
SEC_DH_KEY_SIZE |
128 | DH public key size in bytes |
SEC_XTEA_KEY_SIZE |
16 | XTEA key size in bytes |
SEC_SUCCESS |
0 | Success |
SEC_ERR_PARAM |
-1 | Invalid parameter or NULL pointer |
SEC_ERR_NOT_READY |
-2 | Keys not yet generated/derived |
SEC_ERR_ALLOC |
-3 | Memory allocation failed |
Types
typedef struct SecDhS SecDhT; // Opaque DH context
typedef struct SecCipherS SecCipherT; // Opaque cipher context
RNG Functions
int secRngGatherEntropy(uint8_t *buf, int len);
Reads hardware entropy from the PIT counter and BIOS tick count. Returns the number of bytes written (up to 8). Provides roughly 20 bits of true entropy -- not sufficient alone, but enough to seed the DRBG when supplemented by user interaction timing.
void secRngSeed(const uint8_t *entropy, int len);
Initializes the DRBG with the given entropy. XOR-folds the input into the XTEA key, derives the initial counter, and generates and discards 64 bytes to advance past weak initial output.
void secRngAddEntropy(const uint8_t *data, int len);
Mixes additional entropy into the running RNG state without resetting it. XOR-folds data into the key and re-mixes by encrypting the key with itself. Use this to stir in keyboard timing, mouse jitter, or other runtime entropy sources.
void secRngBytes(uint8_t *buf, int len);
Generates len pseudorandom bytes. Auto-seeds from hardware entropy if
not previously seeded. Produces 8 bytes per XTEA block encryption of the
internal counter.
Diffie-Hellman Functions
SecDhT *secDhCreate(void);
Allocates a new DH context. Returns NULL on allocation failure. The
context must be destroyed with secDhDestroy() when no longer needed.
int secDhGenerateKeys(SecDhT *dh);
Generates a 256-bit random private key and computes the corresponding
1024-bit public key (g^private mod p). Lazily initializes Montgomery
constants on first call. The RNG should be seeded before calling this.
int secDhGetPublicKey(SecDhT *dh, uint8_t *buf, int *len);
Exports the public key as a big-endian byte array into buf. On entry,
*len must be at least SEC_DH_KEY_SIZE (128). On return, *len is
set to 128.
int secDhComputeSecret(SecDhT *dh, const uint8_t *remotePub, int len);
Computes the shared secret from the remote side's public key
(remote^private mod p). Validates the remote key is in range [2, p-2].
Both sides compute this independently and arrive at the same value.
int secDhDeriveKey(SecDhT *dh, uint8_t *key, int keyLen);
Derives a symmetric key by XOR-folding the 128-byte shared secret down
to keyLen bytes. Each output byte is the XOR of 128/keyLen input
bytes.
void secDhDestroy(SecDhT *dh);
Securely zeroes the entire DH context (private key, shared secret, public key) and frees the memory. Must be called to prevent key material from lingering in the heap.
Cipher Functions
SecCipherT *secCipherCreate(const uint8_t *key);
Creates an XTEA-CTR cipher context with the given 16-byte key. The
internal counter starts at zero. Returns NULL on allocation failure or
NULL key.
void secCipherCrypt(SecCipherT *c, uint8_t *data, int len);
Encrypts or decrypts data in place. CTR mode is symmetric -- the same
function handles both directions. The internal counter advances by one
for every 8 bytes processed (one XTEA block). The counter must never
repeat with the same key; callers are responsible for ensuring this
(secLink handles it by using separate cipher instances per direction).
void secCipherSetNonce(SecCipherT *c, uint32_t nonceLo, uint32_t nonceHi);
Sets the 64-bit nonce/counter to a specific value. Both the nonce (baseline) and the running counter are set to the same value. Call this before encrypting if you need a deterministic starting point.
void secCipherDestroy(SecCipherT *c);
Securely zeroes the cipher context (key and counter state) and frees the memory.
Usage Examples
Full Key Exchange
#include "security.h"
#include <string.h>
// Seed the RNG
uint8_t entropy[16];
secRngGatherEntropy(entropy, sizeof(entropy));
secRngSeed(entropy, sizeof(entropy));
// Create DH context and generate keys
SecDhT *dh = secDhCreate();
secDhGenerateKeys(dh);
// Export public key to send to remote
uint8_t myPub[SEC_DH_KEY_SIZE];
int pubLen = SEC_DH_KEY_SIZE;
secDhGetPublicKey(dh, myPub, &pubLen);
// ... send myPub to remote, receive remotePub ...
// Compute shared secret and derive a 16-byte XTEA key
secDhComputeSecret(dh, remotePub, SEC_DH_KEY_SIZE);
uint8_t key[SEC_XTEA_KEY_SIZE];
secDhDeriveKey(dh, key, SEC_XTEA_KEY_SIZE);
secDhDestroy(dh); // private key no longer needed
// Create cipher and encrypt
SecCipherT *cipher = secCipherCreate(key);
uint8_t message[] = "Secret message";
secCipherCrypt(cipher, message, sizeof(message));
// message is now encrypted
// Decrypt (reset counter first, then apply same operation)
secCipherSetNonce(cipher, 0, 0);
secCipherCrypt(cipher, message, sizeof(message));
// message is now plaintext again
secCipherDestroy(cipher);
Standalone Encryption (Without DH)
// XTEA-CTR can be used independently of Diffie-Hellman
uint8_t key[SEC_XTEA_KEY_SIZE] = { /* your key */ };
SecCipherT *c = secCipherCreate(key);
uint8_t data[1024];
// ... fill data ...
secCipherCrypt(c, data, sizeof(data)); // encrypt in place
secCipherDestroy(c);
Random Number Generation
// Seed from hardware
uint8_t hwEntropy[16];
secRngGatherEntropy(hwEntropy, sizeof(hwEntropy));
secRngSeed(hwEntropy, sizeof(hwEntropy));
// Stir in user-derived entropy (keyboard timing, etc.)
uint8_t userEntropy[4];
// ... gather from timing events ...
secRngAddEntropy(userEntropy, sizeof(userEntropy));
// Generate random bytes
uint8_t randomBuf[32];
secRngBytes(randomBuf, sizeof(randomBuf));
Building
make # builds ../lib/libsecurity.a
make clean # removes objects and library
Cross-compiled with the DJGPP toolchain targeting i486+ CPUs. Compiler
flags: -O2 -Wall -Wextra -march=i486 -mtune=i586.
Objects are placed in ../obj/security/, the library in ../lib/.
No external dependencies -- the library is self-contained. It uses only
DJGPP's <pc.h>, <sys/farptr.h>, and <go32.h> for hardware entropy
collection (PIT and BIOS tick count access).
Files
security.h-- Public API header (types, constants, function prototypes)security.c-- Complete implementation (bignum, Montgomery, DH, XTEA, RNG)Makefile-- DJGPP cross-compilation build rules
Used By
seclink/-- Secure serial link (DH handshake, cipher creation, RNG seeding)