Advanced BASIC to C transpiler with easy add-on library integration.
Find a file
2026-02-21 18:51:40 -06:00
.claude Initial commit. 2026-02-21 18:51:40 -06:00
.gitignore Initial commit. 2026-02-21 18:51:40 -06:00
basic2c.c Initial commit. 2026-02-21 18:51:40 -06:00
builtins.def Initial commit. 2026-02-21 18:51:40 -06:00
functions.def Initial commit. 2026-02-21 18:51:40 -06:00
README.md Initial commit. 2026-02-21 18:51:40 -06:00
test.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_big.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_classic.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_continue.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_data.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_fileio.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_inc_b.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_inc_c.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_include.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_include_lib.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_include_nested.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_labels.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_multidim.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_newfeatures.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_redim.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_types.bas Initial commit. 2026-02-21 18:51:40 -06:00
test_udt.bas Initial commit. 2026-02-21 18:51:40 -06:00

basic2c

A BASIC-to-C transpiler. Translates BASIC source code into equivalent C source code with an embedded runtime library.

Build

cc -Wall -o basic2c basic2c.c -lm

Usage

basic2c [--release|-r] input.bas [output.c]
  • If output.c is omitted, C code is written to stdout.
  • --release (or -r) selects the release runtime (see Runtime Modes).

Compile the generated C:

cc -Wall -o program output.c -lm

Architecture

The transpiler is a single-file C program with three phases:

  1. Lexer — tokenizes BASIC source (case-insensitive keywords)
  2. Parser — recursive descent, builds an AST
  3. Codegen — walks the AST, emits C source with a small runtime library

Data Types

BASIC Type C Type Suffix Notes
BYTE uint8_t Unsigned 8-bit
INTEGER int16_t % Signed 16-bit
LONG int32_t Signed 32-bit
FLOAT float ! Single precision
DOUBLE double # Double precision (default numeric)
STRING char* $ Dynamic, heap-allocated

Type suffixes on variable names are recognized: name$ is STRING, count% is INTEGER, total# is DOUBLE, rate! is FLOAT. Variables without a suffix or explicit type declaration default to DOUBLE.

Numeric types follow a promotion hierarchy: BYTE < INTEGER < LONG < FLOAT < DOUBLE. Mixed-type expressions promote to the higher-ranked type.

Variables and Arrays

Declaration

DIM x AS DOUBLE
DIM name AS STRING
DIM count AS INTEGER

Variables can also be used without declaration — they are implicitly declared based on their type suffix or as DOUBLE by default.

Arrays

DIM arr(10) AS INTEGER          ' 1D array, indices 0..10
DIM matrix(3, 4) AS DOUBLE      ' 2D array, indices 0..3 x 0..4
DIM cube(2, 3, 4) AS INTEGER    ' 3D array

Arrays are zero-based. The dimension value is the upper bound (inclusive), so DIM arr(10) allocates 11 elements (0 through 10).

REDIM

REDIM arr(20) AS INTEGER        ' Resize array (contents reset to zero)
REDIM matrix(5, 5) AS DOUBLE    ' Resize multidimensional array

REDIM frees the previous allocation and creates a new zero-initialized array.

Operators

Arithmetic

Operator Description
+ Addition
- Subtraction / unary negation
* Multiplication
/ Division
\ Integer division
MOD Modulo
^ Exponentiation

Comparison

Operator Description
= Equal
<> Not equal
< Less than
> Greater than
<= Less than or equal
>= Greater than or equal

Bitwise / Logical

Operator Description
AND Bitwise AND
OR Bitwise OR
NOT Bitwise NOT
XOR Bitwise XOR

These operators work as both bitwise and logical operators. When used with comparisons (which return 0 or 1), they behave logically: x > 5 AND y < 10. When used with integers, they operate on individual bits: 15 AND 9 gives 9.

String

Operator Description
+ Concatenation (when operands are strings)
& Concatenation (explicit)

Control Flow

IF / THEN / ELSE

Single-line:

IF x > 0 THEN PRINT "positive" ELSE PRINT "non-positive"

Multi-line:

IF x > 0 THEN
    PRINT "positive"
ELSEIF x = 0 THEN
    PRINT "zero"
ELSE
    PRINT "negative"
END IF

FOR / NEXT

FOR i = 1 TO 10
    PRINT i
NEXT i

FOR i = 10 TO 0 STEP -2
    PRINT i
NEXT i

WHILE / WEND

WHILE x > 0
    x = x - 1
WEND

DO / LOOP

DO
    x = x + 1
LOOP UNTIL x >= 10

DO WHILE x < 100
    x = x * 2
LOOP

SELECT CASE

SELECT CASE grade
    CASE 90 TO 100
        PRINT "A"
    CASE 80 TO 89
        PRINT "B"
    CASE 70 TO 79
        PRINT "C"
    CASE IS < 60
        PRINT "F"
    CASE ELSE
        PRINT "D"
END SELECT

CASE values support single values (CASE 1), comma-separated values (CASE 1, 2, 3), ranges (CASE 5 TO 10), comparisons (CASE IS > 100), and a default (CASE ELSE). Works with both numeric and string expressions.

EXIT

EXIT FOR
EXIT WHILE
EXIT DO
EXIT SUB
EXIT FUNCTION

CONTINUE

CONTINUE FOR
CONTINUE WHILE
CONTINUE DO

Skips the rest of the current loop iteration and jumps to the next iteration.

GOTO

GOTO 100       ' Jump to line number
GOTO myLabel   ' Jump to named label

GOSUB / RETURN

GOSUB 200
GOSUB myRoutine
' ...
200 PRINT "in subroutine"
RETURN

myRoutine:
PRINT "named routine"
RETURN

GOSUB uses a compile-time dispatch mechanism — each GOSUB site gets a unique return-point ID, and RETURN uses a switch statement to jump back.

ON GOTO / ON GOSUB

ON choice GOTO label1, label2, label3
ON choice GOSUB routine1, routine2, routine3

Branches to the Nth label based on the expression value (1-based). If the value is out of range, execution continues at the next statement.

Labels

Both classic line numbers and named labels are supported:

10 PRINT "line 10"
20 GOTO 10

myLabel:
PRINT "named label"
GOTO myLabel

Constants

CONST PI = 3.14159
CONST MAX_SIZE = 100
CONST GREETING$ = "Hello"

Constants are evaluated at compile time and substituted directly into expressions. They cannot be reassigned.

SWAP

SWAP a, b
SWAP s1$, s2$

Exchanges the values of two variables of the same type.

Procedures

SUB

SUB greet(name AS STRING)
    PRINT "Hello, "; name
END SUB

CALL greet("World")
greet "World"           ' CALL keyword is optional

FUNCTION

FUNCTION square(x AS DOUBLE) AS DOUBLE
    square = x * x
END FUNCTION

PRINT square(5)

Functions return values by assigning to the function name or using RETURN expr.

Parameter Passing

SUB increment(BYREF x AS INTEGER)
    x = x + 1
END SUB

SUB display(BYVAL x AS INTEGER)
    PRINT x
END SUB
  • BYREF (default) — passes a pointer; changes affect the caller's variable
  • BYVAL — passes a copy; changes are local to the procedure

LOCAL and STATIC

SUB counter()
    STATIC count AS INTEGER
    LOCAL temp AS INTEGER
    count = count + 1
    temp = count
    PRINT temp
END SUB
  • LOCAL — declares a variable scoped to the procedure
  • STATIC — declares a variable that persists across calls

User-Defined Types

TYPE / END TYPE

TYPE PersonRecord
    firstName AS STRING * 20
    lastName AS STRING * 30
    age AS INTEGER
    salary AS DOUBLE
END TYPE

DIM person AS PersonRecord
person.firstName = "John"
person.lastName = "Doe"
person.age = 30
person.salary = 55000.50

String fields in TYPE definitions require a fixed length (STRING * N). Dynamic strings (AS STRING without a length) are not permitted in TYPE definitions because struct copy would produce dangling pointers.

Supported field types: BYTE, INTEGER, LONG, FLOAT, DOUBLE, STRING * N, and other user-defined types (nesting).

Nested UDTs

TYPE Vec2
    x AS DOUBLE
    y AS DOUBLE
END TYPE

TYPE Circle
    center AS Vec2
    radius AS DOUBLE
END TYPE

DIM c AS Circle
c.center.x = 10.0
c.center.y = 20.0
c.radius = 5.0

Nesting depth is unlimited. Chained dot-access works for both reads and writes.

UDT Arrays

DIM points(10) AS Vec2
points(0).x = 1.5
points(0).y = 2.5

UDT Assignment

Whole-struct copy via assignment:

DIM a AS Vec2
DIM b AS Vec2
a.x = 1.0
a.y = 2.0
b = a               ' Copies all fields

Sub-struct copy also works:

DIM saved AS Vec2
saved = c.center     ' Copy nested struct out
c.center = saved     ' Copy nested struct in

Array element copy:

circles(0) = circles(2)

SIZEOF

DIM sz AS LONG
sz = SIZEOF(PersonRecord)

Returns the byte size of a user-defined type. Used primarily with random-access file I/O to specify record length.

Built-in Functions

String Functions

Function Description
LEN(s$) Length of string
MID$(s$, start, len) Substring (1-based start position)
LEFT$(s$, n) First n characters
RIGHT$(s$, n) Last n characters
CHR$(n) Character from ASCII code
ASC(s$) ASCII code of first character
STR$(n) Convert number to string
VAL(s$) Convert string to number
UCASE$(s$) Convert to uppercase
LCASE$(s$) Convert to lowercase
INSTR(haystack$, needle$) Find substring position (1-based, 0 if not found)
STRING$(n, char$) Repeat a character n times
LTRIM$(s$) Remove leading spaces
RTRIM$(s$) Remove trailing spaces
TRIM$(s$) Remove leading and trailing spaces
SPACE$(n) String of n spaces
HEX$(n) Hexadecimal string representation
OCT$(n) Octal string representation

MID$ Assignment

DIM s AS STRING
s = "Hello World"
MID$(s, 7, 5) = "BASIC"    ' s is now "Hello BASIC"

Replaces characters in a string starting at a 1-based position. The length parameter limits how many characters are replaced.

Math Functions

Function Description
ABS(n) Absolute value
INT(n) Truncate to integer
SQR(n) Square root
SIN(n) Sine (radians)
COS(n) Cosine (radians)
TAN(n) Tangent (radians)
ATN(n) Arctangent (returns radians)
LOG(n) Natural logarithm
EXP(n) e raised to the power n
SGN(n) Sign: -1, 0, or 1
RND Random number between 0 and 1

Numeric expressions also support ^ for exponentiation (emitted as pow()).

Print Formatting Functions

Function Description
TAB(n) Output spaces to reach column n
SPC(n) Output exactly n spaces

These functions are used within PRINT statements:

PRINT "Name"; TAB(20); "Value"
PRINT "A"; SPC(5); "B"         ' Outputs "A     B"

RND can be called with or without parentheses, and accepts an optional argument (which is ignored) for compatibility with other BASIC dialects. Use RANDOMIZE to seed the random number generator:

RANDOMIZE         ' Seed from system clock
RANDOMIZE 12345   ' Seed with specific value
x = RND           ' Random double 0..1
x = RND(1)        ' Same as RND (argument ignored)

Array Functions

Function Description
LBOUND(arr) Lower bound of array (always 0)
UBOUND(arr) Upper bound of array

I/O Functions

Function Description
EOF(n) Returns true (-1) if at end of file n
LOF(n) Returns byte length of file n
FREEFILE() Returns the next available file number

Console I/O

PRINT

PRINT "Hello, World!"
PRINT "x = "; x
PRINT x; " "; y           ' Semicolon suppresses newline between items
PRINT x, y                 ' Comma advances to next tab stop
PRINT "no newline";        ' Trailing semicolon suppresses final newline
? "shortcut"               ' ? is a shortcut for PRINT

The ? character can be used as a shortcut for PRINT, for compatibility with classic BASIC dialects and interactive use.

PRINT USING

PRINT USING "###.##"; 123.456        ' Outputs: 123.46
PRINT USING "$$#,###.##"; 1234.56    ' Outputs: $1,234.56
PRINT USING "+###.##"; -45.6         ' Outputs: -45.60
PRINT USING "**###.##"; 9.99         ' Outputs: ****9.99
PRINT USING "!"; "Hello"             ' Outputs: H
PRINT USING "&"; "World"             ' Outputs: World
PRINT USING "\    \"; "Testing"      ' Outputs: Testin (6 chars)

Format specifiers for numbers:

Format Description
# Digit placeholder
. Decimal point position
, Thousands separator (in format, not output)
+ Show sign (+ or -) at start
- Trailing minus for negative numbers
$$ Floating dollar sign
** Fill leading spaces with asterisks

Format specifiers for strings:

Format Description
! First character only
& Entire string
\ \ Fixed width (spaces between backslashes + 2)

Multiple values can be formatted with one format string:

PRINT USING "### + ### = ###"; 10; 20; 30
' Outputs:  10 +  20 =  30

INPUT

INPUT "Enter name: "; name$
INPUT x

LINE INPUT

LINE INPUT "Enter text: "; line$

Reads an entire line including commas and spaces.

File I/O

Sequential Files

' Write
OPEN "data.txt" FOR OUTPUT AS #1
PRINT #1, "Hello"
PRINT #1, 42
CLOSE #1

' Read
OPEN "data.txt" FOR INPUT AS #1
LINE INPUT #1, text$
INPUT #1, value
CLOSE #1

' Append
OPEN "log.txt" FOR APPEND AS #1
PRINT #1, "new entry"
CLOSE #1

WRITE

WRITE #1, name$, age, salary

Outputs CSV-style: strings are quoted, values are comma-separated, terminated with a newline.

Binary Files

OPEN "file.dat" FOR BINARY AS #1

Random-Access Files

TYPE Record
    name AS STRING * 20
    value AS DOUBLE
END TYPE

DIM rec AS Record
rec.name = "test"
rec.value = 3.14

OPEN "data.dat" FOR RANDOM AS #1 LEN = SIZEOF(Record)
PUT #1, 1, rec          ' Write record at position 1 (1-based)
GET #1, 1, rec          ' Read record at position 1
CLOSE #1

Random-access uses GET and PUT with 1-based record numbers. The LEN clause specifies record size in bytes. Records can be read and written in any order.

File Modes

Mode C Mode Description
INPUT "r" Read sequential text
OUTPUT "w" Write sequential text (truncates)
APPEND "a" Append sequential text
BINARY "rb" Binary read
RANDOM "r+b" Random access (creates if not found)

DATA / READ / RESTORE

DATA 10, 20, 30, "hello"

DIM x AS INTEGER
DIM s AS STRING
READ x          ' x = 10
READ x          ' x = 20
READ x          ' x = 30
READ s          ' s = "hello"

RESTORE         ' Reset read pointer to beginning
READ x          ' x = 10 again

DATA statements define a pool of literal values. READ consumes them in order. RESTORE resets the read pointer (optionally to a specific line number).

Comments

' This is a comment
REM This is also a comment
x = 5  ' Inline comment

$INCLUDE Metacommand

'$INCLUDE: 'helpers.bas'

The $INCLUDE metacommand inserts the contents of another file at the point of the directive, before lexing and parsing. The directive is placed inside a comment (the leading ' makes it invisible to editors that don't understand it).

Syntax

The filename is enclosed in single quotes after '$INCLUDE:. The keyword is case-insensitive. Any amount of whitespace may appear between the colon and the opening quote.

Nested Includes

Included files may themselves contain $INCLUDE directives:

' main.bas
'$INCLUDE: 'math_lib.bas'
'$INCLUDE: 'string_lib.bas'
' math_lib.bas — can include further files
'$INCLUDE: 'constants.bas'
FUNCTION Square(x AS DOUBLE) AS DOUBLE
    Square = x * x
END FUNCTION

Path Resolution

Filenames are resolved relative to the including file's directory, not the working directory. If src/main.bas includes 'lib/util.bas', the transpiler looks for src/lib/util.bas.

Error Reporting

When $INCLUDE is used, error messages show the originating file and line:

Error (math_lib.bas:12): undeclared variable 'q'

Without includes, the format is the same but shows the input filename:

Error (main.bas:5): type mismatch

Circular Include Detection

If file A includes file B which includes file A, the transpiler reports a fatal error rather than looping infinitely:

Error: Circular include detected: main.bas

Extensible Functions

The transpiler supports two mechanisms for defining additional functions:

Built-in Functions (builtins.def)

The builtins.def file is compiled into basic2c and provides functions that are always available. To add permanent built-in functions, edit builtins.def and recompile basic2c.

Default built-ins include:

Math functions:

Function Description
SQR(n) Square root
SIN(n) Sine (radians)
COS(n) Cosine (radians)
TAN(n) Tangent (radians)
ATN(n) Arctangent (returns radians)
LOG(n) Natural logarithm
EXP(n) e raised to power n
SGN(n) Sign: -1, 0, or 1
RND() Random number 0 to 1
CEIL(n) Round up to integer
FLOOR(n) Round down to integer
ROUND(n) Round to nearest integer
FIX(n) Truncate toward zero
FRAC(n) Fractional part
HYPOT(x, y) Hypotenuse (sqrt(x² + y²))
MAX(a, b) Maximum of two values
MIN(a, b) Minimum of two values

String functions:

Function Description
CHR$(n) Character from ASCII code
STR$(n) Convert number to string
UCASE$(s) Convert to uppercase
LCASE$(s) Convert to lowercase
LTRIM$(s) Remove leading spaces
RTRIM$(s) Remove trailing spaces
TRIM$(s) Remove leading and trailing spaces
SPACE$(n) String of n spaces
HEX$(n) Hexadecimal representation
OCT$(n) Octal representation
TAB(n) Spaces to reach column n
SPC(n) Output n spaces
ENVIRON$(name) Get environment variable

System:

Function Description
TIMER() Seconds since program start

External Functions (functions.def)

The functions.def file is loaded at runtime from two locations (both if present):

  1. The directory containing the basic2c binary (global extensions)
  2. The directory containing the input .bas file (project-specific)

Functions from the input file's directory are loaded second, allowing project-specific definitions to supplement or override earlier ones.

Definition Format

Both builtins.def and functions.def use the same format:

# Comment lines start with #
# Format: name : type : c_template

SQUARE : double : ((%) * (%))
CUBE   : double : ((%) * (%) * (%))

Each line defines:

  • name — The BASIC function name (case-insensitive)
  • type — Return type: byte, integer, long, float, double, or string
  • c_template — C code with argument placeholders

Argument Placeholders

  • % or %1 — First argument
  • %2 — Second argument
  • %3 — Third argument (and so on)

Arguments are substituted directly, so use parentheses in templates to ensure correct precedence: ((%) * (%2)) not % * %2.

Usage

PRINT CEIL(3.7)          ' Outputs: 4
PRINT MAX(5, 10)         ' Outputs: 10
t = TIMER()              ' Get elapsed time
PRINT ENVIRON$("HOME")   ' Print home directory

Extensible functions require parentheses, even with no arguments: TIMER() not TIMER.

Runtime Modes

The transpiler supports two runtime modes selected at transpile time:

Debug Mode (default)

The debug runtime includes error checking and diagnostics:

  • NULL guards on string function arguments
  • malloc/calloc failure checks with error messages
  • File number bounds checking
  • fopen failure reporting with filename
  • GOSUB stack overflow/underflow detection
  • All errors print to stderr and call exit(1)

Release Mode (--release or -r)

The release runtime strips all diagnostic checks for minimal generated code:

  • No NULL guards on string functions
  • No malloc failure checks
  • No file number bounds checking
  • No GOSUB stack overflow/underflow checks
  • ~8% fewer lines of generated C code

Functional guards are preserved in release mode to prevent crashes:

  • EOF() returns true (-1) for NULL file handles (enables file existence checks)
  • LOF() returns 0 for NULL file handles
  • CLOSE is a no-op for NULL file handles
  • LINE INPUT is a no-op for NULL file handles
  • Temp string pool management (_bfree_temps, _btmp)
  • String variable management (_bstr_assign)

Limits

Resource Maximum
Token length 4096
Identifier length 128
Parameters per procedure 32
Symbol table entries 2048
GOSUB return sites 512
Line number labels 4096
AST nodes 65536
Arguments per call 64
User-defined types 64
Fields per type 32
Constants 256
Include nesting depth 16
Included files 64
Total source lines 65536

Example

TYPE Item
    name AS STRING * 20
    price AS DOUBLE
END TYPE

DIM items(2) AS Item
items(0).name = "Widget"
items(0).price = 9.99
items(1).name = "Gadget"
items(1).price = 24.95
items(2).name = "Doohickey"
items(2).price = 4.50

DIM i AS INTEGER
DIM total AS DOUBLE
total = 0
FOR i = 0 TO 2
    PRINT items(i).name; " $"; items(i).price
    total = total + items(i).price
NEXT i
PRINT "Total: $"; total

Transpile and run:

./basic2c example.bas example.c
cc -Wall -o example example.c -lm
./example