| .claude | ||
| .gitignore | ||
| basic2c.c | ||
| builtins.def | ||
| functions.def | ||
| README.md | ||
| test.bas | ||
| test_big.bas | ||
| test_classic.bas | ||
| test_continue.bas | ||
| test_data.bas | ||
| test_fileio.bas | ||
| test_inc_b.bas | ||
| test_inc_c.bas | ||
| test_include.bas | ||
| test_include_lib.bas | ||
| test_include_nested.bas | ||
| test_labels.bas | ||
| test_multidim.bas | ||
| test_newfeatures.bas | ||
| test_redim.bas | ||
| test_types.bas | ||
| test_udt.bas | ||
basic2c
A BASIC-to-C transpiler. Translates BASIC source code into equivalent C source code with an embedded runtime library.
Build
cc -Wall -o basic2c basic2c.c -lm
Usage
basic2c [--release|-r] input.bas [output.c]
- If
output.cis omitted, C code is written to stdout. --release(or-r) selects the release runtime (see Runtime Modes).
Compile the generated C:
cc -Wall -o program output.c -lm
Architecture
The transpiler is a single-file C program with three phases:
- Lexer — tokenizes BASIC source (case-insensitive keywords)
- Parser — recursive descent, builds an AST
- Codegen — walks the AST, emits C source with a small runtime library
Data Types
| BASIC Type | C Type | Suffix | Notes |
|---|---|---|---|
BYTE |
uint8_t |
Unsigned 8-bit | |
INTEGER |
int16_t |
% |
Signed 16-bit |
LONG |
int32_t |
Signed 32-bit | |
FLOAT |
float |
! |
Single precision |
DOUBLE |
double |
# |
Double precision (default numeric) |
STRING |
char* |
$ |
Dynamic, heap-allocated |
Type suffixes on variable names are recognized: name$ is STRING, count% is
INTEGER, total# is DOUBLE, rate! is FLOAT. Variables without a suffix or
explicit type declaration default to DOUBLE.
Numeric types follow a promotion hierarchy: BYTE < INTEGER < LONG < FLOAT < DOUBLE. Mixed-type expressions promote to the higher-ranked type.
Variables and Arrays
Declaration
DIM x AS DOUBLE
DIM name AS STRING
DIM count AS INTEGER
Variables can also be used without declaration — they are implicitly declared based on their type suffix or as DOUBLE by default.
Arrays
DIM arr(10) AS INTEGER ' 1D array, indices 0..10
DIM matrix(3, 4) AS DOUBLE ' 2D array, indices 0..3 x 0..4
DIM cube(2, 3, 4) AS INTEGER ' 3D array
Arrays are zero-based. The dimension value is the upper bound (inclusive), so
DIM arr(10) allocates 11 elements (0 through 10).
REDIM
REDIM arr(20) AS INTEGER ' Resize array (contents reset to zero)
REDIM matrix(5, 5) AS DOUBLE ' Resize multidimensional array
REDIM frees the previous allocation and creates a new zero-initialized array.
Operators
Arithmetic
| Operator | Description |
|---|---|
+ |
Addition |
- |
Subtraction / unary negation |
* |
Multiplication |
/ |
Division |
\ |
Integer division |
MOD |
Modulo |
^ |
Exponentiation |
Comparison
| Operator | Description |
|---|---|
= |
Equal |
<> |
Not equal |
< |
Less than |
> |
Greater than |
<= |
Less than or equal |
>= |
Greater than or equal |
Bitwise / Logical
| Operator | Description |
|---|---|
AND |
Bitwise AND |
OR |
Bitwise OR |
NOT |
Bitwise NOT |
XOR |
Bitwise XOR |
These operators work as both bitwise and logical operators. When used with
comparisons (which return 0 or 1), they behave logically: x > 5 AND y < 10.
When used with integers, they operate on individual bits: 15 AND 9 gives 9.
String
| Operator | Description |
|---|---|
+ |
Concatenation (when operands are strings) |
& |
Concatenation (explicit) |
Control Flow
IF / THEN / ELSE
Single-line:
IF x > 0 THEN PRINT "positive" ELSE PRINT "non-positive"
Multi-line:
IF x > 0 THEN
PRINT "positive"
ELSEIF x = 0 THEN
PRINT "zero"
ELSE
PRINT "negative"
END IF
FOR / NEXT
FOR i = 1 TO 10
PRINT i
NEXT i
FOR i = 10 TO 0 STEP -2
PRINT i
NEXT i
WHILE / WEND
WHILE x > 0
x = x - 1
WEND
DO / LOOP
DO
x = x + 1
LOOP UNTIL x >= 10
DO WHILE x < 100
x = x * 2
LOOP
SELECT CASE
SELECT CASE grade
CASE 90 TO 100
PRINT "A"
CASE 80 TO 89
PRINT "B"
CASE 70 TO 79
PRINT "C"
CASE IS < 60
PRINT "F"
CASE ELSE
PRINT "D"
END SELECT
CASE values support single values (CASE 1), comma-separated values
(CASE 1, 2, 3), ranges (CASE 5 TO 10), comparisons (CASE IS > 100),
and a default (CASE ELSE). Works with both numeric and string expressions.
EXIT
EXIT FOR
EXIT WHILE
EXIT DO
EXIT SUB
EXIT FUNCTION
CONTINUE
CONTINUE FOR
CONTINUE WHILE
CONTINUE DO
Skips the rest of the current loop iteration and jumps to the next iteration.
GOTO
GOTO 100 ' Jump to line number
GOTO myLabel ' Jump to named label
GOSUB / RETURN
GOSUB 200
GOSUB myRoutine
' ...
200 PRINT "in subroutine"
RETURN
myRoutine:
PRINT "named routine"
RETURN
GOSUB uses a compile-time dispatch mechanism — each GOSUB site gets a unique return-point ID, and RETURN uses a switch statement to jump back.
ON GOTO / ON GOSUB
ON choice GOTO label1, label2, label3
ON choice GOSUB routine1, routine2, routine3
Branches to the Nth label based on the expression value (1-based). If the value is out of range, execution continues at the next statement.
Labels
Both classic line numbers and named labels are supported:
10 PRINT "line 10"
20 GOTO 10
myLabel:
PRINT "named label"
GOTO myLabel
Constants
CONST PI = 3.14159
CONST MAX_SIZE = 100
CONST GREETING$ = "Hello"
Constants are evaluated at compile time and substituted directly into expressions. They cannot be reassigned.
SWAP
SWAP a, b
SWAP s1$, s2$
Exchanges the values of two variables of the same type.
Procedures
SUB
SUB greet(name AS STRING)
PRINT "Hello, "; name
END SUB
CALL greet("World")
greet "World" ' CALL keyword is optional
FUNCTION
FUNCTION square(x AS DOUBLE) AS DOUBLE
square = x * x
END FUNCTION
PRINT square(5)
Functions return values by assigning to the function name or using RETURN expr.
Parameter Passing
SUB increment(BYREF x AS INTEGER)
x = x + 1
END SUB
SUB display(BYVAL x AS INTEGER)
PRINT x
END SUB
BYREF(default) — passes a pointer; changes affect the caller's variableBYVAL— passes a copy; changes are local to the procedure
LOCAL and STATIC
SUB counter()
STATIC count AS INTEGER
LOCAL temp AS INTEGER
count = count + 1
temp = count
PRINT temp
END SUB
LOCAL— declares a variable scoped to the procedureSTATIC— declares a variable that persists across calls
User-Defined Types
TYPE / END TYPE
TYPE PersonRecord
firstName AS STRING * 20
lastName AS STRING * 30
age AS INTEGER
salary AS DOUBLE
END TYPE
DIM person AS PersonRecord
person.firstName = "John"
person.lastName = "Doe"
person.age = 30
person.salary = 55000.50
String fields in TYPE definitions require a fixed length (STRING * N). Dynamic
strings (AS STRING without a length) are not permitted in TYPE definitions
because struct copy would produce dangling pointers.
Supported field types: BYTE, INTEGER, LONG, FLOAT, DOUBLE,
STRING * N, and other user-defined types (nesting).
Nested UDTs
TYPE Vec2
x AS DOUBLE
y AS DOUBLE
END TYPE
TYPE Circle
center AS Vec2
radius AS DOUBLE
END TYPE
DIM c AS Circle
c.center.x = 10.0
c.center.y = 20.0
c.radius = 5.0
Nesting depth is unlimited. Chained dot-access works for both reads and writes.
UDT Arrays
DIM points(10) AS Vec2
points(0).x = 1.5
points(0).y = 2.5
UDT Assignment
Whole-struct copy via assignment:
DIM a AS Vec2
DIM b AS Vec2
a.x = 1.0
a.y = 2.0
b = a ' Copies all fields
Sub-struct copy also works:
DIM saved AS Vec2
saved = c.center ' Copy nested struct out
c.center = saved ' Copy nested struct in
Array element copy:
circles(0) = circles(2)
SIZEOF
DIM sz AS LONG
sz = SIZEOF(PersonRecord)
Returns the byte size of a user-defined type. Used primarily with random-access file I/O to specify record length.
Built-in Functions
String Functions
| Function | Description |
|---|---|
LEN(s$) |
Length of string |
MID$(s$, start, len) |
Substring (1-based start position) |
LEFT$(s$, n) |
First n characters |
RIGHT$(s$, n) |
Last n characters |
CHR$(n) |
Character from ASCII code |
ASC(s$) |
ASCII code of first character |
STR$(n) |
Convert number to string |
VAL(s$) |
Convert string to number |
UCASE$(s$) |
Convert to uppercase |
LCASE$(s$) |
Convert to lowercase |
INSTR(haystack$, needle$) |
Find substring position (1-based, 0 if not found) |
STRING$(n, char$) |
Repeat a character n times |
LTRIM$(s$) |
Remove leading spaces |
RTRIM$(s$) |
Remove trailing spaces |
TRIM$(s$) |
Remove leading and trailing spaces |
SPACE$(n) |
String of n spaces |
HEX$(n) |
Hexadecimal string representation |
OCT$(n) |
Octal string representation |
MID$ Assignment
DIM s AS STRING
s = "Hello World"
MID$(s, 7, 5) = "BASIC" ' s is now "Hello BASIC"
Replaces characters in a string starting at a 1-based position. The length parameter limits how many characters are replaced.
Math Functions
| Function | Description |
|---|---|
ABS(n) |
Absolute value |
INT(n) |
Truncate to integer |
SQR(n) |
Square root |
SIN(n) |
Sine (radians) |
COS(n) |
Cosine (radians) |
TAN(n) |
Tangent (radians) |
ATN(n) |
Arctangent (returns radians) |
LOG(n) |
Natural logarithm |
EXP(n) |
e raised to the power n |
SGN(n) |
Sign: -1, 0, or 1 |
RND |
Random number between 0 and 1 |
Numeric expressions also support ^ for exponentiation (emitted as pow()).
Print Formatting Functions
| Function | Description |
|---|---|
TAB(n) |
Output spaces to reach column n |
SPC(n) |
Output exactly n spaces |
These functions are used within PRINT statements:
PRINT "Name"; TAB(20); "Value"
PRINT "A"; SPC(5); "B" ' Outputs "A B"
RND can be called with or without parentheses, and accepts an optional argument
(which is ignored) for compatibility with other BASIC dialects. Use RANDOMIZE
to seed the random number generator:
RANDOMIZE ' Seed from system clock
RANDOMIZE 12345 ' Seed with specific value
x = RND ' Random double 0..1
x = RND(1) ' Same as RND (argument ignored)
Array Functions
| Function | Description |
|---|---|
LBOUND(arr) |
Lower bound of array (always 0) |
UBOUND(arr) |
Upper bound of array |
I/O Functions
| Function | Description |
|---|---|
EOF(n) |
Returns true (-1) if at end of file n |
LOF(n) |
Returns byte length of file n |
FREEFILE() |
Returns the next available file number |
Console I/O
PRINT "Hello, World!"
PRINT "x = "; x
PRINT x; " "; y ' Semicolon suppresses newline between items
PRINT x, y ' Comma advances to next tab stop
PRINT "no newline"; ' Trailing semicolon suppresses final newline
? "shortcut" ' ? is a shortcut for PRINT
The ? character can be used as a shortcut for PRINT, for compatibility with
classic BASIC dialects and interactive use.
PRINT USING
PRINT USING "###.##"; 123.456 ' Outputs: 123.46
PRINT USING "$$#,###.##"; 1234.56 ' Outputs: $1,234.56
PRINT USING "+###.##"; -45.6 ' Outputs: -45.60
PRINT USING "**###.##"; 9.99 ' Outputs: ****9.99
PRINT USING "!"; "Hello" ' Outputs: H
PRINT USING "&"; "World" ' Outputs: World
PRINT USING "\ \"; "Testing" ' Outputs: Testin (6 chars)
Format specifiers for numbers:
| Format | Description |
|---|---|
# |
Digit placeholder |
. |
Decimal point position |
, |
Thousands separator (in format, not output) |
+ |
Show sign (+ or -) at start |
- |
Trailing minus for negative numbers |
$$ |
Floating dollar sign |
** |
Fill leading spaces with asterisks |
Format specifiers for strings:
| Format | Description |
|---|---|
! |
First character only |
& |
Entire string |
\ \ |
Fixed width (spaces between backslashes + 2) |
Multiple values can be formatted with one format string:
PRINT USING "### + ### = ###"; 10; 20; 30
' Outputs: 10 + 20 = 30
INPUT
INPUT "Enter name: "; name$
INPUT x
LINE INPUT
LINE INPUT "Enter text: "; line$
Reads an entire line including commas and spaces.
File I/O
Sequential Files
' Write
OPEN "data.txt" FOR OUTPUT AS #1
PRINT #1, "Hello"
PRINT #1, 42
CLOSE #1
' Read
OPEN "data.txt" FOR INPUT AS #1
LINE INPUT #1, text$
INPUT #1, value
CLOSE #1
' Append
OPEN "log.txt" FOR APPEND AS #1
PRINT #1, "new entry"
CLOSE #1
WRITE
WRITE #1, name$, age, salary
Outputs CSV-style: strings are quoted, values are comma-separated, terminated with a newline.
Binary Files
OPEN "file.dat" FOR BINARY AS #1
Random-Access Files
TYPE Record
name AS STRING * 20
value AS DOUBLE
END TYPE
DIM rec AS Record
rec.name = "test"
rec.value = 3.14
OPEN "data.dat" FOR RANDOM AS #1 LEN = SIZEOF(Record)
PUT #1, 1, rec ' Write record at position 1 (1-based)
GET #1, 1, rec ' Read record at position 1
CLOSE #1
Random-access uses GET and PUT with 1-based record numbers. The LEN
clause specifies record size in bytes. Records can be read and written in any
order.
File Modes
| Mode | C Mode | Description |
|---|---|---|
INPUT |
"r" |
Read sequential text |
OUTPUT |
"w" |
Write sequential text (truncates) |
APPEND |
"a" |
Append sequential text |
BINARY |
"rb" |
Binary read |
RANDOM |
"r+b" |
Random access (creates if not found) |
DATA / READ / RESTORE
DATA 10, 20, 30, "hello"
DIM x AS INTEGER
DIM s AS STRING
READ x ' x = 10
READ x ' x = 20
READ x ' x = 30
READ s ' s = "hello"
RESTORE ' Reset read pointer to beginning
READ x ' x = 10 again
DATA statements define a pool of literal values. READ consumes them in
order. RESTORE resets the read pointer (optionally to a specific line number).
Comments
' This is a comment
REM This is also a comment
x = 5 ' Inline comment
$INCLUDE Metacommand
'$INCLUDE: 'helpers.bas'
The $INCLUDE metacommand inserts the contents of another file at the point
of the directive, before lexing and parsing. The directive is placed inside a
comment (the leading ' makes it invisible to editors that don't understand it).
Syntax
The filename is enclosed in single quotes after '$INCLUDE:. The keyword is
case-insensitive. Any amount of whitespace may appear between the colon and the
opening quote.
Nested Includes
Included files may themselves contain $INCLUDE directives:
' main.bas
'$INCLUDE: 'math_lib.bas'
'$INCLUDE: 'string_lib.bas'
' math_lib.bas — can include further files
'$INCLUDE: 'constants.bas'
FUNCTION Square(x AS DOUBLE) AS DOUBLE
Square = x * x
END FUNCTION
Path Resolution
Filenames are resolved relative to the including file's directory, not the
working directory. If src/main.bas includes 'lib/util.bas', the transpiler
looks for src/lib/util.bas.
Error Reporting
When $INCLUDE is used, error messages show the originating file and line:
Error (math_lib.bas:12): undeclared variable 'q'
Without includes, the format is the same but shows the input filename:
Error (main.bas:5): type mismatch
Circular Include Detection
If file A includes file B which includes file A, the transpiler reports a fatal error rather than looping infinitely:
Error: Circular include detected: main.bas
Extensible Functions
The transpiler supports two mechanisms for defining additional functions:
Built-in Functions (builtins.def)
The builtins.def file is compiled into basic2c and provides functions that are
always available. To add permanent built-in functions, edit builtins.def and
recompile basic2c.
Default built-ins include:
Math functions:
| Function | Description |
|---|---|
SQR(n) |
Square root |
SIN(n) |
Sine (radians) |
COS(n) |
Cosine (radians) |
TAN(n) |
Tangent (radians) |
ATN(n) |
Arctangent (returns radians) |
LOG(n) |
Natural logarithm |
EXP(n) |
e raised to power n |
SGN(n) |
Sign: -1, 0, or 1 |
RND() |
Random number 0 to 1 |
CEIL(n) |
Round up to integer |
FLOOR(n) |
Round down to integer |
ROUND(n) |
Round to nearest integer |
FIX(n) |
Truncate toward zero |
FRAC(n) |
Fractional part |
HYPOT(x, y) |
Hypotenuse (sqrt(x² + y²)) |
MAX(a, b) |
Maximum of two values |
MIN(a, b) |
Minimum of two values |
String functions:
| Function | Description |
|---|---|
CHR$(n) |
Character from ASCII code |
STR$(n) |
Convert number to string |
UCASE$(s) |
Convert to uppercase |
LCASE$(s) |
Convert to lowercase |
LTRIM$(s) |
Remove leading spaces |
RTRIM$(s) |
Remove trailing spaces |
TRIM$(s) |
Remove leading and trailing spaces |
SPACE$(n) |
String of n spaces |
HEX$(n) |
Hexadecimal representation |
OCT$(n) |
Octal representation |
TAB(n) |
Spaces to reach column n |
SPC(n) |
Output n spaces |
ENVIRON$(name) |
Get environment variable |
System:
| Function | Description |
|---|---|
TIMER() |
Seconds since program start |
External Functions (functions.def)
The functions.def file is loaded at runtime from two locations (both if present):
- The directory containing the
basic2cbinary (global extensions) - The directory containing the input
.basfile (project-specific)
Functions from the input file's directory are loaded second, allowing project-specific definitions to supplement or override earlier ones.
Definition Format
Both builtins.def and functions.def use the same format:
# Comment lines start with #
# Format: name : type : c_template
SQUARE : double : ((%) * (%))
CUBE : double : ((%) * (%) * (%))
Each line defines:
- name — The BASIC function name (case-insensitive)
- type — Return type:
byte,integer,long,float,double, orstring - c_template — C code with argument placeholders
Argument Placeholders
%or%1— First argument%2— Second argument%3— Third argument (and so on)
Arguments are substituted directly, so use parentheses in templates to ensure
correct precedence: ((%) * (%2)) not % * %2.
Usage
PRINT CEIL(3.7) ' Outputs: 4
PRINT MAX(5, 10) ' Outputs: 10
t = TIMER() ' Get elapsed time
PRINT ENVIRON$("HOME") ' Print home directory
Extensible functions require parentheses, even with no arguments: TIMER() not TIMER.
Runtime Modes
The transpiler supports two runtime modes selected at transpile time:
Debug Mode (default)
The debug runtime includes error checking and diagnostics:
- NULL guards on string function arguments
malloc/callocfailure checks with error messages- File number bounds checking
fopenfailure reporting with filename- GOSUB stack overflow/underflow detection
- All errors print to stderr and call
exit(1)
Release Mode (--release or -r)
The release runtime strips all diagnostic checks for minimal generated code:
- No NULL guards on string functions
- No malloc failure checks
- No file number bounds checking
- No GOSUB stack overflow/underflow checks
- ~8% fewer lines of generated C code
Functional guards are preserved in release mode to prevent crashes:
EOF()returns true (-1) for NULL file handles (enables file existence checks)LOF()returns 0 for NULL file handlesCLOSEis a no-op for NULL file handlesLINE INPUTis a no-op for NULL file handles- Temp string pool management (
_bfree_temps,_btmp) - String variable management (
_bstr_assign)
Limits
| Resource | Maximum |
|---|---|
| Token length | 4096 |
| Identifier length | 128 |
| Parameters per procedure | 32 |
| Symbol table entries | 2048 |
| GOSUB return sites | 512 |
| Line number labels | 4096 |
| AST nodes | 65536 |
| Arguments per call | 64 |
| User-defined types | 64 |
| Fields per type | 32 |
| Constants | 256 |
| Include nesting depth | 16 |
| Included files | 64 |
| Total source lines | 65536 |
Example
TYPE Item
name AS STRING * 20
price AS DOUBLE
END TYPE
DIM items(2) AS Item
items(0).name = "Widget"
items(0).price = 9.99
items(1).name = "Gadget"
items(1).price = 24.95
items(2).name = "Doohickey"
items(2).price = 4.50
DIM i AS INTEGER
DIM total AS DOUBLE
total = 0
FOR i = 0 TO 2
PRINT items(i).name; " $"; items(i).price
total = total + items(i).price
NEXT i
PRINT "Total: $"; total
Transpile and run:
./basic2c example.bas example.c
cc -Wall -o example example.c -lm
./example