A lightweight JIT compiler based on MIR (Medium Internal Representation) and C11 JIT compiler and interpreter based on MIR
#define Size 819000
int sieve (int N) {
int64_t i, k, prime, count, n; char flags[Size];
for (n = 0; n < N; n++) {
count = 0;
for (i = 0; i < Size; i++)
flags[i] = 1;
for (i = 0; i < Size; i++)
if (flags[i]) {
prime = i + i + 3;
for (k = i + prime; k < Size; k += prime)
flags[k] = 0;
count++;
}
}
return count;
}
void ex100 (void) {
printf ("sieve (100) = %d\", sieve (100));
}
m_sieve: module
export sieve
sieve: func i32, i32:N
local i64:iter, i64:count, i64:i, i64:k, i64:prime, i64:temp, i64:flags
alloca flags, 819000
mov iter, 0
loop: bge fin, iter, N
mov count, 0; mov i, 0
loop2: bge fin2, i, 819000
mov u8:(flags, i), 1; add i, i, 1
jmp loop2
fin2: mov i, 0
loop3: bge fin3, i, 819000
beq cont3, u8:(flags,i), 0
add temp, i, i; add prime, temp, 3; add k, i, prime
loop4: bge fin4, k, 819000
mov u8:(flags, k), 0; add k, k, prime
jmp loop4
fin4: add count, count, 1
cont3: add i, i, 1
jmp loop3
fin3: add iter, iter, 1
jmp loop
fin: ret count
endfunc
endmodule
m_ex100: module
format: string "sieve (10) = %d\n"
p_printf: proto p:fmt, i32:result
p_sieve: proto i32, i32:iter
export ex100
import sieve, printf
ex100: func v, 0
local i64:r
call p_sieve, sieve, r, 100
call p_printf, printf, format, r
endfunc
endmodule
func
describes signature of the function (taking 32-bit signed
integer argument and returning 32-bit signed integer value)
and function argument N
which will be local
variable of 64-bit signed integer type
;
string
describes data in form of C string
export
describes the module functions or data which are visible outside the current moduleimport
describes the module functions or data which should be defined in other MIR modulesproto
describes function prototypes. Its syntax is the same as func
syntaxcall
are MIR instruction to call functionsMIR_load_external
m1
and m2
are modules
m_sieve
and m_e100
, func
is function ex100
, sieve
is function sieve
): /* ctx is a context created by MIR_init */
MIR_load_module (ctx, m1); MIR_load_module (ctx, m2);
MIR_load_external (ctx, "printf", printf);
MIR_link (ctx, MIR_set_interp_interface, import_resolver);
/* or use MIR_set_gen_interface to generate and use the machine code */
/* or use MIR_set_lazy_gen_interface to generate function code on its 1st call */
/* use MIR_gen (ctx, func) to explicitly generate the function machine code */
MIR_interp (ctx, func, &result, 0); /* zero here is arguments number */
/* or ((void (*) (void)) func->addr) (); to call interpr. or gen. code through the interface */
binfmt_misc
The mir-bin-run
binary is prepared to be used from binfmt_misc
with the
following line (example):
line=:mir:M::MIR::/usr/local/bin/mir-bin-run:P
echo $line > /proc/sys/fs/binfmt_misc/register
Do adapt the mir-bin-run binary path to your system, that is the default one
And run with
c2m your-file.c -o your-file
chmod +x your-file
./your-file your args
The executable is "configurable" with environment variables:
MIR_TYPE
sets the interface for code execution: interp
(for interpretation),
jit
(for generation) and lazy
(for lazy generation, default);MIR_LIBS
(colon separated list) defines a list of extra libraries to load;MIR_LIB_DIRS
or LD_LIBRARY_PATH
(colon separated list) defines an extra list
of directories to search the libraries on.Due to the tied nature of
mir-bin-run
withbinfmt_misc
, it may be a bit weird to callmir-bin-run
directly. TheP
flag on the binfmt_misc passes an extra argument with the full path to the MIR binary.
Compiler Performance Goals relative to GCC -O2:
Very short optimization pipeline for speed and light-weight
Only the most valuable optimization usage:
Different optimization levels to tune compilation speed vs generated code performance
SSA form of MIR is used before register allocation
Simplicity of optimizations implementation over extreme generated code performance
More details about full JIT compiler pipeline:
Simplify: lowering MIR
Inline: inlining MIR calls
Build CFG: building Control Flow Graph (basic blocks and CFG edges)
Build SSA: Building Single Static Assignment Form by adding phi nodes and SSA edges to operands
Copy Propagation: SSA copy propagation keeping conventional SSA form and removing redundant extension insns
Global Value Numbering: Removing redundant insns through GVN
Dead Code Elimination: removing insns with unused outputs
Sparse Conditional Constant Propagation: constant propagation and removing death paths of CFG
Out of SSA: Removing phi nodes and SSA edges (we keep conventional SSA all the time)
Machinize: run machine-dependent code transforming MIR for calls ABI, 2-op insns, etc
Find Loops: finding natural loops and building loop tree
Build Live Info: calculating live in and live out for the basic blocks
Build Live Ranges: calculating program point ranges for registers
Assign: fast RA for -O0
or priority-based linear scan RA for -O1
and above
Rewrite: transform MIR according to the assign using reserved hard regs
Combine (code selection): merging data-depended insns into one
Dead Code Elimination: removing insns with unused outputs
Generate Machine Insns: run machine-dependent code creating machine insns
mir.h
and mir.c
contain major API code including input/output of MIR binary
and MIR text representationmir-dlist.h
, mir-mp.h
, mir-varr.h
, mir-bitmap.h
, mir-htab.h
contain generic code
correspondingly for double-linked lists, memory pools, variable length arrays, bitmaps,
hash tables. File mir-hash.h
is a general, simple, high quality hash function used
by hashtablesmir-interp.c
contains code for interpretation of MIR code. It is included in mir.c
and never compiled separatelymir-gen.h
, mir-gen.c
, mir-gen-x86_64.c
, mir-gen-aarch64.c
, mir-gen-ppc64.c
, mir-gen-s390x.c
,
and mir-gen-riscv.c
contain code for MIR JIT compiler
mir-gen-x86_64.c
, mir-gen-aarch64.c
, mir-gen-ppc64.c
, mir-gen-s390x.c
,
and mir-gen-riscv.c
is machine dependent code of JIT compilermir-<target>.c
contain simple machine dependent code common for interpreter and
JIT compilermir2c/mir2c.h
and mir2c/mir2c.c
contain code for MIR to C compilerc2mir/c2mir.h
, c2mir/c2mir.c
, c2mir/c2mir-driver.c
, and c2mir/mirc.h
contain code for
C to MIR compiler. Files in directories c2mir/x86_64
and c2mir/aarch64
, c2mir/ppc64
, c2mir/s390x
,
and c2mir/riscv
contain correspondingly x86_64, aarch64, ppc64, s390x, and riscv machine-dependent
code for C to MIR compilermake bench
and make test
Intel i7-9700K with 16GB memory under FC29 with GCC-8.2.1
MIR-gen | MIR-interp | gcc -O2 | gcc -O0 | |
---|---|---|---|---|
compilation [1] | 1.0 (69us) | 0.17 (12us) | 193 (13.35ms) | 186 (12.8ms) |
compilation [2] | 1.0 (116us) | 0.10 (12us) | 115 (13.35ms) | 110 (12.8ms) |
execution [3] | 1.0 (3.05s) | 6.0 (18.3s) | 0.95 (2.9s) | 2.08 (6.34s) |
code size [4] | 1.0 (408KB) | 0.51 (209KB) | 62 (25.2MB) | 62 (25.2MB) |
startup [5] | 1.0 (1.3us) | 1.0 (1.3us) | 9310 (12.1ms) | 9850 (12.8ms) |
LOC [6] | 1.0 (19.1K) | 0.53 (10.1K) | 77 (1480K) | 77 (1480K) |
[1] is based on wall time of compilation of sieve code (w/o any include file and with using memory file system for GCC) 100 times. The used optimization level is 1
[2] is analogous to [1] but with MIR-optimization level 2
[3] is based on the best wall time of 10 runs with used MIR-generator optimization level 1
[4] is based on stripped sizes of cc1 for GCC and MIR core and interpreter or generator for MIR
[5] is based on wall time of generation of object code for empty C file or generation of empty MIR module through API
[6] is based only on files required for x86-64 C compiler and files for minimal program to create and run MIR code