whtoo/how_to_implment_pl_in_antlr4
简明自制编程语言教程,同时是antlr非官方参考🌰。这里也是cyson这门语言的缘起。
npx skills add whtoo/how_to_implment_pl_in_antlr4README
How to Implement a Programming Language in ANTLR4
A progressive, hands-on guide to building a complete compiler from scratch using ANTLR4 and Java 21.
📖 Overview
This educational project teaches compiler construction through 21 progressive episodes (EPs), each building upon the previous. Starting with basic parsing and advancing to sophisticated optimizations, you'll build a complete compiler for the Cymbol programming language.
What You'll Build
- A full-featured compiler pipeline: lexing → parsing → AST → type checking → IR generation → optimization → code generation
- A virtual machine (VM) with garbage collection for executing compiled programs
- Advanced compiler optimizations: SSA, dataflow analysis, tail recursion optimization
- Tools for analysis: call graphs, control flow graphs, symbol tables
🎯 Learning Path
Phase 1: Foundations (EP1-EP12)
Goal: Learn ANTLR4 basics and build an interpreter
| EP | Topic | Key Concepts |
|---|---|---|
| EP1-EP2 | Basic Parsing | Lexical analysis, grammar definition, parsing basics |
| EP3-EP4 | Expression Evaluation | Arithmetic operations, operator precedence |
| EP5-EP6 | Statements | Control flow (if/else, while), block statements |
| EP7-EP8 | Functions | Function definition, parameters, return values |
| EP9-EP10 | Symbol Tables | Scoping, variable declarations, name resolution |
| EP11-EP12 | Arrays & More | Array operations, advanced language features |
Outcome: A working interpreter that executes Cymbol programs directly.
Phase 2: Compilation Basics (EP13-EP16)
Goal: Transform interpretation into compilation
| EP | Topic | Key Concepts |
|---|---|---|
| EP13 | AST Construction | Abstract syntax trees, visitor pattern |
| EP14 | Symbol Resolution | Multi-pass compilation, scope management |
| EP15 | Type Checking | Static type analysis, error reporting |
| EP16 | Simple Code Generation | Basic bytecode generation, three-address code |
Outcome: A compiler that generates simple bytecode for execution.
Phase 3: Modern Compiler Architecture (EP17-EP21)
Goal: Build a production-quality compiler
| EP | Topic | Key Concepts |
|---|---|---|
| EP17 | Call Graph Analysis | ANTLR4 4.13.2, function call relationships, DOT visualization |
| EP18 | Virtual Machine | Stack-based VM, instruction set, memory management, garbage collection |
| EP18R | Enhanced VM | Advanced GC, optimizations, instruction set extensions |
| EP19 | IR Generation | Three-address code, SSA basics, IR design patterns |
| EP20 | CFG & Optimization | Control flow graphs, basic blocks, local optimizations |
| EP21 | Advanced Optimizations | Full SSA form, dataflow analysis, tail recursion optimization, global optimizations |
Outcome: A complete compiler with modern architecture and advanced optimizations.
🏗️ Project Structure
antlr4-project/
├── ep1-ep16/ # Foundation episodes (historical, not in active build)
├── ep17/ # Call graph analysis (currently active)
├── ep18/ # Virtual machine implementation
├── ep18r/ # Enhanced virtual machine
├── ep19/ # Intermediate representation generation
├── ep20/ # Full compiler with CFG and optimization
├── ep21/ # Advanced compiler with SSA and TRO
├── pom.xml # Parent Maven POM
├── AGENTS.md # AI agent development guide
└── README.md # This file
Active Modules
The root POM currently builds EP17-EP21:
<modules>
<module>ep17</module>
<module>ep18</module>
<module>ep18r</module>
<module>ep19</module>
<module>ep20</module>
<module>ep21</module>
</modules>
🚀 Quick Start
Prerequisites
- Java 21 or higher
- Maven 3.8+
- Git (for cloning)
Building the Project
# Clone the repository
git clone <repository-url>
cd How_to_implment_PL_in_Antlr4
# Build all active modules (EP17-EP21)
mvn clean compile
# Run all tests
mvn test
# Build specific module
cd ep21
mvn clean compile test
Running the Compiler
# Using EP20 compiler (current production-ready version)
cd ep20
mvn exec:java -Dexec.args="src/main/resources/t.cymbol"
# Using EP21 compiler (advanced optimizations)
cd ep21
mvn compile exec:java -Dexec.mainClass="org.teachfx.antlr4.ep21.integration.EP21Compiler" \
-Dexec.args="src/main/resources/example.cymbol output.vm"
📚 The Cymbol Language
Cymbol is a C-like educational programming language that grows with each episode:
Basic Example
int factorial(int n) {
if (n <= 1) {
return 1;
}
return n * factorial(n - 1);
}
void main() {
int result = factorial(5);
print(result); // Output: 120
}
Features by Episode
- EP1-EP8: Basic types, arithmetic, control flow, functions
- EP9-EP10: Scoping, local/global variables
- EP11-EP12: Arrays and array ope
...