whtoo/how_to_implment_pl_in_antlr4

简明自制编程语言教程,同时是antlr非官方参考🌰。这里也是cyson这门语言的缘起。

33 stars5 forksUpdated Jan 22, 2026
npx skills add whtoo/how_to_implment_pl_in_antlr4

README

How to Implement a Programming Language in ANTLR4

A progressive, hands-on guide to building a complete compiler from scratch using ANTLR4 and Java 21.

📖 Overview

This educational project teaches compiler construction through 21 progressive episodes (EPs), each building upon the previous. Starting with basic parsing and advancing to sophisticated optimizations, you'll build a complete compiler for the Cymbol programming language.

What You'll Build

  • A full-featured compiler pipeline: lexing → parsing → AST → type checking → IR generation → optimization → code generation
  • A virtual machine (VM) with garbage collection for executing compiled programs
  • Advanced compiler optimizations: SSA, dataflow analysis, tail recursion optimization
  • Tools for analysis: call graphs, control flow graphs, symbol tables

🎯 Learning Path

Phase 1: Foundations (EP1-EP12)

Goal: Learn ANTLR4 basics and build an interpreter

EPTopicKey Concepts
EP1-EP2Basic ParsingLexical analysis, grammar definition, parsing basics
EP3-EP4Expression EvaluationArithmetic operations, operator precedence
EP5-EP6StatementsControl flow (if/else, while), block statements
EP7-EP8FunctionsFunction definition, parameters, return values
EP9-EP10Symbol TablesScoping, variable declarations, name resolution
EP11-EP12Arrays & MoreArray operations, advanced language features

Outcome: A working interpreter that executes Cymbol programs directly.

Phase 2: Compilation Basics (EP13-EP16)

Goal: Transform interpretation into compilation

EPTopicKey Concepts
EP13AST ConstructionAbstract syntax trees, visitor pattern
EP14Symbol ResolutionMulti-pass compilation, scope management
EP15Type CheckingStatic type analysis, error reporting
EP16Simple Code GenerationBasic bytecode generation, three-address code

Outcome: A compiler that generates simple bytecode for execution.

Phase 3: Modern Compiler Architecture (EP17-EP21)

Goal: Build a production-quality compiler

EPTopicKey Concepts
EP17Call Graph AnalysisANTLR4 4.13.2, function call relationships, DOT visualization
EP18Virtual MachineStack-based VM, instruction set, memory management, garbage collection
EP18REnhanced VMAdvanced GC, optimizations, instruction set extensions
EP19IR GenerationThree-address code, SSA basics, IR design patterns
EP20CFG & OptimizationControl flow graphs, basic blocks, local optimizations
EP21Advanced OptimizationsFull SSA form, dataflow analysis, tail recursion optimization, global optimizations

Outcome: A complete compiler with modern architecture and advanced optimizations.

🏗️ Project Structure

antlr4-project/
├── ep1-ep16/           # Foundation episodes (historical, not in active build)
├── ep17/               # Call graph analysis (currently active)
├── ep18/               # Virtual machine implementation
├── ep18r/              # Enhanced virtual machine
├── ep19/               # Intermediate representation generation
├── ep20/               # Full compiler with CFG and optimization
├── ep21/               # Advanced compiler with SSA and TRO
├── pom.xml             # Parent Maven POM
├── AGENTS.md           # AI agent development guide
└── README.md           # This file

Active Modules

The root POM currently builds EP17-EP21:

<modules>
    <module>ep17</module>
    <module>ep18</module>
    <module>ep18r</module>
    <module>ep19</module>
    <module>ep20</module>
    <module>ep21</module>
</modules>

🚀 Quick Start

Prerequisites

  • Java 21 or higher
  • Maven 3.8+
  • Git (for cloning)

Building the Project

# Clone the repository
git clone <repository-url>
cd How_to_implment_PL_in_Antlr4

# Build all active modules (EP17-EP21)
mvn clean compile

# Run all tests
mvn test

# Build specific module
cd ep21
mvn clean compile test

Running the Compiler

# Using EP20 compiler (current production-ready version)
cd ep20
mvn exec:java -Dexec.args="src/main/resources/t.cymbol"

# Using EP21 compiler (advanced optimizations)
cd ep21
mvn compile exec:java -Dexec.mainClass="org.teachfx.antlr4.ep21.integration.EP21Compiler" \
    -Dexec.args="src/main/resources/example.cymbol output.vm"

📚 The Cymbol Language

Cymbol is a C-like educational programming language that grows with each episode:

Basic Example

int factorial(int n) {
    if (n <= 1) {
        return 1;
    }
    return n * factorial(n - 1);
}

void main() {
    int result = factorial(5);
    print(result);  // Output: 120
}

Features by Episode

  • EP1-EP8: Basic types, arithmetic, control flow, functions
  • EP9-EP10: Scoping, local/global variables
  • EP11-EP12: Arrays and array ope

...

Read full README

Publisher

whtoowhtoo

Statistics

Stars33
Forks5
Open Issues0
LicenseBSD 3-Clause "New" or "Revised" License
CreatedOct 18, 2020