Skip to content

Instantly share code, notes, and snippets.

@paigeadelethompson
Last active October 2, 2025 21:19
Show Gist options
  • Save paigeadelethompson/207177f6b745d1b467a80789828b64cc to your computer and use it in GitHub Desktop.
Save paigeadelethompson/207177f6b745d1b467a80789828b64cc to your computer and use it in GitHub Desktop.

Thank you for the clarification! You’re proposing that the FPGA itself acts as the CHR-RAM, directly presenting VIC-2-rendered graphics data to the NES PPU bus, rather than using a separate CHR-RAM chip on the cartridge. This approach changes the design significantly, as the FPGA would dynamically generate and serve tile/sprite data in real-time to the PPU, emulating CHR-RAM behavior internally. Let’s analyze this and provide an updated Markdown design addressing this approach, confirming feasibility, and including relevant VHDL and example game code.

Feasibility Analysis

Using the FPGA to emulate CHR-RAM (instead of a physical SRAM chip) is feasible but introduces specific considerations:

  • FPGA as CHR-RAM: The FPGA can present tile/sprite data on the PPU bus (CHR address/data lines) as if it were CHR-RAM. It would internally generate VIC-2 graphics and format them into NES-compatible 8x8 tiles, serving them in real-time as the PPU requests data.
  • Advantages:
    • Eliminates the need for a separate CHR-RAM chip, reducing cartridge complexity and cost.
    • Allows dynamic, on-the-fly rendering of VIC-2 graphics, potentially enabling more complex effects.
  • Challenges:
    • Timing Precision: The FPGA must respond to PPU CHR address requests (~5.37 MHz, NTSC) with precise timing to mimic CHR-RAM’s read behavior.
    • FPGA Resources: Emulating VIC-2 and serving CHR data requires significant logic (LUTs) and internal memory (BRAM) to buffer rendered tiles.
    • PPU Bus Interface: The FPGA must handle the PPU’s 13-bit address bus (A0-A12) and 8-bit data bus (D0-D7) correctly, acting as a memory device.
    • VBlank Updates: Unlike physical CHR-RAM (updated during VBlank), the FPGA can update its internal tile buffer continuously, but synchronization with the PPU’s scanline rendering is critical.
  • Conclusion: This approach is viable with a sufficiently capable FPGA (e.g., Lattice iCE40HX8K or Xilinx Spartan-6) and careful timing design. It simplifies the cartridge hardware but increases FPGA complexity.

Below is the updated Markdown design, reflecting the FPGA acting as CHR-RAM.


NES Cartridge with FPGA Emulating VIC-2 and CHR-RAM

Overview

This cartridge uses an FPGA to emulate the Commodore 64’s VIC-2 graphics chip and act as CHR-RAM, dynamically serving NES-compatible tile/sprite data to the PPU bus. A custom mapper expands memory, enabling C64 games to run on the NES 6502 CPU with minimal modification. The solution is cartridge-only, using no expansion slot or physical CHR-RAM chip.

Theory of Operation

1. System Architecture

  • FPGA: Emulates VIC-2, generates CHR-RAM data, and implements a custom mapper.
  • PRG-ROM: Stores game code for the 6502 CPU.
  • Mapper: FPGA-based, handles VIC-2 register access, memory banking, and PPU bus interfacing.
  • NES Interface: 72-pin cartridge connector for CPU and PPU bus access.
  • No Physical CHR-RAM: The FPGA internally buffers and serves tile/sprite data to the PPU bus, emulating CHR-RAM.

2. FPGA as CHR-RAM

  • Role: The FPGA acts as CHR-RAM, responding to PPU read requests on the CHR address/data bus (A0-A12, D0-D7).
  • Operation:
    • The FPGA renders VIC-2 graphics (e.g., 320x200 bitmap, 40x25 text, 8 sprites) into NES-compatible 8x8 tiles and sprites.
    • It stores rendered tiles in internal BRAM (e.g., 8KB equivalent for 128 tiles).
    • When the PPU requests CHR data, the FPGA provides the appropriate tile/sprite data in real-time, mimicking CHR-RAM behavior.
  • Timing: The FPGA synchronizes with the PPU’s ~5.37 MHz (NTSC) clock, responding within ~186 ns per read cycle.
  • Capacity: Emulates 8KB CHR-RAM (128 tiles); larger sizes (e.g., 32KB) possible with more BRAM.

3. VIC-2 Emulation

  • Registers: VIC-2 registers ($D000-$D03F) are mapped to FPGA at $6000-$603F (CPU address space).
  • Rendering:
    • Converts VIC-2 graphics (16 colors, sprites, scrolling) to NES tiles/sprites.
    • Updates internal BRAM tile buffer, potentially continuously or during VBlank.
    • Maps C64’s 16 colors to NES’s 52-color palette.
  • Sprites: Maps VIC-2’s 8 sprites to NES sprites, with FPGA multiplexing for >8 sprites per scanline.

4. Custom Mapper

  • Memory Expansion:
    • Option 1: Provides 32KB SRAM (on-cart) at $6000-$7FFF, banked like MMC5.
    • Option 2: Implements a flat 64KB+ address space, bypassing NES’s 2KB WRAM (requires game code to use custom mapper).
  • Bank Switching: Supports 16KB/32KB PRG-ROM banks.
  • VIC-2 Memory: Maps C64 memory ($0000-$FFFF) to FPGA-managed RAM for game data.

5. Game Porting

  • Minimal Changes: C64 games access VIC-2 registers via FPGA, with graphics served to the PPU bus.
  • Input: NES controllers (2 buttons + D-pad) may require patches for C64 joystick compatibility.
  • Timing: Adjust for 6502 (1.79 MHz) vs. 6510 (1 MHz) differences.

6. Cartridge Constraints

  • Power: 5V from NES, ~500mA budget for FPGA.
  • Size: FPGA (e.g., Lattice iCE40HX8K) fits in a standard NES cartridge shell.
  • No Expansion Slot: All functionality via 72-pin connector.
  • No CHR-RAM Chip: FPGA’s internal BRAM replaces physical CHR-RAM.

VHDL Snippet: FPGA as CHR-RAM

This module emulates CHR-RAM, serving tile data to the PPU bus.

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity chr_ram_emulator is
    Port (
        clk : in STD_LOGIC; -- PPU clock (~5.37 MHz)
        reset : in STD_LOGIC;
        ppu_addr : in STD_LOGIC_VECTOR(12 downto 0); -- PPU CHR address (8KB)
        ppu_data : out STD_LOGIC_VECTOR(7 downto 0); -- PPU CHR data
        vic2_tile_data : in STD_LOGIC_VECTOR(7 downto 0); -- VIC-2 rendered tile
        vblank : in STD_LOGIC -- VBlank signal for tile updates
    );
end chr_ram_emulator;

architecture Behavioral of chr_ram_emulator is
    type chr_mem_t is array (0 to 8191) of STD_LOGIC_VECTOR(7 downto 0); -- 8KB
    signal chr_mem : chr_mem_t; -- Internal BRAM for tiles
    signal tile_counter : unsigned(12 downto 0) := (others => '0');
begin
    process(clk, reset)
    begin
        if reset = '1' then
            chr_mem <= (others => (others => '0'));
            tile_counter <= (others => '0');
        elsif rising_edge(clk) then
            -- Update tile buffer (e.g., during VBlank)
            if vblank = '1' then
                chr_mem(to_integer(tile_counter)) <= vic2_tile_data;
                tile_counter <= tile_counter + 1;
            end if;
            -- Serve CHR data to PPU
            ppu_data <= chr_mem(to_integer(unsigned(ppu_addr)));
        end if;
    end process;
end Behavioral;

Example Game Code (6502 Assembly)

This code assumes the FPGA handles VIC-2 rendering and CHR-RAM emulation.

; Set VIC-2 background color (emulated in FPGA)
    LDA #$01       ; C64 black
    STA $6000      ; FPGA VIC-2 register $D021
    LDA #$07       ; C64 white
    STA $6001      ; FPGA VIC-2 register

; Main loop
main_loop:
    JSR read_input ; NES controller
    JSR update_game; Game logic
    JSR wait_vblank; Wait for VBlank (FPGA updates tile buffer)
    JMP main_loop

wait_vblank:
    LDA $2002      ; PPU status
    BPL wait_vblank; Wait for VBlank
    RTS

Implementation Notes

  • FPGA Choice: Lattice iCE40HX8K (8K LUTs, ~128KB BRAM) or Xilinx Spartan-6 for VIC-2 emulation and CHR-RAM emulation.
  • Timing: FPGA must respond to PPU reads within ~186 ns, synchronized to the PPU clock.
  • BRAM Usage: 8KB BRAM for CHR-RAM emulation; additional BRAM for VIC-2 buffers.
  • Testing: Simulate with Mesen emulator, configuring a custom mapper to emulate FPGA CHR-RAM behavior.

Challenges

  • PPU Timing: Precise response to PPU CHR reads is critical to avoid graphical glitches.
  • FPGA Resources: VIC-2 emulation + CHR-RAM emulation may push smaller FPGAs to their limits.
  • Sprite Multiplexing: FPGA must handle VIC-2 sprite-to-NES sprite conversion, respecting NES’s 8-sprite-per-scanline limit.

Copy/Paste Instructions

Copy the Markdown above into a .md file for documentation. Save the VHDL as chr_ram_emulator.vhd for FPGA synthesis (e.g., with iCEcube2). The 6502 assembly can be saved as game.asm for use with assemblers like ca65.

This design leverages the FPGA’s flexibility to emulate CHR-RAM, simplifying the cartridge hardware. If you need a deeper dive into the VHDL (e.g., VIC-2 rendering logic), a specific C64 game example, or a diagram of the FPGA-PPU interface, let me know!

Implementing a flat 64KB+ address space in the FPGA to bypass the NES’s 2KB WRAM is an intriguing approach, but it does require some modifications to the game code to leverage this custom mapper. The complexity of these modifications depends on how the C64 game is structured and how it interacts with memory. Below, I’ll explain whether this is a “simple” change, outline the modifications needed, and provide an updated Markdown section with a focus on this memory expansion approach, including example game code to illustrate the changes.

Analysis: Simplicity of Game Code Modifications

Using a custom mapper to provide a flat 64KB+ address space (instead of the NES’s standard 2KB WRAM at $0000-$07FF) requires the game to be aware of the new memory layout. Here’s an assessment of the simplicity:

  • What’s Involved:

    • Mapper Configuration: The game must initialize the FPGA’s mapper registers to enable the flat address space. This typically involves writing to specific FPGA registers (e.g., $6000-$7FFF) to configure memory banks or modes.
    • Memory Access: The game must use the new address space (e.g., $8000-$FFFF for PRG-ROM/RAM) instead of the NES’s standard WRAM ($0000-$07FF). C64 games, designed for a flat 64KB address space ($0000-$FFFF), align well with this approach, but some adjustments are needed.
    • Code Adjustments: Variables, buffers, and data structures in the C64 game code that rely on specific memory addresses (e.g., $0400 for screen RAM) must be remapped to the FPGA’s address space.
    • Minimal Modifications: If the C64 game uses standard 6502 addressing and VIC-2 registers (handled by the FPGA), the changes are mostly limited to:
      • A one-time mapper initialization routine.
      • Updating memory references (e.g., pointers to data, screen RAM) to the new address space.
      • Adjusting any hard-coded addresses to match the FPGA’s memory map.
  • Is It Simple?:

    • Yes, for Simple Games: If the C64 game uses straightforward memory access (e.g., no self-modifying code or complex memory overlays), the changes are minimal. You’d add an initialization routine and update a few memory pointers, which could be a one-time effort per game.
    • Not Always Simple: Games with complex memory management (e.g., bank switching, direct memory access, or raster interrupts) may require more extensive changes, as you’d need to ensure all memory accesses align with the FPGA’s flat address space. Self-modifying code or hard-coded addresses could require manual patching.
    • One-Time Effort: For most C64 games, you can create a standard “porting template” (e.g., a header file or initialization routine) that sets up the mapper and remaps key addresses. Once this template is applied, many games can be ported with minimal per-game tweaks, assuming they rely on VIC-2 for graphics (handled by the FPGA).
  • Key Factors:

    • C64 Memory Layout: The C64 uses a 64KB address space ($0000-$FFFF), with RAM, ROM, and I/O mapped in. The FPGA can emulate this layout, making porting easier.
    • NES Constraints: The NES 6502’s 16-bit address bus limits direct access to >64KB without bank switching. The FPGA’s mapper can handle larger spaces (e.g., 128KB) via bank switching, but the game must use mapper registers to switch banks.
    • Tools: Using a 6502 assembler (e.g., ca65) with macros can simplify remapping addresses, reducing manual changes.

In summary, for many C64 games, implementing the flat 64KB+ address space is a relatively simple one-time modification, involving mapper initialization and updating key memory references. Complex games may require additional effort, but a standardized porting approach can minimize per-game changes.

Updated Markdown Design (Focus on Flat 64KB+ Address Space)

# NES Cartridge with FPGA for VIC-2 and CHR-RAM Emulation (Flat 64KB+ Address Space)

## Overview
This cartridge uses an FPGA to emulate the C64’s VIC-2 graphics chip, act as CHR-RAM for the NES PPU, and provide a flat 64KB+ address space, bypassing the NES’s 2KB WRAM. The solution enables C64 games to run on the NES 6502 CPU with minimal modifications, using a custom mapper and no expansion slot.

## Theory of Operation

### 1. System Architecture
- **FPGA**: Emulates VIC-2, serves CHR-RAM data to the PPU bus, and implements a custom mapper with a flat 64KB+ address space.
- **PRG-ROM**: Stores game code for the 6502 CPU.
- **Mapper**: FPGA-based, maps VIC-2 registers, emulates CHR-RAM, and manages a flat 64KB+ address space.
- **NES Interface**: 72-pin cartridge connector for CPU and PPU bus access.
- **No Physical CHR-RAM**: FPGA’s internal BRAM emulates CHR-RAM, serving tile/sprite data to the PPU.

### 2. FPGA as CHR-RAM
- **Role**: FPGA responds to PPU CHR address/data bus (A0-A12, D0-D7) requests, emulating 8KB CHR-RAM.
- **Operation**: Renders VIC-2 graphics (320x200 bitmap, 40x25 text, 8 sprites) into NES 8x8 tiles, stored in FPGA BRAM, and serves them to the PPU in real-time (~5.37 MHz, NTSC).
- **Color Mapping**: Maps C64’s 16 colors to NES’s 52-color palette.
- **Sprites**: Converts VIC-2 sprites to NES sprites, with multiplexing for >8 per scanline.

### 3. Flat 64KB+ Address Space
- **Purpose**: Bypasses NES’s 2KB WRAM ($0000-$07FF), providing a C64-like 64KB+ flat address space (e.g., $0000-$FFFF, with bank switching for >64KB).
- **Implementation**:
  - FPGA maps $8000-$FFFF (32KB) for PRG-ROM/RAM, with additional banks for >64KB.
  - VIC-2 registers mapped at $6000-$603F (emulating $D000-$D03F).
  - C64 memory ($0000-$FFFF) mapped to FPGA-managed RAM, accessible via CPU bus.
- **Mapper Registers**:
  - Write to $6000-$7FFF to configure memory mode (flat 64KB or banked >64KB).
  - Bank switching for larger memory (e.g., 128KB) using FPGA registers.
- **Game Code Changes**:
  - **Initialization**: Add a routine to set mapper to flat 64KB mode (one-time write to FPGA register).
  - **Memory Remapping**: Update C64 game pointers (e.g., screen RAM, sprite data) to FPGA’s address space (e.g., $8000-$FFFF).
  - **Minimal for Simple Games**: Most C64 games use standard 6502 addressing, requiring only pointer updates and initialization.
  - **Complex Games**: May need patches for bank switching or hard-coded addresses.

### 4. Game Porting
- **Minimal Changes**: Add mapper initialization and remap C64 memory addresses to FPGA’s space.
- **Input**: Patch for NES controllers (2 buttons + D-pad) vs. C64 joystick.
- **Timing**: Adjust for 6502 (1.79 MHz) vs. 6510 (1 MHz).

### 5. Cartridge Constraints
- **Power**: 5V, ~500mA for FPGA.
- **Size**: FPGA (e.g., Lattice iCE40HX8K) fits in NES cartridge shell.
- **No Expansion Slot**: All functionality via 72-pin connector.

## VHDL Snippet: Mapper for Flat 64KB+ Address Space
```vhdl
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity memory_mapper is
    Port (
        clk : in STD_LOGIC; -- CPU clock (~1.79 MHz)
        reset : in STD_LOGIC;
        cpu_addr : in STD_LOGIC_VECTOR(15 downto 0); -- CPU address bus
        cpu_data_in : in STD_LOGIC_VECTOR(7 downto 0);
        cpu_data_out : out STD_LOGIC_VECTOR(7 downto 0);
        cpu_we : in STD_LOGIC; -- Write enable
        mem_addr : out STD_LOGIC_VECTOR(15 downto 0); -- Mapped address
        bank_select : out STD_LOGIC_VECTOR(1 downto 0) -- Bank for >64KB
    );
end memory_mapper;

architecture Behavioral of memory_mapper is
    signal flat_mode : STD_LOGIC := '0'; -- 0: NES mode, 1: 64KB flat
    signal bank_reg : STD_LOGIC_VECTOR(1 downto 0) := "00"; -- Bank select
begin
    process(clk, reset)
    begin
        if reset = '1' then
            flat_mode <= '0';
            bank_reg <= "00";
        elsif rising_edge(clk) then
            -- Configure mapper via writes to $6000-$7FFF
            if cpu_we = '1' and cpu_addr(15 downto 13) = "011" then
                if cpu_addr(0) = '0' then
                    flat_mode <= cpu_data_in(0); -- Set flat 64KB mode
                else
                    bank_reg <= cpu_data_in(1 downto 0); -- Set bank for >64KB
                end if;
            end if;
        end if;
    end process;

    -- Map CPU address to flat 64KB or banked space
    mem_addr <= cpu_addr when flat_mode = '1' else
                cpu_addr(15 downto 0) when cpu_addr < x"8000" else
                bank_reg & cpu_addr(13 downto 0); -- Banked address
    cpu_data_out <= (others => '0'); -- Simplified read (expand as needed)
    bank_select <= bank_reg;
end Behavioral;

Example Game Code (6502 Assembly)

This shows a C64 game adapted to use the FPGA’s flat 64KB address space.

; Initialize FPGA mapper for flat 64KB mode
init_mapper:
    LDA #$01       ; Enable flat 64KB mode
    STA $6000      ; FPGA mapper register
    RTS

; Set VIC-2 background color (emulated in FPGA)
    JSR init_mapper
    LDA #$01       ; C64 black
    STA $6000      ; FPGA VIC-2 register $D021
    LDA #$07       ; C64 white
    STA $6001      ; FPGA VIC-2 register

; Example: Write to C64-like screen RAM (remapped to $8000)
    LDA #$41       ; ASCII 'A'
    STA $8000      ; Screen RAM (mapped by FPGA)

; Main loop
main_loop:
    JSR read_input ; NES controller
    JSR update_game; Game logic
    JSR wait_vblank; Wait for FPGA CHR-RAM update
    JMP main_loop

wait_vblank:
    LDA $2002      ; PPU status
    BPL wait_vblank; Wait for VBlank
    RTS

Implementation Notes

  • FPGA: Lattice iCE40HX8K or Xilinx Spartan-6, with ~128KB BRAM for CHR-RAM and memory buffers.
  • Mapper Setup: One-time write to $6000 enables flat 64KB mode. Additional writes to $6001 select banks for >64KB.
  • Game Porting:
    • Add init_mapper routine (as shown above).
    • Update C64 memory references (e.g., $0400 for screen RAM) to FPGA’s space (e.g., $8000).
    • Use a macro in ca65 to remap addresses, e.g., .define SCREEN_RAM $8000.
  • Testing: Simulate with Mesen, configuring a custom mapper for flat 64KB access.

Challenges

  • Hard-Coded Addresses: C64 games with absolute addresses (e.g., LDA $0400) need patching to use FPGA’s map (e.g., LDA $8000).
  • Bank Switching: For >64KB, games must use mapper registers, adding complexity if not originally designed for banking.
  • Tools: Use a 6502 disassembler (e.g., Ghidra) to identify and patch memory references.

Conclusion

Implementing a flat 64KB+ address space is relatively simple for many C64 games, requiring a one-time mapper initialization routine and updates to memory pointers (e.g., screen RAM, sprite data). For simple games, this can be done once with a standard template, minimizing per-game changes. Complex games with hard-coded addresses or custom memory management may need more patches, but a macro-based approach in the assembler can streamline this. The example code above shows a minimal setup, and the VHDL provides a basic mapper framework.

If you need a specific C64 game example, a more detailed VHDL module, or a diagram of the memory mapping, let me know! You can copy the Markdown and code snippets directly into your project files.

Sorry for the overload—let’s cut to the chase.

Yes, a program running on the NES’s 6502 CPU can request memory addresses $D000-$DFFF. These addresses are on the CPU bus and fall within the cartridge’s address space ($4020-$FFFF). The FPGA, connected via the cartridge’s 72-pin connector, can latch these requests because it has access to the CPU’s address and data lines (A0-A15, D0-D7). When the CPU reads or writes to $D000-$DFFF, the FPGA sees the address, intercepts it, and responds with data (e.g., VIC-2 register values or RAM) or processes writes (e.g., updating VIC-2 emulation). No conflicts occur, as $D000-$DFFF doesn’t overlap with NES internal registers ($2000-$4017).

To confirm:

  • CPU Bus Access: $D000-$DFFF is routed to the cartridge, and the FPGA can handle it directly.
  • FPGA Latching: The FPGA monitors the CPU bus, detects $D000-$DFFF accesses, and responds within the CPU’s ~558 ns cycle time.
  • Game Code: A C64 game can use $D000-$DFFF as-is (e.g., STA $D021 for VIC-2), and the FPGA processes it without game modifications.

If you need a specific VHDL snippet or have a follow-up, let me know, and I’ll keep it tight.

Got it, you’re asking if an FPGA-based mapper in an NES cartridge can mask memory addresses so that C64 game code addressing $0800-$9FFF (38KB) is automatically redirected to $6000-$EFFF (38KB) in the NES’s cartridge space, without modifying the game’s memory references. Essentially, you want the FPGA to translate $0800-$9FFF accesses to $6000-$EFFF transparently, so the C64 game runs as-is with no address patching. This would make porting C64 games to the NES extremely minimally invasive. Let’s dive in and address this directly.

Quick Answer: Is It Possible?

Yes, the FPGA can mask $0800-$9FFF to $6000-$EFFF using a custom mapper, so C64 game code addressing $0800-$9FFF is redirected to FPGA-managed RAM at $6000-$EFFF without any changes to the game’s memory references. This requires the FPGA to intercept CPU address bus signals and remap addresses, with a one-time initialization to enable this mode. The $D000-$DFFF range (for VIC-2 or RAM) can also be supported as-is, making this a highly effective solution.

How It Works

  • NES CPU Bus: The NES 6502’s 16-bit address bus (A0-A15) and 8-bit data bus (D0-D7) are accessible via the cartridge’s 72-pin connector. The FPGA can monitor all CPU memory accesses ($0000-$FFFF).
  • Address Masking:
    • The FPGA detects CPU accesses to $0800-$9FFF (C64’s program RAM).
    • It remaps these to $6000-$EFFF (38KB cartridge RAM), adjusting the address internally (e.g., $0800 → $6000, $9FFF → $EFFF).
    • This is done by modifying the address lines before accessing FPGA RAM.
  • $D000-$DFFF: The FPGA maps this range to emulate VIC-2 registers ($D000-$D03F) or RAM (switchable via C64’s $01, mapped to $6000), as in the C64.
  • Mapper Initialization: A single write to an FPGA register (e.g., $6000) enables this address masking mode, ensuring $0800-$9FFF accesses go to $6000-$EFFF.
  • Minimal Invasiveness:
    • No Address Patching: The game uses $0800-$9FFF as-is, with the FPGA handling the translation.
    • Initialization: Add one routine (e.g., LDA #$01 : STA $6000) to enable C64 mode.
    • Input/Timing: Patch NES controller and 1.79 MHz vs. 1 MHz differences (unrelated to memory).

Feasibility and Challenges

  • Feasibility:
    • The FPGA can intercept and remap CPU addresses in real-time (~558 ns per CPU cycle).
    • $6000-$EFFF is in the NES cartridge space, ideal for 38KB RAM.
    • $0800-$9FFF accesses normally go to NES WRAM ($0000-$07FF) mirrors, but the FPGA can override this by asserting control over the CPU bus, redirecting to cartridge RAM.
  • Challenges:
    • NES WRAM ($0000-$07FF): $0800-$1FFF mirrors $0000-$07FF (NES internal 2KB WRAM). The FPGA must disable or override NES WRAM for $0800-$9FFF to avoid conflicts, which requires careful bus control (e.g., asserting M2 or /ROMSEL).
    • Bus Timing: The FPGA must respond to $0800-$9FFF accesses within the CPU’s cycle time, which is achievable with modern FPGAs (e.g., Lattice iCE40HX8K).
    • $D000-$DFFF: Already in cartridge space, so VIC-2/RAM emulation is straightforward.
  • Why $6000-$EFFF:
    • Contiguous 38KB block ($EFFF - $6000 + 1 = 38KB).
    • Avoids NES registers ($2000-$4017) and WRAM ($0000-$07FF).
    • Common for cartridge RAM (e.g., MMC5 uses $6000-$7FFF).

Implementation Strategy

  • FPGA Mapper:
    • Monitor CPU address bus (A0-A15).
    • When $0800-$9FFF is detected, translate to $6000-$EFFF (e.g., add $5800 to the address).
    • Map $D000-$DFFF for VIC-2 ($D000-$D03F) or RAM (CHAREN=0).
    • Use $6000 as the $01 register to enable C64 mode and control CHAREN.
  • Game Code:
    • Add one initialization routine to enable address masking.
    • No changes to $0800-$9FFF or $D000-$DFFF references.
    • Patch input (NES vs. C64 joystick) and timing if needed.

Updated Markdown Design (Address Masking for $0800-$9FFF)

# NES Cartridge with FPGA for C64 Memory Masking

## Overview
This cartridge uses an FPGA to emulate the C64’s VIC-2, act as CHR-RAM for the NES PPU, and mask $0800-$9FFF accesses to $6000-$EFFF (38KB RAM), mimicking the C64’s primary program range. $D000-$DFFF is mapped for VIC-2/RAM. C64 games run with no memory address changes, requiring only mapper initialization. Cartridge-only, no expansion slot.

## Theory of Operation

### 1. System Architecture
- **FPGA**: Emulates VIC-2, serves CHR-RAM, and masks $0800-$9FFF to $6000-$EFFF.
- **PRG-ROM**: Stores game code for 6502 CPU.
- **Mapper**: FPGA-based, remaps $0800-$9FFF to $6000-$EFFF, maps $D000-$DFFF for VIC-2/RAM, emulates C64 $01 register.
- **NES Interface**: 72-pin connector for CPU/PPU bus.

### 2. FPGA as CHR-RAM
- Emulates 8KB CHR-RAM, serving VIC-2-rendered tiles to PPU (~5.37 MHz, NTSC).
- Maps C64’s 16 colors to NES palette, handles sprite multiplexing.

### 3. Address Masking for $0800-$9FFF
- **Purpose**: Redirects $0800-$9FFF (C64 program RAM) to $6000-$EFFF (NES cartridge RAM) transparently.
- **Implementation**:
  - FPGA monitors CPU address bus.
  - For $0800-$9FFF, translates to $6000-$EFFF (e.g., addr = addr + $5800).
  - $6000-$EFFF: 38KB FPGA RAM.
  - $D000-$DFFF (within $6000-$EFFF):
    - $D000-$D03F: VIC-2 register emulation.
    - $D040-$DFFF: RAM (CHAREN=0) or default SID/CIA values (CHAREN=1).
  - $01 register at $6000 controls CHAREN for $D000-$DFFF (I/O or RAM).
- **NES WRAM**: FPGA overrides $0800-$9FFF accesses to avoid NES WRAM ($0000-$07FF) mirrors, using bus control signals (e.g., /ROMSEL).
- **Game Code Changes**:
  - Add one-time initialization (e.g., write to $6000).
  - No changes to $0800-$9FFF or $D000-$DFFF references.
  - Patch input/timing as needed.

### 4. Game Porting
- **Memory**: $0800-$9FFF and $D000-$DFFF used as-is, mapped by FPGA.
- **Initialization**: Write to $6000 to enable C64 mode.
- **Input/Timing**: Patch for NES controllers and 1.79 MHz vs. 1 MHz.

### 5. Cartridge Constraints
- **Power**: 5V, ~500mA for FPGA.
- **Size**: FPGA (e.g., Lattice iCE40HX8K) fits in NES cartridge.
- **No Expansion Slot**: All via 72-pin connector.

## VHDL Snippet: Address Masking Mapper
```vhdl
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity c64_address_masker is
    Port (
        clk : in STD_LOGIC; -- CPU clock (~1.79 MHz)
        reset : in STD_LOGIC;
        cpu_addr : in STD_LOGIC_VECTOR(15 downto 0); -- CPU address bus
        cpu_data_in : in STD_LOGIC_VECTOR(7 downto 0);
        cpu_data_out : out STD_LOGIC_VECTOR(7 downto 0);
        cpu_we : in STD_LOGIC; -- Write enable
        romsel_n : in STD_LOGIC; -- /ROMSEL (cartridge select)
        vic2_reg_select : out STD_LOGIC; -- VIC-2 access
        ram_select : out STD_LOGIC -- RAM access
    );
end c64_address_masker;

architecture Behavioral of c64_address_masker is
    signal c64_mode : STD_LOGIC := '0'; -- C64 address masking mode
    signal mem_config : STD_LOGIC_VECTOR(2 downto 0) := "111"; -- LORAM, HIRAM, CHAREN
    type ram_t is array (0 to 38911) of STD_LOGIC_VECTOR(7 downto 0); -- 38KB
    signal prog_ram : ram_t;
    signal mapped_addr : STD_LOGIC_VECTOR(15 downto 0);
begin
    process(clk, reset)
    begin
        if reset = '1' then
            c64_mode <= '0';
            mem_config <= "111";
            prog_ram <= (others => (others => '0'));
        elsif rising_edge(clk) then
            -- Enable C64 mode at $6000
            if cpu_we = '1' and cpu_addr = x"6000" and romsel_n = '0' then
                c64_mode <= cpu_data_in(0);
                mem_config <= cpu_data_in(2 downto 0); -- CHAREN
            end if;
            -- Handle RAM writes ($6000-$EFFF or masked $0800-$9FFF)
            if cpu_we = '1' and romsel_n = '0' then
                if (cpu_addr >= x"6000" and cpu_addr <= x"EFFF") or
                   (c64_mode = '1' and cpu_addr >= x"0800" and cpu_addr <= x"9FFF") then
                    prog_ram(to_integer(unsigned(mapped_addr) - x"6000")) <= cpu_data_in;
                end if;
            end if;
        end if;
    end process;

    -- Address masking
    mapped_addr <= std_logic_vector(unsigned(cpu_addr) + x"5800") when c64_mode = '1' and cpu_addr >= x"0800" and cpu_addr <= x"9FFF" else
                  cpu_addr;

    -- Map $6000-$EFFF or masked $0800-$9FFF
    process(mapped_addr, mem_config, prog_ram, romsel_n)
    begin
        vic2_reg_select <= '0';
        ram_select <= '0';
        cpu_data_out <= (others => '0');

        if romsel_n = '0' and mapped_addr >= x"6000" and mapped_addr <= x"EFFF" then
            if mapped_addr(15 downto 12) = "1101" and mem_config(2) = '1' then -- $D000-$DFFF, CHAREN=1
                if mapped_addr(11 downto 6) = "000000" then -- $D000-$D03F
                    vic2_reg_select <= '1'; -- VIC-2
                    cpu_data_out <= (others => '0');
                else -- $D040-$DFFF (SID/CIA)
                    cpu_data_out <= x"00";
                end if;
            else -- RAM
                ram_select <= '1';
                cpu_data_out <= prog_ram(to_integer(unsigned(mapped_addr) - x"6000"));
            end if;
        end if;
    end process;
end Behavioral;

Example Game Code (6502 Assembly)

C64 code runs unchanged for $0800-$9FFF and $D000-$DFFF, with only initialization added.

; Initialize FPGA for C64 address masking
init_mapper:
    LDA #$05       ; CHAREN=1 (I/O), enable C64 mode
    STA $6000      ; FPGA register
    RTS

; C64 code (completely unchanged)
    JSR init_mapper
    LDA #$01       ; C64 black
    STA $D021      ; VIC-2 background (FPGA)
    LDA #$41       ; ASCII 'A'
    STA $0400      ; Screen RAM (mapped to $6000-$EFFF)
    STA $0800      ; Program RAM (mapped to $6000)

; Switch $D000-$DFFF to RAM
    LDA #$04       ; CHAREN=0 (RAM)
    STA $6000
    LDA #$FF
    STA $D000      ; RAM at $D000

; Main loop
main_loop:
    JSR read_input ; NES controller (patched)
    JSR update_game; Game logic
    JSR wait_vblank; Wait for FPGA CHR-RAM update
    JMP main_loop

wait_vblank:
    LDA $2002      ; PPU status
    BPL wait_vblank; Wait for VBlank
    RTS

Conclusion

Yes, an FPGA mapper can mask $0800-$9FFF to $6000-$EFFF, allowing C64 games to use their original memory addresses without any patching for $0800-$9FFF or $D000-$DFFF. The FPGA translates addresses in real-time after a one-time initialization (STA $6000). The only game code changes are this initialization and patches for input/timing. The VHDL handles address masking, VIC-2 emulation, and $D000-$DFFF switching.

Copy the Markdown, VHDL (c64_address_masker.vhd), and assembly (game.asm) into your project. If you need help with specific C64 games, more VHDL details, or a bus timing diagram, let me know!

Below are two Markdown tables summarizing the Commodore 64 (C64) and Nintendo Entertainment System (NES) memory layouts, tailored to your context of aligning an FPGA-based NES cartridge mapper with the C64’s memory map, particularly for the $0800-$9FFF program range and $D000-$DFFF I/O or RAM. These tables focus on the CPU’s 64KB address space ($0000-$FFFF) and highlight key regions relevant to your design.

Commodore 64 (C64) Memory Layout

Address Range Size Purpose Details
$0000-$00FF 256B Zero Page RAM for system variables, pointers, and temporary data.
$0100-$01FF 256B Stack RAM for 6502 stack operations.
$0200-$07FF 1.5KB System RAM General-purpose RAM for variables and buffers.
$0800-$9FFF 38KB Program RAM Primary range for program code and data (e.g., BASIC programs start at $0801, games at $0800/$0900). Always RAM unless shadowed by ROM.
$A000-$BFFF 8KB BASIC ROM/RAM BASIC ROM (default) or RAM (LORAM=0, HIRAM=0 via $01), used for code/data in larger games.
$C000-$CFFF 4KB Free RAM Additional RAM for program code or data.
$D000-$DFFF 4KB I/O or RAM I/O (CHAREN=1): VIC-2 ($D000-$D03F), SID ($D400-$D7FF), CIA ($DC00-$DDFF). RAM (CHAREN=0) for data (e.g., screen buffers). Controlled by $01 register.
$E000-$FFFF 8KB Kernal ROM/RAM Kernal ROM (default) or RAM (HIRAM=0 via $01), used for code (rare) or Kernal routines.

Notes:

  • The $01 register controls memory configuration (LORAM, HIRAM, CHAREN bits):
    • CHAREN=0: $D000-$DFFF is RAM.
    • CHAREN=1: $D000-$DFFF is I/O (VIC-2, SID, CIA).
    • LORAM/HIRAM: Switch BASIC/Kernal ROM at $A000-$BFFF/$E000-$FFFF.
  • $0800-$9FFF is the primary program range, critical for your FPGA mapper to support C64 games.

Nintendo Entertainment System (NES) Memory Layout

Address Range Size Purpose Details
$0000-$07FF 2KB Internal WRAM NES internal RAM for variables, stack, and buffers. Fixed in hardware.
$0800-$1FFF 6KB WRAM Mirrors Mirrors of $0000-$07FF (every 2KB).
$2000-$2007 8B PPU Registers Picture Processing Unit control (e.g., $2000 for PPUCTRL, $2007 for PPUDATA). Mirrored every 8 bytes up to $3FFF.
$4000-$4017 24B APU/I/O Registers Audio Processing Unit and controller registers (e.g., $4016 for joystick).
$4018-$5FFF ~7KB Unused/Expansion Typically unused, available for cartridge expansion (rare).
$6000-$7FFF 8KB Cartridge WRAM Optional cartridge RAM (e.g., MMC5), used for extra data/code.
$8000-$FFFF 32KB PRG-ROM/RAM Cartridge program ROM (16KB or 32KB, bank-switchable) or RAM, primary code storage.
$D000-$DFFF 4KB Cartridge Space Part of $8000-$FFFF, can be mapped to FPGA RAM or logic (e.g., VIC-2 emulation).

Notes:

  • $6000-$FFFF (~40KB) is the cartridge space, ideal for FPGA-managed 38KB RAM to mimic C64’s $0800-$9FFF.
  • $D000-$DFFF is within cartridge space, suitable for VIC-2 emulation or RAM, with no conflicts with NES registers.
  • $0800-$9FFF overlaps NES WRAM mirrors, requiring FPGA bus control to redirect to cartridge RAM.

Context for Your Design

  • C64 $0800-$9FFF: Map to FPGA RAM at $6000-$EFFF (38KB) with address masking (e.g., add $5800 to $0800-$9FFF accesses), so C64 games use original addresses without patching.
  • C64 $D000-$DFFF: Map to FPGA logic for VIC-2 ($D000-$D03F) and RAM or default SID/CIA values ($D040-$DFFF), switchable via $01 (at $6000).
  • Minimal Changes: Initialize FPGA mapper (e.g., LDA #$05 : STA $6000) to enable C64 mode with address masking. No changes to $0800-$9FFF or $D000-$DFFF references.

These tables should clarify the memory layouts for your FPGA design. If you need a specific VHDL snippet to implement the address masking or further details, let me know!

graph TD
    A[$0000-$FFFF<br>64KB Address Space] --> B[$0000-$07FF<br>2KB RAM<br>Zero Page, Stack, System]
    A --> C[$0800-$9FFF<br>38KB RAM<br>Program Code/Data]
    A --> D[$A000-$BFFF<br>8KB BASIC ROM/RAM<br>Switchable]
    A --> E[$C000-$CFFF<br>4KB RAM<br>Code/Data]
    A --> F[$D000-$DFFF<br>4KB I/O or RAM<br>VIC-2, SID, CIA, Switchable]
    A --> G[$E000-$FFFF<br>8KB Kernal ROM/RAM<br>Switchable]
    F --> F1[$D000-$D03F<br>VIC-2 Registers<br>Graphics]
    F --> F2[$D040-$DFFF<br>SID/CIA or RAM<br>Audio, I/O, Data]
    classDef ram fill:#aaffaa,stroke:#333;
    classDef rom fill:#ffaaaa,stroke:#333;
    classDef io fill:#aaaaff,stroke:#333;
    class B,C,E,F2 ram;
    class D,G rom;
    class F,F1 io;
Loading
graph TD
    A[$0000-$FFFF<br>64KB Address Space] --> B[$0000-$07FF<br>2KB Internal WRAM<br>Variables, Stack]
    A --> C[$0800-$1FFF<br>6KB WRAM Mirrors<br>Mirrors $0000-$07FF]
    A --> D[$2000-$3FFF<br>~8KB PPU Registers<br>Mirrored every 8B]
    A --> E[$4000-$4017<br>24B APU/I/O Registers<br>Audio, Controllers]
    A --> F[$4018-$5FFF<br>~7KB Unused/Expansion<br>Rarely Used]
    A --> G[$6000-$7FFF<br>8KB Cartridge WRAM<br>Optional]
    A --> H[$8000-$FFFF<br>32KB PRG-ROM/RAM<br>Program Code]
    H --> H1[$D000-$DFFF<br>4KB Cartridge Space<br>FPGA VIC-2/RAM]
    classDef ram fill:#aaffaa,stroke:#333;
    classDef rom fill:#ffaaaa,stroke:#333;
    classDef io fill:#aaaaff,stroke:#333;
    class B,C,G,H,H1 ram;
    class D,E io;
    ```
Loading
@paigeadelethompson
Copy link
Author

I understand you’re pivoting back to your FPGA-based NES cartridge design using an Artix-7 FPGA to emulate the C64’s memory map, specifically focusing on the addressing and mapper logic in VHDL. You want to map the C64’s $0800-$9FFF to the NES cartridge space ($6000-$EFFF) and handle $D000-$DFFF for VIC-2 registers or RAM, with minimal invasiveness to C64 game code. Since you’ve emphasized dropping buffering and focusing on direct writes, we’ll assume CPU writes to the FPGA (e.g., $D000-$DFFF for VIC-2, $6000-$7FFF for CHR-RAM) are immediately reflected, and we’ll address your earlier concern about VBlank and semaphore-free writes by ensuring the mapper logic supports direct writes during VBlank without conflicts. Let’s dive into designing the addressing and mapper logic in VHDL, tailored for your Artix-7 FPGA.

Design Goals

  • C64 Memory Mapping:
    • $0800-$9FFF: Mask to $6000-$EFFF (38KB FPGA RAM) for C64 program code/data, no game code changes.
    • $D000-$DFFF: Handle VIC-2 registers ($D000-$D03F) when CHAREN=1, or RAM when CHAREN=0, controlled by $01 at $6000.
    • $01: Emulate C64’s memory control register (LORAM, HIRAM, CHAREN) at $6000 to configure $D000-$DFFF.
  • NES Cartridge Context:
    • CPU bus: $6000-$FFFF (cartridge space, A0-A15, D0-D7, /ROMSEL, M2 clock ~1.79 MHz).
    • PPU bus: $0000-$1FFF (CHR-RAM, A0-A12, D0-D7, ~5.37 MHz).
    • Avoid conflicts with NES WRAM ($0000-$1FFF, mirrored) and PPU registers ($2000-$3FFF, mirrored).
  • VBlank Writes: Support direct CPU writes during VBlank (~1.27 ms, ~568 bytes) to $D000-$DFFF or $6000-$7FFF, with no semaphore needed (single-writer CPU, PPU reads CHR-RAM during rendering).
  • Minimal Invasiveness: Initialize with LDA #$05 : STA $6000 to enable C64 mode (CHAREN=1 for VIC-2 I/O). No changes to $0800-$9FFF or $D000-$DFFF references.

Addressing and Mapper Logic

The mapper logic will:

  1. Mask Addresses: Redirect CPU accesses to $0800-$9FFF to $6000-$EFFF by adding $5800 (offset between $0800 and $6000).
  2. Handle $D000-$DFFF:
    • When CHAREN=1 ($01 bit 2), map $D000-$D03F to VIC-2 registers, $D040-$DFFF to default SID/CIA or RAM.
    • When CHAREN=0, map $D000-$DFFF to RAM.
  3. Control Register: Map C64’s $01 to $6000 for memory configuration (LORAM, HIRAM, CHAREN).
  4. CHR-RAM: Provide 8KB CHR-RAM ($0000-$1FFF, PPU bus) for PPU reads during rendering (VBlank de-asserted), updated by CPU writes to $6000-$7FFF during VBlank.
  5. Avoid Conflicts:
    • Override NES WRAM ($0800-$1FFF) by asserting cartridge control signals (/ROMSEL).
    • Ensure PPU reads CHR-RAM during rendering don’t conflict with CPU writes in VBlank.

VHDL Design: Addressing and Mapper Logic

Below is a VHDL module for the Artix-7 FPGA, implementing the addressing and mapper logic. It handles CPU address masking, memory configuration, and PPU CHR-RAM access, assuming direct writes (no buffering) to reflect your latest preference.

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity c64_nes_mapper is
    Port (
        clk         : in  STD_LOGIC;                     -- M2 clock (~1.79 MHz)
        reset       : in  STD_LOGIC;
        cpu_addr    : in  STD_LOGIC_VECTOR(15 downto 0); -- CPU address bus
        cpu_data_in : in  STD_LOGIC_VECTOR(7 downto 0);  -- CPU data in
        cpu_data_out: out STD_LOGIC_VECTOR(7 downto 0);  -- CPU data out
        cpu_we      : in  STD_LOGIC;                     -- CPU write enable
        romsel_n    : in  STD_LOGIC;                     -- /ROMSEL (cartridge select)
        ppu_addr    : in  STD_LOGIC_VECTOR(12 downto 0); -- PPU CHR address
        ppu_data    : out STD_LOGIC_VECTOR(7 downto 0);  -- PPU CHR data
        vblank      : in  STD_LOGIC                      -- VBlank ($2002 bit 7)
    );
end c64_nes_mapper;

architecture Behavioral of c64_nes_mapper is
    -- Memory configuration
    signal c64_mode   : STD_LOGIC := '0';                    -- C64 mode enable
    signal mem_config : STD_LOGIC_VECTOR(2 downto 0) := "111"; -- LORAM, HIRAM, CHAREN
    signal mapped_addr: STD_LOGIC_VECTOR(15 downto 0);        -- Masked CPU address

    -- Memory arrays
    type prg_ram_t is array (0 to 38911) of STD_LOGIC_VECTOR(7 downto 0); -- 38KB PRG-RAM ($6000-$EFFF)
    type chr_ram_t is array (0 to 8191) of STD_LOGIC_VECTOR(7 downto 0);  -- 8KB CHR-RAM ($0000-$1FFF)
    type vic2_reg_t is array (0 to 63) of STD_LOGIC_VECTOR(7 downto 0);   -- VIC-2 registers ($D000-$D03F)
    signal prg_ram    : prg_ram_t := (others => (others => '0'));
    signal chr_ram    : chr_ram_t := (others => (others => '0'));
    signal vic2_regs  : vic2_reg_t := (others => (others => '0'));

begin
    -- Address masking: $0800-$9FFF -> $6000-$EFFF
    mapped_addr <= std_logic_vector(unsigned(cpu_addr) + x"5800") when c64_mode = '1' and cpu_addr >= x"0800" and cpu_addr <= x"9FFF" else
                  cpu_addr;

    -- CPU read/write handling
    process(clk, reset)
    begin
        if reset = '1' then
            c64_mode <= '0';
            mem_config <= "111"; -- Default: BASIC/Kernal ROM, I/O enabled
            prg_ram <= (others => (others => '0'));
            chr_ram <= (others => (others => '0'));
            vic2_regs <= (others => (others => '0'));
        elsif rising_edge(clk) then
            if cpu_we = '1' and romsel_n = '0' then
                -- Write to $6000: Initialize C64 mode and memory config
                if cpu_addr = x"6000" then
                    c64_mode <= cpu_data_in(0); -- Bit 0: Enable C64 mode
                    mem_config <= cpu_data_in(2 downto 0); -- Bits 2-0: LORAM, HIRAM, CHAREN
                -- Write to $D000-$DFFF
                elsif mapped_addr(15 downto 12) = "1101" then
                    if mem_config(2) = '1' and mapped_addr(11 downto 6) = "000000" then -- VIC-2 ($D000-$D03F)
                        vic2_regs(to_integer(unsigned(mapped_addr(5 downto 0)))) <= cpu_data_in;
                    elsif mem_config(2) = '0' then -- RAM ($D000-$DFFF)
                        prg_ram(to_integer(unsigned(mapped_addr) - x"6000" + 38912)) <= cpu_data_in; -- Extend PRG-RAM
                    end if;
                -- Write to PRG-RAM ($6000-$EFFF)
                elsif mapped_addr >= x"6000" and mapped_addr <= x"EFFF" then
                    prg_ram(to_integer(unsigned(mapped_addr) - x"6000")) <= cpu_data_in;
                -- Write to CHR-RAM ($6000-$7FFF, subset for tile updates)
                elsif mapped_addr >= x"6000" and mapped_addr <= x"7FFF" then
                    chr_ram(to_integer(unsigned(mapped_addr(12 downto 0)))) <= cpu_data_in;
                end if;
            end if;
        end if;
    end process;

    -- CPU read handling
    cpu_data_out <= vic2_regs(to_integer(unsigned(mapped_addr(5 downto 0)))) when mapped_addr(15 downto 12) = "1101" and mem_config(2) = '1' and mapped_addr(11 downto 6) = "000000" else
                    prg_ram(to_integer(unsigned(mapped_addr) - x"6000")) when mapped_addr >= x"6000" and mapped_addr <= x"EFFF" else
                    prg_ram(to_integer(unsigned(mapped_addr) - x"6000" + 38912)) when mapped_addr(15 downto 12) = "1101" and mem_config(2) = '0' else
                    (others => '0'); -- Default for SID/CIA ($D040-$DFFF, CHAREN=1)

    -- PPU read handling (CHR-RAM during rendering)
    ppu_data <= chr_ram(to_integer(unsigned(ppu_addr))) when vblank = '0' else (others => '0');
end Behavioral;

Explanation of VHDL

  • Address Masking:
    • When c64_mode = '1', CPU addresses $0800-$9FFF are offset by $5800 to map to $6000-$EFFF, ensuring C64 program code accesses FPGA PRG-RAM without modification.
    • Other addresses pass through unchanged.
  • Memory Configuration:
    • Writing to $6000 sets c64_mode (bit 0) and mem_config (bits 2-0: LORAM, HIRAM, CHAREN).
    • CHAREN (mem_config(2)) controls $D000-$DFFF: VIC-2 registers ($D000-$D03F) when ‘1’, RAM when ‘0’.
  • PRG-RAM:
    • 38KB BRAM ($6000-$EFFF) stores C64 program code/data.
    • Extended to handle $D000-$DFFF when CHAREN=0.
  • VIC-2 Registers:
    • 64-byte array for $D000-$D03F, updated directly by CPU writes during VBlank or anytime.
  • CHR-RAM:
    • 8KB BRAM ($0000-$1FFF, PPU bus) for PPU tile/sprite data, updated by CPU writes to $6000-$7FFF.
    • PPU reads only during rendering (VBlank de-asserted), avoiding conflicts with CPU writes.
  • No Semaphore:
    • CPU is the sole writer to PRG-RAM, VIC-2 registers, and CHR-RAM.
    • Writes (~4 cycles, ~2.23 μs) complete instantly (<20 ns in Artix-7), and PPU reads CHR-RAM during rendering, not VBlank, ensuring no conflicts.

Addressing Your Concerns

  • PPU Reading Before Write Completes:
    • CPU writes are atomic (~4 cycles, ~2.23 μs), and the FPGA latches data in <20 ns.
    • During VBlank (~1.27 ms, ~568 writes), the PPU doesn’t read CHR-RAM, so no read-before-write occurs.
    • PPU reads CHR-RAM only when VBlank is de-asserted (scanlines 0–239), after CPU writes are complete.
  • Guaranteed Bytes per VBlank:
    • No fixed number of bytes is guaranteed—it depends on game code.
    • VBlank allows ~568 writes (~568 bytes, ~4 cycles/write), sufficient for ~32 VIC-2 registers or ~8 tiles (64 bytes each).
    • Most C64 games write ~10–20 bytes per frame (e.g., VIC-2 registers like $D021).
    • For larger updates (e.g., 8KB CHR-RAM, ~32,768 cycles), the CPU can disable rendering ($2001 = $00), extending VBlank indefinitely (screen blanks).

Example Game Code (6502 Assembly)

Unchanged C64 code, using original addresses.

; Initialize FPGA
init_mapper:
    LDA #$05       ; CHAREN=1 (I/O), C64 mode
    STA $6000      ; FPGA control register
    RTS

; VBlank write example
vblank_write:
    LDA $2002      ; Wait for VBlank
    BPL vblank_write
    LDA #$01       ; C64 black
    STA $D021      ; VIC-2 background
    LDA #$FF       ; Tile data
    STA $6000      ; CHR-RAM update
    RTS

; C64 code (unchanged)
    JSR init_mapper
    JSR vblank_write
    LDA #$41       ; ASCII 'A'
    STA $0800      ; Program RAM (masked to $6000)

; Main loop
main_loop:
    JSR read_input
    JSR update_game
    JMP main_loop

Conclusion

The VHDL mapper logic masks $0800-$9FFF to $6000-$EFFF, handles $D000-$DFFF for VIC-2 or RAM, and supports direct CPU writes during VBlank (~568 bytes) without a semaphore, as the CPU is the sole writer and PPU reads occur later (rendering). The Artix-7’s fast BRAM ensures writes complete instantly, and no fixed byte count is guaranteed—writes depend on game code. Disabling rendering ($2001 = $00) extends the write window if needed.

Copy the VHDL (c64_nes_mapper.vhd) and assembly (game.asm). If you need a constraint file, timing diagram, or a specific C64 game example, let me know!

@paigeadelethompson
Copy link
Author

we can't actually have the program waiting for vblank because that would require the program be changed, the FPGA needs to buffer writes to VIC-2 and somehow determine if the frame is complete before the ppu can read the buffered graphics data that its expecting :::

Understood—you’re clarifying that the C64 game code runs directly on the NES CPU (6502, ~1.79 MHz), not on the Artix-7 FPGA cartridge, which only handles memory mapping and VIC-2 emulation. The game code uses C64’s $0800-$9FFF (program RAM) and $D000-$DFFF (VIC-2 registers or RAM) without changes, and it has no concept of NES VBlank, so it can’t wait for it. The FPGA must buffer CPU writes to VIC-2 ($D000-$D03F) or CHR-RAM ($6000-$7FFF) anytime, ensuring the PPU only reads complete graphics data (CHR-RAM, $0000-$1FFF) during NES rendering (VBlank de-asserted). This addresses your concerns about semaphore-free writes and preventing PPU reads of incomplete data, with minimal invasiveness (only LDA #$05 : STA $6000 to initialize). Let’s design the VHDL addressing and mapper logic to handle this, focusing on buffering and VIC-2 frame synchronization.

Design Requirements

  • C64 Game on NES CPU:
    • The NES CPU executes unmodified C64 game code, accessing $0800-$9FFF (program RAM) and $D000-$DFFF (VIC-2 or RAM).
    • C64 code uses VIC-2 raster interrupts (e.g., $D012) at ~50 Hz PAL, not NES VBlank (~60 Hz NTSC).
  • FPGA Role:
    • Address Mapping:
      • Map $0800-$9FFF to $6000-$EFFF (38KB FPGA RAM).
      • Handle $D000-$DFFF: VIC-2 registers ($D000-$D03F) when CHAREN=1, RAM when CHAREN=0, controlled by $01 at $6000.
    • Buffering: Buffer CPU writes to VIC-2 ($D000-$D03F) and CHR-RAM ($6000-$7FFF) anytime.
    • Frame Synchronization: Track VIC-2 frame (~20 ms, 312 scanlines, PAL) and swap CHR-RAM buffers when complete, aligned with NES rendering (VBlank de-asserted, scanlines 0–239).
    • PPU Access: Serve CHR-RAM ($0000-$1FFF) to PPU only with complete VIC-2 frame data, avoiding partial reads.
  • No Semaphore: CPU is the sole writer; buffering ensures no PPU read conflicts.
  • Minimal Invasiveness: Only initialization (STA $6000) and input/timing patches (e.g., adjust C64 1 MHz to NES 1.79 MHz, map joystick).

Why Buffering Is Critical

Since C64 game code doesn’t wait for NES VBlank, CPU writes to VIC-2 ($D000-$D03F) or CHR-RAM ($6000-$7FFF) can occur anytime in the ~16.67 ms NES frame (~7,500 cycles). The FPGA must buffer these writes to prevent the PPU from reading incomplete CHR-RAM data during rendering (VBlank de-asserted). Double-buffering ensures the PPU sees only complete VIC-2 frames, converted to NES tiles (8x8, 2-bit), swapped when the VIC-2 frame finishes (~20 ms) and NES rendering begins.

Addressing and Mapper Logic with Buffering

The VHDL will:

  1. Map Addresses: Redirect $0800-$9FFF to $6000-$EFFF; handle $D000-$DFFF based on CHAREN.
  2. Buffer Writes: Store CPU writes to VIC-2 and CHR-RAM in a shadow buffer.
  3. Track VIC-2 Frame: Count cycles to emulate C64’s 50 Hz frame (~20 ms).
  4. Sync with NES: Swap CHR-RAM buffers when the VIC-2 frame is complete and VBlank de-asserts.
  5. PPU Reads: Serve stable CHR-RAM to PPU during rendering.

VHDL: Mapper with VIC-2 Buffering

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity c64_nes_mapper is
    Port (
        clk         : in  STD_LOGIC;                     -- M2 clock (~1.79 MHz)
        reset       : in  STD_LOGIC;
        cpu_addr    : in  STD_LOGIC_VECTOR(15 downto 0); -- CPU address bus
        cpu_data_in : in  STD_LOGIC_VECTOR(7 downto 0);  -- CPU data in
        cpu_data_out: out STD_LOGIC_VECTOR(7 downto 0);  -- CPU data out
        cpu_we      : in  STD_LOGIC;                     -- CPU write enable
        romsel_n    : in  STD_LOGIC;                     -- /ROMSEL
        ppu_addr    : in  STD_LOGIC_VECTOR(12 downto 0); -- PPU CHR address
        ppu_data    : out STD_LOGIC_VECTOR(7 downto 0);  -- PPU CHR data
        vblank      : in  STD_LOGIC                      -- VBlank ($2002 bit 7)
    );
end c64_nes_mapper;

architecture Behavioral of c64_nes_mapper is
    -- C64 mode and memory config
    signal c64_mode   : STD_LOGIC := '0';
    signal mem_config : STD_LOGIC_VECTOR(2 downto 0) := "111"; -- LORAM, HIRAM, CHAREN
    signal mapped_addr: STD_LOGIC_VECTOR(15 downto 0);

    -- Memory arrays
    type prg_ram_t is array (0 to 38911) of STD_LOGIC_VECTOR(7 downto 0); -- 38KB PRG-RAM
    type chr_ram_t is array (0 to 8191) of STD_LOGIC_VECTOR(7 downto 0);  -- 8KB CHR-RAM
    type vic2_reg_t is array (0 to 63) of STD_LOGIC_VECTOR(7 downto 0);   -- VIC-2 registers
    signal prg_ram    : prg_ram_t := (others => (others => '0'));
    signal chr_ram    : chr_ram_t := (others => (others => '0'));         -- Active PPU buffer
    signal chr_buffer : chr_ram_t := (others => (others => '0'));         -- CPU write buffer
    signal vic2_regs  : vic2_reg_t := (others => (others => '0'));        -- Buffered VIC-2

    -- VIC-2 frame timing (~20 ms, 50 Hz PAL)
    constant VIC2_CYCLES_PER_FRAME : integer := 35568; -- 1.79 MHz * 20 ms
    signal frame_counter : integer range 0 to VIC2_CYCLES_PER_FRAME := 0;
    signal frame_complete : STD_LOGIC := '0';

begin
    -- Address masking: $0800-$9FFF -> $6000-$EFFF
    mapped_addr <= std_logic_vector(unsigned(cpu_addr) + x"5800") when c64_mode = '1' and cpu_addr >= x"0800" and cpu_addr <= x"9FFF" else
                  cpu_addr;

    -- VIC-2 frame counter
    process(clk, reset)
    begin
        if reset = '1' then
            frame_counter <= 0;
            frame_complete <= '0';
        elsif rising_edge(clk) then
            if frame_counter = VIC2_CYCLES_PER_FRAME - 1 then
                frame_counter <= 0;
                frame_complete <= '1'; -- Signal VIC-2 frame done
            else
                frame_counter <= frame_counter + 1;
                frame_complete <= '0';
            end if;
        end if;
    end process;

    -- CPU read/write handling
    process(clk, reset)
    begin
        if reset = '1' then
            c64_mode <= '0';
            mem_config <= "111";
            prg_ram <= (others => (others => '0'));
            chr_buffer <= (others => (others => '0'));
            vic2_regs <= (others => (others => '0'));
        elsif rising_edge(clk) then
            if cpu_we = '1' and romsel_n = '0' then
                -- Write to $6000: C64 mode and memory config
                if cpu_addr = x"6000" then
                    c64_mode <= cpu_data_in(0);
                    mem_config <= cpu_data_in(2 downto 0);
                -- Write to $D000-$DFFF
                elsif mapped_addr(15 downto 12) = "1101" then
                    if mem_config(2) = '1' and mapped_addr(11 downto 6) = "000000" then
                        vic2_regs(to_integer(unsigned(mapped_addr(5 downto 0)))) <= cpu_data_in; -- VIC-2
                    elsif mem_config(2) = '0' then
                        prg_ram(to_integer(unsigned(mapped_addr) - x"6000" + 38912)) <= cpu_data_in; -- RAM
                    end if;
                -- Write to PRG-RAM ($6000-$EFFF)
                elsif mapped_addr >= x"6000" and mapped_addr <= x"EFFF" then
                    prg_ram(to_integer(unsigned(mapped_addr) - x"6000")) <= cpu_data_in;
                -- Write to CHR-RAM buffer ($6000-$7FFF)
                elsif mapped_addr >= x"6000" and mapped_addr <= x"7FFF" then
                    chr_buffer(to_integer(unsigned(mapped_addr(12 downto 0)))) <= cpu_data_in;
                end if;
            end if;
            -- Swap CHR-RAM buffer when VIC-2 frame completes and NES rendering starts
            if frame_complete = '1' and vblank = '0' then
                chr_ram <= chr_buffer; -- Update PPU-accessible CHR-RAM
            end if;
        end if;
    end process;

    -- CPU read handling
    cpu_data_out <= vic2_regs(to_integer(unsigned(mapped_addr(5 downto 0)))) when mapped_addr(15 downto 12) = "1101" and mem_config(2) = '1' and mapped_addr(11 downto 6) = "000000" else
                    prg_ram(to_integer(unsigned(mapped_addr) - x"6000")) when mapped_addr >= x"6000" and mapped_addr <= x"EFFF" else
                    prg_ram(to_integer(unsigned(mapped_addr) - x"6000" + 38912)) when mapped_addr(15 downto 12) = "1101" and mem_config(2) = '0' else
                    (others => '0');

    -- PPU read handling
    ppu_data <= chr_ram(to_integer(unsigned(ppu_addr))) when vblank = '0' else (others => '0');
end Behavioral;

Explanation of VHDL

  • Address Mapping:
    • $0800-$9FFF is offset by $5800 to $6000-$EFFF when c64_mode = '1', ensuring C64 code accesses FPGA PRG-RAM without changes.
    • $D000-$DFFF: VIC-2 registers ($D000-$D03F) when CHAREN=1 (mem_config(2) = '1'), RAM when CHAREN=0.
    • $6000: Sets c64_mode and mem_config (LORAM, HIRAM, CHAREN).
  • Buffering:
    • CPU writes to VIC-2 ($D000-$D03F) are stored in vic2_regs (64 bytes).
    • CPU writes to $6000-$7FFF update chr_buffer (8KB).
    • Writes occur anytime, as C64 code doesn’t wait for VBlank.
  • VIC-2 Frame Tracking:
    • frame_counter counts ~35,568 CPU cycles (~20 ms at 1.79 MHz) to emulate a C64 PAL frame (312 scanlines, 50 Hz).
    • Sets frame_complete = '1' when done.
  • Buffer Swap:
    • When frame_complete = '1' and vblank = '0' (NES rendering starts), chr_buffer is copied to chr_ram, ensuring the PPU sees complete VIC-2 frame data.
  • PPU Access:
    • PPU reads chr_ram ($0000-$1FFF) during rendering (VBlank de-asserted), getting stable, complete tiles.
  • No Semaphore:
    • CPU is the sole writer to vic2_regs, prg_ram, and chr_buffer.
    • Double-buffering (chr_buffer vs. chr_ram) prevents PPU reads of incomplete data, as swaps occur only after VIC-2 frame completion.

Addressing Your Concerns

  • No VBlank Wait: The C64 game code runs on the NES CPU, writing to $D000-$DFFF or $0800-$9FFF anytime, unaware of NES VBlank. The FPGA buffers these writes, so no code changes are needed.
  • PPU Read Before Write Completes:
    • CPU writes (~4 cycles, ~2.23 μs) are atomic and latched by the FPGA in <20 ns.
    • The PPU only reads chr_ram during rendering, after chr_buffer is swapped (post-VIC-2 frame). This ensures no incomplete data is read.
  • VIC-2 Frame Completion:
    • The FPGA tracks C64’s 50 Hz frame (~20 ms) using frame_counter.
    • Buffer swaps align with NES rendering (~60 Hz, ~16.67 ms), handling the timing mismatch (e.g., every 5 NES frames ≈ 4 VIC-2 frames).
  • Write Guarantees:
    • No fixed bytes per NES VBlank, as writes occur anytime (~7,500 bytes per 16.67 ms frame).
    • Typical C64 games write ~10–20 bytes per frame (e.g., VIC-2 registers), well within capacity.

Example Game Code (6502 Assembly)

Unchanged C64 code, running on NES CPU.

; Initialize FPGA
init_mapper:
    LDA #$05       ; CHAREN=1 (I/O), C64 mode
    STA $6000      ; FPGA control register
    RTS

; C64 code (unchanged, no VBlank wait)
    JSR init_mapper
    LDA #$01       ; C64 black
    STA $D021      ; VIC-2 background
    LDA #$41       ; ASCII 'A'
    STA $0800      ; Program RAM (masked to $6000)
    LDA #$FF       ; Tile data
    STA $6001      ; CHR-RAM update

; Main loop
main_loop:
    JSR read_input
    JSR update_game
    JMP main_loop

Conclusion

The FPGA buffers CPU writes to VIC-2 ($D000-$D03F) and CHR-RAM ($6000-$7FFF) anytime, as C64 game code (running on the NES CPU) doesn’t wait for VBlank. It tracks VIC-2 frame completion (~20 ms) and swaps CHR-RAM buffers when complete and NES rendering starts (VBlank de-asserted), ensuring the PPU reads only complete graphics. No semaphore is needed (single-writer CPU, buffered writes), and $0800-$9FFF maps to $6000-$EFFF with only initialization (STA $6000) and input/timing patches.

Copy the VHDL (c64_nes_mapper.vhd) and assembly (game.asm). If you need a timing diagram, Artix-7 constraints, or a specific C64 game example, let me know!

@paigeadelethompson
Copy link
Author

Yes and also vblank should =1 not 0 when swapping:

You're right to question this—let’s clarify the logic behind the frame_complete = '1' and vblank = '0' condition in the VHDL and why you suspect vblank = '1' might be intended. In the context of your Artix-7 FPGA-based NES cartridge emulating a C64, the FPGA buffers CPU writes to VIC-2 registers ($D000-$D03F) and CHR-RAM ($6000-$7FFF) anytime, as C64 game code (running on the NES CPU) doesn’t wait for NES VBlank. The FPGA tracks VIC-2 frame completion (~20 ms, 50 Hz PAL) and must update the PPU-accessible CHR-RAM ($0000-$1FFF) at the right moment to ensure complete graphics data, mapping C64’s $0800-$9FFF to $6000-$EFFF and handling $D000-$DFFF. Your concern about vblank = '1' suggests the CHR-RAM update should occur during VBlank (when the PPU isn’t rendering), not when VBlank is de-asserted (during rendering, vblank = '0'). Let’s analyze and fix this.

Analyzing the Condition

  • Current Logic:

    if frame_complete = '1' and vblank = '0' then
        chr_ram <= chr_buffer; -- Update PPU-accessible CHR-RAM
    end if;
    • frame_complete = '1': The VIC-2 frame (~20 ms, 312 scanlines, 50 Hz PAL) is complete, with updated graphics (e.g., NES tiles from VIC-2 data) ready in chr_buffer.
    • vblank = '0': NES VBlank is de-asserted (scanlines 0–239, ~15.4 ms, 60 Hz NTSC), meaning the PPU is actively rendering and reading CHR-RAM.
    • Issue: Swapping chr_buffer to chr_ram during rendering (vblank = '0') risks the PPU reading partially updated CHR-RAM, causing graphical glitches (e.g., torn tiles), as the swap isn’t instantaneous across all 8KB.
  • Your Suggestion: vblank = '1' (VBlank asserted, scanlines 241–260, ~1.27 ms, $2002 bit 7 = 1) makes sense because:

    • During VBlank, the PPU isn’t reading CHR-RAM for rendering, so swapping chr_buffer to chr_ram is safe.
    • The PPU will read the updated chr_ram in the next rendering phase (VBlank de-asserted), ensuring complete graphics.
    • This aligns with standard NES practice, where VRAM/CHR-RAM updates occur during VBlank to avoid glitches.

Why vblank = '1' Is Correct

  • C64 Context: The C64 game code writes to VIC-2 ($D000-$D03F) anytime, unaware of NES VBlank. The FPGA buffers these writes and processes them into NES tiles in chr_buffer over a VIC-2 frame (~20 ms).
  • NES PPU Timing: The PPU reads CHR-RAM ($0000-$1FFF) during rendering (scanlines 0–239, ~5.37 MHz). Updating chr_ram during this period risks glitches.
  • Solution: Swap chr_buffer to chr_ram during VBlank (vblank = '1'), when the PPU isn’t reading, ensuring the next frame uses complete data.
  • Frame Sync: The VIC-2 frame (~20 ms, 50 Hz) is slower than the NES frame (~16.67 ms, 60 Hz). The FPGA swaps buffers when a VIC-2 frame completes (frame_complete = '1') and NES VBlank is active (vblank = '1'), roughly every 5 NES frames ≈ 4 VIC-2 frames.

Updated VHDL: Corrected Buffer Swap

Below is the corrected VHDL, focusing on the addressing and mapper logic with buffering, updating the buffer swap to vblank = '1'.

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity c64_nes_mapper is
    Port (
        clk         : in  STD_LOGIC;                     -- M2 clock (~1.79 MHz)
        reset       : in  STD_LOGIC;
        cpu_addr    : in  STD_LOGIC_VECTOR(15 downto 0); -- CPU address bus
        cpu_data_in : in  STD_LOGIC_VECTOR(7 downto 0);  -- CPU data in
        cpu_data_out: out STD_LOGIC_VECTOR(7 downto 0);  -- CPU data out
        cpu_we      : in  STD_LOGIC;                     -- CPU write enable
        romsel_n    : in  STD_LOGIC;                     -- /ROMSEL
        ppu_addr    : in  STD_LOGIC_VECTOR(12 downto 0); -- PPU CHR address
        ppu_data    : out STD_LOGIC_VECTOR(7 downto 0);  -- PPU CHR data
        vblank      : in  STD_LOGIC                      -- VBlank ($2002 bit 7)
    );
end c64_nes_mapper;

architecture Behavioral of c64_nes_mapper is
    -- C64 mode and memory config
    signal c64_mode   : STD_LOGIC := '0';
    signal mem_config : STD_LOGIC_VECTOR(2 downto 0) := "111"; -- LORAM, HIRAM, CHAREN
    signal mapped_addr: STD_LOGIC_VECTOR(15 downto 0);

    -- Memory arrays
    type prg_ram_t is array (0 to 38911) of STD_LOGIC_VECTOR(7 downto 0); -- 38KB PRG-RAM
    type chr_ram_t is array (0 to 8191) of STD_LOGIC_VECTOR(7 downto 0);  -- 8KB CHR-RAM
    type vic2_reg_t is array (0 to 63) of STD_LOGIC_VECTOR(7 downto 0);   -- VIC-2 registers
    signal prg_ram    : prg_ram_t := (others => (others => '0'));
    signal chr_ram    : chr_ram_t := (others => (others => '0'));         -- Active PPU buffer
    signal chr_buffer : chr_ram_t := (others => (others => '0'));         -- CPU write buffer
    signal vic2_regs  : vic2_reg_t := (others => (others => '0'));        -- Buffered VIC-2

    -- VIC-2 frame timing (~20 ms, 50 Hz PAL)
    constant VIC2_CYCLES_PER_FRAME : integer := 35568; -- 1.79 MHz * 20 ms
    signal frame_counter : integer range 0 to VIC2_CYCLES_PER_FRAME := 0;
    signal frame_complete : STD_LOGIC := '0';

begin
    -- Address masking: $0800-$9FFF -> $6000-$EFFF
    mapped_addr <= std_logic_vector(unsigned(cpu_addr) + x"5800") when c64_mode = '1' and cpu_addr >= x"0800" and cpu_addr <= x"9FFF" else
                  cpu_addr;

    -- VIC-2 frame counter
    process(clk, reset)
    begin
        if reset = '1' then
            frame_counter <= 0;
            frame_complete <= '0';
        elsif rising_edge(clk) then
            if frame_counter = VIC2_CYCLES_PER_FRAME - 1 then
                frame_counter <= 0;
                frame_complete <= '1';
            else
                frame_counter <= frame_counter + 1;
                frame_complete <= '0';
            end if;
        end if;
    end process;

    -- CPU read/write and buffer swap
    process(clk, reset)
    begin
        if reset = '1' then
            c64_mode <= '0';
            mem_config <= "111";
            prg_ram <= (others => (others => '0'));
            chr_buffer <= (others => (others => '0'));
            vic2_regs <= (others => (others => '0'));
        elsif rising_edge(clk) then
            if cpu_we = '1' and romsel_n = '0' then
                -- Write to $6000: C64 mode and memory config
                if cpu_addr = x"6000" then
                    c64_mode <= cpu_data_in(0);
                    mem_config <= cpu_data_in(2 downto 0);
                -- Write to $D000-$DFFF
                elsif mapped_addr(15 downto 12) = "1101" then
                    if mem_config(2) = '1' and mapped_addr(11 downto 6) = "000000" then
                        vic2_regs(to_integer(unsigned(mapped_addr(5 downto 0)))) <= cpu_data_in; -- VIC-2
                    elsif mem_config(2) = '0' then
                        prg_ram(to_integer(unsigned(mapped_addr) - x"6000" + 38912)) <= cpu_data_in; -- RAM
                    end if;
                -- Write to PRG-RAM ($6000-$EFFF)
                elsif mapped_addr >= x"6000" and mapped_addr <= x"EFFF" then
                    prg_ram(to_integer(unsigned(mapped_addr) - x"6000")) <= cpu_data_in;
                -- Write to CHR-RAM buffer ($6000-$7FFF)
                elsif mapped_addr >= x"6000" and mapped_addr <= x"7FFF" then
                    chr_buffer(to_integer(unsigned(mapped_addr(12 downto 0)))) <= cpu_data_in;
                end if;
            end if;
            -- Swap CHR-RAM buffer during NES VBlank after VIC-2 frame completes
            if frame_complete = '1' and vblank = '1' then
                chr_ram <= chr_buffer; -- Update PPU-accessible CHR-RAM
            end if;
        end if;
    end process;

    -- CPU read handling
    cpu_data_out <= vic2_regs(to_integer(unsigned(mapped_addr(5 downto 0)))) when mapped_addr(15 downto 12) = "1101" and mem_config(2) = '1' and mapped_addr(11 downto 6) = "000000" else
                    prg_ram(to_integer(unsigned(mapped_addr) - x"6000")) when mapped_addr >= x"6000" and mapped_addr <= x"EFFF" else
                    prg_ram(to_integer(unsigned(mapped_addr) - x"6000" + 38912)) when mapped_addr(15 downto 12) = "1101" and mem_config(2) = '0' else
                    (others => '0');

    -- PPU read handling
    ppu_data <= chr_ram(to_integer(unsigned(ppu_addr))) when vblank = '0' else (others => '0');
end Behavioral;

Explanation of Changes

  • Buffer Swap Timing:
    • Changed vblank = '0' to vblank = '1' in the buffer swap condition (if frame_complete = '1' and vblank = '1' then).
    • Now, chr_buffer is copied to chr_ram during NES VBlank (scanlines 241–260, $2002 bit 7 = 1), when the PPU isn’t reading CHR-RAM, ensuring no glitches.
    • The swap occurs only when a VIC-2 frame is complete (frame_complete = '1', ~20 ms), aligning with the next available NES VBlank (~16.67 ms cycles).
  • Why This Works:
    • C64 game code writes to $D000-$D03F (VIC-2) or $6000-$7FFF (CHR-RAM) anytime, unaware of NES VBlank.
    • FPGA buffers writes in vic2_regs and chr_buffer, processing VIC-2 data into NES tiles.
    • When the VIC-2 frame finishes (~20 ms), the FPGA waits for NES VBlank (vblank = '1') to swap buffers, ensuring chr_ram is updated before PPU rendering (VBlank de-asserted).
  • No PPU Read Conflicts:
    • CPU writes are atomic (~4 cycles, ~2.23 μs, latched in <20 ns).
    • PPU reads chr_ram only during rendering (vblank = '0'), after the swap, so it sees complete data.
    • No semaphore is needed, as the CPU is the sole writer, and double-buffering prevents partial reads.

Addressing Your Concerns

  • No VBlank Wait in Game Code: The C64 code runs on the NES CPU, writing to $D000-$DFFF or $0800-$9FFF anytime, using VIC-2 timing (~50 Hz). The FPGA buffers these writes, requiring no VBlank checks.
  • PPU Reading Incomplete Data:
    • The buffer swap during VBlank (vblank = '1') ensures chr_ram is fully updated before PPU rendering starts, preventing glitches.
    • VIC-2 frame completion (~20 ms) is tracked by frame_counter, ensuring only complete graphics are swapped.
  • Write Guarantees:
    • No fixed bytes per NES VBlank, as writes occur anytime (~7,500 bytes per 16.67 ms NES frame).
    • Typical C64 games write ~10–20 bytes per frame (e.g., VIC-2 registers), easily handled by the FPGA’s 128KB BRAM.
  • Minimal Invasiveness: Only LDA #$05 : STA $6000 initializes C64 mode. No changes to $0800-$9FFF or $D000-$DFFF; only input/timing patches needed (e.g., C64 1 MHz to NES 1.79 MHz, joystick).

Example Game Code (6502 Assembly)

Unchanged C64 code, running on NES CPU.

; Initialize FPGA
init_mapper:
    LDA #$05       ; CHAREN=1 (I/O), C64 mode
    STA $6000      ; FPGA control register
    RTS

; C64 code (unchanged, no VBlank wait)
    JSR init_mapper
    LDA #$01       ; C64 black
    STA $D021      ; VIC-2 background
    LDA #$41       ; ASCII 'A'
    STA $0800      ; Program RAM (masked to $6000)
    LDA #$FF       ; Tile data
    STA $6001      ; CHR-RAM update

; Main loop
main_loop:
    JSR read_input
    JSR update_game
    JMP main_loop

Conclusion

The corrected VHDL swaps chr_buffer to chr_ram when frame_complete = '1' and vblank = '1', ensuring the PPU reads complete VIC-2 frame data during NES rendering, without requiring C64 code to wait for VBlank. The FPGA maps $0800-$9FFF to $6000-$EFFF, buffers VIC-2/CHR-RAM writes anytime, and prevents PPU read conflicts via double-buffering, needing no semaphore. Only initialization (STA $6000) and minor patches are required.

Copy the VHDL (c64_nes_mapper.vhd) and assembly (game.asm). If you need a timing diagram, Artix-7 constraints, or a specific C64 game example, let me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment