Tag Archives: HDL

FPGA – Xilinx JTAG to AXI Master from XSDB and Python

One of the most annoying things when working on an early design on an FPGA development kit is a lack of run-time register interfaces without a lot of effort.

While looking for an interface that would work on basically any Vivado supported Xilinx FPGA I came across the JTAG to AXI Master core supplied by Xilinx. Unfortunately it has a cumbersome interface that is intended for the user to drive from Vivado’s TCL console which is not always the most convenient. Others have been looking for a C API to interact with the hw_server directly. There seems to be someone that has had put together a C library but I was unable to get the files. I wanted something easier to use anyways so I began to look elsewhere for a solution.

Accessing JTAG2AXI from XSDB

I remembered that XSDK, XSCT and XSDB has the ability to read/write memory on the Xilinx SoCs so I thought to try the mrd and mwr in XSDB.

Running xsdb in a terminal.

$> xsdb

Connecting to the hw_server and JTAG cable.

xsdb% connect
tcfchan#0

Searching for debug target.

xsdb% targets
 1  APU
    2  ARM Cortex-A9 MPCore #0 (Running)
    3  ARM Cortex-A9 MPCore #1 (Running)
 4  xc7z010
    5  Legacy Debug Hub
       6  JTAG2AXI

We see above that the JTAG2AXI core we put in our design which we have already programmed to the board shows up so we select it.

xsdb% target 6 

Trying the mrd command results in a valid read!                                              

xsdb% mrd 0
      0:   0000000A

Accessing JTAG2AXI from Python

While performing memory read and write without the TCL commands in Vivado but from XSDB is great… I wanted a way to interact with the JTAG2AXI bridge from other software. While looking for a solution I found pysct, a Python interface to XSDB and Vivado!

After installing pysct, connecting to XSDB is as easy as starting xsdb in a terminal then creating a server and connecting to it from Python.

$> xsdb

xsdb% xsdbserver start -port 3010
from pysct.core import *

xsct = Xsct('localhost', 3010)

xsct.do("connect")
xsct.do("target 6")

print(xsct.do("mrd -value 0"))

# xsct.do("mrd -value 0 256") performs a read burst of 256 words instead of 1. 

By default the mrd command returns data formatted for human reading with addresses and data in HEX format. This slows stuff down a lot. Using the -value or -bin option is recommended for higher speed.

I noticed some issues in pysct and had to modify the recv() function in the Xsct class to have a much larger buffer size, setting it to 32768 allowed AXI4 bursts of 256 to work.

Performance Testing

With the JTAG cable on the Digilent Zybo board set to 30 MHz I ran some performance tests.

Running some performance tests on an AXI4Lite variant of the core in Python results in about 9 kilobytes/s of read transfers.

If we use the AXI4 variant of the core and use mrd -value 0 256 to perform max length bursts we get about 1.2 megabytes/s of read transfers! Pretty decent!

FPGA – LittleRiscy RISC-V RV32I Emulator and HDL Core

LittleRiscy is an RV32I RISC-V emulator and HDL core that I have decided to release as an open source project. The project is a work in progress.

LittleRiscy’s GIT repository contains an instruction set emulator written in C++ and a CPU core written in SystemVerilog. The emulator has been validated against some simple test binaries and the SystemVerilog code has been converted to C++ using Verilator to validate that it behaves functionally identical when running the same test binaries.

At this time, LittleRiscy has been deployed on a Xilinx Series 7 FPGA using Xilinx Vivado and has blinked some LEDs with a simple binary.

The goal of LittleRiscy is to create a simple CPU core for a unique purpose. It will be a classic RISC pipeline with no debugger, no interrupts, no ability to load new code at runtime, and limited peripherals. Inspired by CHIPS2.0 and the PIOs in the RP2040, it will be used to process AXI-Streams for packet processing or digital signal processing when data rates are low enough that custom RTL logic is not required and a CPU is both smaller and simpler. To further simplify creation of software for the core, the AXI-Stream input interface could stall the CPU on read if empty and the output interface could stall the CPU on write if full, negating the requirement for the software to check flags during execution. For some demanding tasks, it should be easy to add a handful of custom instructions to improve throughput.