## Introduction

This page focusses on the possibilities to describe hardware structure in MyHDL. MyHDL uses classic functions for this purpose.

Before simulation or conversion to Verilog, a design is *elaborated* by running
the Python interpreter. This implies that all structure is flattened out. As a
consequence, the complexity of the structural description has no impact on the
possibility to convert to Verilog or VHDL.

As an example, this page shows how to describe recursive structures in MyHDL. Python is very well suited to do this elegantly. But after conversion to Verilog, all recursion is flattened out, so it doesn't matter whether the back-end language or tools support it or not.

## Specification

We will develop a circuit that sorts an array of integer values. More specifically, the array to be sorted is the input, and the sorted array the output. We are looking for a data-flow oriented circuit, where the output follows the input "combinatorially".

We will use the bitonic sort algorithm. We will not explain it here: for more info, read this web page that explains the bitonic sort algorithm.

For our purposes, it is sufficient to know that the bitonic algorithm is well suited for a hardware implementation, and that it is recursive in nature. To develop the MyHDL code, we will start from a reference software implementation.

## Reference implementation in Java

The web page mentioned earlier also contains an annotated reference implementation in Java. We will use this as a starting point to develop the MyHDL code. Therefore, the original Java code and annotations are quoted literally in the remainder of this section.

*(start quote)*

In the following, an implementation of bitonic sort in Java is given. The number of elements to be sorted must be a power of 2.

The program is assumed to be encapsulated in a class with the following elements:

private static int[] a; // the array to be sorted private final static boolean ASCENDING=true, DESCENDING=false; // sorting direction

A comparator is modelled by the procedure `compare`

, where the parameter `dir`

indicates the sorting direction. If `dir`

is `ASCENDING`

and `a[i] > a[j]`

is
true or `dir`

is DESCENDING and `a[i] > a[j]`

is false then `a[i]`

and `a[j]`

are interchanged.

private static void compare(int i, int j, boolean dir) { if (dir==(a[i]>a[j])) { int h=a[i]; a[i]=a[j]; a[j]=h; } }

The procedure `bitonicMerge`

recursively sorts a bitonic sequence in ascending
order, if `dir = ASCENDING`

, and in descending order otherwise. The sequence to
be sorted starts at index position `lo`

, the number of elements is `n`

.

private static void bitonicMerge(int lo, int n, boolean dir) { if (n>1) { int k=n/2; for (int i=lo; i<lo+k; i++) compare(i, i+k, dir); bitonicMerge(lo, k, dir); bitonicMerge(lo+k, k, dir); } }

Procedure `bitonicSort`

first produces a bitonic sequence by recursively
sorting its two halves in opposite directions, and then calls `bitonicMerge`

.

private static void bitonicSort(int lo, int n, boolean dir) { if (n>1) { int k=n/2; bitonicSort(lo, k, ASCENDING); bitonicSort(lo+k, k, DESCENDING); bitonicMerge(lo, n, dir); } }

When called with parameters `lo = 0`

, `n = a.length()`

and `dir = ASCENDING`

,
procedure `bitonicSort`

sorts the whole array `a`

.

public static void sort() { bitonicSort(0, a.length(), ASCENDING); }

*(end quote)*

## MyHDL implementation

Let's start with some general considerations.

First, note that the Java code consists of a number of procedures and procedure calls. In MyHDL, we can map each procedure to a function that models a hardware module and each procedure call to an instantiation. The Java procedures return nothing; they are run solely for their side effects. In MyHDL, we will build the hardware structure by returning the instances that compose each module.

Secondly, note that the Java code sorts the array in-place, and the procedure parameters consist of the indices of the array elements to be manipulated. In our data-flow oriented MyHDL solution, we will have to use a separate input and output array instead. The array will be represented by a list of signals. If a function performs several transformations, we have to introduce additional lists of signals internally.

Finally, in the Java code the recursion stops by "doing nothing". In the MyHDL code, the equivalent is to assign the input to the output signal. We will need a "feedthrough" circuit for this purpose.

Now we are ready to fill in the details, in the same order as in the original Java version.

We start by importing the `myhdl`

library, and by defining some boolean
constants:

from myhdl import * DESCENDING, ASCENDING = False, True

The `compare`

leaf module is a combinatorial circuit with two input and two
output signals. By default, the inputs are "fed through" to the outputs, but
under the same condition as in the Java code, they are interchanged:

def compare(a1, a2, z1, z2, dir): @always_comb def logic(): z1.next = a1 z2.next = a2 if dir == (a1 > a2): z1.next = a2 z2.next = a1 return logic

As mentioned, we will also need a feedthrough circuit, the equivalent of "doing nothing" in the Java code:

def feedthru(a, z): @always_comb def logic(): z.next = a return logic

Function `bitonicMerge`

describes a higher-level module that takes a list of
signals as its input and its output. Instead of manipulating array elements
based on indices as in Java, it generates the output from the input list of
signals. Note how an additional list of signals is introduced in the case of
`n > 1`

, and how the recursion stops with a `feedthru`

instance.

def bitonicMerge(a, z, dir): n = len(a) k = n//2 w = len(a[0]) if n > 1: t = [Signal(intbv(0)[w:]) for i in range(n)] comp = [compare(a[i], a[i+k], t[i], t[i+k], dir) for i in range(k)] loMerge = bitonicMerge(t[:k], z[:k], dir) hiMerge = bitonicMerge(t[k:], z[k:], dir) return comp, loMerge, hiMerge else: feed = feedthru(a[0], z[0]) return feed

Function `bitonicSort`

is written in a similar fashion:

def bitonicSort(a, z, dir): n = len(a) k = n//2 w = len(a[0]) if n > 1: t = [Signal(intbv(0)[w:]) for i in range(n)] loSort = bitonicSort(a[:k], t[:k], ASCENDING) hiSort = bitonicSort(a[k:], t[k:], DESCENDING) merge = bitonicMerge(t, z, dir) return loSort, hiSort, merge else: feed = feedthru(a[0], z[0]) return feed

Now we can define a top level module, for example to sort an array of 8 values. There is a slight restriction however. The Verilog convertor cannot handle lists of signals in a top level interface. (This is related to the fact that Verilog memories cannot be used as ports.) To make the design suited for conversion, we have to splice the lists into individual signals:

def Array8Sorter(a0, a1, a2, a3, a4, a5, a6, a7, z0, z1, z2, z3, z4, z5, z6, z7): a = [a0, a1, a2, a3, a4, a5, a6, a7] z = [z0, z1, z2, z3, z4, z5, z6, z7] sort = bitonicSort(a, z, ASCENDING) return sort

## Verification

Verifying the sorter design is easy enough. We set up a list of random data values, assign it to the inputs of the circuit, then sort the data list and compare it with the outputs.

from random import randrange from myhdl import * from bitonic import Array8Sorter def bench(): n = 8 w = 4 a0, a1, a2, a3, a4, a5, a6, a7 = inputs = \ [Signal(intbv(0)[w:]) for i in range(n)] z0, z1, z2, z3, z4, z5, z6, z7 = outputs = \ [Signal(intbv(0)[w:]) for i in range(n)] inst = Array8Sorter(a0, a1, a2, a3, a4, a5, a6, a7, z0, z1, z2, z3, z4, z5, z6, z7) @instance def check(): for i in range(100): data = [randrange(2**w) for i in range(n)] for i in range(n): inputs[i].next = data[i] yield delay(10) data.sort() assert data == outputs return inst, check def test_bench(): sim = Simulation(bench()) sim.run()

This test bench can be used with a unit test framework such as `py.test`

.

## Verilog generation and synthesis

The sorter design can be converted to Verilog as usual, with the `toVerilog`

function. See the Verilog code for the result. Note that it is a
"net list of always blocks", with all structure and recursion flattened out.

Of course, you will want to verify that the generated Verilog code is correct. This can be done by using Verilog co-simulation with the same test bench as used for the MyHDL code. You can find the details about the procedure here.

To get an idea of the hardware implementation characteristics, check out the synthesis results.