Designing with the NIOS (Part 1)


Second-Order, Closed-Loop Servo Control

    Exchanging gifts during the holiday season is common in many parts of the world. Did you receive any gifts last winter?  Were any of them exactly what you wanted?  Well, in any case, I would like to talk about the perfect microprocessor for your next project.  Altera's NIOS has the exact number and type of peripherals you need.  It also has all the computing power you require, along with custom instructions tailored to you and your application.  Memory?  Well, of course it has the proper amount and type.  Speed?  Certainly, it's just as fast as your next application requires.  Whoa! time to insert the low-pass marketing filter, you say.  All of this can't be true.  What's the catch?

    Well, I recently completed a design using the NIOS embedded processor. What Altera has accomplished is to put a RISC CPU into a FPGA device. The CPU is entirely IP, which means you can tailor the features to meet your requirements. The line between CPU and peripherals has become transparent because there is a host of common devices that you can build into the CPU. In addition, support is provided for inclusion of your custom/proprietary modules.

    You can select the 16- or 32-bit CPU, external bus size, hardware of software multiply, instruction queue size, big or little endian, internal stack support, and custom instructions. The memory can be internal or external to the FPGA.

    The NIOS peripheral library consists of all the peripherals you would ever need. Some of the more common ones, which are available as part of the tools, include: a UART, timer, parallel I/O (PIO), serial peripheral interface (SPI), direct memory access (DMA), memory interfaces, Ethernet port, and interface to user logic. The other more sophisticated devices are available as licensed cores. Devices such as encryption and FFT transforms can be included inside the device.

    And last but not least, you can choose from several families of FPGA devices to meet your speed, cost, and packaging requirements.  Refer to the Resources section at the end of this article for links to information about a wide range of development systems.

    You are wondering about the software. Well, the development kits include the following: GNU C compiler (GCC) and GNU C++ compiler (G++); GNU debugger (GDB) source and assembly-level debugger; GNU assembler (GAS); GNU linker (LD); Insight GUI for GNU debugger; GNU software code profiler (GPROF); and NIOS processor-specific binary utilities.

    Well, did I stretch the truth? Is this the next major advance in embedded system design, or is it just the latest twist and turn leading nowhere? In this two-part series I'll take you through a system design and let you draw your own conclusions.

SERVO CONTROL


    What to design? I wanted to select a design complicated enough to blow your socks off. How about a second order, closed-loop servo control? I’ll do a single axis and let you add additional axes to meet your specific requirements.

    Let's begin with a list of system requirements. I just had a meeting with marketing and this is what came out of that get-together: First, the component costs $50, including the PCB for the control portion less the servo amp, motor, and encoder. Second, the control loop should operate at a minimum of 20 MHz. Third, the system should be field-reconfigerable. Fourth, the system should have one serial port and one parallel port for customer interface. Fifth, the system should have one serial port debug interface and a 5-V power source. Sixth, the system needs to operate between 0° and 70°C (an inside environment but no A/C). Seventh, the system might need an Ethernet interface. (The market will let you know.) Eighth, the system should have 12 D/A converters to interface to the servo amp. Ninth, the system should interface to quadrature encoder input. And tenth, we have a show to make in nine months. It must be done by then or else forget about it.

    Does this sound familiar? Scary, isn't it? But it's more like the real world. And as a responsible engineer, you would probably say yes to this request. The System-on-a-Chip is the main point of this article, so I won't go into great detail about the control system. But I hope to provide enough guidance so you can take what's  presented and run with it.

    The development tools basically consist of QuartusII, SOPC Builder, and a C++ compiler. Quartus is the tool (Quartus II Ver 3.0 is the latest) that actually converts your design to FPGA configuration data. SOPC Builder pulls together the CPU and peripherals that you select for your system. And the C compilers and assemblers generate code that runs on your specific device.

DEVELOPMENT OPTIONS

    Altera offers development systems with a prototyping board and software tools for $995. Let me also add that Altera has been extremely aggressive with its pricing. I've seen discounts on this system at seminars. So, ask Altera yourself before you write this number in your budget. 

    Altera offers a free student system and a free ’Net-based QuartusII design tool. (I haven't used it so I don't know if there is a turnaround time penalty.) So, it looks like Altera has you covered.

    I started in Quartus by creating a project in which I generated a schematic called TopLevelPins (see Figure 1). Don't you like that name? Very descriptive. I've seen FPGA designs that actually hide the device pinouts, so you have to spend hours looking for them. It's interesting to note that Altera does not select page sizes such as A, B, C, and D. It has you place symbols and then size for the printer and paper size available. It's a little odd at first, but it saves time over the life of the project. Also note that Altera refers to this file as a schematic or block diagram type. The extension is bdf. The movement is away from the detailed gate placement toward selecting larger blocks from a list. Let the FPGA compiler edit out what isn't used. On this top level, I placed a title block to identify the document.

NIOS CPU


    Let's use the SOPC builder to create a NIOS CPU. The SOPC builder is a graphical interface that lets you select features for the CPU and peripherals. When I went to select the NIOS processor, I noticed you could also pick an 8051, Z80, or 6811. What's the world coming to? A word of caution: these are not free devices; they require some sort of licensing.

    Take a look at Photo 1 to see what the SOPC builder interface has to offer. The Systems Contents tab is selected. Only a few of the components are visible in the left window. In the main window, you can see that I targeted this design for an ACEX1K device with a 20-MHz clock frequency. Also, notice the following modules: NIOS CPU, tri-state bridge, SRAM interface, UART0, UART1, Timer0, SetPos PIO, SetTgt PIO, SetCtrl PIO, ReadPOS PIO, ReadStatus PIO, and flash memory interface.

    The More NIOS Settings tab lets you further define the CPU as a 32-bit unit with a 20-bit external address bus and 16-bit external memory bus. Because I was building a motion system, I suspected I would need CPU multiplies on the fast side. So, I selected hardware support in the FPGA for multiplies. This used more gates but executed faster. I selected modules that matched the designs for the development boards for the SRAM and flash memory interfaces. I had used them before, and they worked just fine. At this point, I ignored the detailed settings for the interrupt vectors, restart vector, vector tables, and other such entries.

    The UARTs have a fixed data rate of 38,400 bps and 8N1. You have the option of making these parameters fixed (less FPGA resources) or variable (more resources) from the CPU. Every time you select an option on the menu, the number of logic elements appears at the bottom of that window. This gives you some idea about the consequences of your choices.

    I decided on three output and two input PIO ports for the CPU to interface the motion system. The basic axis design has a 32-bit target register set by the CPU and a 32-bit position register controlled by the encoder inputs. These are both 32-bit registers that can be set using the PIO outputs from the CPU. I added a 32-bit input register so the CPU could read the Position register. I then added two 16-bit registers. One is used for control output and one for status input. You'll need inputs (e.g., home and limits) and outputs (e.g., zero the position register, zero the target register, and output LEDs) to indicate status.

  I added a timer module because there is probably a real-time requirement in the design. The timer operation I selected is for free running. I have described in previous articles how to derive timers without interrupts from such a module.

  I locked the RAM at location 0x00000000, the peripherals above the RAM, and the flash memory at location 0x00100000.. The system automatically assigns these locations. The Generate button at the bottom of the window is the last step.

    I hope you can see how straightforward it was to generate a custom CPU. This particular first pass does not meet all of the requirements. But I wanted to show you the process and then explain how easy it is to handle changes. Also, it's advisable to get a reading on the number of logic elements.

MOVING ON

    The basic system design I had in mind included a position register, a target register, encoder inputs, A/D outputs, and some sort of velocity feedback. Well, in the time it took to draw the block diagram I could have designed the FPGA hardware. This is a hierarchical design. The main block for each axis contains an AQUADB block and an error-generation block (see Figure 2).

    Here is a hint as to what took place. I designed one axis and then made a component out of it. I placed that component in the design once for every axis I required.

    The AQUADB block reads the A and B encoder inputs and generates an Up or Down signal along with a Count Enable signal (see Figure 3). The CPU sets the target register (32 bits), and that's the destination I wanted the motion system to achieve. The position register (32 bits) is a counter that counts up or down as driven by the AQUADB block (see Table 1).

    The error-generation block (32 bits) takes the difference between the position and the target. I assumed I had a 12-bit A/D converter to drive the analog portion of the design. If the error is larger than 11 bits (2047) or less than –11 bits (–2047), you need to limit the difference to those numbers. So, a difference of more than 2047 counts will have an A/D output of 2047 counts. Then, as the position nears the target and the difference is less the 2047 counts, the actual difference will be output to the A/D converter. My design used two.

    The Altera Quartus tools have built in macros for common functional blocks. In Figure 2, I used a D flip-flop that is 32 bits wide with a synchronous load and clear for the target register. I also used a macro for the 32-bit up/down counter with count enable and synchronous load and clear.

    In Figure 4, I used a macro for the 32-bit difference function and macros for the limit comparisons. So, if the error is within +/-11 bits, the error amount is passed through the mux. If the error amount is greater that 11 bits, the mux outputs the 11-bit limit. And it's the same with the –11-bit limit.

    As you can see, I started out with a 32-bit design. That's a lot of traveling. 2**31 in either up or down counts. That's 2,147,483,648 counts in either direction. I probably only need 24 bits 2**23 or 8,388,608 counts in either direction. It would be easy to edit the Altera macros and select 24 bits instead of 32.

    Another thing that's going to happen is that the output from Figure 2 is 32 bits, but I'm going to hook up to a 12-bit A/D converter. So, some bits will go nowhere. The tools will start ripping out unused logic. I expect Err[31..0] will be reduced to Err[11..0], and Err[31..12] will be removed. If that's the case, the mux driving Err[31..12] also will be reduced, and the process will continue back till it encounters gates that are needed.

    Another important note: this design is synchronous. The position counter counts one count every clock. The position output to the diff and the diff to the limit checking and mux must take place in one clock cycle. If you use 20 MHz, it takes 50 ns to get through all the differencing and limit logic. This might cause a bottleneck. If it is, I would add a register to break up the timing. However, this would add another 50 ns of delay to the system. But let's wait and see what the timing produces.


DIRECTORY STRUCTURE

    Let's look at the project's directory structure. My drive looks something like what you see in Figure 5, where CircuitCellar is the customer, NIOS2Axis is the project, and V00-00 is the version. The next minor version would be V00-01, and I would copy all the files under V00-00 to V00-01. db is Quartus stuff. NIOS2Axis_sim is the simulation directory, and NIOS_0_sdk is the software development directory for the first NIOS CPU in the design. Yes, there could be more than one. I told you this was everything you could ever hope for.

    The lib directory contains all of the library files that you would link into the C code built specifically for your CPU. It also contains a version of these files built for debugging. These files support functions such as sprinf, getc, and putc.

    The source directory has some source files generated by the Quartus system targeted for the development boards as well as peripheral test files that you can use on your custom CPU. Figure 1 contains the NIOS CPU and one axis of motion control. I compiled this for the ACEX1K family and it fit as reported in Table 2.

    The maximum clock frequency is reported as 29.24 MHz, which is suspiciously low. When I targeted the design for the Cyclone family of devices, I got a maximum clock frequency of 70 MHz. But I used only 3000 to 4000 of the 6000 total elements. This is overkill. I looked into what was the longest delay path. It was in the hardware multiply. So, I changed that configuration from all hardware to part hardware and part software multiply. The design then fit into EP1K100QC208-3 (the slowest and least expensive). The maximum frequency is 36 MHz.

    Next month I’ll add velocity feedback for a second-order system and describe the software.


George Martin began his career in the aerospace industry in 1969. After five years at a real job, he set out on his own and cofounded a design and manufacturing firm (www.embedded-designer.com). George’s designs typically include servo-motion control, graphical input and output, data acquisition, and remote control systems. George is a charter member of the Ciarcia Design Works Team. He’s currently working on a mobile communications system for the military. In his spare time, George has become a nationally ranked revolver shooter. You can reach him at  george.martin@att.net

RESOURCES
Information about development systems,
    www.parallax.com/html_pages/products/altera/smartpacks.asp
    www.altera.com/products/devices/NIOS/kits/nio-dev_kits.html
    www.jopdesign.com/cyclone/index.jsp
    www.cepdinc.com/altera/cas10.htm
    and www.microtronix.com/products/cycldevkit.html

SOURCE
NIOS embedded processor, SOPC
Builder, QuartusII design software
Altera Corp.
(408) 544-7000
www.altera.com



If you found this article interesting you will find much more like these at Circuit Cellar.  You can obtain the complete article including all source listings at Circuit Cellar.  Look in the Products and Services section for back issues in print and CD

Part 2 of this article

 back to Articles

 back to ESC Home

Copyright © 2003-2004 by ESC Inc. All rights Reserved
Last update: November 9, 2004