The ZPU is a microprocessor stack machine designed by Norwegian company Zylin AS to run supervisory code in electronic systems that include a field-programmable gate array (FPGA).[1]

It has been designed to require very small amounts of electronic logic, making more electronic logic available for other purposes in the FPGA. To make it easily usable, it has a port of the GNU Compiler Collection. This makes it much easier to apply than CPUs without compilers. Sacrificing speed in exchange for small size, it keeps the intermediate results of calculations in memory, in a push-down stack, rather than in registers.[1]

Zylin Corp. made the ZPU open-source in 2008.[1]

Usage

edit

Many electronic projects include electronic logic in an FPGA. It is wasteful to also have a microprocessor, so it is commonplace to add a CPU to the electronic logic in the FPGA. Often, a smaller, less-expensive FPGA could be used if only the CPU used less resources. This is the exact situation that the ZPU was designed to address.

The ZPU is designed to handle the miscellaneous tasks of a system that are best handled by software, for example, a user interface. The ZPU is very slow, but its small size helps to place any needed high-speed algorithm in the FPGA.

Another issue is that most CPUs for FPGAs are closed-source, available only from a particular maker of FPGAs. Occasionally a project needs to have a design that can be widely distributed, for security inspections, educational uses or other reasons. The licenses on these proprietary CPUs can prevent these uses. The ZPU is open-sourced.

Some projects need code that must be small, but run on a CPU that inherently has larger code. Alternatively, a project may benefit from the wide selection of code, compilers and debugging tools for the GNU Compiler Collection. In these cases, an emulator can be written to implement the ZPU's instruction set on the target CPU, and the ZPU's compilers can be used to produce the code. The resulting system is slow, but packs code into less memory than many CPUs and enables the project to use a wide variety of compilers and code.[2]

Design features

edit

The ZPU was designed explicitly to minimize the amount of electronic logic. It has a minimal instruction set, yet can be encoded for the GNU Compiler Collection. It also minimizes the number of registers that must be in the FPGA, minimizing the number of flip-flops. Instead of registers, intermediate results are kept on the stack, in memory.[1]

It also has small code, saving on memory. Stack machine instructions do not need to contain register IDs, so the ZPU's code is smaller than other RISC CPUs, said to need only about 80% of the space of ARM Holdings Thumb2.[1] For example, the signed immediate helps the ZPU store a 32-bit value in at most 5 bytes of instruction space, and as little as one. Most RISC CPUs require at least eight bytes.

Finally, about 2/3 of its instructions can be emulated by firmware implemented using the other 1/3 "required" instructions. Although the result is very slow, the resulting CPU can require as little as 446 lookup-tables (a measure of FPGA complexity, roughly equivalent to 1700 electronic logic gates).

The ZPU has a reset vector, consisting of 32-bytes of code space starting at location zero. It also has a single edge-sensitive interrupt, with a vector consisting of 32 bytes of code space beginning at address 32. Vectors 2 through 63 each have 32 bytes of space, but are reserved for code to emulate instructions 33 through 63.

The base ZPU has a 32-bit data path. The ZPU also has a variant with a 16-bit-wide data path, to save even more logic.

Tools and resources

edit

The ZPU has a well-tested port of the GNU Compiler Collection.[1] Enthusiasts and firmware engineers have ported ECos,[1] FreeRTOS[3] and μClinux.[4] At least one group of enthusiasts have copied the popular development environment of the Arduino and adapted it to the ZPU.[5]

There are now multiple models of the ZPU core. Besides the original Zylin cores,[1] there are also the ZPUino cores,[5] and the ZPUFlex core.[6] The Zylin core is designed for a minimal FPGA footprint, and includes a 16-bit version. The ZPUino has practical improvements for speed, can replace emulated instructions with hardware, and is embedded in a system-on-chip framework. The ZPUFlex is designed to use external memory blocks and can replace emulated instructions with hardware.

Academic projects include power efficiency studies and improvements,[7] and reliability studies.[8]

To improve speed, most implementors have implemented the emulated instructions, and added a stack cache.[5][6][7] Beyond this, one implementor said that a two-stack architecture would permit pipelining (i.e. improving speed to one instruction per clock cycle), but this might also require compiler changes.[7]

One implementor reduced power usage by 46% with a stack cache and automated insertion of clock gating.[7] The power usage was then roughly equivalent to the small open-source Amber core, which implements the ARM v2a architecture.

The parts of the ZPU that would be most aided by fault-tolerance are the address bus, stack pointer and program counter.[8]

Instruction set

edit

"TOS" is an abbreviation of the "Top Of Stack." "NOS" is an abbreviation of the "Next to the top Of Stack."

Required ZPU Instruction Set
Name Binary Description
BREAKPOINT 00000000 Halt the CPU and/or jump to the debugger.
IM_x 1xxxxxxx Push or append a signed 7-bit immediate to the TOS.
STORESP_x 010xxxxx Pop the TOS and store it into the stack at an offset from the top.
LOADSP_x 011xxxxx Fetch from a value indexed in the stack and push it into the TOS.
EMULATE_x 001xxxxx Emulate an instruction with code at vector x (ie. jump to address x * 32).
ADDSP_x 0001xxxx Fetch from a value indexed in the stack and add the value to the TOS.
POPPC 00000100 Pop an address from the TOS and store it to the PC.
LOAD 00001000 Pop an address and push the loaded memory value to the TOS.
STORE 00001100 Store the NOS into the memory pointed-to by the TOS. Pop both.
PUSHSP 00000010 Push the current SP into the TOS.
POPSP 00001101 Pop the TOS and store it to the SP.
ADD 00000101 Integer addition of TOS and NOS.
AND 00000110 Bitwise AND of the TOS and NOS.
OR 00000111 Bitwise OR of the TOS and NOS.
NOT 00001001 Bitwise NOT of the TOS.
FLIP 00001010 Reverse the bit order of the TOS.
NOP 00001011 No-Operation. (Usually used for delay loops or tables of code.)

Opcodes 34 to 63 may be emulated by code in vectors 2 through 32: LOADH and STOREH (16-bit memory access), LESSTHAN (comparisons set 1 for true, 0 for false), LESSTHANOREQUAL, ULESSTHAN, ULESSTHANOREQUAL, SWAP (TOS with NOS), MULT, LSHIFTRIGHT, ASHIFTLEFT, ASHIFTRIGHT, CALL, EQ, NEQ, NEG, SUB, XOR, LOADB and STOREB (8-bit memory access), DIV, MOD, EQBRANCH, NEQBRANCH, POPPCREL, CONFIG, PUSHPC, SYSCALL, PUSHSPADD, HALFMULT, CALLPCREL[5][9]

Optional (emulated) Instructions
Name Binary Description
LOADH 00100010 Load short
STOREH 00100011 Store short
LESSTHAN 00100100 Signed less-than comparison
LESSTHANOREQUAL 00100101 Signed less-than-or-equal comparison
ULESSTHAN 00100110 Unsigned less-than comparison
ULESSTHANOREQUAL 00100111 Unsigned less-than-or-equal comparison
SWAP 00101000 Swap TOS and NOS
MULT 00101001 Signed 32-bit multiplication
LSHIFTRIGHT 00101010 Logical shift right
ASHIFTLEFT 00101011 Arithmetic shift left
ASHIFTRIGHT 00101100 Arithmetic shift right
CALL 00101101 Call function
EQ 00101110 Comparison equal
NEQ 00101111 Comparison not equal
NEG 00110000 Negative
SUB 00110001 Subtract
XOR 00110010 Exclusive-OR
LOADB 00110011 Load byte
STOREB 00110100 Store byte
DIV 00110101 Signed 32-bit division
MOD 00110110 Signed 32-bit modulus
EQBRANCH 00110111 Branch if equal
NEQBRANCH 00111000 Branch if not equal
POPPCREL 00111001 Pop PC relative
CONFIG 00111010 Internal configuration
PUSHPC 00111011 Push current PC
SYSCALL 00111100 System Call
PUSHSPADD 00111101 Push SP + offset
HALFMULT 00111110
CALLPCREL 00111111 Call relative function

References

edit
  1. ^ a b c d e f g h "ZPU - the worlds [sic] smallest 32-bit CPU with GCC toolchain : Overview". opencores.org, Zylin Corp. Retrieved 7 February 2015.
  2. ^ "ZOG - A ZPU processor core for Propeller with GNU C + Fortran". Parallax Forum. Parallax. Retrieved 6 September 2019.
  3. ^ Antonio, Anton. "ZPUino-HDL/zpu/sw/freertos/". GitHub. Antonio Anton. Retrieved 7 February 2015.
  4. ^ Lopes, Alvaro. "alvieboy/Linux". GitHub. Alvaro Lopes. Retrieved 7 February 2015.
  5. ^ a b c d Lopes, Alvaro. "ZPUino". www.alvie.com. Archived from the original on 19 February 2014. Retrieved 7 February 2015.
  6. ^ a b AMR. "ZPU Flex". Retro Ramblings. Retrieved 9 February 2015.
  7. ^ a b c d Eriksen, Stein Ove. "Low Power microcontroller core". NTNU Open. Norges teknisk-naturvitenskapelige universitet. Archived from the original on 9 February 2015. Retrieved 9 February 2015.
  8. ^ a b Zandrahimi, M. (2010). "An analysis of fault effects and propagations in ZPU: The world's smallest 32 bit CPU". 2nd Asia Symposium on Quality Electronic Design (ASQED). IEEE. pp. 308–313. doi:10.1109/ASQED.2010.5548320. ISBN 978-1-4244-7809-5. S2CID 20045721.
  9. ^ zylin. "zpu/zpu/docs/zpu_arch.html at master · zylin/zpu". GitHub. Retrieved 2026-06-10.

📚 Artikel Terkait di Wikipedia

List of open-source hardware projects

multicore processor designs OpenSPARC, a series of open-source microprocessors based on the UltraSPARC T1 and UltraSPARC T2 multicore processor designs

Cromemco Z-2

system that was introduced in 1977. The original Z-2 in kit form included a ZPU-K Z80 CPU card, S-100 bus motherboard, all-metal rack-mount chassis and dust

Soft microprocessor

commodity variations. Most systems, if they use a soft processor at all, only use a single soft processor. However, a few designers tile as many soft cores

Stack machine

case of a hardware processor, a hardware stack is used. The use of a stack significantly reduces the required number of processor registers. Stack machines

Open collaboration

rather than imposed) and self-organizing (processes adapt to people rather than people adapt to pre-defined processes)." Since 2011, a peer-reviewed academic

Hispano-Suiza HS.820

Oerlikon 20 mm cannon Oerlikon KCB Rheinmetall Mk 20 Rh-202 Zastava M55 ZPU ZU-23-2 A for 20, B for 23 or 25, C for 30 and D for 35 mm. 20 x 139 round

GNU Compiler Collection

machine-independent C and processor-specific machine code, designed primarily to handle arithmetic operations that the target processor cannot perform directly

Cromemco

their systems. It was a dual-processor card (called the DPU) with both a Motorola 68000 processor and a Zilog Z80 processor (for backward compatibility)