control flow

In this section, we intend to introduce the ideas you will need to be familiar with to write good Spatial applications. Specifically, you will learn about

  • Control Level (inner vs outer)

  • Control Style (Pipe, Sequential, Parallel, and Stream annotations)

  • Retiming and Initiation Interval

  • Finite State Machine (FSM)



Spatial applications are composed of hierarchical control structures and primitives. In later tutorials, you will learn about specific syntax and constructs in the language, but in this section we will try to abstract away most of the details to demonstrate the key concepts. Mastering these concepts is the majority of squeezing the best performance out of the language, and the rest of the tutorials will apply what is discussed here in a concrete way. We will not spend much time discussing the details of banking, buffering, and duplication, which are also important.

To understand how you can apply knowledge from this section to build better apps, see the instrumentation guide.


Controller level


A controller in Spatial is either an “outer” controller or an “inner” controller. A controller in this context is essentially what the software world calls a loop. They are instantiated in the language as a counter chain and/or state machine, and enclose a collection of other nodes in the language. Specifically,

  • Inner controllers contain only primitives (i.e.- arithmetic, memory access, muxing, etc.)

  • Outer controllers contain at least one other controller (i.e.- Foreach, Reduce, FSM, etc.) and “transient” operations that do not consume FPGA resources (i.e.- bit slicing, struct concatenation, etc.)

The diagram on the right shows a specific example of a controller tree that we will use repeatedly in this tutorial. The snippet is just an example piece of Spatial that may generate this tree

Foreach(M by 1){j => 
  Foreach(K by 1){k => ... }
  Foreach(Q by 1){q => ... }
  Foreach(R by 1){r => ... }
Screenshot from 2018-11-06 12-20-24.png

Control schedule

There are five schedules that an outer controller can take. The animations on the right demonstrate how these controllers work in the context of the previous example. The one exception here is the FSM controller, which will be explained in a later section.

  • Pipelined - Default schedule of controllers, if none is specified. The children of the controller will execute in a pipelined fashion. The counter for the controller only increments when all enabled children are complete. Each child observes a different counter value from the parent, based on its position in the sequence of children. The first child receives the latest value, and child n in the sequence will observe a counter value from n iterations previous. This is another way of saying that the initiation interval of the controller is equal to the child with the longest runtime who was active in a given iteration. While in steady-state, the slowest child is entirely responsible for the runtime of the iteration. Note that this schedule has consequences in terms of loop carry dependencies and area usage, which are discussed in the Performance tutorials.
    To forcibly use this control schedule, annotate the controller with Pipe.<control>

  • Sequenced - The children of the controller will execute one at a time. Only after the last child finishes iteration i will the parent counter increment and issue iteration i+1 to the first child. This is another way of saying that the initiation interval for the controller is equal to the sum of the runtimes of all children controls summed together. While it is slower than Pipe, it consumes fewer resources and could be a solution to loop-carry dependency issues. Each child is equally culpable for the total runtime of the controller.
    To forcibly use this control schedule, annotate the controller with Sequential.<control>

  • ForkJoin - The children of the controller will all execute in parallel. Each child will maintain its own copy of the parent’s counter, and increment that counter for its own local usage when it is complete. The runtime of the controller is dictated by the longest child.
    To forcibly use this control schedule, annotate the controller with Parallel.<control>. Note that simply using Parallel{} is shorthand for creating a loop that runs for one iteration and whose children will execute with this schedule.

  • Fork - Only one child of the controller will execute at a time. The only way to create a controller with this kind of controller right now is to use if-then-else, which creates a controller that will only run for one iteration. You should think of this as one-hot mux encoding of the children.

  • Stream - This schedule refers to a controller that is pipelined at the “data” level. It is similar to a ForkJoin controller, except rather than allowing all of the children to execute freely, a child can only execute when all of its outbound streams are “ready” and all of its inbound streams are “valid.” The “ready” signal means that the outbound stream is ready to receive at least one element of data, and the “valid” signal means that the inbound stream has at least one element of data that has not been dequeued yet. Nodes that conform to this interface include FIFO-like memories and Buses that communicate with a peripheral like DRAM or other AXI-like interface.
    While this is the trickiest schedule to understand and use properly, it is probably the most important one in order to get optimal performance out of Spatial.
    To forcibly use this control schedule, annotate the controller with Stream.
    Note that the last child in the animation has no inbound or outbound streams, so it is allowed to run freely while the parent is enabled.


Latency and Interval


The previous section discusses scheduling options for outer controllers. Since these schedules do not really map to inner controllers (except for Sequential and Pipelined, kind of, which will be discussed later), we describe inner controllers with their latency and initiation interval. The Spatial compiler includes a retiming pass which allows the generated design to meet higher clock rates. Each node has a latency model associated with it, and the compiler will use these models to compute the retiming on the dataflow graph and automatically inject appropriate delay lines.

  • Latency - The number of cycles that it takes for the rising edge of the datapath enable signal to reach the last primitive in the retimed controller body. For a controller that runs for only one iteration, this will be equal to the number of cycles that the controller runs for.

  • Initiation Interval - The number of cycles the controller must wait before it can increment the counter and issue the next iteration of the controller body. For a body that contains no cycles, the initiation interval will be set to 1. This means that the controller can increment its counter and issue this new counter value to the first layer of primitives every cycle. The rest of the body is fully pipelined, so after the number of cycles equal to the latency, the body will generate one result per cycle. For controllers where there is a cycle (i.e.- read from a memory followed by a write to that memory), the compiler determines the minimum amount of cycles required for this dependency to resolve, with respect to the access pattern on the memory.

If you tag an inner controller as Sequential, this will automatically force the initiation interval of that controller to be equal to the latency. You can also use the Pipe(ii=desired_cycs) tag to force the initiation interval of the controller to be some desired number of cycles. You cannot control the latency of the controller, as this is calculated from the latency model and the nodes within a controller.


Finite State Machine (FSM)


Spatial also supports arbitrary FSM controllers. These are useful if you want to create a loop that behaves like a software while loop, or take certain actions based on the value of a state. These kinds of controllers can be either inner controllers (both the action logic and next state logic is composed of only primitives) or outer controllers (there is at least one controller in the action logic or next state logic).

For outer controllers, the behavior of the FSM is similar to the Sequential schedule. For inner controllers, the behavior of the FSM is similar to an inner pipe with its initiation interval set equal to its latency. The next state evaluation does not begin until all of the action primitives have finished.