Parallel .ALTER Statements in SmartSpice


Parallel .ALTER

The .ALTER statement is designed to allow a SPICE input deck to be re-run with a change in a single parameter. This feature is especially useful in characterization. For this type of work, users will want to run a single deck many times over, changing a single parameter each time. The runs are generally short, taking a few minutes to a few hours to complete. Very often the circuit itself is not so large, but the sequential nature of the analysis means that the characterization takes a long time.

It is increasingly popular to run such simulations on multi-processor machines, with anywhere from two to twelve CPUs, sharing the same memory. A significant speed-up in characterization runs could be achieved if the .ALTERs could be spread over all the CPUs.


Separate Command

SmartSpice's '-P <n>' option, in conjunction with its `separate' command (exclusively developed for SmartSpice), allows the user to distribute a deck with one or more .ALTERs over several CPUs. Operating in batch mode, SmartSpice takes an input deck containing .ALTER statements, and farms out each .ALTER to a separate CPU. The user can specify the number of CPUs (<n>) to be used. In order to maximize efficiency, SmartSpice will run no more than <n> .ALTERs at a time. As each CPU finishes, SmartSpice will give it another .ALTER, until all have been processed.

Each .ALTER statement will produce a separate RAW file, and output file, whose names are identical to the original input deck, but with'' replacing the extension, where # is the number of the .ALTER statement in the input deck.


Figure 1. Relative simulation time for a deck containing three .ALTER statements
as a function of number of CPUs used. Actual performance of SmartSpice is almost
ideal. (The test machine is a four-CPU SUN machine running Solaris2.
Each .ALTER took about one minute to run).




The resulting performance improvement is significant. For most decks the parallel speed-up is linear: using four CPUs for example can make your overall simulation finish four times faster. Figure 1 shows the execution time for an input deck with three .ALTERs (and therefore, four separate input decks), as a function of the number of CPUs used. Also shown is the theoretical ideal behavior. It is clear that SmartSpice performance is close to ideal for decks taking longer than a few seconds to run. By specifying the number of CPUs over which the decks are to be processed, a user can, for example, allow a small number of CPUs to share the available memory if the circuit is very large, or run a .ALTER on each CPU if the circuit is small. There is a small overhead associated with managing multiple CPUs (typically a few seconds per .ALTER), which should affect performance only on the shortest, smallest decks. For most realistic decks, it will be negligible.