# Multiple Linear Solvers Introduced in *SmartSpice*

Acknowledging the need for more flexibility, ** SmartSpice**
now provides three numerical methods for linear system solution. The additional
solvers provide for greater capacity by minimizing memory requirements
and reducing the overall simulation time.

Historically, spice simulation tool vendors were locked into using the Berkeley Sparse linear system solver in their products due to its tight integration with the simulation engine, and sometimes even with the implementation of certain analyses and models.

Therefore, adding a better linear solver into a spice
package has proven to be extremely difficult, unless "reverse integration"
is accomplished. The solver should be decoupled from the the rest of the
spice package so as to present a very clean interface to the simulator.
Silvaco has succeeded in this decoupling approach in ** SmartSpice**
and managed to avoid any runtime overhead that usually occurs with this
kind of restructuring thanks to the approach it chose to perform this
task: instead of introducing a new interface layer between the solver
and the simulator, Silvaco stripped away the old interface altogether
and replaced it with a clean one.

Once this decoupling was implemented, a multitude of
different solvers can ce plugged in through the common solver interface
without interfering with what customers are used to with previous versions
of ** SmartSpice**. This new approach to solving spice linear
systems yielded immediate results. What is to become the default

**solver is an enhanced version of the Berkeley Sparse1.3 solver. The gain with this solver is an average of 10% on overall simulation time. This gain is achieved by merely using a more compact data structure than the original one provided by Berkeley Sparse. The main objective is to improve memory access patterns by putting data accessed around the same time in memory locations close to one another. This is known as spatial locality. This technique was coupled with a cache blocking approach that increases the percentage of useful data loaded into the processor's cache with each load request from the main memory. Future versions of this solver library will show much better speeds as there is still room for improvement using for instance temporal locality.**

*SmartSpice*Numerically well-conditioned circuits can also take advantage
of a much faster solver ** SmartSpice** provides which is called
Speeds.

The Berkeley Sparse1.3. solver is still available for
completeness and backward compatibility. Our goal is to provide the matrix
inversion method best suited for each type of circuit. We will briefly
discuss each methods' speed improvement and provide a broad guideline
for when each method is appropriate. The linear solver method can be chosen
from within the ** SmartSpice** input deck using the statement:
.

OPTION SOLVER=

<method>

where <method> is

- speeds: to use the fast solver

- sparse: to revert to Berkeley Sparse1.3

- default: uses the default

Solver.SmartSpice

1) The Default SmartSpice Solver

While retaining the same numerical properties as Berkeley
Sparse1.3, the default ** SmartSpice** solver has been optimized
for speed. This solver relies on the stability of the structure of the
circuit matrix to optimize memory layout and data transfer in memory.
A special memory layout of the sparse matrix elements has been devised
to optimize data access during the linear solver phase. This layout is
used throughout the simulation in order to minimize cache-misses during
the LU factorization and back solve phases. The average gain on simulation
time with respect to the solver in previous releases of

**ranges between 5 and 15 %. Whenever the matrix structure itself changes during the simulation, the improved matrix layout needs to be re-evaluated to make sure it still is optimal. In very rare situations when this happens very often during the course of one simulation, the overhead will become significant and the new solver will not show a big speed improvement.**

*SmartSpice*

2) The Speeds Solver

One of the most time consuming operations in the default
** SmartSpice** solver is the search for the best pivot element
during the LU decomposition phase, at every iteration of the process.
This choice of pivot element is supposed to minimize the fill in of the
resulting matrix, therefore minimizing memory usage and also improving
stability of the LU decomposition. With some circuits, this is overkill,
which led Silvaco to implement a much faster solver that takes advantage
of a circuit's stability, to speed up the simulation.

With the Speeds solver, the "pivrel" and "pivtol" parameters are used to check that the pivot is a good pivot, while bypassing the time-consuming pivot search. This solver proves to be efficient in two types of situations: - on circuits with a structurally stable and well-conditioned matrix, - in cases where matrix reordering and factorization takes more than 50% of the total simulation time.

2-1) For general circuits with well-conditioned matrices, the time gain varies between 5% and 25% of the total simulation time. Here are some examples of execution times with the two solvers on two benchmarks.

Circuit |
Speeds |
default |
ratio(%) |

bench_44.sp | 28.88 sec. | 37.36 sec. | 77.30 |

bench_62.sp | 326.50 sec | 378.01 sec. | 86.37 |

2-2) In some cases, the Speeds solver can provide dramatic time gains because it bypasses completely the matrix reordering phase. For circuits spending more than 50 % of simulation time in matrix reordering or factorization, this can lower computation time dramatically. Sometimes, these circuits are the ones with many independent voltage sources that are connected to many devices.

Using a ".option acct" statement, one can determine which phase of the circuit takes the biggest part of total simulation time. Example of such a transient analysis:

<.option solver=default acct>

Total user time : 1993.330 seconds

Total user time : 1993.330 seconds

Total system time : 1.220 seconds

<...>

equations (Circuit Equations) = 2540

loadtime (Load time) = 186.83

lutime (L-U decomposition time) = 310.83

reordertime (Matrix reordering time) = 1291.49

solvetime (Matrix solve time) = 23.36

transpoints (Transient timepoints) = 2061

Note that the reordering and factor phases now take only 20% of the total simulation time. As a side effect, the total number of Newton iterations is twice smaller, which accounts for the halved "loadtime".

3) The Berkeley Sparse1.3 solver

The Berkeley Sparse1.3 solver is retained for possible cases when: - reordering and factorization phases take more than 50% of total simulation time - AND the Speeds solver fails to solve because the matrix is ill-conditioned.

Conclusion

** SmartSpice** now provides three numerical
methods for matrix inversion, allowing greater flexibility in adapting
the simulator to each input deck. In addition to the new

**default solver, stable circuits can benefit from a faster solver, which bypasses the reordering phase of the sparse matrix. The extraction of the linear solver functionality from the core spice simulator allows great flexibility for future developments. This flexibility paves the way for other linear solvers to be implemented, which will scale much better than current solvers with the circuit size. With proper preconditioning techniques, iterative solvers may well become the solution to the bottleneck towards a true multi-million transistor spice simulator.**

*SmartSpice*