Hints, Tips and Solutions - February 2007

Q. How Can I significantly Reduce Circuit Parasitics Netlist Extraction Time?

A. SILVACO has recently released a new suite of parasitic extraction tools to meet the demands of state-of-the-art designers at cell, circuit and chip level. After having proved [1] [2] [3] the accuracy of these tools, SILVACO now focuses his attention to decrease the simulation time, by taking benefit from Multi-CPUs computing architectures.

The results presented here have been obtained with STELLAR, a 3D-based Field Solver with full-chip capacitance extractor. The software uses an advanced numerical method, the so-called fictitious domain method, which is based on the decomposition of the simulation domain into sub-domains. The parallel version of STELLAR accepts a command line option -P n, which allows running the m sub-domains simulations in parallel on n CPUs (n being the number of requested CPUs). Simulations were done by STMicroelectronics Crolles France on a SunOS 5.8, 16 CPUs, Sun-Fire-V890. The layout used for this study had the following characteristics: 106x230 um2 area, 7 metal layers and 6 via layers. In the following text and figures, the CPU time is the total on-CPU time as measured by a UNIX ps or top command, while the Wallclock time is the real-world time, as measured by a watch.

The decomposition algorithm is sufficiently robust to give a very limited (3%) variation of the capacitance with the number of sub-domains represented by a decomposition step d (Figure 1). High value of d correspond to a low number of sub-domains. It is also clear from Figure 1 that CPU time increases with d. As a consequence it was decided to set d to 1.91 corresponding to 90 sub-domains. It has been verified that for a given decomposition, the capacitance does not vary with the number of requested CPUs.

Figure 1. Capacitances and CPU time as a function of decomposition step d. 1 CPU used


As can be seen in Figure 2, the parallelization is very good (near the theoretical limit) for a number of requested CPUs around 8. Running on 12 CPUs leads to a gain time of a factor 9. The layout used for this benchmark was relatively small. The advantage of parallelization over a larger number of CPUs maybe even more advantageous for larger structures. This result is achieved easily (no preliminary optimization runs) if the decomposition step d is chosen such that a large number of subdomains is obtained (90 in this case). In other words the parallelization is fully exploited only if n is significantly lower than m.

Figure 2. Time Gain versus number of CPUs. The dashed lines show an ideal parallelization



SILVACO wishes to thank STMicroelectronics Crolles France for its collaboration to this work.



  1. STELLAR: Process Based Parasitics Capacitance Extraction on Large Custom Cells, Simulation Standard Volume 14, Number 5, May 2004.
  2. QUEST: Inductance Optimization Using, 3D Field Solver based on DoE Approach, Simulation Standard Volume 16, Number 2, February 2006.
  3. QUEST: Simulation and Characterization of High Frequency of Advanced MIM capacitors, Simulation Standard Volume 15, Number 10, November 2005.

Download pdf version of this article