



# Timing Analysis and Characterization for Full Custom IP-blocks

Sergey Gavrilov IPPM RAS / MIET

#### Outline

- 1. Contemporary technologies & IP blocks design problems
- 2. Deterministic and statistical timing analysis
- 3. Digital noise analysis problems
- 4. Logic cell characterization
- 5. Memory cell characterization
- 6. Decomposition problems for IP blocks
- 7. Logic correlation analysis for timing and noise estimation
- 8. Input stimulus generation for IP blocks
- 9. IP blocks characterization speed-up
- 10. Future technologies problems

#### Outline

#### 1. Contemporary technologies & IP blocks design problems

- 2. Deterministic and statistiical timing analysis
- 3. Digital noise analysis problems
- 4. Logic cell characterization
- 5. Memory cell characterization
- 6. Decomposition problems for IP blocks
- 7. Logic correlation analysis for timing and noise estimation
- 8. Input stimulus generation for IP blocks
- 9. IP blocks characterization speed-up
- 10. Future technologies problems



#### **Transistor Scaling Trands**



Transistor density approximately doubles every two years. - Moore's Law.

 Smaller transistors give improved performance, reduced power and lower cost per transistor.



# Interconnects Dominate the Transistors: Delay (ITRS Data)

| Technology            | Gate / Transistor<br>Delay | Delay of line,<br>Lint=1mm |  |  |
|-----------------------|----------------------------|----------------------------|--|--|
| 1.0 mkm<br>(AI, SiO2) | ~ 20 ps                    | ~ 1 ps                     |  |  |
| 100 nm<br>(Cu)        | ~ 5 ps                     | ~ 30 ps                    |  |  |
| 35 nm<br>(Cu)         | ~ 2.5 ps                   | ~ 250 ps                   |  |  |



### **Data ITRS, Delay**



## **Buffered Interconnects:**

Critical path without buffered interconnects



Metal Layers

#### Technology,nm

|            | 90nm | 65nm | 45nm | 32nm |
|------------|------|------|------|------|
| m3         | 0.43 | 0.24 | 0.14 | 0.08 |
| <b>—m6</b> | 1    | 0.56 | 0.32 | 0.19 |

T

#### **Parameter Variations in the Nanometer Range**



### **Threshold Voltage and its Variations Scaling**



#### **Leff and its Variations Scaling**



| Year                       | 1997 | 1999  | 2002  | 2005 | 2006 |
|----------------------------|------|-------|-------|------|------|
| Leff (nm)                  | 250  | 180   | 130   | 100  | 70   |
| <b>3 σ (Leff)</b>          | 80   | 60    | 50    | 40   | 30   |
| <b>3 σ / Leff *100( %)</b> | 32%  | 33.3% | 38.5% | 40%  | 43%  |

# **Transistor Width W and its Variation Scaling**



| Year             | 1997 | 1999 | 2002 | 2005 | 2006  |
|------------------|------|------|------|------|-------|
| W (mkm)          | 0.8  | 0.55 | 0.5  | 0.4  | 0.3   |
| 3σ(W)            | 0.2  | 0.17 | 0.14 | 0.12 | 0.1   |
| 3 σ / W *100( %) | 25%  | 31%  | 28%  | 30%  | 33.3% |

11

#### **Tox and it's Variation Scaling**



| Year               | 1997 | 1999 | 2002  | 2005 | 2006 |
|--------------------|------|------|-------|------|------|
| Tox (nm)           | 5.0  | 4.50 | 4.00  | 3.50 | 3.0  |
| <b>3 σ (Tox)</b>   | 0.4  | 0.36 | 0.39  | 0.42 | 0.48 |
| 3 σ / Tox *100( %) | 8%   | 8%   | 9.75% | 12%  | 16%  |

12

# IP-Blocks Timing Analysis and Characterization Trends



### Outline

- 1. Contemporary technologies & IP blocks design problems
- 2. Deterministic and statistical timing analysis
- 3. Digital noise analysis problems
- 4. Logic cell characterization
- 5. Memory cell characterization
- 6. Decomposition problems for IP blocks
- 7. Logic correlation analysis for timing and noise estimation
- 8. Input stimulus generation for IP blocks
- 9. IP blocks characterization speed-up
- 10. Future technologies problems



## **Static Timing Analysys (STA)**

#### <u>2 passes</u>:

Forward propagation (from primary inputs to primary outputs)
 LAT (Latest Arrival Time) for Setup restriction analysis
 EAT (Earliest Arrival Time) for Hold restriction analysis.
 Backward propagation (from primary outputs to primary inpputs)
 LRT (Latest Required Time)
 ERT (Earliest Required Time).

The interval **[EAT,LAT]** – real arrival window

The interval **[ERT,LRT]** – required window.

output[j].LAT = MAX (input[i].LAT + gate.delay[i][j])
input[i].LRT = MIN (output[j].LRT - gate.delay[i][j])



# Circuit Example(a), Delay Graph (b), Modified Delay Graph (c) – splitting fall / rise



16

#### **Block Oriented Statistical Timing Analysis**

X – normal distributed value (Delay, Slope, etc.) with mean value  $m_X$  and dispersion  $\sigma_X^2$ 

 $X = m_X + \sigma_X \Delta X$ 

 $\Delta X$  – random value ( $m_X$ =0,  $\sigma_X$ =1)

Linear approximation (no correlations) assumption:

$$A = a_0 + \sum_{i=1}^n a_i \Delta X_i + r_a \Delta R_a$$

 $a_{0}$  – mean nominal value,

 $\Delta X_i$ , *i=1,...,n*, и  $\Delta R_a$  – random values with normal distributions



#### **Block Oriented Statistical Timing Analysis**

Sum & Max operations:

1) 
$$C = A + B$$
:  $c_o = a_o + b_o$ ,  $c_i = a_i + b_i$ ,  $r_c = \sqrt{r_a^2 + r_b^2}$   
2)  $C = max (A, B)$ 

$$\sigma_{A} = \sqrt{\sum_{i=1}^{n} a_{i}^{2} + r_{a}^{2}} \qquad \sigma_{B} = \sqrt{\sum_{i=1}^{n} b_{i}^{2} + r_{b}^{2}}$$

Correlation coefficients: (for independent  $\Delta X_i$  and  $\Delta R_a$ ):

$$\rho = \frac{1}{\sigma_A \sigma_B} \sum_{i=1}^n a_i b_i$$



#### **Block Oriented Statistical Timing Analysis**

Output mean value for C:

$$\mathbf{c}_0 = \mathbf{a}_0 \mathbf{T} + \mathbf{b}_0 (1 - \mathbf{T}) + \theta \varphi \left( \frac{\mathbf{a}_0 - \mathbf{b}_0}{\theta} \right)$$

Distribution for C:

$$\sigma_{\rm C}^2 = (\sigma_{\rm A}^2 + a_0^2)T + (\sigma_{\rm B}^2 + b_0^2)(1 - T) + (a_0 + b_0)\theta\varphi\left(\frac{a_0 - b_0}{\theta}\right)$$

Coefficient recalculation for C:

$$c_{i} = a_{i}T + b_{i}(1-T)$$
,  $i=1,...,n$ ,  $r_{c} = \sqrt{\sigma_{c}^{2} - \sum_{i=1}^{n} c_{i}^{2}}$ 



# Increasing of Characterizations for Block Oriented Statistical Timing Analysis

$$D = D_0 + \sigma_D^X \cdot \Delta X$$

 $\sigma_D^X$  – delay sensitivity for a given parameter X

 $D_o$  – nominal gate delay,

 $\Delta X$  – random distribution

$$\sigma_D^X = \frac{D(X_{min}) - D(X_{max})}{2k}$$

 $X_{min}$ ,  $X_{max}$  are side points of the interval [-ks, +ks].



## Outline

- 1. Contemporary technologies & IP blocks design problems
- 2. Deterministic and statistiical timing analysis
- 3. Digital noise analysis problems
- 4. Logic cell characterization
- 5. Memory cell characterization
- 6. Decomposition problems for IP blocks
- 7. Logic correlation analysis for timing and noise estimation
- 8. Input stimulus generation for IP blocks
- 9. IP blocks characterization speed-up
- 10. Future technologies problems

#### Interconnect Extraction and Cross Coupling Noise

- Aggressor nets affect victim net through coupling capacitances
- Functional Noise: changes logic state of the victim net
- Delay Noise: affects signal propagation delay
- Different types of functional noises:

-victim state and aggressor switching direction

-Low/High Overshoot/Undershoot



# **Conservative Coupling Noise Analysis**

- All aggressor nets switch simultaneously in the same direction
- All aggressor noises combine to create maximum noise
- Aggressors switching times align to inject maximum noise



- Ignores correlation between circuit signals and may overestimate noise
- ✓ May produce *false noise violations*
- New method to reduce false noise violations by using logic implications



# **Cross Coupling Delay Noise**

- ✓ Aggressor nets affect victim net through coupling capacitances
- ✓ Functional Noise: changes logic state of the victim net
  - Affects victim when it in a stable state
- ✓ Delay Noise: changes signal propagation delay
  - Affects victim net when it transitions
  - Delay changes accumulate along the signal propagation paths



# **Signals Correlation and False Noise**

- $\checkmark$  Timing correlation:
  - nets switch at different clock cycles, etc.
- $\checkmark$  Logic correlation:
  - circuit logic prohibits some combinations of nets signals
  - it prohibits some aggressor nets from
  - simultaneous switching



- Ignoring signal correlation overestimates noise and results in false noise violations
  - makes difficult to recognize actual noise violations
  - diminishes trust in noise analysis results
- ✓ False noise analysis is needed



## **Delay Noise Model**

Linear approximation for small impulses:

$$\Delta D = \frac{\partial D}{\partial h_n} * \Delta h_n + \frac{\partial D}{\partial w_n} * \Delta w_n + \dots$$

The total delay increment is sum of independent aggressors increments:

$$\Delta D = \sum \Delta D_i$$

The total delay increment across the path from input to outptut:

$$\Delta D_p = \sum_{i \in P} \sum_{j \in A_i} \Delta D_{i,j}$$

 $\Delta D_{i,j}$  – delay increment for *i*-th «victim» in the path P due to Noise from *j*-th «aggressor».



## Outline

- 1. Contemporary technologies & IP blocks design problems
- 2. Deterministic and statistiical timing analysis
- 3. Digital noise analysis problems
- 4. Logic cell characterization
- 5. Memory cell characterization
- 6. Decomposition problems for IP blocks
- 7. Logic correlation analysis for timing and noise estimation
- 8. Input stimulus generation for IP blocks
- 9. IP blocks characterization speed-up
- 10. Future technologies problems



#### **Non-Linear Delay Model (NLDM)**

Table characterization

$$D_{out}(S_{inp}^{k}, C_{out}^{l}) , k \in [1:Ns], l \in [1:Nc]$$
$$S_{out}(S_{inp}^{k}, C_{out}^{l}) , k \in [1:Ns], l \in [1:Nc]$$

For each delay graph  $arc_{ij} = (inp_i, out_j)$ , when logic input state switch  $inp_i$  results in output switch  $out_i$ 

Simplified NLDM input caps

 $C_{inp}^r$ ,  $C_{inp}^f$ , for each input, can be different for  $0 \Rightarrow 1$  (r),  $1 \Rightarrow 0$  (f).



#### **Results of Characterization**



29

#### **Composite Current Source (CCS)**

CCS driver model:

$$I_{out} = F(t, S_{inp}, C_{out})$$

CCS efficient input caps for different arcij

$$C_{inp}^{(r/f)}(S_{inp}, C_{out}), C_{inp}^{(r/f)}(S_{inp}, C_{out}),$$

C\_1 – the table of caps for the 1-th half of transition,] C\_2 – the table of caps for the 2-d half of transition



#### **Efficient Current Source Model (ECSM)**

1) ECSM table:  $V_{out} = G(t, S_{inp}, C_{out}).$   $V_{out}(t) \in [o + \varepsilon, Vdd - \varepsilon], \quad \varepsilon - \text{constant.}$   $V_{out}(t) \text{ is normilized:} \quad \frac{1}{Vdd} \quad \text{To the interval } (0, 1).$ 2) ECSM input caps:  $C_{inp}^{(r|f)}$   $(S_{inp}, C_{out})$ 

CCS & ECSM models are equivalent theoretically:

$$I_{out}(t) = C_{out} * \frac{dV_{out}(t)}{dt}$$

Results are different practically.



# **Logic Characterization Input Data**

| <b>T</b> 1 (     | f_loop_set slew_lo                      | w_threshold                                               |       | 0.2*\$vdd                |               |  |  |
|------------------|-----------------------------------------|-----------------------------------------------------------|-------|--------------------------|---------------|--|--|
| Example for      | f_loop_set slew_u                       | f_loop_set slew_upper_threshold                           |       |                          |               |  |  |
| AND <sub>2</sub> | # длительность периода входных сигналов |                                                           |       |                          |               |  |  |
|                  | f_loop_set time_s                       | f_loop_set time_slice                                     |       |                          |               |  |  |
|                  | # описание теста                        |                                                           |       |                          |               |  |  |
|                  | f_loop_testcase                         | delay                                                     | testı |                          |               |  |  |
|                  | # описание формы с                      |                                                           | сЛЭ   |                          |               |  |  |
|                  |                                         | $\int f_{loop}waveform$                                   |       |                          |               |  |  |
|                  |                                         | _waveform                                                 | i2    | "rfr11frf"<br>"11frforf" |               |  |  |
|                  | # задание измерени                      |                                                           | нтов  | 5555                     |               |  |  |
|                  |                                         | _measure                                                  | delay |                          | i1 r 1 x r    |  |  |
|                  |                                         | <b>f_loop_measure</b><br>f_loop_measure<br>f_loop_measure |       |                          | i1f2xf        |  |  |
|                  |                                         |                                                           |       |                          | i2 r 4 x r    |  |  |
|                  |                                         |                                                           |       |                          | $i_2 f_5 x f$ |  |  |
|                  | f_loop_                                 | delay                                                     |       | iir7xr                   |               |  |  |
|                  | f_loop_                                 | delay                                                     |       | i2 r 7 x r               |               |  |  |
|                  | f_loop_                                 | delay                                                     |       | ii f 8 x f               |               |  |  |
|                  | <i>5</i> <b>x</b>                       | measure                                                   | delay |                          | i2 f 8 x f    |  |  |
|                  | # задание измерений входных емкостей    |                                                           |       |                          |               |  |  |
|                  | · · · · · · · · · · · · · · · · · · ·   | _measure                                                  | сар   |                          | i1 1          |  |  |
|                  | · · · · · · · · · · · · · · · · · · ·   | _measure                                                  | cap   |                          | i2 4          |  |  |
|                  |                                         | _measure                                                  | cap   |                          | i1 2          |  |  |
|                  | f_loop_                                 | _measure                                                  | cap   |                          | i2 5          |  |  |
|                  | f_loop_end                              |                                                           | *     |                          | _             |  |  |
|                  | # запись рузультатов в выходной файл    |                                                           |       |                          |               |  |  |
|                  | f_macro_write dotlib                    | <b>y</b>                                                  |       |                          |               |  |  |
|                  | f_loop_destroy                          |                                                           |       |                          |               |  |  |
|                  | <i>J – J</i>                            |                                                           |       |                          |               |  |  |

32

## Outline

- Contemporary technologies & IP blocks design problems 1.
- Deterministic and statistical timing analysis 2.
- Digital noise analysis problems 3.
- Logic cell characterization
- Memory cell characterization 5.
- Decomposition problems for IP blocks 6.
- Logic correlation analysis for timing and noise estimation 7.
- Input stimulus generation for IP blocks 8.
- IP blocks characterization speed-up 9.
- 10. Future technologies problems



## **Setup & Hold Characterization**

#### **Restriction control:**

- ✓ Correct output switching
- ✓ Delay degradation control



# **Different Types of Setup & Hold Characterization**

✓ Independent Setup

✓ Independent Hold

✓ Dependent Setup (Hold first)

✓ Dependent Hold (Setup first)

✓ Minimal SUM = Setup + Hold

✓ 3D interdependent characterization

Delay(Setup,Hold)



#### **Express Analysis of Setup and Hold**



36
#### Outline

- 1. Contemporary technologies & IP blocks design problems
- 2. Deterministic and statistiical timing analysis
- 3. Digital noise analysis problems
- 4. Logic cell characterization
- 5. Memory cell characterization
- 6. Decomposition problems for IP blocks
- 7. Input stimulus generation for IP blocks
- 8. Logic correlation analysis for timing and noise estimation
- 9. IP blocks characterization speed-up
- 10. Future technologies problems

## **Characterization of the Full Custom IP block**

Find input stimulus for maximal delay from a given primary input to a given primary output



# **Decomposition Approach (DCCC = DSN = CCC...)**



#### Decomposition for Full Custom IP-block: DCCC # Gate for Path Transistors & Domino Logic



# **Decomposition Problem: (1) Cinp Error**



41

# **Decomposition Problem: (2) Coupling Cap Noise**



42

# **Decomposition Problem: (3) IR-drop**



#### **Modified DCCC Decomposition**



#### **Decomposition and Correlations**

Delay: true path analysis - logic correlations results
 in false path
 Inputs stimulus for DCCC: correlations between

**DCCC** inputs

✓ Coupling capacitances: correlations between

aggressors and victim

✓ IR drop: max current estimation – correlations in

different DCCC switching



#### Outline

- 1. Contemporary technologies & IP blocks design problems
- 2. Deterministic and statistiical timing analysis
- 3. Digital noise analysis problems
- 4. Logic cell characterization
- 5. Memory cell characterization
- 6. Decomposition problems for IP blocks
- 7. Input stimulus generation for IP blocks
- 8. Logic correlation analysis for timing and noise estimation
- 9. IP blocks characterization speed-up
- 10. Future technologies problems



# **Input Stimulus Generation for the Full Custom IP**

Find input stimulus for maximal delay from a given primary input to a given primary output



# **Timing Analysis for the Full Custom IP blocks**

Full custom IP block

 Logic function of DCCC component is unknown
 Library less analysis is required
 Only transistor netlist is available

Input stimulus search for maximal delay

 Increasing number of DCCC inputs
 Input logic correlations restrictions
 Available methods:



full simulation; BDD / ADD approaches; critical path search



#### **Critical Path and Side Load Conflict**

 Critical path without side loads can be different from critical path with side loads

✓ Input logic correlation restrictions results in additional problems in critical path / side load analysis



## **Stimulus Search: Proposed Approach:**

- ✓ Input data should combine both logic and transistor data
- ✓ Generate *PU/PD-SP-DAG* for pull-up and pull-down networks from transistor netlist (logic extraction)
- ✓ Store the history of node resolutions "Resolution list"
- ✓ Form equivalent  $\pi$ -model in terms of Elmore delay.
- $\checkmark$  Delay analysis for particularly defined inputs
- $\checkmark$  Branch and bound approach for Max delay search



# Logic Extraction and SP-DAG: CMOS element AND3



#### Logic Extraction and SP-DAG: CMOS elementXOR2



#### **Gauss Elimination for non-SP Structure:**

#### **Before b resolution**



#### **Pi-model in Gauss Elimination Approach (~Ticer)**

$$Y_{k} = \sum_{i} y_{ki} = \frac{B_{k}}{s} + G_{k} + s \cdot C_{k}$$

$$y_{ki} = \frac{1}{s} (b_{i} + s \cdot g_{i} + s^{2} \cdot c_{i}) \qquad y_{kj} = \frac{1}{s} (b_{j} + s \cdot g_{j} + s^{2} \cdot c_{j})$$

$$y_{ij} = \frac{1}{s^{2} \cdot Y_{k}} \cdot (b_{i}b_{j} + s \cdot (b_{i}g_{j} + b_{j}g_{i}) + s^{2}(g_{i}g_{j} + b_{i}c_{j} + b_{j}c_{i}) + s^{3}(g_{i}c_{j} + g_{j}c_{i}) + s^{4}c_{i}c_{j})$$

$$y_{ij} = \frac{1}{G_{k}} \cdot (g_{i}g_{j} + s(c_{i}g_{j} + c_{j}g_{i}) + ...)$$

$$y_{ij} = \frac{1}{s \cdot B_{k}} \cdot (b_{i}b_{j} + s(b_{i}g_{j} + b_{j}g_{i}) + s^{2}(g_{i}g_{j} + b_{i}c_{j} + b_{j}c_{i}) + ...)$$

$$y_{ij} = \frac{1}{C_{k}} \cdot (c_{i}c_{j})$$

#### **Elmore Delay Estimation**

Elmore delay:

$$d = C/g$$

C – equivalent ground capacitance; g=1/R – equivalent internal conductance.

✓ Calculate logic states for internal and external nodes before and after switch;

✓ Calculate equivalent conductances for *pull-up* and *pull-down* networks;

 $\checkmark$  Calculate equivalent load capacitances for fall and rise switches

✓ Estimate switch delays (fall delay) and (rise delay).

 $\max(d) = \max(C) / \min(g)$ 



#### **Prototypes vs PU/PD-SP-DAG**

SP-DAG - [R.E. Bryant, Algorithmic Aspects of Symbolic Switch Network Analysis]

BDD - [R.E. Bryant, Graph-Based Algorithms for Boolean Function Manipulation]



#### **PU/PD-SP-DAG examples**



#### **Input Stimulus Generation for Characterization**

✓ Equivalent Pi-model estimation
✓ Max/Min estimation (min G, max C) max(D) = max(C) / min(g)



#### **Example of Branch and Bound Search**

Account for input correlations

ediata

Inputs: clk1, clk2, data





Inputs: clk, a, b, a\_b, b\_b. Logic restrictions: inverse a a\_b, inverse b b\_b





#### Outline

- 1. Contemporary technologies & IP blocks design problems
- 2. Deterministic and statistiical timing analysis
- 3. Digital noise analysis problems
- 4. Logic cell characterization
- 5. Memory cell characterization
- 6. Decomposition problems for IP blocks
- 7. Input stimulus generation for IP blocks
- 8. Logic correlation analysis for timing and noise estimation
- 9. IP blocks characterization speed-up
- 10. Future technologies problems



# **Cross Coupling Noise**

- Aggressor nets affect victim net through coupling capacitances
- Functional Noise: changes logic state of the victim net
- Delay Noise: affects signal propagation delay
- Different types of functional noises:
  - -victim state and aggressor switching direction
  - -Low/High Overshoot/Undershoot



# **Conservative Coupling Noise Analysis**

- All aggressor nets switch simultaneously in the same direction
- All aggressor noises combine to create maximum noise
- Aggressors switching times align to inject maximum noise



- Ignores correlation between circuit signals and may overestimate noise
- ✓ May produce false noise violations
- New method to reduce false noise violations by using logic implications



# Simple Logic Implication (SLI)

#### Problem:

compute logic correlation & maximum realizable aggressors set

#### Approach:

express logic correlation as simple logic implications

Used in logic synthesis & peak current estimation[G. Hachtel 1988], [W.Kunz 1994], [R.I.Bahar 1996], [W.Long 2000], [S. Bobba 1998]

build constraint graph & find maximum realizable aggressors set

# SLI (a=Va)->(x=Vx) means : - if net a is at Va then net x is at Vx

No timing information Conservative only for glitch free circuits



# **SLI (Simple Logic Implication) Approach**

Simple Logic Implication (SLI) for 2 nodes *a*, *b*:

$$(a=0) \Longrightarrow (b=1)$$

Initial notation

$$(x = v) \Longrightarrow (y = u)$$

Short notation:

$$x^{\nu} \Longrightarrow y^{u}$$
 , where

 $x^{\nu}, y^{\mu}$  - are  $(x, y \text{ or } \overline{x}, \overline{y})$ :

$$x^{v} = \begin{cases} \overline{x} & for \quad v = 0\\ x & for \quad v = 1 \end{cases}$$



# **SLI (Simple Logic Implication) Approach**

Equivalent notations:

$$(a=0) \Rightarrow (b=0) \Leftrightarrow \overline{a} \Rightarrow \overline{b}$$
  

$$(a=0) \Rightarrow (b=1) \Leftrightarrow \overline{a} \Rightarrow b$$
  

$$(a=1) \Rightarrow (b=0) \Leftrightarrow a \Rightarrow \overline{b}$$
  

$$(a=1) \Rightarrow (b=1) \Leftrightarrow a \Rightarrow b$$

Implication set for a gate

$$y = nand2(a,b) = \overline{a \cdot b}$$

$$(a=0) \Rightarrow (y=1) \Leftrightarrow \overline{a} \Rightarrow y$$
$$(b=0) \Rightarrow (y=1) \Leftrightarrow \overline{b} \Rightarrow y$$
$$(y=0) \Rightarrow (a=1) \Leftrightarrow \overline{y} \Rightarrow a$$
$$(y=0) \Rightarrow (b=1) \Leftrightarrow \overline{y} \Rightarrow b$$



# **SLI (Simple Logic Implication) Approach**

Implication set for a gate  $y = nor2(a,b) = \overline{a+b}$ 

$$(a=1) \Rightarrow (y=0) \Leftrightarrow a \Rightarrow \overline{y}$$
$$(b=1) \Rightarrow (y=0) \Leftrightarrow b \Rightarrow \overline{y}$$
$$(y=1) \Rightarrow (a=0) \Leftrightarrow y \Rightarrow \overline{a}$$
$$(y=1) \Rightarrow (b=0) \Leftrightarrow y \Rightarrow \overline{b}$$



# **Simple Logic Implication (SLI)**

Compact representation used in implementation

4 implications lists  $H^{a}_{H}$ ,  $H^{a}_{L}$ ,  $L^{a}_{H}$ ,  $L^{a}_{L}$  for each circuit net **a** 

implication list  $L_{H}^{a}$  consists of nets  $b_{i}$  such as SLI  $(b_{i} = 1) - (a = 0)$ 



Implication lists:  $H^{n_{I_{L}}} = \{ n3, n4, n8, n9 \}$ ,  $H^{n_{I_{H}}} = \{ n11 \}$ 



#### **SLI Generation**

- Compute SLIs for individual gate in circuit
- Propagate SLIs across the circuit
- Based on laws:
  - transitive:  $(a = V_a) \rightarrow (b = V_b), (b = V_b) \rightarrow (c = V_c)$   $\longrightarrow$   $(a = V_a) \rightarrow (c = V_c)$  contra-positive:  $(a = V_a) \rightarrow (x = V_x)$   $\longleftrightarrow$   $(x = V_x) \rightarrow (a = V_a)$

$$(x = V_{a}) \longleftrightarrow (x = V_{a}) \rightarrow (a = V_{a})$$

**Basic operations:** 

- Implications lists union and intersection



#### **Lateral Propagation of SLI**

- ✓ Used in logic optimization [R.I.Bahar 1996], [W.Long 2000]
- Based on contra positive law
- ✓ AND gate:
  - implication (a=1&x=0)->b=0
  - lateral implication lists propagation







#### **Constraint Graph Construction from SLIs & MWIS Analysis**



## **False Noise Analysis Data Flow**



# **SLI and Other SAT Solutions Problems**

- ✓ Logic correlation
  - SAT problem
  - NP complite
- ✓ Full analysis [A. Rubio, et al. 1997], [P. Chen, K. Keutzer, 1999]
  - For circuits ~100-300 nodes
- ✓ SLI heuristic approach [A. Glebov, S.Gavrilov, et al]
  - Fast but not full
  - Pair wise correlations only
  - Ignore 3-, 4- etc. correlations
  - Logic extraction is required






#### **Logic Constraints Representation**

• System of equations or DNF:

 $\left. \begin{array}{c} \overline{a} \cdot \overline{b} \cdot \overline{c} = 0 \\ \overline{a} \cdot \overline{b} \cdot d = 0 \\ \vdots & \vdots \\ \cdot & \cdot \end{array} \right\} \quad \longleftrightarrow \quad \overline{a} \cdot \overline{b} \cdot \overline{c} + \overline{a} \cdot \overline{b} \cdot d + \dots = 0$ 

- ✓ Set of conjunctive terms:  $\overline{a} \cdot b \cdot \overline{c}, \overline{a} \cdot \overline{b} \cdot d...$
- Each conjunctive term prohibits one signal combination

- Term:  $a \cdot \overline{b} \cdot \overline{c} \cdot \overline{d} = 0$ 

prohibits: *a*=1, *b*=0, *c*=0, *d*=0 and prohibits signals *b*, *c*, *d* from simultaneous switching if a=1

$$\mathbf{a} - \mathbf{x}$$

$$\bar{x} \cdot a \cdot b + x \cdot \bar{a} + x \cdot \bar{b} = 0$$

$$\mathbf{a} - \mathbf{a} - \mathbf{a} + \mathbf{a} \cdot \bar{b} = 0$$

$$\mathbf{a} - \mathbf{a} - \mathbf{a} + \mathbf{a} \cdot \bar{b} = 0$$

$$\mathbf{a} - \mathbf{a} - \mathbf{a} + \mathbf{a} - \mathbf{a} - \mathbf{a} + \mathbf{a} - \mathbf{a} -$$

#### **Resolution Method**

- Automatic theorem proving and SAT problem:
  - deriving new logic relations by **Resolution Rule**:

$$a+B=1, a+C=1 \longrightarrow B+C=1$$

или 
$$a+B, a+C \implies B+C$$

Resolution rule for logic constraints
 -constraints (false sentences) derivation

$$a \cdot B = 0, a \cdot C = 0 \implies B \cdot C = 0$$
  
or  $a \cdot B, \overline{a} \cdot C \implies B \cdot C$ 



#### **Transistor Level Logic Constraints Generation**

• Initial logic constraints for transistors:



Deriving constraints for DCCCs (gates) at transistor level

- compute constraints by resolution rule
- try to eliminate variables not involved in noise clusters
- remove tautologies:  $a \cdot a \cdot B$
- remove constraints covered by other ones:  $(a \cdot b \operatorname{cov} ers \ a \cdot b \cdot c)$



#### **Logic Constraints Derivation**

Logic constraints for static NAND2



 $b \cdot y, \ a \cdot x \cdot \overline{y} \to a \cdot b \cdot x$ 

Logic constraints for dynamic NAND2





#### **Constraints Derivation at Logic Level**



77

### **Characteristic ROBDD Construction**

- Create root for victim
- ✓ Try v=0,1 assignments
- Make all conclusions from constraints
  - if constraints are satisfied create arc to
  - if constraints are at conflict create arc to 0
  - otherwise create arc to next aggressor vertex and repeat the analysis
- Repeat the procedure for aggressors

#### Low overshoot noise at v:



Constraints for low overshoot noise at *v*:  $\overline{v} \cdot \overline{a}_4, \ \overline{v} \cdot \overline{a}_5,$  $\overline{a}_1 \cdot \overline{a}_4, \ \overline{a}_2 \cdot \overline{a}_4, \ \overline{a}_2 \cdot \overline{a}_5, \ \overline{a}_3 \cdot \overline{a}_5,$  $a_1 \cdot a_2 \cdot a_4, \ a_2 \cdot a_3 \cdot a_5$ 





#### **Maximum Realizable Noise Calculation**



# **Cross Coupling Delay Noise**

- Aggressor nets affect victim net through coupling capacitances
- Functional Noise: changes logic state of the victim net
  - Affects victim when it in a stable state
- Delay Noise: changes signal propagation delay
  - Affects victim net when it transitions
  - Delay changes accumulate along the signal propagation paths



# **Signal Correlation and False Noise Analysis**

- ✓ Timing correlation:
  - nets switch at different clock cycles, etc.
- Logic correlation:
  - circuit logic prohibits some combinations of nets signals
  - it prohibits some aggressor nets from simultaneous switching



Two problems of false noise analysis:

- Computing signal correlations
- Computing the worst possible noise and aggressors set injecting it
- Difficult optimization problem

#### **Logic Constraints Representation and Derivation**

- ✓ Set of conjunctive terms:  $a \cdot b \cdot c$ ,  $a \cdot b \cdot d$ , ...
- Each term prohibits signal combination
  - $a \cdot b \cdot c \cdot d$  prohibits: a=1, b=0, c=0, d=0
  - **Resolution technique** 
    - deriving sentences by resolution rule
    - can handle multiple constraints
    - can build approximate solutions
    - works even at transistor level
- Simple Logic Implication (SLI)
  - binary constraints only
  - simpler implementation



 $x \cdot a \cdot b, x \cdot a, x \cdot b$  $a \cdot B, \overline{a} \cdot C \implies B \cdot C$ 



#### **Transistor Level Logic Constraints Generation**

- Set constraints for transistors
- ✓Apply resolution rule
  - eliminate variables not involved in noise clusters
  - remove tautologies  $a \cdot a \cdot B$
  - remove constraints covered by other ones:

 $(a \cdot \overline{b} \operatorname{cov} \operatorname{ers} a \cdot \overline{b} \cdot \overline{c})$ 







#### **Logic Constraints Derivation at Gate Level**



#### **Linear Delay Noise Model**

✓ Need for simple model to estimate each aggressor impact

Actual delay variation is verified by SPICE simulations



Actual delay variation is verified by SPICE simulations

#### **Computation of Linear Noise Model for Noise Cluster**

Compute noise pulse height  $h_i$  of each aggressor Compute total noise pulse height:  $H = \sum_{j \in Net\_Aggressors} h_j$ Compute total net delay variation:  $\Delta D_{Net}$ Estimate delay variation due to each aggressor:



Error of delay noise additive model

# **Constraint Graph and Hyper-Graph**

- Vertices are aggressor nets
- Edges / hyper-edges are constraints
  - -weight is injected noise
- Maximum weight independent set (MVIS) of vertices
  - -does not have any edge/hyper-edge as subset



Constraints/hyper-edges:  $\{a_1a_2a_3, a_4a_5, a_2a_3a_5\}$ *MWIS* = { $a_1, a_2, a_5$ }, w=0.65



• Constraints (SLI/) edges:  $\{a_1a_2, a_2a_3, a_3a_5, a_2a_5, a_4a_5, a_1a_3\}$ *MWIS* =  $\{a_1, a_4\}, w=0.35$ 



#### **Branch and Bound Algorithm**



# **Data Flow in Branch and Bound Algorithm**



Input data for recursive calls of B&B algorithm

#### **Delay Noise Analysis Data Flow**



# Outline

- 1. Contemporary technologies & IP blocks design problems
- 2. Deterministic and statistiical timing analysis
- 3. Digital noise analysis problems
- 4. Logic cell characterization
- 5. Memory cell characterization
- 6. Decomposition problems for IP blocks
- 7. Input stimulus generation for IP blocks
- 8. Logic correlation analysis for timing and noise estimation
- 9. IP blocks characterization speed-up
- 10. Future technologies problems



#### **Analysis of the Dependences and Decomposition**



 $D(S,C) = D(S_0, C_0) + (D_{in}(S) - D_{in}(S_0)) + (D_{out}(C) - D_{out}(C_0))$ 



# **IP blocks Characterization Speed-up**

#### **Reduction of repeating Simulations**

✓ Output DCCCs (1 or 2) cascades are simulated *M* times for different outputs.

✓ Input DCCCs (2 or 3) cascades are simulated *N* times for different input slodpe (transition time).

✓ The full circuit is simulated during single path.



### **AlphaSim vs Standard Data Flow**



### **Parallel Simulations for Different C / S**



### **Ananlysis of Input Driver**

Input driver is used to generate smooth input waveform.

The voltage repeater is required to exclude direct contact between Cinp and driver output

The set of Ck1 is chosen to generate the required input slope (transition time) { Skinp }.





# **Preliminary Driver Characterization and Speed-up**



✓ Normal approach: Nand3 + 3 drivers: 4 + 3\*10 = 34 elements

✓ Modified approach: Nand3: 4+3 new characterized sources = 7 elements



#### **Driver Characterization**

$$y = \frac{m_1 \cdot r_1^3}{6h_i} + \frac{m_2 \cdot r_2^3}{6h_i} + \left(f_1 - \frac{m_1 \cdot h_i^2}{6}\right) \cdot \frac{r_1}{h_i} + \left(f_2 - \frac{m_2 \cdot h_i^2}{6}\right) \cdot \frac{r_2}{h_i}$$

Where :  $\{x_i\}$  – node argument values,  $\{f_i\}$  – node function values,

 $r_1 = x_i - a x$ ,  $r_2 = x - x_{i-\nu}$ ,  $f_1 = f_{i-\nu}$ ,  $f_2 = f_i$ ,  $h_i = x_i - x_{i-\nu}$ ,  $m_\nu$ ,  $m_2$  – spline coefficients for each i-th interval.



# Outline

- 1. Contemporary technologies & IP blocks design problems
- 2. Deterministic and statistiical timing analysis
- 3. Digital noise analysis problems
- 4. Logic cell characterization
- 5. Memory cell characterization
- 6. Decomposition problems for IP blocks
- 7. Input stimulus generation for IP blocks
- 8. Logic correlation analysis for timing and noise estimation
- 9. IP blocks characterization speed-up
- 10. Future technologies problems



# Thank

You!

