Journal of Zhejiang University-SCIENCE C (Computers & Electronics) ISSN 1869-1951 (Print); ISSN 1869-196X (Online) www.zju.edu.cn/jzus; www.springerlink.com E-mail: jzus@zju.edu.cn



## Scratch-concerned yield modeling for IC manufacturing involved with a chemical mechanical polishing process

Jiao-jiao ZHU, Xiao-hua LUO<sup>†‡</sup>, Li-sheng CHEN, Yi YE, Xiao-lang YAN

(Institute of VLSI Design, Zhejiang University, Hangzhou 310027, China) <sup>†</sup>E-mail: luoxh@vlsi.zju.edu.cn Received Aug. 22, 2011; Revision accepted Jan. 6, 2012; Crosschecked Apr. 9, 2012

**Abstract:** In existing integrated circuit (IC) fabrication methods, the yield is typically limited by defects generated in the manufacturing process. In fact, the yield often shows a good correlation with the type and density of the defect. As a result, an accurate defect limited yield model is essential for accurate correlation analysis and yield prediction. Since real defects exhibit a great variety of shapes, to ensure the accuracy of yield prediction, it is necessary to select the most appropriate defect model and to extract the critical area based on the defect model. Considering the realistic outline of scratches introduced by the chemical mechanical polishing (CMP) process, we propose a novel scratch-concerned yield model. A linear model is introduced to model scratches. Based on the linear model, the related critical area extraction algorithm and defect model enables a more accurate yield prediction caused by scratches and results in a more accurate total product yield prediction as compared to the traditional circular model.

Key words:Chemical mechanical polishing (CMP), Scratch, Defect, Yield model, Critical areadoi:10.1631/jzus.C1100242Document code: ACLC number: TN4

### 1 Introduction

During the manufacturing of integrated circuits (IC), functional yield loss is caused mainly by defects introduced by the environment, tools, or processes like implantation, etching, planarization, cleaning, and lithography. Thus, the determination of defects and yield, and an appropriate yield model to analyze their correlation, are essential components for assessing yield prediction and improvement.

As a global planarization technology, chemical mechanical polishing (CMP) has been used extensively in inter-level dielectric (ILD), inter-metal dielectric (IMD), the copper damascene process, and shallow trench isolation (STI) planarization in IC fabrications (Luo and Dornfeld, 2004). However, the CMP process inevitably introduces a significant

proportion of scratches due to its relative mechanical abrasion between wafer and pad (Jung et al., 2001). A scratch is generated when large or agglomerated particles in slurry or foreign particles on the polish pad are in contact with the wafer surface (Huang et al., 1999; Aytes et al., 2003). In fact, as device geometry shrinks to deep sub-micron regions, scratches are becoming a major cause of defects which result in circuit failure and yield loss (Park and Kim, 2001). Fig. 1 shows some missing/extra material defects in poly gates, which have been caused by scratches introduced by the STI CMP process. Extra/missing material around scratches is the main cause of electrical short/open faults. Thus, a scratch-concerned yield model is important in analyzing flaws caused by CMP-involved IC manufacturing.

Traditional yield models are based on the assumption that a defect is a circular disc (Hess and Stroele, 1994). However, this is not correspondent with the realistic outline of real defects in most

376

<sup>&</sup>lt;sup>‡</sup> Corresponding author

<sup>©</sup> Zhejiang University and Springer-Verlag Berlin Heidelberg 2012

situations. A typical scratch (Fig. 2a) generally has four features:

1. A clear maximum extension *l*.

2. An extension l', which is perpendicular to l and smaller than the minimum feature size.

3. A high aspect ratio between l and l'.

4. An orientation of the maximum extension, denoted by  $\theta$ .



Fig. 1 Defects caused by scratches introduced by the STI CMP process: (a) missing material defects; (b) extra material defects

Thus, a circle model with only the radius parameter (Fig. 2b) does not portray these features well. Since l' is smaller than the minimum feature size, its effect on causing a bridge or a break in ICs is quite small. Therefore, it is reasonable to introduce a new linear model to approximate a scratch. This linear model uses l and  $\theta$  to represent a scratch's size and orientation, respectively (Fig. 2c).

Based on the linear defect model, a novel scratch-concerned yield model is proposed here for IC manufacturing involving the CMP process. This new yield model can significantly improve the yield prediction by separately considering the yield losses caused by scratches.



Fig. 2 A typical scratch and two different defect models (a) Outline of a typical scratch; (b) Circular model with a radius of l/2; (c) Linear model with length *l* and orientation  $\theta$ 

# 2 Yield modeling considering scratches in IC manufacturing

A defect limited yield model is used to express the complex relationship among the yield, defect density D (the average number of defects per unit area), and the average critical area A. It is usually presented as

$$Y = f(D, A), \tag{1}$$

where Y is the defect limited yield. If there are M defect types (like metal shorts, metal opens, and contact/via opens) and they are independent of each other, the total defect limited yield can be described as the product of the yield for each type of defect (Huang *et al.*, 1999):

$$Y = \prod_{i=0}^{M} Y_i = \prod_{i=0}^{M} f(D_i, A_i),$$
 (2)

where the subscript i indicates the defect type. For each type of defect, we can further classify those having the four features mentioned above as scratches and model them with the linear defect model, and classify the rest as particles and model them with the circular defect model. Assuming the defect density is constant under the same process, the yield can be represented using a Poisson model (Stapper, 1984):

$$Y_i = Y_s Y_p = e^{-A_s D_s} e^{-A_p D_p},$$
 (3)

where 's' and 'p' indicate the scratch and the particle, respectively. Since defect density is process-related and can be measured with wafer inspection tools (Skumanich and Cai, 1999; Maeda *et al.*, 2001; Shankar and Zhong, 2005), the focal point of analysis for yield prediction is the average critical area. The critical area extraction algorithm and defect density distribution differ according to the different defect models being used. Based on the circular model, the average critical area is calculated by

$$A = \int_0^\infty A(l) f(l) \mathrm{d}l,\tag{4}$$

where A(l) and f(l) are the critical area and defect density for defects with size l, respectively. As to the linear defect model, the average critical area for all defect sizes and orientations is obtained by

$$A = \int_0^{2\pi} \int_0^\infty A(l,\theta) f(l,\theta) dl d\theta,$$
 (5)

where  $A(l, \theta)$  and  $f(l, \theta)$  are the critical area and defect density for defects with size *l* and orientation angle  $\theta$ , respectively.

### 3 Average critical area extraction for scratches

Critical area is defined as the region where the center of a defect must fall to cause a failure in ICs (Stapper, 1983), and it reflects a layout's sensitivity to defects. Thus, besides defect model selection, critical area is also related to circuit geometry and pattern density (May and Spanos, 2006). According to cor-

relation related research, critical area can be obtained by certain shape operations, which is the so-called 'geometric method' (Allan and Walton, 1998).

Approaches to critical area estimates vary with the defect types. In this work, we focus on the missing and extra material defects caused by scratches.

# 3.1 Critical area of missing material defects caused by scratches

A missing material defect forms an electrically insulating region which may cause open circuits if it occurs in conductive materials (Walker and Director, 1986).

Assume there is a long conductor with a length L which is much greater than its width W. Electrical current should flow from one end of this conductor to the other. Also assume there is an open fault or a hard open fault only when the line is completely broken.

The critical area of missing material defects is generated using a polygon shrink operation and calculated from the self-intersection regions of a polygon shrunk by half of the defect size (Allan and Walton, 1998).

To compare our linear defect model with the circular model, the missing critical area is generated based on both models, as denoted by the dark grey region in Fig. 3.

Assume the length is much greater than the line width or space. In reality, the likelihood of large defects occurring is very small, and hence it is easy to say that h is much smaller than L. Neglecting the influence of h on critical area calculation, which is the so-called 'end effect' (Stapper, 1984), the critical area based on the circular model can be derived geometrically from Fig. 3a by

$$A(l) = \begin{cases} 0, & 0 < l < W, \\ L(l-W), & l \ge W, \end{cases}$$
(6)

which is a function of size *l*. Fig. 3b shows the critical area generated by the linear defect model, which is given geometrically by

$$A(l,\theta) = (L-h)(l|\sin\theta| - W).$$
(7)

Neglecting the 'end effect', the critical area for scratches with size l and orientation  $\theta$  is

$$A(l,\theta) = \begin{cases} 0, & 0 < l \mid \sin\theta \mid < W, \\ L(l \mid \sin\theta \mid -W), & l \mid \sin\theta \mid \ge W, \end{cases}$$
(8)

which is a function of size l and orientation  $\theta$ .



Fig. 3 Critical area of missing material defects in a long conductor extracted with the circular model (a) and the linear model (b)

# **3.2** Critical area of extra material defects caused by scratches

Trenches in oxide formed by scratches are filled with metal materials during the CMP process, which is the main cause of extra material defects (Ollendorf *et al.*, 2004).

The critical area of extra material defects is the region where the center of a defect must fall to short separate electrical nodes. Fig. 4 shows how to extract the critical area of extra material defects between two parallel conductors separately based on the circular model and the linear model.

Assume there are two long conductors of length L, width W, and space S. The critical area of extra material defects is calculated from the intersection regions of polygon nodes expanded by half of the defect size (Lauther, 1981; Allan and Walton, 1997), as denoted by the dark grey region in Fig. 4. Ignoring the 'end effect', the critical area of extra material defects calculated based on the circular model is

$$A(l) = \begin{cases} 0, & 0 < l < S, \\ L(l-S), & l \ge S, \end{cases}$$
(9)



Fig. 4 Critical area of extra material defects between two conductors extracted with the circular model (a) and the linear model (b)

and the critical area extracted based on the linear model is

$$A(l,\theta) = \begin{cases} 0, & 0 < l \mid \sin\theta \mid < S, \\ L(l \mid \sin\theta \mid -S), & l \mid \sin\theta \mid \ge S. \end{cases}$$
(10)

This has the same form as the missing material critical area of one long conductor except that S is used for space instead of W for line width. This complete duality between open and short models holds even for many complex circuits (Stapper, 1984).

### 3.3 Defect density distribution based on two defect models

After critical area extraction, defect density distribution must be considered to calculate the average critical area. As mentioned before, defect density distribution is related to the defect model. As to the circular model, defect density distribution is a function of size l, denoted by  $f_l(l)$ . For the linear model, it is a joint density function of l and  $\theta$ , denoted by  $f(l, \theta)$ . Since  $\theta$  and l are independent of each other in a mature process, we have

$$f(l,\theta) = f_l(l)f_{\theta}(\theta), \qquad (11)$$

where  $f_{\theta}(\theta)$  is the probability density function of  $\theta$ . In general, there should be no preferred orientation, so  $f_{\theta}(\theta)$  is a continuous uniform distribution:

$$f_{\theta}(\theta) = \frac{1}{2\pi}, \quad 0 \le \theta < 2\pi.$$
 (12)

As to  $f_l(l)$ , the  $1/\text{size}^3$  distribution function introduced by Thomas and Stapper is used here as the size distribution:

$$f_{l}(l) = \begin{cases} l / l_{0}^{2}, & 0 \le l < l_{0}, \\ l_{0}^{2} / l^{3}, & l \ge l_{0}, \end{cases}$$
(13)

where  $l_0$  is the defect size with the peak density and typically less than the minimum feature size (Allan and Walton, 1997). Combining Eqs. (12) and (13), the density distribution function based on the linear model is

$$f(l,\theta) = \begin{cases} \frac{l}{2\pi l_0^2}, & 0 \le l < l_0, \\ \frac{l_0^2}{2\pi l^3}, & l \ge l_0. \end{cases}$$
(14)

#### 3.4 Average critical area extraction for scratches

Combining the formulas for critical area and defect density function by integral of Eqs. (4) and (5), we obtain the average critical area based on two defect models. The average missing critical area of a long conductor can be calculated by combing Eqs. (6) and (13):

$$A = \int_{0}^{\infty} A(l) f(l) dl$$
  
=  $\int_{0}^{l_{0}} 0 \cdot \frac{l}{l_{0}^{2}} dl + \int_{l_{0}}^{W} 0 \cdot \frac{l_{0}^{2}}{l^{3}} dl + \int_{W}^{\infty} L(l-W) \frac{l_{0}^{2}}{l^{3}} dl$   
=  $\frac{L l_{0}^{2}}{2W}$ . (15)

Combining Eqs. (8) and (14), the average missing critical area for a long conductor based on the linear model is obtained:

$$A = \int_{0}^{2\pi} \int_{0}^{\infty} A(l,\theta) f(l,\theta) dl d\theta$$
  
=  $\int_{0}^{2\pi} \int_{0}^{l_{0}} 0 \cdot \frac{l}{2\pi l_{0}^{2}} dl d\theta + \int_{0}^{2\pi} \int_{l_{0}}^{W/|\sin\theta|} 0 \cdot \frac{l_{0}^{2}}{2\pi l^{3}} dl d\theta$   
+  $\int_{0}^{2\pi} \int_{W/|\sin\theta|}^{\infty} L(l |\sin\theta| - W) \frac{l_{0}^{2}}{2\pi l^{3}} dl d\theta$   
=  $\frac{L l_{0}^{2}}{4W}.$  (16)

The average extra critical area of two conductors can be easily obtained by substituting W in Eqs. (15) and (16) by S.

It is easy to see that the circular model has a more pessimistic estimate of the average critical area due to its lower correspondence with realistic scratches. This is even more apparent in the case where a layout is primarily composed of short lines. In this case, since the 'end effect' cannot be neglected, the critical area overestimated by the circular model is even larger. In the next section, we will compare these two models using fabrication experiment data.

#### 4 Experiment principle and method

To assess the accuracy of these two yield models, test chips are designed to give a real product yield. Assume there are N test chips, and U of them have short/open failures. According to the law of large numbers, when N is large enough, the measured yield (denoted by  $Y_{\rm m}$ ) approximates the real yield (denoted by Y):

$$Y_{\rm m} = \frac{N - U}{N}.$$
 (17)

Statistical methods are available to determine the value of *N*. According to the de Moivre–Laplace theorem, if we want the margin of error to be less than  $\varepsilon$  with a given confidence level of  $1-\alpha$ , which means

$$P\left\{\left|\frac{N-U}{N}-Y\right|<\varepsilon\right\}>1-\alpha,$$
 (18)

sample size N must satisfy

$$N > \frac{Y(1-Y)}{\varepsilon^2} u_{1-\alpha/2}, \qquad (19)$$

380

where  $u_{1-\alpha/2}$  is the upper critical value of a standard normal distribution (Sheng *et al.*, 2008).

Defect density and critical area of test chips are measured to calculate model yields. The yield calculated based on the linear model is

$$Y_{\rm l} = Y_{\rm s} Y_{\rm p} = {\rm e}^{-A_{\rm s} D_{\rm s}} {\rm e}^{-A_{\rm p} D_{\rm p}}, \qquad (20)$$

where classified defect density and critical area for both particles and scratches are needed. The yield calculated based on the circular model is

$$Y_{\rm c} = {\rm e}^{-A_{\rm p}(D_{\rm s}+D_{\rm p})},$$
 (21)

where all defects are treated as particles and modeled with the circular model.

The test vehicles used in industry to detect open/short faults are predominantly snake/comb conductive lines (Khare *et al.*, 1994). Here we use a hybrid comb-snake structure (Fig. 5) to detect both open and short defects.



Fig. 5 Layout of a comb-snake test structure Dummy bars are intended to create the same external environment for border lines

By measuring the current between different pads, short/open faults are detected (Fig. 6).

To detect line-to-line short faults, the voltage is applied to both ends of the snake-line, and the leakage current between the snake-line and comb-lines is measured at the comb ends (Fig. 6a). For detecting line opens, the voltage is applied to one end of the snake-line with the other end grounded. The current flow through the snake-line is measured at the ground end (Fig. 6b). The comb-lines are disconnected during the open test.



Fig. 6 Diagram of faults detection with electrical measurement: (a) short detection; (b) open detection

### 5 Fabrication experiment results and statistical analyses

Fabrication experiments were performed at Semiconductor Manufactory International Corporation (SMIC) 65 nm low leakage very high speed (VHS) regular voltage threshold (RVT) logic process. Four comb-snake structures with different line width/space formed a 2.18 mm×0.38 mm test chip. Test structures were used in both metal-1 and metal-2 layers because of their high density of interconnection lines and high sensitivity to CMP scratch defects. One 2×16 pad group was shared by test structures in two layers for electrical measurements (Fig. 7).



Fig. 7 Layout and wiring of the test chip designed for the experiment

'M1-' and 'M2-' indicate the test structure in the metal-1 or metal-2 layer respectively, and letters a-d represent four comb-snake test structures. A total of 420 copies of test chips were fabricated, corresponding to 3360 test structures and 6720 electrical measurements

A 1.5 V voltage source was used in electrical measurements. The current data collected for each structure (not all data is valid for analysis) is plotted into a cumulative probability graph. A cumulative

probability graph for structure 'MI-a' is shown in Fig. 8. Fig. 8a is the current data from short fault detecting. Hypothetically, if the metal lines are disconnected, there should be no electrical current. In real applications, however, there exists a very small (close to zero) leakage current attributed to the resistance of dielectric between metal lines. When the metal lines are connected, the currents are in the order of 0–2 mA (Fig. 8b), almost 10 orders of magnitude larger than the currents when there are metal breaks. The counts of short/open failures are easily found from the current cumulative probability graph.



Fig. 8 Cumulative probability graphs of currents measured in short (a) and open (b) faults detecting for structure 'M1-a'

Defect densities were measured using KLA wafer inspection tools after the CMP process. Due to different polishing and cleaning parameter settings,  $D_s$  is 0.7/mm<sup>2</sup> and  $D_p$  is 1.1/mm<sup>2</sup> for the metal-1 layer, while  $D_s$  is 0.8/mm<sup>2</sup> and  $D_p$  is 0.9/mm<sup>2</sup> for the metal-2 layer.  $A_s$  and  $A_p$  are calculated for four test structures using the geometric method addressed in Section 3. Table 1 lists the experimental data.

Many assessment criteria can be used to evaluate the fit accuracy of a model. One of such criteria is the

| Struc-   | Fault | N   | IJ | $Y_{\rm m}$ | $Y_{\rm c}$ | $Y_1$ |
|----------|-------|-----|----|-------------|-------------|-------|
| ture     | type  | 1.  | U  | (%)         | (%)         | (%)   |
| M1-a     | S     | 407 | 8  | 98.03       | 96.77       | 97.39 |
|          | 0     | 410 | 8  | 98.03       | 97.13       | 97.64 |
| M1-b     | S     | 406 | 3  | 99.26       | 98.49       | 98.89 |
|          | 0     | 406 | 3  | 99.26       | 98.52       | 98.98 |
| M1-c     | S     | 408 | 2  | 99.51       | 98.96       | 99.26 |
|          | 0     | 408 | 2  | 99.51       | 99.06       | 99.34 |
| M1-d     | S     | 405 | 1  | 99.75       | 99.50       | 99.62 |
|          | 0     | 406 | 1  | 99.75       | 99.61       | 99.69 |
| M2-a     | S     | 410 | 9  | 97.78       | 96.77       | 97.39 |
|          | 0     | 407 | 8  | 98.03       | 97.13       | 97.64 |
| M2-b     | S     | 411 | 4  | 99.01       | 98.49       | 98.79 |
|          | 0     | 406 | 4  | 99.01       | 98.52       | 98.88 |
| M2-c     | S     | 405 | 3  | 99.26       | 98.96       | 99.17 |
|          | 0     | 406 | 3  | 99.26       | 99.06       | 99.24 |
| M2-d     | S     | 409 | 1  | 99.75       | 99.50       | 99.62 |
|          | 0     | 412 | 1  | 99.75       | 99.61       | 99.69 |
| Mean (%) |       |     |    | 99.06       | 98.51       | 98.83 |

Table 1 In-line experimental data from 420 test chips

S: short faults; O: open faults. N: sample number; U: failure number.  $Y_m$ : measured yield;  $Y_c$ : yield calculated using the circular model;  $Y_1$ : yield calculated using the linear model

correlation coefficient. The closer it is to 1, the better. The correlation coefficient corr(X, Y) between two variables or two sets of data, X and Y, with expected values  $\mu_X$  and  $\mu_Y$  and standard deviations  $\sigma_X$  and  $\sigma_Y$ , is defined as

$$\operatorname{corr}(X,Y) = \frac{E[(X - \mu_X)(Y - \mu_Y)]}{\sigma_X \sigma_Y}$$
$$= \frac{\sum_{i=1}^n (X_i - \overline{X})(Y_i - \overline{Y})}{\sqrt{\sum_{i=1}^n (X_i - \overline{X})^2} \sqrt{\sum_{i=1}^n (Y_i - \overline{Y})^2}}, \quad (22)$$

where E is the expected value operator and n is the number of observations, which is 16 in our case. Substituting X and Y with the experimental data in Table 1, we obtain

$$\operatorname{corr}(Y_{\mathrm{m}}, Y_{\mathrm{c}}) = \frac{\sum_{i=1}^{n} (Y_{\mathrm{m}i} - \overline{Y_{\mathrm{m}}})(Y_{\mathrm{c}i} - \overline{Y_{\mathrm{c}}})}{\sqrt{\sum_{i=1}^{n} (Y_{\mathrm{m}i} - \overline{Y_{\mathrm{m}}})^{2}} \sqrt{\sum_{i=1}^{n} (Y_{\mathrm{c}i} - \overline{Y_{\mathrm{c}}})^{2}}} = 0.985,$$
(23)

$$\operatorname{corr}(Y_{\mathrm{m}}, Y_{\mathrm{l}}) = \frac{\sum_{i=1}^{n} (Y_{\mathrm{m}i} - \overline{Y_{\mathrm{m}}})(Y_{\mathrm{l}i} - \overline{Y_{\mathrm{l}}})}{\sqrt{\sum_{i=1}^{n} (Y_{\mathrm{m}i} - \overline{Y_{\mathrm{m}}})^{2}} \sqrt{\sum_{i=1}^{n} (Y_{\mathrm{l}i} - \overline{Y_{\mathrm{l}}})^{2}}} = 0.992.$$
(24)

Corr( $Y_{m}$ ,  $Y_{l}$ ) is closer to 1 than corr( $Y_{m}$ ,  $Y_{c}$ ). This implies that the yield predicted by our linear model has a higher agreement with the manufacturing yields as compared to the traditional circular model.

Another way to assess the fitness of a yield model is the relative error between model yields and measured yields. The smaller the error, the better the model. Relative errors between model yields and measured yields are calculated as

$$\operatorname{Err}_{i} = |Y_{i} - Y_{m}|, \qquad (25)$$

where *i* indicates the model type. The results are listed in Table 2. The yield errors resulting from the linear model are smaller than those from the circular model.

 Table 2 Yield errors between yields obtained from the linear/circular model and measurements

| Parameter |   | Err <sub>c</sub> (%) | $\operatorname{Err}_{l}(\%)$ | Diff (%) |  |
|-----------|---|----------------------|------------------------------|----------|--|
| M1-a      | S | 1.26                 | 0.64                         | 0.62     |  |
|           | 0 | 0.90                 | 0.39                         | 0.51     |  |
| M1-b      | S | 0.77                 | 0.37                         | 0.40     |  |
|           | 0 | 0.74                 | 0.28                         | 0.46     |  |
| M1-c      | S | 0.55                 | 0.25                         | 0.30     |  |
|           | 0 | 0.45                 | 0.17                         | 0.28     |  |
| M1-d      | S | 0.25                 | 0.13                         | 0.12     |  |
|           | 0 | 0.14                 | 0.06                         | 0.08     |  |
| M2-a      | S | 1.01                 | 0.39                         | 0.62     |  |
|           | 0 | 0.90                 | 0.39                         | 0.51     |  |
| M2-b      | S | 0.52                 | 0.22                         | 0.30     |  |
|           | 0 | 0.49                 | 0.13                         | 0.36     |  |
| M2-c      | S | 0.30                 | 0.09                         | 0.21     |  |
|           | 0 | 0.20                 | 0.02                         | 0.18     |  |
| M2-d      | S | 0.25                 | 0.13                         | 0.12     |  |
|           | 0 | 0.14                 | 0.06                         | 0.08     |  |
| Mean (%)  |   | 0.55                 | 0.23                         | 0.32     |  |
| ~         |   |                      |                              |          |  |

S: short faults; O: open faults

To further assess whether the improvement achieved using our model is statistically significant or whether  $\text{Err}_1$  is significantly smaller than  $\text{Err}_c$ , a statistical test of difference between yield errors was carried out. To begin with, the difference is calculated as

$$Diff = Err_c - Err_l.$$
 (26)

The differences calculated are listed in Table 2.

As yield errors arise in pairs from the same test structure, we performed a paired *t*-test (Zimmerman,

1997) to determine if our model improves yield prediction significantly. The *t*-statistic is given by

$$t = \frac{\overline{\mathrm{Err}_{\mathrm{c}}} - \overline{\mathrm{Err}_{\mathrm{l}}}}{s_{\mathrm{D}} / \sqrt{n}} = \frac{\overline{\mathrm{Diff}}}{s_{\mathrm{D}} / \sqrt{n}},$$
 (27)

where Diff and  $s_D$  are the mean and standard deviation of the differences between Err<sub>1</sub> and Err<sub>c</sub> respectively, and *n* is the number of observations (O'Mahony, 1986; Press *et al.*, 1992). The test results are listed in Table 3.

Table 3 Paired *t*-test of yield errors

| Parameter                  | Value    |  |
|----------------------------|----------|--|
| Mean of Diff               | 0.32%    |  |
| Standard deviation of Diff | 0.18%    |  |
| <i>t</i> -statistic        | 6.99     |  |
| Degree of freedom          | 15       |  |
| Single-tailed $p(T \le t)$ | 2.18E-06 |  |
| t critical single-tail     | 1.75     |  |
| Two-tailed $p(T \le t)$    | 4.36E-06 |  |
| t critical two-tail        | 2.13     |  |

 $Diff = Err_{c} - Err_{l}$ 

The calculated *t*-statistic is about 6.99. Looking up the *t*-table, we know that the single-tailed *p*-value is about  $2.18 \times 10^{-6}$  and the two-tailed *p*-value is about  $4.36 \times 10^{-6}$ , both far less than the statistical significance threshold (here 0.05). This provides evidence that the improvement has statistical significance.

Note that all preceding analyses are based on one layer product yield. Considering the wide application of the CMP process in IC fabrication, the total yield, which is the product of multi-layer yields, will have an even larger difference between these two models. Taken together, we conclude that our linear model has significantly improved the yield prediction.

#### 6 Conclusions

Errors in yield prediction are generally assigned to the wrong selection of models for describing the defect and incorrect critical area calculations (Hess and Weiland, 1996). When there are a significant proportion of non-circular defects, traditional circular defect models will result in inaccurate critical area estimates and hence poor yield prediction. Considering the large proportion of scratches in defects introduced by the CMP process and its realistic outline, we propose a scratch-concerned yield model for IC manufacturing involved with the CMP process. A new linear defect model is introduced to model scratches. Based on the linear defect model, the defect limited yield caused by scratches has been calculated. This includes a different critical area extraction algorithm and a different defect density distribution compared with the traditional circular model. The total defect limited yield is then obtained by multiplying all defect limited yields.

Since the linear defect model has a higher correspondence with the real outline of scratches, the yield model based on it will result in a more accurate IC yield prediction than the traditional circular model. This has been confirmed by experimental results.

#### References

- Allan, G.A., Walton, A.J., 1997. Efficient Critical Area Estimation for Arbitrary Defect Shapes. Proc. IEEE Int. Symp. on Defect and Fault Tolerance in VLSI Systems, p.20-28.
- Allan, G.A., Walton, A.J., 1998. Critical area extraction for soft fault estimation. *IEEE Trans. Semicond. Manuf.*, 11(1): 146-154. [doi:10.1109/66.661294]
- Aytes, S.D., Armstrong, J.S., Mortensen, K.A., Mortensen, K.A., Russell, C.W., Ross, K.A., Giraud, J.E., Hooper, D.H., Alexander, H.M., Nelson, M.M., *et al.*, 2003. Experimental Investigation of the Mechanism for CMP Micro-scratch Formation. Proc. 15th Biennial Microelectronics Symp., p.107-109.
- Hess, C., Stroele, A.P., 1994. Modeling of real defect outlines and parameter extraction using a checkerboard test structure to localize defects. *IEEE Trans. Semicond. Manuf.*, 7(3):284-292. [doi:10.1109/66.311331]
- Hess, C., Weiland, L.H., 1996. Issues on the Size and Outline of Killer Defects and Their Influence on Yield Modeling. IEEE/SEML Advanced Semiconductor Manufacturing Conf., p.423-428.
- Huang, J., Chen, H.C., Wu, J.Y., Lur, W., 1999. Investigation of CMP Micro-Scratch in the Fabrication of Sub-quarter Micron VLSI Circuit. Proc. Chemical Mechanical Polishing–Multilevel of Interconnection Conf., p.77-79.
- Jung, S.M., Uom, J.S., Cho, W.S., Bae, Y.J., Chung, Y.K., Yu, K.S., Kim, K.Y., Kim, K.T., 2001. A Study of Formation and Failure Mechanism of CMP Scratch Induced Defects on ILD in a W-damascene Interconnect SRAM Cell. IEEE 39th Annual Int. Reliability Physics Symp., p.42-47.
- Khare, J.B., Maly, W., Thomas, M.E., 1994. Extraction of defect size distributions in an IC layer using test structure

data. IEEE Trans. Semicond. Manuf., 7(3):354-368. [doi:10.1109/66.311339]

- Lauther, U., 1981. An O(NlogN) Algorithm for Boolean Mask Operations. Proc. 18th Design Automation Conf., p.555-560. [doi:10.1109/DAC.1981.1585410]
- Luo, J.F., Dornfeld, D.A., 2004. Integrated Modeling of Chemical Mechanical Planarization for Sub-micron Integrated Circuit Fabrication. Springer, NY, USA.
- Maeda, S., Oka, K., Shibata, Y., Yoshida, M., 2001. Defect Inspection Method and Apparatus. US Patent No. 6169282B1.
- May, G.S., Spanos, C.J., 2006. Fundamentals of Semiconductor Manufacturing and Process Control. John Wiley & Sons, Inc., Hoboken, New Jersey. [doi:10.1002/04717 90281]
- Ollendorf, H., Cabral, S., Fuller, R., 2004. Reduction of CMP μ-Scratch Induced Metal Shorts by Introduction of a Post CMP Tungsten Plasma Clean Process in a High Volume DRAM Manufacturing Environment. IEEE Advanced Semiconductor Manufacturing Conf. and Workshop, p.5-8.
- O'Mahony, M., 1986. Sensory Evaluation of Food: Statistical Methods and Procedures. CRC Press, FL, USA, p.487.
- Park, S.W., Kim, S.Y., 2001. Reduction of Micro-defects in the Inter-Metal Dielectrics (IMD) Chemical Mechanical Polishing (CMP) for ULSI Applications. Proc. Int. Symp. on Electrical Insulating Materials, p.63-66.
- Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P., 1992. Numerical Recipes in C: the Art of Scientific Computing. Cambridge University Press, Cambridge, UK, p.616.
- Shankar, N.G., Zhong, Z.W., 2005. Defect detection on semiconductor wafer surfaces. *Microelectron. Eng.*, 77(3-4):337-346. [doi:10.1016/j.mee.2004.12.003]
- Sheng, Z., Xie, S.Q., Pan, C.Y., 2008. Probability Theory and Mathematical Statistics. China Higher Education Press, Beijing, China, p.129-136 (in Chinese).
- Skumanich, A., Cai, M.P., 1999. CMP process development based on rapid automatic defect classification. SPIE, 3743:76-88. [doi:10.1117/12.346901]
- Stapper, C.H., 1983. Modeling of integrated circuit defect sensitivities. *IBM J. Res. Devel.*, 27(6):549-557. [doi:10. 1147/rd.276.0549]
- Stapper, C.H., 1984. Modeling of defects in integrated circuit photolithographic patterns. *IBM J. Res. Devel.*, 28(4): 461-475. [doi:10.1147/rd.284.0461]
- Walker, H., Director, S.W., 1986. VLASIC: a catastrophic fault yield simulator for integrated circuits. *IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst.*, 5(4):541-556. [doi:10.1109/TCAD.1986.1270225]
- Zimmerman, D.W., 1997. A note on interpretation of the paired-samples t test. J. Educ. Behav. Stat., 22(3):349-360. [doi:10.2307/1165289]