

Abstract: When IC devices are produced and shipped to end customers, it is important that they will function as specified in the application environment. This paper outlines strategies and practices used to statistically sample, and predict how a device will operate over time. The practices outlined are believed to be best in class techniques for a successful product launch. These strategies most likely will point to sensitivities in devices that cause intermittent failures or process weaknesses which cause hard failures. If all of the outlined methods are not done in the pre-production phase, it may be necessary for failures to be analyzed later to prevent such occurrences in the future.

## **Design For Test (DFT):**

At eSilicon, we insert Design for Test (DFT) structured logic to detect silicon defects and reduce DPPM levels. We develop a composite chip test coverage for DFT and other test methods employed. The composite chip coverage helps to predict expected DPPM levels before measured data is obtained, and can help to find coverage "holes" in the chip design before tapeout. High test coverage from DFT is important as it is in the exponent of the DPPM equation. Relatively small changes in the high end of test coverage can result in large changes for DPPM.

We also add DFT logic to actively improve the IC yield. This is typically employed with memory Built-In Self-Test (BIST) and fuse based repair. In this scenario, we detect faulty memory bit-cells with memory BIST, and then map in new memory cells with laser-blown fuses. Memory repair can result in dramatic improvements in yield, which is reflected in the IC cost.

**Device and Process Characterization**: Fully characterizing a device in the fabrication process is typically used to understand how fab process corner conditions interact with the IC design, ATE test and application function. Units are obtained from each fab process corner extreme and the center point or typical process conditions expected and tested across desired voltage and temperature ranges. This methodology captures the edge boundaries of the wafer fabrication process to statistically map the possible variables experienced in the wafer fabrication process in production.



## **Example of process characterization analysis:**



**CSilicon** "Strategies to Prevent IC Failures in Volume Production" 041806\_rev 4.0 By completing the step of characterization, device and fab process corner factors are evaluated in relation to voltage and temperature in both the application environment and the ATE test environment. If this step is omitted from a production launch the PVT sensitivities may be missed which will impact the production ramp. It is possible that without this characterization that yield and performance may not be optimally centered for the production ramp resulting in yield loss or unstable yield. At this stage a full understanding of the ATE test limits are understood and the test program limits are finalized.

**Customer Application Correlation:** At the point in which the ATE test process is defined and parameters set, it is also important that a thorough correlation is completed in the application setting. It is recommended that units from all process corners be evaluated and a true correlation between the ATE test program and application functions be performed.

When this process is completed, if there are any units that pass the ATE test and fail in the application when exposed to other factors, it is important to understand potential areas of ATE test coverage that can be improved. If there are areas that the ATE environment is not able to duplicate, it is important to understand and implement any potential parameters that may be linked to an application function if possible.

**Product/Process Qualification:** To fully understand the reliability of the wafer fabrication, package assembly and device design, reliability tests are performed. These tests are designed to test the robustness of the combinations of manufacturing processes, product and design. A series of tests are outlined and performed to emulate the device under stress over a period of time.

Upon completion of these tests, predictions can be made about how well a device will perform as it is aged in a system or application environment. Examples of these tests are High Temperature Operating Life, ESD, Latch Up, Temperature Cycling, Highly Accelerated Stress Test and many other tests depending on the end application market. Conditions are set during these tests to best duplicate the stress a device will endure during the application operation and life expectancy predictions are made based on the ability to meet such testing conditions.

| APPLICABLE QUALIFICATION TESTS $\downarrow$ |              |             |                          |            | TEST STRESS/<br>DURATION |          | RESULT    |
|---------------------------------------------|--------------|-------------|--------------------------|------------|--------------------------|----------|-----------|
| STRESS/TEST DESCRIPTION                     | TEST<br>CODE |             | JEDEC TEST<br>CONDITIONS | SS/<br>REJ | COND.                    | FULL     | PASS 🖂    |
| <b>Electrical Characterization</b>          | EC           | $\square$   | -40/0/25/75/85°C         | N/A        |                          |          | Completed |
| High Temperature Operating Life             | HTOL         | $\square$   | 125°C, 1.1 × Vdd         | 80/0       | 500 hrs                  |          | PASS 🖂    |
|                                             |              |             |                          |            |                          | 1000 hrs | PASS 🖂    |
| Human Body Model ESD                        | HBM          | $\square$   | ±2000V                   | 5/0        | ±2000 Volts              |          | PASS 🖂    |
| Machine Model ESD                           | MM           |             | ±300V                    | 5/0        | ±300Volts                |          | PASS 🖂    |
| Device Latch-Up                             | DLU          | $\square$   | ±200mA                   | 5/0        | ±200milliAmps            |          | PASS 🖂    |
| Preconditioning                             | PRE          | $\square$   | Level 3, 220°C           | 345/0      | 3 cycles                 |          | PASS 🖂    |
| Moisture Sensitivity Level                  | MSL          | $\boxtimes$ | Level 3, 220°C           | 25/0       | 3 cycles                 |          | PASS 🖂    |
| Unbiased HAST                               | UHST         | $\boxtimes$ | 130°C/85%RH              | 80/0       | 100 hours                |          | PASS 🖂    |
| Temperature Cycling                         | TMCL         |             | -65°C to +150°C          | 80/0       | 100 cyl                  |          | PASS 🖂    |
|                                             |              |             |                          |            |                          | 1000 cyl | PASS 🖂    |
| High Temperature Storage Life               | HTSL         |             | 150°C, unbiased          | 80/0       | 500 hrs                  |          | PASS 🖂    |
|                                             |              |             |                          |            |                          | 1000 hrs | PASS 🖂    |
| Supplier Provided Data (On file)            | N/A          | $\boxtimes$ | JEDEC/MIL                | (file)     | YES                      | YES      | PASS 🖂    |
| Other Public Generic Data (On file)         | N/A          | $\boxtimes$ | JEDEC/MIL                | (file)     | YES                      | YES      | PASS 🖂    |

501 Macara Ave, Sunnyvale, CA 94085



The ATE test environment is used before and after the reliability stress tests and environmental exposure extremes to ensure the device performance is in tact and functional. It is important at this phase of pre-production that the ATE test program be "Production Ready" to prevent discrepancies at production release.

**Intermittent Failure Conditions:** The necessary and sufficient information is needed about a failure that establishes a strong relationship between failure characteristics and the application environment. In most cases IC devices are tested in an ATE environment that best duplicates worst case of the environment in which they are expected to operate when possible.

When a device displays a failure that is not catastrophic and has some intermittency, it can be very difficult to reproduce and understand. There are preventative measures and techniques used at the pre-production phase that assist in the correlation between the ATE test environment and the system application.

This necessary and sufficient information could include test data, IV-curves, schmoo plots, parametric data logs, environmental history, etc. and therefore could be either electrical or physical in nature. The scope of application may be time-based, lot-based, package-based, design-based, application setting based etc. It is common that exposure to temperature or voltage can induce or duplicate an application failure. By completing an ATE test margin review of the parameter of interest may explain how a device could pass the ATE or application environment sometimes, and fail others.

If a device exhibits an intermittent failure mode, it is unlikely that physical damage will be seen on a device using traditional FA techniques. It is also possible that a parameter in the ATE test program may not have adequate test coverage to fully duplicate an application condition. Again in these cases correlation between ATE test and the application become very important.

Valid or Hard Device Failures: If at any time there is a failure that causes the device not to operate at all in the application or in the ATE test environment, it is important to understand the mechanisms that cause the failure. This method of analysis by inference may be applied to customer returns, failures from quality conformance testing, reliability failures, qualification failures, and devices from engineering experiments and yield issues.

Using Failure Analysis does not necessarily imply that the root cause is known or understood, nor does it necessarily imply that a corrective action will or should take place, but FA can be used in conjunction with other programs that address these needs when a hard failure is seen. Many physically destructive tests are available at laboratories that can identify where there is breakdown in the silicon or circuit that would cause a failure.

Examples of Failure Analysis methods are:

- Basic failure mode verification: Electrical Test
- Non-destructive inspection
  - External Visual Inspection (EVI)
  - Real-Time X-Ray (RTX)
  - Scanning Acoustic Microscopy/Tomography (SAM/SAT)
  - Mechanical/chemical decapsulation/delid

eSilicon Corporation Confidential Page 3 of 4

501 Macara Ave, Sunnyvale, CA 94085



- "Strategies to Prevent IC Failures in Volume Production" 041806 rev 4.0
- Internal visual inspection (IVI)
- Basic defect visual localization
- Basic defect visual characterization (if exposed)
- Electrical overstress/electrostatic discharge analysis (EOS/ESD)
- Root cause analysis (RCA)
  - Final FA report including corrective action (CA) initiation

## **Examples of failure analysis performed:**

## Figure #1: BGA interconnect with a printed circuit board failure:



Solder pad



Photos 2 & 3 De-cap and SEM analysis of wire bond and solder ball failures



Many more extensive failure analysis techniques are used if the root cause can't be identified using these, most failures can be seen without using more expensive and time consuming techniques.

**Summary:** eSilicon partners with Customers to ensure a mutually successful production ramp with minimal risk to yield, application or field failures. In order to fully understand the process, product and application variables, all of the aforementioned process steps are required. eSilicon strongly believes that all of the production release steps are essential to fully understand the design or process limitations, risks, and product performance aspects, otherwise the risks of failure are high.

Releasing a device to volume production is an important step in gaining market share and credibility with the customer base. Introduction of a product to the field without any performance, delivery or reliability problems greatly increase the success of the program launch. The strategies outlined in this paper, if followed, will lead to a successful launch to production with minimal risk and maximum product performance.

April 2006 Publication Author, Donna Black Sr. Director, Corporate Quality & Reliability eSilicon Corporation eSilicon Corporation Confidential Page 4 of 4