How TSMC Keeps Getting Better

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in 2021 tsmc fabbed about 12 to 13 million 300 millimeter wafers assuming each die is about 100 square millimeters large that is about 8 billion chips 8 billion semiconductor manufacturing is the most sophisticated unforgiving high volume production technology that has ever been done successfully you need a lot of practice and the more chips that tsmc makes the better it gets at it in this video i want to talk about how fab like tsmc improves their operations how they think about yield speed up throughput and in general get better at wafer fabrication but first let me thank our sponsors at the agenometry newsletter i write a lot of exclusive content for it this includes profiles on taiwan's startup space and exclusive videos like this one here the link is in the video description below i try to put one out every week maybe two alright back to the show wafer manufacturing is hard to break it down there are mask layers where a mask is used to reproduce a pattern onto a substrate each mask layer requires about 15 to 25 process steps these steps include oxidation photolithography etching ion implantation washing and so on a simple 180 nanometer process chip which was leading edge some 20 years ago has about 20 mask layers that means a single wafer can take up to 600 steps remember this is a chip from like 20 years ago today's leading edge processes which tsmc calls n14 n16 and n10 can have up to 60 70 and even 80 mask layers you can pass bad wafers to your customers bad dies that cannot be repurposed somehow have to be thrown away with its cost distributed amongst the good dies so if your yield is 50 percent then the impact is that the cost per good unit is twice as high as it possibly can be broadly speaking yield is defined as the fraction of total input transformed into a shippable output but that is composed of several components for instance line yield is the fraction of wafers that make it to the final wafer electrical test die yield is the fraction of dies on good waivers that make it through to the assembly and final testing stage and final test yield is what makes it through that assembly and final testing stage independent fabs have the most direct control over the first two line and die yield maintaining good final yields across some 500 to 600 steps is super hard defects need to be tested and detected as fast as possible so that technicians can quickly respond to them and fix the line before more wafers have to be scrapped on a process with 400 steps even if each step has an average individual yield of 98 the die yield of your wafer will be just 0.03 percent which means 3 good dies out of 10 000. you need 95 percent one tsmc customer a chinese company called kanan says that their yields using tsmc's 16 nanometer process node were about 95 to 96 over the past few years tsmc 16 nanometer has some 60 wafer layers which means about 800 to 1200 steps so imagine doing something that has to succeed 99 point whatever percent of the time 800 to 1200 times at high speed it gives you a sense of the challenge these fabs are grappling with so what causes yields to decline it's a billion dollar question the reality is that almost anything can affect yields in a negative way here i will go through just a few of my favorites there is the concept of killer particles these are anything large enough to cause defects in semiconductor production in the same way dust on a negative causes a visual defect they can come from anywhere defects in the starting materials or ultra pure water for instance certain processes like sputtering and plasma etch can generate killer particles too impacting the overall yield as process nodes and feature sizes advance the bar for what defines a killer particle ratchets lower recent discussions in the ultra pure water industry are implying a killer particle size under 10 nanometers which is roughly the diameter of dna these particles can attract static charges causing additional direct damage to the wafer these electrostatic discharges are just 1 to 10 nanoseconds long but can cause melting damage if the static charges reach and damage the mask which as i mentioned contains the chip design pattern then every silicon wafer produced thereafter is damaged too a very big deal lastly to close my absolute favorite yield flaw stochastic effects in uv lithography uv has made possible feature sizes so small that foundries have to start considering the probabilistic nature of atomic scale interactions at this scale there is only a probability that a photon will be absorbed at a certain location microbridges lines where spaces should be and broken lines spaces where lines should be are common printing errors largely due to such randomness these have been observed before particularly in immersion lithography but were not a major concern back when the feature sizes were larger no longer the case when talking about sub 10 nanometer process nodes modern foundries use computational lithography techniques to compensate and print better features this software has to now account for these probabilistic interactions for instance older versions were programmed to assume discrete edges now they have to be reprogrammed to replace those edges with gradual approximations on a continuum very cool stuff that i should do a video about in the future no foundry starts off a new process node at 95 percent they have to get there through a process called yield learning this is where the foundry goes through and eliminates individual sources of faults one by one if you were to plot time against yield you get a basic curve this is how you want it to ideally go at the start your yields are at about 20 to 50 percent you struggle to figure out what is going on for a while until you figure it out then you rapidly get to an acceptably high number which plateaus thereafter the time it takes to get to that high plateau number is called the time to yield the shorter it is the better and the speed is directly correlated to foundry profitability here is why the price of a product gradually decreases after it is released to the market so if you can ramp up the yield faster than the price of the product declines that is when you earn the highest profits per an older research paper tsmc seeks to speed up its learning rate for advanced nodes about 18 a year and their mature nodes by 4 to 12 percent as semiconductors have gotten more complicated the process of yield learning has gotten harder early on most fabs relied on experienced engineers eyeballing the data doing experiments and accordingly adjusting the recipes based on the eyeball analysis for instance motorola semiconductors once had a situation where wafer yields on a mature process subtly degrade by five percentage points a significant deterioration for no reason before mysteriously recovering these flawed wafers can only be detected in final testing due to the sensitivities required it took five years and 30 experiments to figure out what was going wrong various teams exactly reproduced the wafer fab line brought in scanning electron microscopes and more it took sheer luck in a four factor experiment to discover what eventually led to a faulty emitter pipe yield learning as it is today has become a far more data and automation-heavy effort the whole fabrication process generates and collects millions of pieces of data each day finding patterns in this data with its hundreds of attributes have scaled far beyond traditional statistical methods thus engineers have been throwing machine learning models on top of this data to help find new improvements for instance let's take the photolithography process it has many input parameters that are measured examples might include the time spent dunking the wafer into a specific chemical agent or how long you spend mixing some other chemical prior to its usage this process inevitably results in faults like hot spots and overlays but the correlation between them and the inputs are non-linear and difficult to uncover without a more sophisticated tool this is where the machine learning model is employed to help refine the photolithography recipe and achieve optimal results such big data methods would have helped pinpoint the aforementioned motorola flaw in a few weeks rather than the five plus years it took using traditional experiments thus tsmc employs hundreds of it and ai engineers to run such data analyses there is a great deal of physical automation inside the fab the less the material is handled by a human person the less likely killer particles can get into the location we get a lot of photographs of people in clean rooms but nowadays people only go inside to do maintenance robots have replaced all the people largely due to cleanliness and weight issues over the years wafers have become larger the current standard wafer is 300 millimeters wide roughly a foot they are stored in things called a front opening unified pod which i will refer to as foops fully loaded to its 25 wafer capacity a foop weighs about 19.8 pounds since it is physically difficult for an average semiconductor human worker to move a fully loaded foop fabs have to be configured to have robots do it the robot takes the foop to a machine bay usually via an overhead railway system the machine opens up the foop and processes the wafers inside then it loads up the foop and closes it the robot then takes it to the next machine most of these machines are supplied from where else japan muratek and daifuku which sounds like a mochi but isn't are the leaders in machine tool technology and automation for your clean room studies have shown that the most successful fabs have the most useful automation systems inside their clean rooms having the flexibility to use robots to adjust the manufacturing process on the fly is critical in maximizing speed and throughput fabs measure how fast you can work through a single mask layer using a metric called cycle time tsmc is obsessed with it i briefly talked about this in a prior video but they care for it more than the actual yield this is because simply accelerating the cycle time allows for more work in progress turns more data a faster learning rate and thus faster time to yield in the late 1990s tsmc pioneered a dispatching system to maximize its equipment utilization rate and speed up cycle time the system considers technical and business factors to decide which wafers to send to a station and when a simple technical example involves a furnace and a wet bench not a real bench but an automated tool for wet chemical processing the wet bench can batch four foots at once so when the furnace is about to finish it coordinates the robots and wet bench to prepare for a rival trying to figure out the when part is challenging tsmc uses a remarkable theorem in queueing theory called little's law it is expressed as n equals lambda times t where n the long term average inventory level of a queuing system is the product of lambda the arrival rate of the wafers and t the cycle time the company is able to use a derivation of this formula to calculate the remaining cycle time and optimize or adjust their automated dispatch times accordingly tsmc and its customers are in a race to deliver good product to the market before the profit window closes countless fabulous firms make it clear that their number one requirement isn't price but rather lead times tsmc is able to leverage its customers huge volume to generate more data and optimizations than any of its competitors each new customer who joins the roster adds to tsmc's edge in data volume and yield learning while simultaneously sharing in its operational benefits scale matters tsmc discusses their time to yield progress for certain leading edge nodes as part of their annual technology symposiums the last one we got to see was n5 and that was looking quite well likely generating a great deal profit for the foundry next up is n3 dubbed the three nanometer process based on recent rumors this one is having some issues it is kind of fascinating to think about the countless hundreds of little experiments adjustments and improvements the company is running right now as fast as it can in order to get to an acceptable plateau alright everyone that's it for tonight thanks for watching subscribe to the channel sign up for the newsletter and i'll see you guys next time
Info
Channel: Asianometry
Views: 123,523
Rating: undefined out of 5
Keywords:
Id: -DCZsT2plw8
Channel Id: undefined
Length: 14min 3sec (843 seconds)
Published: Sun May 29 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.