Six Sigma Full Course | Six Sigma Explained | Six Sigma Green Belt Training | Simplilearn

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hello guys welcome to this six sigma full course video by simply learn in this video we'll cover everything you'll need to know about six sigma and cover the major concepts of lean six sigma we will start out with an animated video covering each of these concepts and go into more detail on the dmeic process we will also cover the different six sigma builds and much more so without further ado let's jump right into it before we do that don't forget to get subscribed to our youtube channel and don't forget to hit that bell icon so that you can never miss an update from simply learn now over to our training experts imagine you've been tasked with a really important project at work the company you're working for produces luxury cars the production numbers are going down and a lesser number of cars are getting manufactured each day there also seems to be an issue with the quality of the windshield wipers that go on these cars the question you are faced with is there a way for the company to stop the stall in production and increase the production per day from 1 000 to 2 000 also is there a way to find out what's causing the drop in the wiper quality there is six sigma six sigma gives you the tools and techniques to determine what's making the manufacturing process slow down how you can eliminate the delays improve the process and fix further issues along the way the concept was introduced in 1980 by bill smith while working for motorola since then six sigma has seen worldwide adoption six sigma aims to reduce the time defects and variability experienced by processes in an organization thanks to six sigma you can produce a defect-free product 99.9996 percent of the time allowing only 3.4 errors per 1 million opportunities six sigma also increases customer loyalty towards the brand improves employee morale leading to higher productivity six sigma has two major methodologies dmaic and dmadv let's look at the first methodology dmaic is an acronym for define measure analyze improve and control let's have a look at each of these stages individually and how it relates to your earlier problem in the define phase you determine what issues you're facing what your opportunities for improvement are and what the customer requires of you here you look at the process as a whole and determine the issues with the manufacturing process in this case finding out why the cars had varying windshield wiper quality and how to optimize the current process to manufacture more cars in the measure phase you determine how the process is performing currently in its unaltered state you determine the current number of cars that are manufactured in a day in the current scenario 1000 cars are manufactured in a day and each of these cars are outfitted with a pair of windshield wipers by one of 30 machines used some of the metrics measured are how many cars are produced in a day time taken to assemble a car how many windshield wipers were attached in a day time that takes them to do so defects detected from each machine on assembly completion and so on following this in the analyze phase you determine what caused the defect or variation on analyzing previous data you find out that one of the machines that installed the windshield wiper was not performing as well as it was supposed to production was taking longer since the car chassis was being moved across the different locations slower as cranes had to individually pick and drop the frame this was because the wheels were attached to the car only in the last stage next in the improve phase you make changes to the manufacturing process and ensure the defects are addressed you replaced the faulty machines that installed the windshield wiper with another one you also find a way to save time by attaching wheels on the frame in the initial stages of the manufacturing process unlike how it was done earlier now the car can be moved across the assembly area faster and finally in the control phase you make regular adjustments to control new processes and future performance based on the changes made the company was able to reduce production time and manufacture about 2 000 cars a day with a higher quality of output dmaic is one of the most commonly used methodologies in the world it focuses on improving the existing products of the organization the second methodology is dmadv which is short for define measure analyze design and verify it is used when the company has to create a new product or service from scratch it is also called dfss or design for six sigma let's take the scenario where the company decides to build a new model a sports car in the define phase you define the requirements of the customer based on inputs from customers historical data industry research you determine what you need to ensure your car becomes a success the data collected indicates customers are drawn to cars which can achieve more than 150 miles per hour customers are also more inclined towards cars which have v6 engines and an aerodynamic frame then in the measure phase you use the customer's requirements to create a specification this specification helps define the product in a measurable method so that data can be collected and compared with specific requirements some of the major specifications that you focus on are the top speed engine type and type of frame in the analyze phase you analyze the product to determine whether there are better ways to achieve the desired results areas of improvement are determined and tested based on the analysis of the prototype created in this phase you find that the product satisfies just about all of the customer requirements except the top speed so research begins on an aluminum alloy that could possibly meet the speed requirements of the customer following this the design phase based on the learnings from the analysis phase the new process or product is designed revisions are made to the model and the car is manufactured with the new material the analysis phase is repeated based on the new design you also bring a focus group and see how they receive it based on their feedback further changes are made and finally in the verify phase you check whether the end result meets or exceeds customer requirements once you launch your brand new sports card you collect customer feedback and incorporate it into future designs and guess what your customers are loving the new design and that is dmadv for you six sigma has also found success in a number of different industries the petrochemical healthcare banking government and software are some of the industries that have utilized the concepts of six sigma to achieve their business goals another commonly used methodology adopted by companies around the world is lean lean is a methodology that aims to remove any part of the process that does not bring value to the customer it means doing more with less while doing it better the philosophy behind lean comes from the japanese manufacturing industry by bob hartman who at the time was part of toyota since then across the world services and manufacturing organizations have incorporated lean within their businesses but what if you could have the best of both worlds a combination of both six sigma and length that's lean six sigma before we get into lean six sigma we've got a quiz for you what methodology is used by companies when they want to release a new product created from scratch a design measure analyze define verify b define measure analyze design verify c demand measure analyze improve control d define measure analyze improve control leave your answers in the comments section below for a chance to be one of three people to win an amazon voucher now let's get back to lean six sigma lean six sigma is a methodology that focuses on eliminating problems removing inefficiencies and waste while improving the working conditions to ensure the customer's needs are better satisfied it combines the tools methods and principles of lean and six sigma we'll have another video detailing the process of lean six sigma very soon imagine you're the manager of a supermarket chain you've noticed that two things need your immediate attention the first issue is how to handle the different kinds of waste that you encounter at your supermarket the next one requires you to address the supply chain issues at the supermarket which are causing delays to the morning delivery of milk leading to customer dissatisfaction and attrition these problems can be solved by incorporating two of the most popular quality management methodologies in the world lean and six sigma one famous for its ability to handle waste and another known for process improvement but what if there was a methodology that combined the concepts of both six sigma and lean one that could solve all your issues well there is lean six sigma before we dive into lean six sigma let's take a closer look at its parent methodologies first off lean is a methodology that focuses on providing value to the customer eliminating waste continuous improvement reducing cycle time lean and six sigma both aim to handle waste but what is this waste waste is any step or action in the process that a user does not gain any value from in short things that users wouldn't want to pay for why would a consumer want to pay extra for the additional truck that was required to deliver milk to the supermarket just because the other one broke down this waste can be divided into eight categories let's have a look at each of them one transportation this waste refers to the excess movement of people tools inventory equipment and other components of a process than it is required two inventory this waste occurs due to having more products and materials than required this can cause damage and effects to products or materials greater time for completion inefficient allocation of capital and so on three motion this refers to the time and effort wasted due to unnecessary movement of people equipment or machinery this could be sitting through inventory double data entry and so on 4. waiting this can be time wasted waiting on information instructions materials or equipment 5. over-production this is the waste created due to producing more products than required 6. it refers to more work more components or more steps in a product or service than required seven defects this is the waste originating from a product or service that fails to meet customer expectations 8. skills this waste refers to the waste of human potential under utilizing capabilities and delegating tasks to people with inadequate training for years now many systems have emerged that use the lean methodology to identify and handle the different kinds of waste some of the more popular and effective ones are jit or just in time 5s and kanban the jit methodology focuses on reducing the amount of time the production system takes to provide an output and the response time from suppliers to customers 5s is another methodology that focuses on cleanliness in organization while improving profits and efficiency kanban is also another popular methodology to achieve lean it is a visual method to manage tasks and workflows kanban enables users visualize the workflow to identify issues in the process and fix them these methodologies help in optimizing the waste production and are often used together to maximize results so that's the first problem solved now let's have a look at how you can improve the supermarket supply chain efficiency for that let's have a look at the other part of lean six sigma six sigma six sigma is a set of tools and techniques that are used for process improvement and removing defects let's see how six sigma makes that possible six sigma has two major methodologies dmaic and dmadv you can learn more about these two methodologies by checking out our six sigma in nine minutes video by clicking on the top right corner let's have a closer look at dmaic since lean six sigma uses the dmaic methodology of six sigma dmaic is an acronym for define measure analyze improve control it is used to improve existing products and processes so that it can meet the customer's requirements in the define phase you determine what the goals of the project are in this case you want to reduce the amount of time taken to deliver milk from the warehouse to the supermarket so that it is stocked on the supermarket shelves before 8 30 in the morning in the measure phase you measure the performance of the current unaltered process the milk truck leaves at 7 30 a.m in the morning and can take one of three routes a b and c route a is currently the preferred one as it takes only 60 minutes to reach the supermarket compared to the routes b and c which takes 70 and 80 minutes respectively in the analyze phase you find out why the defects exist since routes b and c were school bus routes by reducing the starting time by one hour at 6 30 instead of 7 30 meant avoiding the traffic routes b and c now take 40 to 45 minutes to reach the supermarket route a still takes the milk truck one hour to get to the supermarket even when the truck leaves at 6 30 a.m in the improved phase performance can be improved by addressing and eliminating the root causes now that you've realized that advancing the milk pickup by an hour and changing the route to route b can save time you change the process accordingly providing your workers with ample time to stock the milk into the shelves before the morning rush and finally in the control phase you make regular adjustments to control new processes and future performance you continue to monitor the delivery times and try out alternate routes to continually improve the process and ensure even faster delivery this process change led to reduced man hours and cost enhanced sales and customer retention the lean six sigma methodology offers many such benefits to businesses let's take a look at some of them one increase in profits two standardized and simplified process three reduced errors four employee development five value to customers and that is lean six sigma for you six sigma is a set of tools and techniques that have helped several companies around the world achieve business success hi guys i'm raut from simply learn and let's get started with our introduction to six sigma now let's understand this better with an example here let's talk about how things were before six sigma was introduced here jenny and james are having a conversation with each other jenny is james's manager and she's not happy at all she says james is in a lot of trouble this is because she found out that the customers weren't happy with the organization's service and the operational costs were way too high and as manager james had to make sure that this did not happen now let's have a look at the same scenario in present day here we have jenny congratulating james she's very impressed with his work but james says it's all thanks to six sigma methodology so jenny asks she wants to know more about six sigma so to understand six sigma here's what you need to know firstly we'll have to understand what is six sigma what its advantages are some of its methodologies what are the different roles in a six sigma team what is lean what is a lean process and what is lean six sigma so now let's get started with understanding what exactly is six sigma the six sigma methodology makes sure to find as well as eliminate any sort of defect or variation that could be affecting your product service or process now this methodology is statistics based is data driven and focused on continuous improvement now this means that there's no end goal in the horizon there's always another goal to reach there are three core ideologies behind six sigma the first one states that for any business to be successful there's continuous efforts that are required so that you can achieve stable as well as predictable process results the second ideology states that in any business or manufacturing process there are certain characteristics that can be defined measured analyzed and controlled the final ideology says that along with the rest of the organization the top level management plays a very important role to making sure that there's sustained quality now let's talk about the advantages of six sigma six sigma can help produce a road map or a path through which you can easily find and reduce any sort of organizational risk and reduce the operational costs another advantage is that it helps improve the efficiency of the process and making sure that it works in a timely manner it decreases defects improves the overall tracking and monitoring process and ensures that the products are aligned with the company's policies it is also reported that it greatly helps improve customer as well as vendor satisfaction it helps improve the cash flow and ensures that the products are complying with the regulations of the organization now let me tell you about the process of six sigma now six sigma projects are of basically two methodologies the dma ic and the dma dv now let's talk about dmaic in detail that's short for define measure analyze improve and control this is one of the most commonly used methodologies in the world this is commonly used by companies when they have to fix or improve an already existing product or process that does not meet the company's standards now let's have a look at the process first phase is the define phase in this phase you define the problem that the customers are facing you find out where you can improve and you clearly understand what the customers require of you the second phase is the measure phase now in this phase you actually identify how well the process is doing in its current unaltered state in the analyze phase you process the data that you get from the measure phase and determine what exactly is the cause of the delay or variation in the improve phase you start by making small changes to the business process and make sure that the problem you identified earlier is being taken care of and finally in the control phase you control the new process so that it doesn't go wrong and use the same knowledge for future processes now let's have a look at dmadv this is short for define measure analyze design and verify now this is also commonly known as dfss or the design for six sigma now this is commonly used by companies around the world when they have a new product that needs to be created all the way from scratch in the first phase which is the define phase you define what the goal of the project is and what the customers require of you in the measure phase you measure as well as determine what the customer needs and how they respond to your products in the analyze phase you perform analyses to determine how you can improve your product or service so they can better serve your customers in the design phase you set up process details and make optimizations to the design to make sure your customer is satisfied and finally in the verify phase you check how well the design is working out and how well it meets the customer's needs now before we go on let's talk about how six sigma was used in reference to the earlier example the situation that james was facing a survey conducted by the organization james was working for indicated that the customers weren't very happy with the organization so they decided to fix that with the help of six sigma so they decided that the dmaic methodology would be best suited to solve their problem so let's have a look at what they did firstly in the define phase they used a tool called the voice of the customer this tool represented the needs as well as requirements of the customer this showed that the customers expected prompt delivery the correct product selection and a knowledgeable distribution team from the company and now on to the measure phase the company wanted to know why the customers didn't like them so they performed some data collection from there they found out that they took 56 percent longer than other companies to deliver their product so they decided to reduce the amount of time it takes between order entry and the delivery of the product and now in the analyze phase here they knew what the issue was but they wanted to know what exactly made their products delivery so slow why were the customers receiving the product so late then they performed some analysis their analysis showed the possible causes it could have been inaccurate sales plans issues with their safety stock issues with their vendors delivery performance and falling behind on the manufacturing schedule further analysis also indicated that most of their sales almost 80 percent came from 30 of their products the issue was that they didn't have enough safety stock to satisfy the customers who required that 30 of products and now on to the improved phase so now that they knew what was causing their problem they wanted to solve it they began to have monthly reviews and tried to make sure that their in-demand products stayed in demand another thing that they wanted to focus on was to make sure that they could order and provide the customer with the products that they wanted and finally onto the control phase they began to set up plans so that they could monitor the sales of that 30 of products that were selling the most each year they would review how well a product was selling and replace it if it had fallen out of favor now let me tell you what a six sigma team consists of let's talk about the roles in a six sigma team first up is level seven now these are individuals who are at the novice level now these individuals don't know in great detail about what the project is but they have a basic understanding of the principles and the methodology behind the program now they usually support with smaller projects and with smaller issues but these individuals found the foundation for the people who decide where the program is going and now we're at level six now these are individuals who have a yellow belt certification now they are core members of the six sigma team who have an understanding of how the basic metrics work and how they can perform some sort of improvement now they have their own areas of expertise and they're required to determine certain processes that need to improve at the same time they're also in charge of smaller improvement projects now level five these are people who have a green belt certification now these individuals are usually part-time professionals who have a number of different duties to fulfill they focus on smaller six sigma projects they are usually involved with gathering data performing some sort of experiment and analyzing information they also assist with black belt projects and now we're at level four these are individuals who have a black belt certification they're usually team leaders of a six sigma project they complete four to six projects a year and are experts in the principles methodologies and lean concepts thanks to their understanding of statistical experimental design they can also understand the hidden reasons behind why a particular product failed and now we're at level three these are individuals who have a master black belt certification now these are individuals who are experts when it comes to resources the practices and the methodologies that are employed in six sigma their main emphasis is to coach train and certify black belts they also are involved with other six sigma leaders to ensure a company's goals are met now level two these individuals are called champions so they work really closely with the executives and usually have a role like a senior or a middle executive level role they also have a clear understanding of what exactly is the company's vision and mission they also understand metrics so that they can set up a six sigma project that lines up with the company's goals they're responsible for removing any sort of roadblock that could hamper the success of a project and finally we're at level one these are the executives now these individuals represent the highest level when it comes to a six sigma team now they have training as well as experience through which they can set up six sigma projects that clearly line up with the company's goals their main emphasis is to ensure that the project is able to add value to the organization and at the end of the day is successful now this is when jenny interjects she wants to know about lean john tells her that lean just like six sigma is another methodology so what exactly is lean now lean is a methodology that has a very important ideology to make sure that there's continuous optimization of the processes and there's an elimination of waste so what's waste so waste is basically any part of the process that the customer doesn't want to pay for it is a process that does not add any value to the customer now coming back to lean here are some of its characteristics whenever decisions are being made in a lean team the main emphasis is to understand how it exactly adds value to the customer every member in a lean team has a clear understanding of what exactly are the goals of the organization it also encourages employees to push for further success even if the organization is in a good place or is already doing well there's also an emphasis on cross-functional collaboration and communication lean focuses on answering the difficult question or the complex ones rather than employing short-term fixes and with lean you can easily prepare for issues that can come up in the future or improvise in unexpected circumstances so let's talk about how lean and six sigma are different from one another the lean methodology aims to reduce waste it does so by analyzing the workflow it also emphasizes on minimizing resource usage and improving customer value now let's talk about six sigma the aim of six sigma is to provide near-perfect results it wants to reduce costs and improve customer satisfaction basically both of them are moving towards the same goal to reduce the amount of waste and to create efficient processes now let's talk about the process of lean now there are five different steps let's start with the first one identifying value you need to identify value by determining what exactly is the problem you're trying to solve for the customer the second step is to map your value stream you need to map the workflow of your company you need to focus on the different actions and the people that are involved with the process you need to be able to identify which parts of the process are able to add value and the ones that don't the third step is to create a flow you need to break up your work into smaller silos and visualize the workflow so that you can easily identify problems that might show up later the next step is to establish pull you need to set up a system through which products are created only when there is a demand or a requirement for it through this you can optimize resource capacity and finally we're at the fifth step which is continuous improvement you need to ensure all your employees at all levels are involved in the continuous improvement of the process so what exactly is lean six sigma what if you could combine the best of both worlds the combination of six sigma and lean methodology led to the creation of lean six sigma lean six sigma is a methodology that aims to solve problems removes any form of waste or inefficiency and improving the working conditions of employees to make sure that they can serve the customers better now this is a combination of the tools methods and principles that are employed in lean and six sigma let's talk about some of its advantages it aims to provide customers with a better experience by streamlining the process with efficient power flows it aims to drive higher results it can reduce cost remove waste and prevent effects it can help the organization handle day-to-day problems the decreased lead times help increase capacity and profitability and finally it helps with people development and improving the morale of the organization this lesson provides an overview of the certified six sigma greenbelt or cssgb course a process is a series of steps designed to produce a product and or service according to the requirement of the customer a process mainly consists of four parts input process steps output and feedback input is something put into a process or expended in its operation to achieve an output or a result for example man material machine and management output is the final product delivered to an internal or external customer for example product or services it is important to understand that if the output of a process is an input for another process the latter process is the internal customer each input can be classified as controllable represented as c non-controllable represented as nc noise represented as n and critical represented as x the most important aspect of the process is the feedback as can be inferred from the image any change in the inputs causes change in the output therefore y equals f of x feedback helps in process control because it suggests changes to the inputs let us learn about the process of six sigma in the next screen let us understand how six sigma works in this screen six sigma is successful because of the following reasons six sigma is a management strategy it creates an environment where the management supports six sigma as a business strategy and not as a standalone approach or a program to satisfy some public relations need six sigma mainly emphasizes the dmacc method of problem solving focused teams are assigned well-defined projects that directly influence the organization's bottom line with customer satisfaction and increased quality being byproducts six sigma also requires extensive use of statistical methods the next screen will focus on some key terms used in six sigma let us look at the sigma level chart in this screen as discussed earlier the six sigma quality means 3.4 defects in 1 million opportunities or process with a 99.99966 percent yield the sigma level chart given on the screen shows the values for other sigma levels please take a look at the values carefully let us understand the benefits of six sigma in the next screen the organizational benefits of six sigma are as follows a six sigma process eliminates the root cause of problems and defects in a process sometimes the solution is creating robust products and services that mitigate the impact of a variable input or output on a customer's experience for example many electrical utility systems have voltage variability up to and sometimes exceeding a 10 percent deviation from nominal value thus most electrical products are built to tolerate the variability drawing more amperage without damage to any components or the unit itself using six sigma reduces variation in a process and thereby reduces waste in a process it ensures customer satisfaction and provides process standardization rework is substantially reduced because one gets it right the very first time further six sigma addresses the key business requirement six sigma can also be used by organizations to gain advantage and become world leaders in their respective fields ultimately the whole six sigma process is to satisfy customers and achieve organizational goals in the next screen let us understand six sigma and quality taking a process to six sigma level ensures that the quality of the product is maintained the primary goal of improved quality is increased profits for the organization in very simple terms quality is defined as the degree of excellence of a product or a service provided to the customer it is conformance to customer requirement if the customer is satisfied with the product or service then the product or service is of the required quality let us look at the history of quality in the next screen in the mid 1930s statistical process control spc was developed by walter shoehart and used extensively during world war ii to quickly expand the us's industrial capabilities spc is the application of statistical techniques to control any process walter shoehart's work on the common cause of variation and special cause of variation assignable has been used proactively in all six sigma projects the approach to quality has varied from time to time in the 1960s there were quality circles which originated in japan it was started by kaoru ishikawa quality circles were self-improvement groups composed of small number of employees belonging to a single department quality circles brought in improvements with little or no help from the top management in 1987 iso 9000 was introduced iso stands for international organization for standardization iso 9000 is a set of international standards on quality management and quality assurance to help organizations implement quality management systems iso 9000 is still in effect baldrige award now known as the malcolm baldrige national quality award was developed by the u s congress in 1987 to raise awareness of quality management systems as well as recognize and award u s companies that have successfully implemented quality management systems in 1988 another quality approach was developed known as benchmarking in this approach an organization measures its performance against the best organizations in its field determines how such performance levels were achieved and the information is used by the organization to improve itself then in the 1990s there was the balanced scorecard approach it is a management tool that helps managers of all levels to monitor their results in their key areas so that one metric is not optimized while another is ignored during the year 1996 through 1997 an approach known as re-engineering was developed this approach involved the restructuring of an entire organization and its processes integrating various functional tasks into cross-functional processes is one of the examples of re-engineering in the next screen let us find out about the quality gurus and their contribution to the field of quality let us focus on six sigma and the business system in this screen business systems are designed to implement a process or a set of processes a business system ensures that process inputs are at the right place and at the right time so that each step of the process has the resource it needs a business system design should be responsible for collecting and analyzing data so that continual improvement of its processes products and services is ensured a business system has processes sub processes and steps as its subsets human resources manufacturing and marketing are some examples of processes in a business system six sigma improves a business system by continuously removing the defects in its processes and also by sustaining the changes a defective item is any product or service that a customer would reject a customer can be the user of the ultimate product or service or can be the next process downstream in the business system let us learn about six sigma projects and organizational goals in the following screen let us understand the structure of the six sigma team in this screen there are totally five levels in the six sigma screen the first level consists of the top executives of the organization these people lead change and provide direction as they own the vision of the organization for any improvement initiative to work it is important that top management of the organization be actively involved in its propagation the top executives own the six sigma initiatives next in the level are six sigma champions they identify and scope projects develop deployment and strategy and support cultural change they also identify and coach master black belts three to four master black belts work under every champion six sigma master black belts train and coach black belts green belts and various functional leaders of the organization they usually have at least three to four black belts under them the fourth level in six sigma structure is six sigma black belts they apply strategies to specific projects and lead and direct teams to execute projects finally there are six sigma green belts they support the black belt employees by participating in project teams green belts play a dual role they work on the project and perform day-to-day jobs related to their work area in the next screen we will understand the drivers and metrics of organizational strategy while financial accounting is useful to track physical assets the balanced scorecard or bsc offers a more holistic approach to strategy implementation and performance measurement by taking into account perspectives other than the financial one for an organization traditional strategic activities that concentrate only on financial metrics are not sufficient to predict future performance they are not sufficient to implement and control the strategic plan either bsc translates the organizational strategy into actionable objectives that can be met on an everyday basis and provides a framework for performance measurement the balanced scorecard helps clarify the organizational vision and mission to workable action items to be carried out and measured it also provides feedback on both internal business processes and external outcomes by doing so it enables continuous improvement in strategic performance toward achieving organizational goals the balanced scorecard achieves all this by integrating the organizational strategy with a limited number of key metrics from four major areas of performance finance customer relations internal processes and learning and growth many organizations in the world use balanced scorecard approaches and the number is increasing every day in the next screen we will describe the balanced scorecard framework we will learn about developing a balanced scorecard in this screen while applying the balanced scorecard in an organization care must be taken to account for interactions between different perspectives or strategic business units and avoid optimizing the results of one at the expense of another to outline the strategy a top-down approach is followed by determining the strategic objectives measures targets and initiatives for each perspective the strategic objectives refer to the strategy to be achieved in that perspective three or four leading objectives are agreed upon the progress towards strategic objectives is assessed using specific measures these measures should be closely related to the actual performance drivers this enables effectively evaluating progress high level metrics are linked to lower level operational measures the target values for each measure are set the initiatives required to achieve the target value are identified as already mentioned this exercise is carried out for all the perspectives finally the scorecard is integrated into the management system in the next screen let us understand the change in the approach to the balance score card from the four box model of bsc to strategy maps in earlier approaches to the balanced scorecard the perspectives were presented in a four box model this kind of scorecard was more a comprehensive glance at the key performance indicators or metrics in different perspectives however the key performance indicators or metrics of different perspectives were viewed independent of each other which led to a silo based approach and lack of integration however modern scorecards place the focus on the interrelations between the objectives and metrics of different perspectives and how they support each other a well-designed balanced scorecard recognizes the influence of one perspective on another and the effect of these interactions on organizational strategy to achieve the objectives in one perspective it is necessary to achieve the objectives in another perspective in short the four perspectives form a chain of cause and effect relationships a map of interlinked objectives from each perspective is created these objectives represents the performance drivers which determine the effectiveness of strategy implementation this is called a strategy map the function of a strategy map is to outline what the organization wants to accomplish and how it plans to accomplish it the strategy map is one page view of how the organization can create value for example financial success is dependent on giving customers what they want which in turn depends on the internal processes and learning and growth at an individual level in the next screen we will look at the impact of the balanced scorecard on the organization the balance scorecard and strategy map force managers to consider cause and effect relationships which leads to better identification of key drivers and a more rounded approach to strategic planning the bsc enables the organization to improve in the following ways being a one-page document a strategy map can easily be communicated and facilitates understanding at all levels of the organization an organization is successful in meeting its objectives only when everyone understands the strategy the balance scorecard also forces an organization to measure what really matters and manage information better so that quality of decision making is higher creating performance reports against a balanced scorecard allows for a structured approach to reporting progress it also enables organizations to create reports and dashboards to communicate performance transparently and meaningfully as expected a balanced scorecard helps an organization to better align itself and its processes to the strategic goals outlined in the bsc the overall objectives of the bsc can be cascaded into each business unit to enable that unit to work toward the common organizational goal all the activities of the organization such as budgeting or risk management are automatically aligned to the strategic objectives to conclude the balance scorecard is a simple and powerful tool that when implemented correctly equips an organization to perform better let us proceed to the next topic of this lesson in the following screen in this topic we will look at what lean is and how lean is applied to a process let us start with the lean concepts in the next screen let us look at the process issues in this screen lean focuses on three major issues in a process known by their japanese names muda mura and marie muda refers to non-value-adding work mura represents unevenness and marie represents overburden together they represent the key aspects in lean let us look at the types of waste in the next screen there are seven types of muda or waste as per lean principles let us understand these seven types of muta overproduction this refers to producing more than is required for example a customer needed 10 products and 12 were delivered inventory in simple words this refers to stock the term inventory includes finished goods semi-finished goods raw materials supplies kept in waiting and some of the work in progress for example test scripts waiting to be executed by the testing team defects repairs rejects any product or service deemed unusable by the customer or any effort to make it usable to the original customer or a new customer for example errors found in the source code of a payroll module by quality control team motion a waste due to poor ergonomics of the workplace for example finance and account teams sit on the first floor but invoices to customers are printed on the ground floor causing unnecessary personnel movement over processing additional process on a product or service to remove unnecessary attribute or feature is over processing for example a customer needs a bottle and you deliver a bottle with extra plastic casing a customer needs abec3 bearing and your process is tuned to produce more precise abec 7 bearings taking more time for something the customer doesn't need waiting when a part waits for processing or the operator waits for work the wastage of waiting occurs for example improper scheduling of staff members transport when the product moves unnecessarily in the process without adding value for example a product is finished and yet it travels 10 kilometers to the warehouse before it gets shipped to the customer another example an electronic form is transferred to 12 people some of them seeing the form more than once that is the form is traveling over the same space multiple times next we will look at lean wastes other than the seven types of waste discussed some lean experts talk about additional areas of waste under utilized skills skills are underutilized when the workforce has capabilities that are not being fully used toward productive efforts people are assigned to jobs in which they do not fit underperforming processes automation of poorly performing process improving a process that should be eliminated if possible for example the product returns department or product discounts process a symmetry in processes that should be eliminated for example two signatures to approve a cost reduction and six signatures to reverse a cost reduction that created higher costs in other areas in the next screen we will look at an exercise on identifying the waste type we will cover each step of the lean process in the next few screens in this screen we will learn about the first step identify value to implement lean to a process it is important to find out what the customer wants once this is done the process should be evaluated to identify what it needs to possess to meet customer requirements the next screen will focus on the next step of the lean process value stream mapping in this screen we will discuss the differences between push and pull processes an organization can adopt either of these processes depending on the requirement contrary to a pull process in a push process the first step is to forecast the demand for a product or service the production line then begins to fill this demand and produced parts are stocked in anticipation of customer demand for example a garments manufacturer produces 200 shirts based on expected demand and then waits for customer orders for them note that the demand is expected and not actual discounts offered to customers by big retailers are examples of the push process if the garment company adopts a pull process instead it would start making the shirts only after receiving a confirmed demand from customers note that although the pull approach seems better it is not applicable to all situations for example a pharmacy uses a push process in the next screen we will learn about theory of constraints let us look at an example for the toc methodology in this screen the three sub processes in the packing process are coding or printing filling and sealing the data for the three sub processes are observed and collected as number of units produced in an hour the data is as follows coding or printing is 900 units per hour filling is 720 units per hour and ceiling is 780 units per hour how can you implement the toc methodology in this example let us build the toc map for this example the first step in the toc methodology is to identify the constraint looking at the data the output per hour from the filling process is 720 this is the constraint in the system in the second step the constraint is exploited by analyzing the performance using data to break the constraint a repair and maintenance personnel can be assigned to maintain the filling machine on a daily basis in the third step the other fixes in the repair and maintenance function are made as subordinate decisions to the one taken in step two in this example carry out the maintenance of the filling machine in the fourth step the constraint is elevated by implementing the decisions in this example remove the damages from the filling machine the next step is to go back to step one and identify the next system constraint as per the data collected after implementation of the first cycle of the toc ceiling can be identified as the next system constraint let us now analyze the data before and after toc implementation in this example the number of units produced per hour before implementing the toc encoding or printing process was 900 units filling process was 720 units and sealing process was 780 units after implementing the toc the number of units produced per hour for the filling process increased to 840 from 720 units let us proceed to the next topic of this lesson in this topic we will discuss the concepts in design for six sigma or dfss let us first understand dfss in the next screen dfss or design for six sigma is a business process methodology that ensures that any new product or service meets customer requirements and the process for that product or service is already at six sigma level dfss uses tools such as quality function deployment or qfd and failure mode and effects analysis or fmea dfss can help a business system to introduce an entirely new product or service for the customer it can also be used to introduce a new category of product or service for the business system for example an fmcg company plans to make a new brand of hair oil a type of product already in the market dfss also improves the product or service and adds to the current product or service lines to implement dfss a business system has to know its customer requirements dfss can be used to design a new product or service a new process for a new product or service or redesign of an existing product or service to meet customer requirements let us learn about processes for dfss in the next screen the two major processes for dfss are idov and dmadv idov stands for identify design optimize and verify dmadb stands for define measure analyze design and verify in the idov process the first step involves identifying specific customer needs based on which the new product and business process will be designed the next step involves design which involves identifying functional requirements developing alternate concepts evaluating the alternatives selecting a best fit concept and predicting sigma capability tools such as fmea are used here the third step optimize uses a statistical approach to calculate tolerance with respect to the desired output when idov is implemented to design a process expected to work at six sigma level this is checked in the optimize phase if the process does not meet expectations the optimize phase helps in developing detailed design elements predicting performances and optimizing design the last stage of idov is to verify that is to test and validate the design and finally to check conformance to six sigma standards the other process dmadv has five stages the first stage is to define the customer requirements and goals for the process product or service next measure and match performance to customer requirements the third stage involves analysis and assessment of the design for the process product or service the next step is to design and implement the array of latest processes required for the new process product or service the final stage is to verify results and maintain performance in the next screen we will look at the differences between idov and dmadb the primary difference between idov and dmadb is that while i is used only to design a new product or service dmadb can be used to design either a new product or service or redesign an existing product or service idov involves design of a new process while dmadv involves redesigning an existing process in idov no analysis or measurement of existing process is done and the whole development is new the design step immediately follows the identification of customer requirements in contrast dmadv the existing product service or process is examined thoroughly before moving to the design phase the design stage comes only after defining requirements and analyzing the existing product service or process in the following screen we will learn about tool quality function deployment or qfd which is one of the dfss tools qfd also called voice of customer or voc or house of quality is a predefined method of identifying customer requirements it is a systematic process to understand the needs of the customer and convert them into a set of design and manufacturing requirements qfd motivates business to focus on its customers and design products that are competitive in lesser time and at lesser cost the primary learning from qfd includes which customer requirements are most important what the organization strengths and weaknesses are where an organization should focus their efforts and where most of the work needs to be done to learn from qfd the organization should ask relevant questions to customers and tabulate them to bring out a set of parameters critical to the design of the product apart from understanding customer requirements it is also important to know what would happen if a particular product or service fails when being used by a customer it is necessary to understand the effects of failure on the customer to ensure preventive actions are taken and to be able to answer the customers in the event of failure in the next screen we will look at another dfss tool failure modes and effects analysis or fmea failure modes and effects analysis or fmea is a preemptive tool that helps any system to identify potential pitfalls at all levels of a business system it helps the organization to identify and prioritize the different failure modes of its product or service and what effect the failure would have on the customer it helps in identifying the critical areas in a system on which the organization's efforts can be focused note that while fmea enables identification of critical areas it does not offer solutions to the identified problems we will look at the varieties of fmea such as dfmea and pfmea in the next screen pfmea stands for process failure mode and effects analysis and df-mea stands for design failure mode and effects analysis pfmea is used on a new or existing process to uncover potential failures it is done in the quality planning phase to act as an aid during production a process fmea can involve fabrication assembly transactions or services dfmea is used in the design of a new product service or process to uncover potential failures the purpose is to find out how failure modes affect the system and to reduce the effect of failure on the system this is done before the product is sent to manufacturing all design deficiencies are sorted out at the end of this process in the following screen we will understand fmea risk priority number fmea risk priority number or rpm is a measure used to quantify or assess risk associated with the design or process assessing risk helps identify critical failure modes higher the rpn higher the priority the product or process receives rpn is a product of three numbers severity of a failure occurrence of a failure and the detectability of a failure all these numbers are given a value on a scale of one to ten the minimum value of rpn is one and the maximum value is one thousand a failure mode with a higher currency rating means the failure mode occurs very frequently a mode with a high severity rating means that the mode is really critical to ensure safety of operations a mode with a high detection rating means that the current controls are not sufficient in the next screen we will look at the fmea table the fmea table helps plan improvement initiatives by underlining why and how failure modes occur and helps organizations plan for their prevention typically fmea is applied on the output of root cause analysis and is a better tool for focus or prioritization as compared to multi-voting one important aspect of fmea is that it does not need data experts in a particular area can form the fmea table without having to look at data from any source in functions such as human resources the fmea table is very useful as there might not be much data available to the problem-solving team the sample fmea table is given on the screen please go through the contents for better understanding in the following screen we will discuss severity of risk priority number and scale criteria let us first discuss severity severity refers to the seriousness of the effect of the failure mode or how critical the failure mode is to the customer or the process the severity of a failure mode is rated on a scale of 1 to 10 using a severity table different industries follow different structures for the severity table a high severity rating indicates a mode is critical to operational safety for example a team working on fmea of a radioactive plant may insert fatal as the effect with rating 10. another example is the severity table for a sports team the team manager wants to rate the severity of failure of the team in an upcoming game she might rate it at nine given that the team would lose a big sponsorship should they face defeat which could in turn be hazardous to the team's future shown here is a generalized table of severity the severity rating can never be changed for example if a mode has a rating of nine before improvement it will continue to have a rating of 9 after improvement 2. let us look at occurrence of rpm and scale criteria occurrence is the probability that a specific cause will result in the particular failure mode as with severity occurrences rated on a scale of one to ten based on a table like the severity table higher the occurrence of a failure higher is its rating again this table might vary depending on the industry and scenario sometimes the project team can use data here if available based on past data the probability of occurrence of a failure can easily be rated shown here is a generalized table let us next look at detection of rpn and scale criteria detection is the probability that a particular failure will be detected the table shown here is again a generalized one the rating here is a bit different from severity or occurrence higher the detectability of a failure lower is its rating this is because if the failure can easily be detected then everyone would know of it and therefore there would be less or no damage for example if detection is impossible the failure is given a rating of ten please note that at the start of a six sigma project the failure mode is given a relatively high detection rating let us look at an example of fmea and rpn in the next screen in this example a bank wants to recognize and prioritize the risks involved in the process of withdrawing cash from an atm it can be observed from the table that not having a control in place for network issues has the highest rpm this is due to the detectability for a network issue being very low the next set of information in the table shows the action taken by the bank's management to address the failure modes following the implementation the new rpn is calculated retaining the security level at nine this is because the actions were not directed at reducing the severity but at the causes of failure it can be seen that the new rpn is much lower and the risk for both causes has been mitigated this lesson will cover the details of the define phase six sigma can be applied to everything around it can be applied across almost 70 different sectors however it cannot be applied to all problems the first step is to check if the project qualifies to be a six sigma project the questions that need to be asked are as follows is there an existing process to implement the dmac methodology of problem solving a process needs to exist the process should be in operation for the development of the product or service is there a problem in the process ideally the process should not have any problem if there is a problem in the process performance the process needs to be improved is the problem measurable the problem has to be measurable in order to assess the root cause and the impact of the problem on the process does the problem impact customer satisfaction if the problem affects customer satisfaction an action needs to be taken immediately else the customer may start finding alternate products or switch to competitors products does working on the problem impact profits of the company it is very essential to assess the impact of the project on the profits of the company if the project affects the profits of the company adversely then such a project is not feasible is the root cause of the problem unknown if the root cause of the problem is visible then a six sigma project is not required other problem solving techniques can be used in this case is the solution unknown if the solution to the problem is already known then there is no need for any project the company can directly implement the solution the define phase of dmacc will be introduced in the next screen the six sigma project process is known as a dmacc process define is the first phase in the six sigma project process in the define phase the problem is defined and the six sigma team is formed the objectives of the define phase are as follows clearly define the problem statement through customer analysis understand customer requirements and ensure that the six sigma project goals are aligned to these requirements define the objectives of the six sigma project plan the project in terms of time budget and resource requirements define the team structure for the project and establish roles and responsibilities in the next screen let us learn about benchmarking benchmarking is the process of comparing an organization's business processes practices and performance metrics with that of industry leaders there are various types of benchmarking let us briefly look at each type process benchmarking entails comparing specific processes to a leading company or an industry standard this is useful to obtain a simplified view of business operations and enables a focused review of major business functions process benchmarking includes comparisons of production processes data collection processes performance indicators and productivity and efficiency reviews financial benchmarking is performed to assess overall competitiveness and productivity it is done by running a detailed financial analysis and analyzing the result performance benchmarking involves comparison of products and services with those of competitors with the intention of evaluating the organization's competitiveness product benchmarking involves designing new products or services or upgrading existing products or services this can involve reverse engineering a competitor's products to study the strengths and weaknesses and modeling the new product on these findings strategic benchmarking refers to studying strategies and problem-solving approaches in other industries functional benchmarking is the focused analysis of a single function with the aim of improving it complex functions may need to be divided into processes before benchmarking is done competitive benchmarking includes standardizing organizational strategies process products services and procedures against the competitors in the same industry collaborative benchmarking is a type of benchmarking where the standardization of various business parameters is carried out by a group of companies and the information is shared if the subsidiary units of a company or its various branches carry out the benchmarking it is called collaborative benchmarking let us take a look at best practices for benchmarking in the next screen best practice is a method that ensures continuous improvement leading to exceptional performance it is also a method to sustain and develop the process continuously some of the best practices in benchmarking are as follows increase the objectives or scope of benchmarking set the standards and path to be followed at the initial stage reduce unnecessary effort and comply with the scope recognize the best in the industry to set a benchmark share the information derived from benchmarking in the next screen we will discuss project selection let us understand process business process and business system in this screen a process is a series of steps designed to produce a product or service to meet the requirement of a customer a process mainly consists of three elements input process and output a business process is a systematic organization of objects such as people machinery and materials into work activities designed to produce a required product or service as shown on the screen a process is a subset of a business process a business process is in turn a part of a business system a business system is a value-added chain of various business processes such as sales or finance for example payroll calculation is a process in the hr business process of an i t company which is a business system in the next screen we will look at the process elements let us discuss the challenges to business process improvement in this screen the improvement to a business process of an organization faces challenges due to the traditional business system structure because it is generally grouped around the functional aspect the main problem in a functionally grouped organization is the movement or flow of a product or service a product or service has to go through various functions and their functional elements to reach the customer or end user the other problem is management of the flow of products or services across various functional elements this is difficult as usually there is no one in charge these business process improvement problems can be solved using the project management approach to produce the product or service in the next screen we will learn about process owners and different stakeholders of a project the representation of where the process owner and stakeholders are placed in the organizational hierarchy is on the screen the process owner is a person at a senior level in the hierarchy he is the one who takes responsibility for the performance and execution of a process and also has the authority and the ability to make necessary changes on the other hand a stakeholder is a person group or organization which is affected or can affect an organization's actions businesses have many stakeholders like stockholders customers suppliers company management employees of an organization and their families the society etc let us discuss the effects of process failure on various stakeholders in the next screen while it is an absolute business necessity to keep one stakeholder satisfied at all times failure to meet one or more process objectives may result in negative effects on them in such situations for the stockholders the perceived value for the company gets reduced customers may seek other competitors for their deals while imposing penalties and finding recourse in legal action against the company suppliers may be on the losing front with delayed payments or not being paid at all company management may require cost cut down employees will receive diminishing wages the community and society will be affected due to pollution created by the organization in the next screen we will understand the relationship between business and the stakeholder in the diagram shown on the screen each stakeholder is both a supplier as well as a customer forming many closed loop processes that must be managed controlled balanced and optimized for the business to thrive communication is the key in such situations and is facilitated through internal company processes the next screen covers the importance and relevance of stakeholder analysis stakeholder analysis is an important task to be completed before doing a six sigma project a business has many stakeholders and any change to a business process affects some or all of them when a process does not meet its objectives it results in the stakeholders being negatively affected which in turn affects the organization's performance the six sigma team must factor in the reasons why a stakeholder may oppose the change effort let us proceed to the next topic of this lesson in the following screen in this topic we will discuss voice of the customer let us start with how to identify the customer in the following screen customers are the most important part of any business a customer is someone who decides to purchase pays consumes and gets affected by a particular product or service it is critical to identify and understand the customer requirements the products or services can be designed according to these requirements consequently the company is able to provide products or services the customers are willing to purchase there are two types of customers internal and external customers in the next screen we will learn about internal customers an internal customer can be defined as anyone within the business system who is affected by the product or the service while it is being developed most often internal customers are the employees of the company for example let us assume that there is a series of processes in a particular business system in such a scenario the second process is the internal customer for the first process the third process is an internal customer for the second process and so on the basic needs of an internal customer are to be provided proper tools and necessary equipment imparted proper training and given specific instructions to carry out their responsibilities however the needs are not limited to these alone other needs include the provision of company newsletters projects storyboards to display the letters etc team meetings to share business news and announcements staff meeting to share information and quality awards from suppliers an internal customer is important first of all the activities of an internal customer directly affect the final or ultimate customer secondly the activities of an internal customer affect the next process in the system finally an internal customer also affects the quality of the product developed or service provided when the needs of the internal customers in most cases employees are met they are more likely to have higher perceptions of quality and also contribute to greater productivity the satisfaction levels of the internal customers can be improved in various ways these include a higher amount of internal communication through company newsletters and team meetings recognitions for work quality awards etc constant training on how to be ahead and well equipped in a competitive environment is very essential too in the next screen we will learn about external customers this screen focuses on the positive effects of a project on the customers the most important aspect of any process improvement project is the customers internal customers are the ones who drive the project hence the effect of the project on internal customers is a critical factor that needs to be considered the positive impact of a project on the internal customers is as follows the project is driven by highly motivated individuals or internal customers who are aware of the project objectives and scope individuals belonging to a credible project understand the project deliverables and display high levels of job satisfaction these individuals go the extra mile to take up tasks beyond their job description such individuals make a highly motivated team focused on delivering their responsibilities in order to meet the customer requirements working together in a positive environment also improves team spirit and bonding the positive impact of a project on the external customers is as follows process improvement projects analyze the problems and come up with an effective solution consequently ensuring a better product a successful process improvement project assists the organization in effectively meeting customer expectations or requirements there is visible improvement in customer service good quality product and service ensures high customer satisfaction let us learn about different methods of customer data collection in the following screen once you begin to identify the customer types you need to look forward to collecting customer data collecting data from customers is very essential as it helps consider the levels at which these customers affect the business begin by collecting feedback from both internal as well as external customers customer feedback helps fill the gaps and improve the various business processes in the organization it helps define a good quality product as perceived by the customer and identify qualities that make the competitors products or service better it also helps identify factors which provide a competitive edge to the product or service on offer there are various methods to collect feedback from the customers many of you might be involved in a similar activity at some point or the other popular and common methods are surveys conducted through questionnaires focus groups individual interviews with the customers customer complaints received via call centers emails and feedback forms are also quite prevalent feedback received in this form are from dissatisfied customers in the next screen we will learn about questionnaires let us discuss the advantages and disadvantages of questionnaires in this screen the advantages of a questionnaire are that it costs less the phone response rate is high anywhere from around 70 to 90 percent and it produces faster results also analysis of male questionnaires requires few trained resources questionnaire is a method used to gather data however there are a few disadvantages associated with it there may be incomplete results and unanswered questions leading to a lack of clarity the response rate of mail surveys is about 20 to 30 percent only at times phone surveys can produce undesirable results as the interviewer can influence the person being interviewed we will differentiate between telephone survey and web survey in the next screen there are different methods to collect data for a survey the methods need to be based on the requirements and needs of the organization the popular methods of survey are the telephone survey and web survey both have their own drawbacks and benefits which are given on the screen the organization needs to choose a method of collecting data according to the situation it is recommended to go through the content for a better understanding in the next screen we will learn about focus groups let us now discuss the advantages and disadvantages of using a focus group for data collection the interaction in a focus group generates information provides in-depth responses and can address more complex questions or qualitative data it is an excellent platform to get critical to quality or ctq definitions as well on the other hand the disadvantages of focus groups are that the learning only applies to those within the group and it is not possible to generalize the information collected is more qualitative than quantitative which is another drawback additionally they can also generate a lot of information from anecdotes and incidents experienced by the individuals in the group in the next screen we will discuss the interview technique this screen discusses advantages and disadvantages of using the interview technique for data collection interviews have a capability to handle complex questions and a large amount of information they also allow us to use visual aids in the process it is a better method to be employed when people do not respond willingly and or accurately by phone or email however there are some shortcomings as well interviews are time consuming and the resources or interviewer needs to be trained and experienced to carry out the task let us discuss the importance and urgency of these inputs in the next screen the table shows the importance and urgency of different kinds of input to understand the kind of input to be chosen different kinds of methods for collecting data are identified telephone survey web survey and interview are the data collection methods identified to select the best methods the criteria or the factors which are important to the organization are listed the criteria are the factors based on which an organization is going to make decisions the list of factors is then given weightage based on the importance of each factor in decision making as seen cost is the most important criterion for which the weightage is given twenty response rate of the customer is next important factor and the list follows visualizing feature and compiling and analyzing data are the factors which have the lowest impact on the decision of selecting the methods for data collection each of the data collecting methods is rated between 1 and 10 based on its impact on the listed factors with 10 being highly favorable to the organization and one being least favorable after rating all the methods with the factors listed the sum or total is calculated the calculation of the total involves multiplying each methods rating with the factor weightage and adding all the multiplied values of the column that is for telephone survey the rating is multiplied with factors rating eight multiplied by twelve plus eight multiplied by six plus three multiplied by twenty plus five multiplied by five plus three multiplied by five plus seven multiplied by fifteen plus one multiplied by ten plus seven multiplied by three plus zero multiplied by two plus three multiplied by two plus one multiplied by ten plus seven multiplied by five plus eight multiplied by five and the total of this is four hundred seventy one in a similar way calculate the total value for the remaining two methods the total of other two methods are 744 and 522 respectively looking at the overall total of the methods 744 is the highest hence web survey is the best method for the organization to use for data collection let us look at the pros and cons of customer complaints data in the next screen there are pros and cons in gathering information from customer complaints advantages include availability of specific feedback directly from the customer and ease in responding appropriately to every customer on the contrary feedback in this method does not provide an adequate sample size and may lead to processed changes based on one or two inputs from the customer the next screen will discuss the difference between product complaint and expedited service request product complaints and expedited service requests can act as inputs to the company for improving their process these details address the needs of the customer in an indirect way a product complaint means that the customer is not happy with the product that he has purchased from the company an expedited service request means a service request is being rushed if the customer requires the items immediately then an expedited service request is raised from the customer and the organization tries to fulfill it to please the customer product complaint implies that a product is not meeting customers specification hence it has to be improved expedited service request implies that service timeliness are not meeting customer requirements hence service has to be improved product complaint also implies that the customer needs for product are not completely identified whereas expedited service request implies that the customer timings need to be recalculated let us discuss the importance and urgency of these inputs in the next screen the table shows the importance and urgency of different kinds of input to select the best methods the criteria or the factors which are important to the organization are listed the criteria are the factors based on which organization is going to make decisions these factors are then given weightage based on the importance of each factor in decision making as seen cost involved and identification of customer need are the most important criteria for which the weightage given is 15 and the list follows time consumption and compiling and analyzing data are the factors which have the least impact on the decision of selecting the methods for data collection each method is rated between 1 and 10 based on its impact of the listed factors with 10 being highly favorable to the organization and one being least favorable to the organization after reading all the methods with the factors listed the summer total is calculated calculation of the total is derived by multiplying each methods rating with the factor weightage and adding all the multiplied values that is for product complaint the rating is multiplied with the factors rating eight multiplied by fifteen plus four multiplied by fifteen plus three multiplied by two plus one multiplied by ten plus one multiplied by ten plus one multiplied by ten plus one multiplied by eight plus one multiplied by ten plus four multiplied by two plus one multiplied by eight plus one multiplied by ten and the total of this is two hundred sixty in the similar way calculate the total value for expedited service request the total of expedited service request is eight hundred seventeen and hence it is effective to the organization let us discuss the key elements of data collection tools in the next screen data collection tools will be selected based on the type of data to be collected the key elements that make these tools effective are as follows data is collected directly from the primary source or customer hence there is no scope for miscommunication or loss of information data is collected exclusively for the stated purpose hence data is highly reliable the data is captured is after understanding the organizational purpose this makes the data exclusively relevant and serves the purpose of the organization data is collected instantaneously when there is a requirement this ensures that the data is up to date hence the data is valid the tools accurately define customer requirements the customer requirements could be current needs or improvement to the product or service that they are currently using the tools help to get enough information about customer requirement through which the process for improving or creating the product or service that the customer requires can be developed in the next screen we will discuss how the collected data can be reviewed collated data must be reviewed to eliminate vagueness ambiguity and any unintended bias nutra worldwide buys laptops for its employees from a company that is into manufacture and sales of laptops the company also provides servicing and repairs for their products to the customers to understand the level of customer satisfaction in neutral worldwide and to improve its process the laptop company is conducting a survey the questionnaire which was prepared initially questionnaire before review had questions that led to ambiguity vagueness and unintended bias let us look at each item on the survey to understand the level of usage of the laptop and to know their customer better the survey is raising a question related to the occupation of the customer it gives the option of student or professional but with this low amount of information the company is neither able to gather the information nor will the given option cover the entire possible occupation in the market including an option of other please specify would help the customer to choose and provide the information if he does not belong to one of the two given groups hence the same is added in the review so that the customer will not be in any ambiguity while filling the questionnaire the question whether the sales executive was supportive with an option of yes or no is a question which leads again to ambiguity and unintended bias the customer might be partially happy or partially not happy but the choice does not let them inform their exact feeling if the customer selects no as the option then the company does not get enough information to understand where their sales executive went wrong hence in the reviewed questionnaire the customer is asked to rate the qualities of their sales executive which will provide better data to the company in order to improve the process next we will discuss a technique named voice of customer the voice of customer is a technique to organize analyze and profile the customer's requirements voice of the customer is an expression for listening to the external customer the table shows the customer requirements while purchasing an air conditioner in all cases the customer is purchasing for his or her domestic usage each customer is further categorized according to his needs and requirements when the customer says that he needs a silent air conditioner he needs sound sleep at night in the bedroom this is primarily to remain fresh the next morning and to get rid of the noisy ceiling fan being used currently in case the customer says that he needs an efficient ac he needs a machine which provides good cooling at night in the bedroom this is mainly because it gets extremely hot in summer also he currently uses a ceiling fan which is not so effective in summers on the other hand when the customer wants to buy an ac which is not too costly he has limited cash for the purchase he wants to purchase a low-cost ac let us discuss the importance of translating customer requirements in the next screen customer requirement is the data collected from customers that gives information about what they need or want from the process customer requirements are often high level vague and non-specific some customers may give you a set of specific requirements to the business but broadly customers requirements are a reflection of their experience customer requirements when translated into critical process requirements that are specific and measurable are called critical to quality ctq factors a fully developed ctq has four major elements output characteristic y metric target and specification or tolerance limits we will discuss the meaning of ctq in the next screen let us understand what quality function deployment is in this screen qfd is a process to ensure that the customers wants and needs are heard and translated into technical characteristics it is also known as the voice of the customer or house of quality qfd is a process to understand the customer's needs and translate them into a set of design and manufacturing requirements while motivating businesses to focus on their customers it also helps companies to design and build more competitive products in less time and lesser costs qfd helps in prioritizing customer requirements recognizing strengths and weaknesses of an organization and recognizing areas that need to be worked on and areas that need immediate focus of efforts qfd is carried out by asking relevant questions to the customers and tabulating them to bring out a set of parameters critical to the product design let us discuss phases of qfd in the next screen quality function deployment involves four phases phase one product planning in this phase the qfd team translates the customer requirements into product technical requirements phase two product design in this phase the qfd team translates the identified technical requirements into key part characteristics or systems phase three process planning in this phase the qfd team identifies the key process operations necessary to achieve the identified key part characteristics phase four production planning or process control in this phase the qfd team establishes process control plans maintenance plans and training plans to control operations next we will understand the structure of qfd let us see what happens after completing the hoq matrix completing one hoq matrix is not the end of the qfd process the output of the first hoq matrix can be the first stage of the second qfd phase as shown in the image the translation process is continued using linked hoq type matrices until the production planning targets are developed let us proceed to the next topic of this lesson in the following screen in this topic we will discuss the basics of project management let us start with a discussion on problem statement every six sigma project targets a problem that needs to be resolved the first step of project initiation is defining the problem statement a problem statement needs to describe the problem in a clear and concise manner a problem statement needs to identify and specify the observed gap in performance it should indicate the current performance state of a process and the required performance state completely derived from customer requirements a problem statement should be quantifiable this means it should have specified metrics including the respective units please note that the problem statement cannot contain solutions or causes for the problem in the next screen we will discuss the is or is not template the is or is not technique was first popularized by kepner trigo incorporated in the seventies it is a powerful tool that helps define the problem and gather required information an example of a problem statement of paper cup leaks is given on the screen the six sigma team has to answer what is the problem what isn't the problem where is it where isn't it when is it when isn't it the problem to what extent is it a problem and to what extent isn't it a problem the information is then used to fill the question areas in the is and is not issue template in the analysis phase if a cause cannot describe the is and the is not data then it's not likely the main cause in the next screen we will list the criteria for the project objectives the project objectives must meet the smarts criteria smarts is an acronym of the characteristics desired in project objectives specific measurable attainable relevant time based and stretch the project deliverables should be specific example hospitals maintain records of all patients often a few forms are rejected or missed due to errors in recording the id numbers in this case setting the objective as reduce form rejection is very vague instead reduce patient id errors in recording lab results is specific and effectively targets solving the problem the project objectives should be quantifiable example setting the objective as fewer form rejections is very vague instead reduce patient id errors by 30 percent sets a specific goal the project objectives should be achievable and practical the project objectives should be relevant to the problem the project objectives must specify a time frame within which they should be delivered the project objectives must not be easily achievable example most problems and errors can be reduced by creating awareness hence the objective must stretch beyond the easily attainable state in the next screen we will understand project documentation project documentation refers to creating documents to provide details about the project such documents are used to gain a better understanding of the project prevent and resolve conflict among stakeholders and share plans and status for the project documentation of a project is critical throughout the project some of the benefits achieved through project documentation are mentioned below documentation serves as written proof for execution of the project it helps teams achieve a common understanding of the requirements and the status of the project it removes personal bias as there is a documented history of discussions and decisions made for the project depending on the nature of the project each project produces a number of different documents some of these documents are the project charter project plan and its subsidiary plans other examples of project documentation include project status reports including key milestones report risk items and pending action items the frequency of these reports is determined by the need and complexity of the project these reports are sent to all stakeholders to keep them abreast of the status of the project another example of project documentation is the final project report this report is prepared at the end of the project and includes a summary of the complete project project storyboard inputs generated from statistical tools outputs from spreadsheets checklists and other miscellaneous documents are also classified as project documents in the next screen we will understand project charter we will list the project charter sections in this screen the major sections of a project charter are project name and description business requirements name of the project manager project purpose or justification including roi stakeholder and stakeholder requirements broad timelines major deliverables constraints and assumptions and the budget summary of the charter in the next screen we will understand the project plan a project plan is the final approved document which is used to manage and control the various processes within the project and ensure its seamless execution the project manager uses the project charter as an input to create a detailed project plan a project plan comprises various sections prominent among them being the project management approach the scope statement the work breakdown structure the cost estimates scheduling defining performance baselines marking major milestones to be achieved and the key members and required staff personnel for the project it also includes the various open and pending decisions related to the project and the key risks involved additionally it also contains references to other subsidiary plans for managing risk scope schedule etc in the next screen we will learn about project scope we will look at different techniques used for interpreting the project scope in this screen project scope can be interpreted from the problem statement and project charter using various tools like the burrito chart and the cypoc map the principle behind the burrito chart or the 80 20 principle as we know it is a vital few trivial many the burrito chart helps the teams to trim the scope of the project by identifying the causes which have a major impact on the outcome of the project the cypac map is a high level process map which helps all team members in understanding the process functions in terms of addressing questions like who are the suppliers what are the inputs they provide what are the outputs that can be obtained and who are the customers as discussed earlier cypoc stands for suppliers inputs process outputs and customers in the subsequent screen we will learn about process maps scipok is a macro level map that provides an overview of the business process where a process map is a micro level flow chart that provides an in-depth detail of a process the process map covers details at all levels and provides a walk through the current process experience the cypac map is used as a basis while drawing a process map a level one process map provides in-depth information but the final process map drills further into detail in the following screen we will understand the project metrics let us discuss consequential metrics in this screen consequential metrics measure any negative consequences these can be business metrics process metrics or both they measure the negative effects of improving the primary or key metrics they are used to measure the indemnity triggered by any damage in the project the inconsistent use of consequential metrics can lead to loss of opportunity and rework after a project ends consequential metrics help to understand the cause and effect relationship between the primary and the secondary metrics and the impact it has on the organization let us take a look at an example for consequential metrics in the next screen we will discuss the best practices in this screen the following are some of the best practices of consequential metrics setting consequential metrics during the measure phase and monitoring these metrics after finalizing the project will help to analyze whether the link between previous primary and secondary metrics has been established also linking consequential metrics with primary metrics and finally linking them with secondary metrics provides clarity on the impact of these metrics assessing and evaluating the cause and effect relationship between these metrics is helpful to the organization as a whole in the next screen we will list some project planning tools the project manager uses various tools to plan and control a project one of the tools which he uses is the burrito chart other prominent tools include the network diagram the critical path method also called cpm the program evaluation and review technique which is also known as pert gantt charts and the work breakdown structure also known as wbs in the next screen we will discuss burrito chart pareto chart is a histogram ordered by the frequency of occurrence of events it is also known as the 80 20 rule or vital few trivial many it helps project teams to focus on the issues which cause the highest number of defects or complaints to explain further the given chart plots all the causes for defects in a product or service the values are represented in descending order by bars and the cumulative total is represented by the line burrito chart emphasizes that 80 percent of the effects come from 20 percent of the causes thus a burrito chart narrows the scope of the project or problem solving by identifying the major causes affecting quality pareto charts are useful only when required data is available if data is not available then other tools such as brainstorming and multi-voting should be used to find the root cause of any problem in the following screen we will continue to discuss burrito chart with an example a hotel receives plenty of complaints from its customers and the hotel manager wishes to identify the key areas of complaints complaints were received in the following areas cleaning check-in pool timings minibar room service and miscellaneous cleaning and check-in can be noted as areas of concern with 35 and 19 complaints respectively is calculated for each cause of complaint and the cumulative is derived burrito chart is plotted using this data in the next screen we will discuss network diagrams network diagrams are one of the tools used by the project manager for project planning they are also sometimes referred to as arrow diagrams because they use arrows to connect activities and represent precedence and interdependencies between activities of a project there are some assumptions that need to be made while forming the network diagram the first assumption is that before a new activity begins all pending activities have been completed the second assumption is that all arrows indicate logical precedence this means that the direction of the arrow represents the sequence that activities need to follow the last assumption is that a network diagram must start from a single event and end with a single event there cannot be multiple start and end points to the network diagram in the next screen let us discuss some terms related to network diagram for the network diagram to calculate the total duration of the project the project manager needs to define four dates for each task the first two dates relate to the date by when the task can be started the first date is early start this is the earliest date by when the task can start the second date is late start this is the last date by when the task should start the second two dates relate to the dates when the task should be complete early finish is the earliest date by when the task can be completed late finish is the last date by when the task should be completed the duration of the task is calculated as the difference between the early start and early finish of the task the difference between the early start and late start of the task is called the slack time available for the task slack can also be calculated as the difference between the early finish and late finish dates of the task slack time or float time for a task is the amount of time the task can be delayed before it causes a delay in the overall project timeline in the next screen we will discuss critical path method critical path method also known as cpm is an important tool used by project managers to monitor the progress of the project and to ensure that the project is on schedule the critical path for a project is the longest sequence of tasks on the network diagram the critical path in the given network diagram is highlighted in orange critical path is characterized by zero slack for all tasks on the sequence this means that the smallest delay in any of the tasks on the critical path will cause a delay in the overall timeline of the project this makes it very important for the project manager to closely monitor the tasks on the critical path and ensure that the tasks go smoothly if needed the project manager can divert resources from other tasks that are not on the critical path to task on the critical path to ensure that the project is not delayed when a project manager removes resources from such tasks he needs to ensure that the task does not become a critical path task because of the reduced number of resources during the execution of the project the critical path can easily shift because of multiple factors and hence needs to be constantly monitored by the project manager a complex project can also have multiple critical paths in the next screen we will discuss project evaluation and review technique we will understand the concept of risk in this screen risk is an uncertain event or consequence probable of occurring during a project the main objectives of any project are time cost quality and scope risk affects at least one of the four project objectives it is important to understand that risk can be both positive as well as negative a positive risk enhances the success of the project whereas a negative risk is a threat to a project's success some of the terms used in risk analysis and management are risk probability issue and risk consequences the likelihood that a risk will occur is called risk probability to assess any risk is to assess the probability and impact of the risk issue is the occurrence of a risk risk consequences are the effects on project objectives if there is an occurrence of a risk or issue in the subsequent screen we will understand the process of risk analysis and management we will list and understand some of the elements of risk analysis in this screen qualitative method qualitative methods like interview checklists and brainstorming are used to identify potential risks quantitative method quantitative methods are databased and a computer is required to calculate and analyze these methods are used to evaluate the cost time and probabilistic combination of individual uncertainties feasibility feasibility is the study of the project risk this is usually carried out in the beginning of the project when the project is most flexible and risks can be reduced at a relatively low cost it helps in deciding different implementing options for the projects potential impact once the potential risks are identified the impact of these on the project is determined using this data possible solutions for the risks are identified rpn rpn of a failure is the product of its probability of occurrence severity and detectability a failure is prioritized based on its rpn value a high rpn indicates high risk rpn assists to prioritize risks avoiding risk when potential risks are identified their impact in terms of cost time resources and objective perspective is calculated if the impact is huge then avoiding the risk is the best option mitigating risk mitigating is the second option when dealing with risks the loss that arises from mitigating a risk is much less than the loss that arises from the temporary avoiding a risk accepting the risk if a risk cannot be avoided or mitigated then it has to be accepted the risk will be accepted if it doesn't greatly impact the cost time and product objective in the following screen we will discuss benefits of risk analysis benefits of risk analysis are as follows once the risk has been identified it can be either mitigated transferred or accepted when risk is identified in a task slack time is provided as a buffer identifying risks also helps in setting up an actual timeline for a project slack time for an activity in a project could be the result of a risk identified proactively identifying risks helps in setting realistic expectations from the project by communicating the risk probability and consequence to stakeholders risk analysis also helps to identify and plan contingency activities if the risk becomes an issue the project team is then well prepared to work on the issue thereby reducing the impact of the risk in the following screen we will take a look at the risk assessment matrix the potential risks of a project are assessed using the risk assessment matrix it covers potential risk areas like project scope team personnel material facility and equipment and communication each of these areas is assessed in terms of risk of loss of money productivity resources and customer confidence in the subsequent screen we will discuss project closure by definition a project has a beginning and an end but without a formal closure process project teams can fail to recognize the end and then the project can drag on sometimes at great expense every project requires closure for larger complex projects it's a good idea to close each major project phase for example design code and test individually project closure ensures that outcomes match the stated goals of the project customers and stakeholders are happy with the results critical knowledge is captured the team feels a sense of completion and project resources are released for new projects in the next screen we will list the goals of a project closure report the project closure report is created to accomplish the following goals review and validate the success of the project confirm outstanding issues limitations and recommendations outline tasks and activities accomplished to complete the activity highlight the best practices for future projects provide the project report or summary provide a project background overview summarize the planned activities of a project evaluate project performance provide a synopsis of the process generate discussions and recommendations generate project closure recommendations in the following screen we will list and understand project closure activities during project closure the project manager needs to take care of the following activities finalize the project documents much of a project's documentation is created during the life of the project document collection and update procedures are well established during the project lifeline capture the project knowledge project documents are helpful for future projects in troubleshooting the product or in a future audit set up a project library ideally the project library is set up at the beginning of the project and team members add documents as they produce them document the project learnings project learnings can be captured through team meetings meetings with stakeholders and sponsor and through feedbacks from consultants and vendors provide knowledge transfer the project manager needs to provide a summary of the project results to team members either as a presentation at a meeting or as a formal document consultants should not be relieved from their position until they have transferred all the important product maintenance knowledge to the team get a final sign off schedule a meeting with the project sponsor and key stakeholders to get their final sign off on the project close the project office if the project team used a project management office or a dedicated work area arrangements need to be made to return that space for general use recognize and reward the project manager has the best understanding of which of the team members have worked the best have transformed themselves with new skills and who might be ready for a new level of responsibility the project manager needs to report to the team superiors what each team member has brought to the project celebrate after completion of every project the team needs and deserves a celebration a team dinner a team outing gift certificates or other rewards are minor costs that generate a large return in terms of morale and job satisfaction make a public announcement an announcement to the organization is a good way to highlight the success of the project and its benefits to the company conclude formal project closure ensures that the team has met its objectives satisfied the customer captured important knowledge and been rewarded for their efforts let us proceed to the next topic of this lesson in the following screen in this topic we will discuss management and planning tools let us start with the discussion on affinity diagram in the next screen the affinity diagram method is employed by an individual or team to solve unfamiliar problems it is an effective medium where the consensus of the group is necessary the given affinity diagram is based on an organization where the employees are not satisfied to begin with each member writes down ideas and opinions on sticky notes each note can have a single idea the points which have surfaced during the brainstorming session are that the workers are unkind pay is low and it is difficult to survive on the pay structure working hours are too long etc in the next step all the sticky notes are pasted on a table or wall the sticky papers are arranged according to categories or thought patterns members happen to arrange their ideas based on affinity in case a particular idea is good to go into more than one category it is duplicated and added to several categories after the arrangement is done each category is named with a header card the header card captures the central idea of all the cards in that category and draws a boundary around them poor compensation combines ideas like low pay long working hours and complaints about wages poor work environment encompasses issues like poor lighting uncomfortable rooms and stuffy air similarly poor relationships prevail in the workspace as the workers are unkind and there is mutual dislike lack of motivation is due to repetitive work and no work related challenges you can see in the diagram on that slide that once all the ideas are grouped to the respective header cards a diagram is drawn and borders are placed around the group of ideas thus affinity diagram helps in grouping ideas with the common theme in the next screen we will discuss the interrelationship diagram during problem solving the inter-relationship diagram technique helps in identifying the relationship between problems and ideas in complex situations if the problem is really complex it may not be easy to determine the exact relationship between ideas given inter-relationship diagram is the result of a team brainstorming session which identified 10 major issues involved in developing an organization's quality plan initially the problem is defined and all the members put down their ideas on sticky notes each note contains only one idea all the sticky notes are put on a table for a random display in the next step the causes or areas of concern are identified and a cause effect arrangement of cards is constructed by drawing an arrow between the causes and effects of the cause this is done until all the ideas on the sticky notes are accounted for and made a part of the interrelationship diagram take a large sheet of paper and replicate the cause effect arrangement on it as depicted in the image a large number of outgoing arrows indicate the root cause whereas a higher number of incoming arrows indicates an outcome there are as many as six arrows originating from lack of quality strategy this leads us to understand that it is a root cause on the other hand there are three arrows ending with the idea lack of tqm commitment by managers making it an outcome in the next screen we will understand the tree diagram the tree diagram is a systematic approach to outline all the details needed to complete a given objective in other words it is a method used to identify the tasks and methods needed to solve a problem and reach a predefined goal it is mostly used while developing actions to execute a solution while analyzing processes in detail during the evaluation of implementation issues for several potential solutions and also as a communication tool to explain the details of a process to others the given tree diagram shows the plan of a coffee shop trying to set standards for the coffee it delivers first the objective is noted on a note card and placed on the far left side of the board the basic goal of the coffee shop is to provide a delightful cappuccino experience in the next step the coffee shop needs to determine the means required to achieve the goal and furnish three different solutions in other words the answers to the how or why questions of the objectives in this case the cappuccino needs to be at a comfortable temperature and it should have strong and pleasing coffee aroma with the right amount of sweetness in the next step the three issues mentioned in the second stage are addressed at length each issue is answered by maintaining the espresso and steamed milk temperature the cappuccino can be served at a palatable temperature strong flavored cappuccino can be prepared using a good amount of finely ground coffee beans and a good quality sweetener used in the right amount makes a great cappuccino thus the tree diagram can be used to achieve a goal or define a process in the following screen we will discuss prioritization matrices let us learn about matrix diagram in this screen matrix diagrams show the relationship between objectives and methods results and causes tasks and people etc their objective is to provide information about the relationship they provide importance of task and method elements of the subject they also help determine the strength of relationships between a grid of rows and columns they help in organizing a large amount of inter-process related activities let us discuss various types of matrices in the next screen let us learn about a process decision program chart in this screen process decision program chart or the pdpc method is used to chart the course of events from the beginning of a process till the goal while emphasizing the ability to identify the failure of important issues on activity plans the pdpc helps create appropriate contingency plans to limit the number of risks involved the pdpc is used before implementing a plan especially when the plan is large and complex if the plan must be completed on schedule or if the price of failure is quite high the given process decision program chart shows the process which can help in securing a contract the process starts when the seller receives an order request from a potential buyer this can lead to fixing an appointment with the buyer confirming the appointment date and meeting the buyer if the date is not fixed then buyers should be contacted till the meeting is confirmed without a meeting there is a risk of losing the order considering an optimistic scenario where a meeting is fixed with the buyer the seller describes the price of the product or service if the price is competitive the order is secured if the price is not competitive the seller may have to repeat the bid until the buyer agrees and the order is secure however the buyer may not agree to a revised bid either in which case the seller might lose the bid in such a scenario the seller can justify the pricing and pursue the buyer to agree to the bid it might work and the seller might secure the order in the next screen we will discuss the activity network diagram an activity network diagram is used to show the time required for solving a problem and to identify items that can be done in parallel it is used in scheduling and monitoring tasks within a complex project or process with interrelated tasks and resources moreover it is also used when you know the steps of the project or process their sequence and the time taken by each of the steps involved the original japanese name for this tool is arrow diagram the given activity network diagram shows a house construction plan and identifies the factors involved separately like the amount of time for each operation in one situation the relationship of work without time for each operation and an operation by itself the number of days is denoted by d so the time taken for an activity like foundation to scaffolding takes around five days plus four days which is nine days in total the line joining electrical work and interior walls is dotted this shows relation between them but without any timeline basically it means that electrical work has to be done before interior walls but the time is either not important or not available let us proceed to the next topic of this lesson in the following screen in this topic we will introduce business results for projects let us start with the discussion on defect per unit we will learn about throughput yield in this screen throughput yield or tpy is the number of acceptable pieces at the end of a process divided by the number of starting pieces excluding scrap and rework throughput yield is used to measure a single process only if the dpu is known tpy can be easily calculated as e to the power of the negative of dpu here e is the mathematical constant and has a value of 2.7183 the expression can also be stated as dpu equals the negative of natural logarithm e of tpy in the next screen we will discuss rolled throughput yield rolled throughput yield or rty is the probability of the entire process producing zero defects rty is the true measure of process efficiency and is considered across multiple processes it is important as a metric when a process has excessive rework tdpu is total defects per unit and is defined for a set of processes when the total defects per unit is known rolled throughput yield is calculated using the expression e to the power of negative of tdpu the expression can also be written as tdpu is equal to negative of natural logarithm of rty when the defectives are known roll through foot yield can be calculated as the product of each process's first pass yield or fpy first pass yield is the number of products which pass without any rework over the total number of units first pass yield is calculated as total number of quality products over total number of units total number of quality products is total number of units minus total number of defective units in the following screen we will understand fby and rty with an example we will discuss process capability in this screen process capability or cp is defined as the inherent variability of a characteristic of a process or a product in other words it might also mean how well a process meets customer cp is an indicator of capability of a process and is expressed as difference of usl and lsl divided by product of six sigma usl stands for upper specification limit lsl is lower specification limit and sigma is the standard deviation of a process the difference between usl and lsl is also called the specification with or tolerance in the following screen we will discuss process capability indices process capability indices or cpk was developed to objectively measure the degree to which a process meets or does not meet customer requirements it was developed to account for the position of mean with respect to usl and lsl to calculate cpk the first step is to determine if the process mean is closer to the lsl or the usl if the process mean is closer to lsl cpkl is determined cpkl is mean minus lsl divided by product of 3 and sigma if the process mean is closer to usl cpku is calculated cpku is usl minus mean divided by product of three and sigma here mean is the process average and sigma represents the standard deviation if the process mean is equidistant either of the specification limit can be chosen cpk takes up the value of cpku and cpkl depending on whichever is the lower value in the next screen we will understand process capability indices with an example in this screen we will discuss cpk and cp interpretations a cp value of less than 1 indicates the process is not capable even if cp is greater than one to ascertain if the process really is not capable check the cpk value a cpk value of less than one indicates that the process is definitely not capable but might be if cp is greater than one and the process mean is at or near the midpoint of the tolerance range the cpk value will always be less than cp especially as long as the process mean is not at the center of the process tolerance range non-centering can happen when the process has not understood the customer expectations clearly or the process is complete as soon as the output reaches a specific limit for example a shirt size of 40 has a target chest diameter of 40 inches but the process consistently delivers shirts with a mean of 41 inches as the chest diameter a machine stops removing material as soon as the measured dimension is within specified limit let us proceed to the next topic of this lesson in the following screen in this topic we will discuss team dynamics and performance let us start with the discussion on team stages there are five typical stages in the team building process each team passes through these stages as they start and proceed through the project the five stages in the team building process are as follows forming storming norming performing and adjourning in the next screen we will discuss the first stage forming the first stage in the team building process is called the forming stage in this stage the team comes together and begins to formulate roles and responsibilities the team leader is identified and he or she starts directing the team and assigning responsibilities to other team members most team members at this stage are generally enthusiastic and motivated by desire to be accepted within the team the leader employs a directive style of management which includes delegating responsibility within the team providing a structure to the team and determining processes needed for the smooth functioning of the team toward the end of this phase the team should achieve a commitment to the project and an acceptance of a common purpose in the next screen we will discuss the second stage storming the second phase in the team building process is called the storming stage as suggested by the name itself in this stage conflicts start to arise within the team team members often struggle over responsibilities and control within the project it is the responsibility of the team leader to coach and conciliate the team the leader employs a coaching style of management which is reflected through facilitating change managing conflict and mediating understanding between different parties towards the end of this phase team members need to learn to voice disagreement openly and constructively while staying focused on common objectives and areas of agreement in the next screen we will discuss the third stage norming the third stage in the team building process is called the norming stage in this stage people get along and the team develops a unified commitment toward the project goal the team leader promotes the team and participates in the team activities team members look to the leader to clarify their understanding as some leadership roles begin to shift within the lower rungs of the group the leader employs a participatory style of management through facilitating change working to build consensus and overseeing quality control toward the end of this phase team members need to accept individual responsibilities and workout agreements about team procedures in the next screen we will discuss the fourth stage performing the next stage in the team building process is called the performing stage this is the most productive stage for the project team in this stage team members manage complex tasks and work toward the common goals of the project the leader employs a supervisory style of management by overseeing progress rewarding achievement and supervising process the team leader leads the project on more or less an automated mode when the project has completed successfully or when the end is in sight the team moves into the final stage in the next screen we will discuss the fifth stage adjourning the last stage of team building is called the adjourning stage in this stage the project is winding down and the goals are within reach the team members are dealing with their impending separation from the team the team leader provides feedback to the team the leader employs a supportive style of management by giving feedback celebrating accomplishments and providing closure the team leader needs to adopt a different style of leadership at every stage it is therefore important for a leader to understand these stages and identify the current stage that a team is in the success of the team depends on how well the leader can guide them through these phases in the next screen we will learn about negative dynamics team members can exhibit negative behavior in more than one way during the project life cycle this behavior has a negative effect on the dynamics of the team the first kind of negative participants fall in the category of overbearing participants these participants use their influence or expertise to take on a position of authority discounting contributions from other team members to cope with such participants team leaders must establish ground rules for participation and reinforce that the group has the right to explore any area pertinent to team goals and objectives another kind of negative participant is often referred to as the dominant participant these participants take up an excessive amount of group time by talking too much focusing on trivial concerns and otherwise preventing participation by others team leaders need to be able to control dominant participants without inhibiting their energy or enthusiasm some other participants are reluctant participants who feel intimidated and are not happy with the team process owing to their reluctance they miss opportunities to bring up data that is valuable to the project this can often lead to hostility within the team one way to deal with reluctant participants is to respond positively and with encouragement to any contribution from the team member teamwork is more than a natural consequence of working together team management is more than building a relationship with individual team members all teams face group challenges that need group based diagnosis and problem solving to ensure that negative participants are able to contribute and perform as part of the team in the next screen we will learn about group challenges six sigma team and their responsibilities are described here various roles assist the smooth execution of a six sigma project these roles are required to support the project by providing the information and resources that are needed to execute the project the first important member of the six sigma team is the executive sponsor sponsors are the source or conduit for project resources and they are usually the recipients of the benefits the project will produce the sponsor is responsible for setting the direction and priorities for the organization the sponsor may be a functional manager or an external customer the next important role is that of the process owners they work with the black belts to improve their respective processes they provide functional expertise about the process to the project usually this role is played by the functional managers in charge of specific processes the next role in the project is that of the champions they are typically upper level managers who control and allocate resources to promote process improvements they ensure the organization is providing necessary resources to the project and the project is fitting into the strategic plans of the organization the first role related to the execution of the project is the role of the master black belt this role acts as a consultant to team leaders and offers expertise in the use of six sigma tools and methodologies master black belts are experts in six sigma statistical tools and are qualified to teach high level six sigma methodologies and applications each master black belt will have multiple black belts under him black belts are the leaders of individual six sigma projects they lead project teams and conduct the detailed analysis required in six sigma methodologies black belts act as instructors and mentors for green belts and educate them in six sigma tools and methods they also protect the interests of the project by coordinating with functional managers green belts are trained in six sigma but typically lead project teams working in their own areas of expertise they are focused on the basic six sigma tools for acceleration of projects green belts work on projects on a part-time basis dividing time between project and functional responsibilities an executive is the person who manages and leads the team to ensure smooth working of tasks and has the power to execute decisions a coach takes on a number of roles he or she is the person who trains mentors teaches and guides the team when required coach also motivates and builds confidence of the members a facilitator is a guide for the team or group also known as a discussion leader facilitators help the group or team to understand their common objective and plan their activities a sponsor is a person who supports the event or the project by providing all the required resources a team member is an individual who belongs to a particular project team a team member contributes to the performance of the team and actively participates for fulfillment of the project objectives the progress achievements and the details of the project have to be effectively communicated to the team management customers and stakeholders we will learn about modes of communication in the next screen let us understand communication within the team in this screen the purpose of communication within the team and the modes of communication used are as follows meetings and emails are suitable to communicate the roles and responsibilities of the team members meetings memos and emails are used by the team to understand the project status workshops and meetings are conducted to identify the outstanding tasks risks and their corrective actions team meetings assist decision making meetings and emails ensure coordination and efficient work the next screen will focus on communication with stakeholders the purpose of communication with stakeholders and the modes of communication used are as follows meeting emails and events are suitable to convey project objectives and goals to stakeholders meetings emails and newsletters assist stakeholders in understanding project status workshops meetings and events help stakeholders to identify the adverse effects of a situation meetings with stakeholders assist decision making process in the next screen we will discuss the communication technique communication techniques can be grouped in various ways the first grouping of communication techniques is based on the direction in which communication flows vertical communication consists of two subtypes namely downward flow of communication and upward flow of communication in the downward flow of communication the managers must pass information and give orders and directives to the lower levels in the organization on the contrary upward communication consists of information relayed from the bottom or grassroot levels to the higher levels of the company horizontal communication refers to the sharing of information across the same levels of the organization this can be in the form of formal and informal communication formal communications are official company sanctioned methods of communicating to the employees the grapevine rumor mill etc are some of the means of informal communication in the organization the second grouping of communication techniques is based on the usage of words verbal communication includes use of words for communication via telephone face to face etc non-verbal communication conveys messages without the use of words through body language facial expressions etc the last grouping of communication techniques is based on participation of the people involved in communication one way communication happens when information is relayed from the sender to the receiver without the expectation of a response like memos and announcements two-way communication is a method in which both parties are involved in the exchange of information team tools are a part of the team dynamics and performance the various team tools that are used are brainstorming nominal group technique and multi-voting if getting your learning started is half the battle what if you could do that for free visit skillup by simply learn click on the link in the description to know more this lesson will cover the details of the measure phase the key objective of the measure phase is to gather as much information as possible on the current processes this involves three key tasks that is creating a detailed process map gathering baseline data and summarizing and analyzing the data let us understand process modeling in the following screen process modeling refers to the visualization of a proposed system layout or other change in the process process modeling and simulation can determine the effectiveness or ineffectiveness of a new design or process they can be done using process mapping and flow charts we will learn about these in the forthcoming screens let us understand process mapping in this screen process mapping refers to a workflow diagram which gives a clear understanding of the process or a series of parallel processes it is also known as process charting or flow charting process mapping can be done either in the measure phase or the analyze phase the features of process mapping are as follows process mapping is usually the first step in process improvement process mapping gives a wider perspective of the problems and opportunities for process improvement it is a systematic way of recording all the activities performed process mapping can be done by using any of the methods like flow charts written procedures and detailed work instructions let us learn about flowchart in the following screen a flowchart is a pictorial representation of all the steps of a process in consecutive order it is used to plan a project document processes and communicate the process methodology with others there are many symbols used in a flowchart and the common symbols are shown in the given table it is recommended you take a look at the symbols and their description for better understanding click the button to view an example of a flowchart the given flowchart shows the processes involved in software development the flowchart starts with the start box which connects to the design box in a software project a software design is followed by coding which is then followed by testing in the next step there is a check for errors in case of errors it is evaluated for the error type if it is a design error it goes back to the beginning of the design stage if it is not a design error it is then routed to the beginning of the coding stage on the contrary if there are no errors the flowchart ends let us learn about written procedures in this screen a written procedure is a step-by-step guide to direct the reader through a task it is used when the process of a routine task is lengthy and complex and it is essential for everyone to strictly follow the rules written procedures can also be used when you want to know what is going on during product or process development phases there are a number of benefits of documenting procedures writing procedures help you avoid mistakes and ensure consistency they streamline the process and help your employees take relevant decisions and save a lot of time written procedures help in improving quality they are simple to understand as they tend to describe the processes at a general level in the next screen we will discuss how work instructions are helpful in understanding the process in detail work instructions define how one or more activities involved in a procedure should be written in a detailed manner with the aid of technology or other resources like flowcharts they provide step-by-step details for a sequence of activities organized in a logical format so that an employee can follow it easily and independently for example in the internal audit procedure how to fill out the audit results report comes under work instructions selection of the three process mapping tools is based on the amount of detail involved in the process for a less detailed process you can select flowchart and for a detailed process with lots of instructions you can select work instructions click the button to view an example of work instructions this example shows the work instructions for shipping electronic instruments the company name is nutri worldwide inc the instructions are written by brianna scott and approved by andrew murphy it is a one-page instruction the work instructions are documented for the shipping of electronic instruments by the shipping department the scope of the project states that it is applicable to shipping operations the procedure is divided into three broad steps as a first step the order for the shipment must be prepared in this step the shipping person receives an order number from the sales department through an automatic order system the quantity of the instrument and its card number are looked up from the system file and the packaging is done as per the instructions on the card the next step is packaging special packing instructions must be checked the instruments are then marked as per the instructions on the card and packed in a special or standard container as per the requirement the order number is written in the shipping system and the packing list and shipping documentation are obtained finally the quantity of instruments and the documents is checked let us understand process input and output variables in this screen any improvement of a process has a few prerequisites to improve a process the key process output variables kpov and key process input variables kpiv should first be measured metrics for key process variables include percent defective operation cost elapsed time backlog quantity and documentation errors critical variables are best identified by the process owners process owners know and understand each step of a process and are in a better position to identify the critical variables initially once identified the relationship between the variables is depicted using tools such as cypoc and cause and effect matrix the process input variables results are compared to determine which input variables have the greatest effect on the output variables let us proceed to the next topic of this lesson in the following screen in this topic we will discuss probability and statistics in detail let us learn about probability in the following screen probability refers to the chance of something occurring or happening an outcome is the result of a single trial of an experiment suppose there are n possible outcomes that are equally likely the probability that a specific type of event or outcome say f can occur is the number of specific outcomes divided by the total possible outcomes click the button to view an example of probability in the event of tossing a coin what is the probability of the occurrence of heads a single trial of tossing a coin has two outcomes heads and tails hence the probability of heads occurring is one divided by two the total number of outcomes let us look at some basic properties of probability in this screen there are three basic properties of probability click each property to know more property 1 states that the probability of an event is always between 0 and 1 both inclusive according to property 2 the probability of an event that cannot occur is zero in other words an event that cannot occur is called an impossible event property 3 states that the probability of an event that must occur is one in other words an event that must occur is called a certain event if e is an event then the probability of its occurrence is given by p of e it is also read as the probability of event e in this screen let us look at some common terms used in probability along with an example the commonly used terms in probability are sample space venn diagram and event sample space is the collection of all possible outcomes for a given experiment in the coin example discussed earlier the sample space consists of one instance each of heads and tails if two coins are tossed the sample space would be four in total a venn diagram shows all hypothetically possible logical relations between a finite collection of sets an event is a collection of outcomes for an experiment which is any subset of the sample space click the button to view an example of probability what is the probability of getting a 3 followed by two when a dice is thrown twice when the dice is thrown twice the first throw can have any number from one to six similarly the second throw can also have any number from one to six so the total sample space is six times six that is thirty six the event in this case is three followed by two this can happen in only one way so the probability in the question is one divided by thirty-six let us discuss the basic concepts of probability in this screen some basic concepts of probability are independent event dependent event mutually exclusive and mutually inclusive events click each concept to know more when the probability of occurrence of an event does not affect the probability of occurrence of another event the two events are said to be independent suppose you rolled a dice and flipped a coin at the same time the probability of getting any number on the dice in no way influences the probability of getting heads or tails on the coin when the probability of one event occurring influences the likelihood of the other event the events are said to be dependent events are said to be mutually exclusive if the occurrence of any one of them prevents the occurrence of all the others in other words only one event can occur at a time consider an example of flipping a coin when you flip a coin you will either get heads or tails but not both you can add the probabilities of these two events to prove they are mutually exclusive any two events wherein one event cannot occur without the other are said to be mutually inclusive events in this screen let us learn about the multiplication rules also known as and rules the multiplication rules or and rules depend on the event dependency for independent events that is if two events are independent of each other the special multiplication rule applies for mutually independent events the special multiplication rule is as follows if the events a b c and so on are independent of each other then the probability of a and b and c and so on is equal to the product of their individual probabilities click the button to view an example of this rule suppose there are three events which are independent of each other such as the event of flipping a coin and getting heads drawing a card and getting an ace and throwing a dice and getting a one what is the probability of occurrence of all these events the answer is the probability of a and b and c is equal to the product of their individual probabilities which is half multiplied by one-thirteenth multiplied by one-sixth the result is 0.0064 which is 0.64 hence there is 0.64 probability of all of the events occurring we will continue the discussion on multiplication rules in this screen the multiplication rule for non-independent or conditional events which is also the general multiplication rule is as follows if a and b are two events then the probability of a and b is equal to the product of probability of a and the probability of b given a alternatively we can say that for any two events their joint probability is equal to the probability that one of these events occurs multiplied with the conditional probability of the other event given the first event click the button to view an example of this rule a bag contains six golden coins and four silver coins two coins are drawn without replacement from the bag what is the probability that both of the coins are silver let a be the event that the first coin is silver and b be the event that the second coin is silver there are ten coins in the bag four of which are silver therefore p of a equals four divided by ten after the first selection there are nine coins in the bag three of which are silver therefore p of b given a equals three divided by nine therefore based on the rule of multiplication probability of a intersection b equals four divided by ten multiplied by three divided by nine the answer is twelve divided by ninety which is zero point one three three four hence there is thirteen percent probability that both the coins are silver in this screen we will look at the definitions and formula of permutation and combination permutation is the total number of ways in which a set group or number of things can be arranged the order matters to a great extent in permutation the manner in which the objects or numbers are arranged will be considered in permutation the formula for permutation is npr equals p of n and r equals n factorial divided by n minus r factorial where n is the number of objects and r is the number of objects taken at a time the unordered arrangement of set group or number of things is known as combination the order does not matter in combination the formula for combination is ncr equals c of n and r equals n factorial divided by r factorial multiplied by n minus r factorial where n is the number of objects and r is the number of objects taken at a time let us look at an example for calculating permutation and combination in the following screen from a group of ten employees a company has to select four for a particular project in how many ways can the selection happen given the following conditions when the arrangements of employees needs to be different when the arrangement of employees need not be different click the button to know the answer in the given example the values of n and r are ten and four respectively let us consider the first condition from a group of ten employees four employees need to be selected the arrangement needs to be different using the permutation formula n p r equals p of n and r equals n factorial divided by n minus r factorial 10 p four equals p of ten and four equals ten factorial divided by ten minus four factorial equals five thousand forty therefore the four employees can be selected in five thousand forty ways let us now consider the second condition from a group of ten employees four employees need to be selected the arrangement of employees need not be different using the combination formula n c r equals c of n and r equals n factorial divided by r factorial multiplied by n minus r factorial 10 c 4 equals c of ten and four equals ten factorial divided by four factorial multiplied by ten minus four factorial equals two hundred and ten therefore the four employees can be selected from a group of 10 employees in 210 different ways let us understand the two types of statistics in this screen statistics refers to the science of collection analysis interpretation and presentation of data in six sigma statistical methods and principles are used to measure and analyze the process performance and improvements there are two major types of statistics descriptive statistics and inferential statistics descriptive statistics is also known as enumerative statistics and inferential statistics is also known as analytical statistics descriptive statistics include organizing summarizing and presenting the data in a meaningful way whereas inferential statistics includes making inferences and drawing conclusions from the data descriptive statistics describes what's going on in the data the main objective of inferential statistics is to make inferences from the data to more general conditions histograms pie charts box plots frequency distributions and measures of central tendency mean median and mode are all examples of descriptive statistics on the other hand examples of inferential statistics are hypothesis testing scatter diagrams etc the main objective of statistical inference is to draw conclusions on population characteristics based on the information available in the sample collecting data from a population is not always easy especially if the size of the population is big the easier way is to collect a sample from the population and from the sample statistic collected make an assessment about the population parameter click the button to see an example of statistical inference the management team of a cricket council wants to know if the team's performance has improved after recruiting a new coach the management conducts a test to prove this statistically let us consider y a and y b where y a stands for efficiency of coach a and yb stands for efficiency of coach b to conduct the test the basic assumption is coach a and coach b are both effective this basic assumption is known as null hypothesis here let us assume the status quo is null hypothesis hence null hypothesis h o can be given by y a equals yb the management team also challenges their basic assumption by assuming the coaches are not equally effective this is their alternate hypothesis the alternate hypothesis states that the efficiencies of the two coaches differ if the null hypothesis is proven wrong the alternate hypothesis must be right hence alternate hypothesis h1 can be given y a is not equal to yb these hypothesis statements are used in a hypothesis test which will be discussed in the later part of the course in this screen we will learn about the types of errors when collecting data from a population as a sample and forming a conclusion on the population based on the sample you run into the risk of committing errors there are two possible errors that can happen type 1 error and type 2 error the type 1 error occurs when the null hypothesis is rejected when it is in fact true type 1 error is also known as producer's risk the chance of committing a type 1 error is known as alpha alpha or significance level is the chance of committing a type one error and is typically chosen to be five percent this means the maximum amount of risk you have for committing a type one error is five percent let us consider the previous example arriving at a conclusion that coach b is better than coach a when in fact they are at the same level is a type one error the risk you have of committing this error is 5 which means there is a 5 chance your experiment can give wrong results the type 2 error occurs when the null hypothesis is accepted when it is in fact false also when you reject the alternate hypothesis when it is actually true you commit a type 2 error type 2 error is also referred to as consumer's risk in comparing the two coaches the coaches were actually different in their efficiencies but the conclusion was that they are the same the chance of committing a type 2 error is known as beta the maximum chance of committing a type 2 error is 20 in the next screen we will learn about central limit theorem central limit theorem clt states that for a sample size greater than 30 the sample mean is very close to the population mean in simple words the sample mean approaches the normal distribution for example if you have sample 1 and its mean is mean 1 sample 2 and its mean is mean 2 and so on take the means of mean one mean two etc and you will find that it is the same as the population mean population mean is the average of the sample means in such cases the standard error of mean also known as sem that represents the variability between the sample means is very less the sem is often used to represent the standard deviation of the sample the formula for sem is population standard deviation divided by the square root of the sample size selecting a sample size also depends on the concept called power also known as power of the test we will cover this concept in detail in the later part of the course let us look at the graphical representation of the central limit theorem in the following screen the plot of the three numbers 2 three and four looks as shown in the graph it is interesting to note that the total number of times each digit is chosen is six when the plot of the sample mean of nine samples of size two each is drawn it looks like the red line which is plotted in the figure the x-axis shows numbers of the mean which are two two point five three and four on the y-axis the frequency is plotted the point at which arrows from number two and three converge is the mean of two and three similarly the point at which arrows from two and four converge is the mean of the numbers two and four let us discuss the concluding points of the central limit theorem in the next screen the central limit theorem concludes that the sampling distributions are helpful in dealing with non-normal data if you take the sample data points from a population and plot the distribution of the means of the sample you get the sampling distribution of the means the mean of the sampling distribution also known as the mean of means will be equal to the population mean also the sampling distribution approaches normality as the sample size increases note that clt enables you to draw inferences from the sample statistics about the population parameters this is irrespective of the distribution of the population clt also becomes the basis for calculating confidence interval for hypothesis tests as it allows the use of a standard normal table let us proceed to the next topic of this lesson in the following screen in this topic we will cover the concept of statistical distributions let us start with discrete probability distribution in the following screen discrete probability distribution is characterized by the probability mass function it is important to be familiar with discrete distributions while dealing with discrete data some of the examples of discrete probability distribution are binomial distribution poisson distribution negative binomial distribution geometric distribution and hyper geometric distribution we will focus only on the two most useful discrete distributions binomial distribution and poisson distribution like most probability distributions these distributions also help in predicting the sample behavior that has been observed in a population let us learn about binomial distribution in the following screen binomial distribution is a probability distribution for discrete data named after the swiss mathematician jacob bernoulli it is an application of popular knowledge to predict the sample behavior binomial distribution also describes the discrete data as a result of a particular process like the tossing of a coin for a fixed number of times and the success or failure in an interview a process is known as bernoulli's process when the process output has only two possible values like defective or okay pass or fail and yes or no binomial distribution is used to deal with defective items defect is any non-compliance with a specification defective is a product or service with at least one defect binomial distribution is most suitable when the sample size is less than 30 and less than 10 percent of the population it is the percentage of non-defective items provided the probability of creating a defective item remains the same over a period let us look at the equation the probability of exactly r successes out of a sample size of n is denoted by p of r which is equal to ncr whole multiplied by p to the power of r and one minus p whole to the power of n minus r in the equation b is the probability of success r is the number of successes desired and n is the sample size to continue discussing the binomial distribution let us look at some of its key calculations in the following screen the mean of a binomial distribution is denoted by mio and is given by n multiplied by p the standard deviation of a binomial distribution is denoted by sigma which is equal to n multiplied by p multiplied by 1 minus p the method of calculating factorials say factorial of 5 is the product of five four three two and one which is equal to one hundred twenty similarly factorial of four is the product of four three two and one which is equal to twenty four let us look at an example of calculating binomial distribution in the next screen suppose you wish to know the probability of getting heads five times in eight coin tosses you can use the binomial equation for the same click the answer button to see how this is done the tossing of a coin has only two outcomes heads and tails it means that the probability of each outcome is 0.5 and it remains fixed over a period of time additionally the outcomes are statistically independent in this case the probability of success denoted by p is 0.5 the number of successes desired is denoted by r which is five and the sample size is denoted by n which is eight therefore the probability of five heads is equal to factorial of eight c r which is eight divided by factorial of five and factorial of eight minus five whole multiplied by zero point five to the power of five multiplied by one minus zero point five whole to the power of eight minus five this calculation gives a result of zero point two one eight seven which is equal to twenty one point eight seven percent let us learn about poisson distribution in this screen poisson distribution is named after simeone denis on and is also used for discrete data poisson distribution is an application of the population knowledge to predict the sample behavior it is generally used for describing the probability distribution of an event with respect to time or space some of the characteristics of poisson distribution are as follows poisson distribution describes the discrete data resulting from a process like the number of calls received by a call center agent or the number of accidents at a signal unlike binomial distribution which deals with binary discrete data possum distribution deals with integers which can take any value poisson distribution is suitable for analyzing situations wherein the number of trials similar to the sample size in binomial distribution is large and tends towards infinity additionally it is used in situations where the probability of success in each trial is very small almost tending towards zero this is the reason why poisson distribution is applicable for predicting the occurrence of rare events like plane crashes car accidents etc and is therefore widely used in the insurance sector poisson distribution can be used for predicting the number of defects as well given a low defect occurrence rate let us look at the formula for calculating poisson distribution in the next screen the poisson distribution for a probability of exactly x occurrences is given by p of x equals to lambda to the power of x multiplied with log e to the power of minus lambda whole divided by factorial of x in this equation lambda is the mean number of occurrences during the interval x is the number of occurrences desired and e is the base of natural logarithm which is equal to 2.71828 the mean of the poisson distribution is given by the lambda and the standard deviation of a poisson distribution is given by sigma which is the square root of lambda let us look at an example to calculate poisson distribution in the next screen the past records of a road junction which is accident prone show a mean number of five accidents per week at this junction assume that the number of accidents follows up with song distribution and calculate the probability of any number of accidents happening in a week click the button to know the answer given the situation you know that the value of lambda or mean is five so p of zero that is the probability of zero accidents per week is calculated as five to the power of zero multiplied by e to the power of minus five whole divided by factorial of zero the answer is zero point zero zero six applying the same formula the probability of one accident per week is 0.03 the probability of more than two accidents per week is one minus the sum of probabilities of zero one and two accidents which is 0.884 in other words the probability is 88.4 percent let us learn about normal distribution in this screen the normal or gaussian distribution is a continuous probability distribution the normal distribution is represented as n and depends on two factors me u which stands for mean and sigma which gives the standard deviation of the data points normal distribution normally has a higher frequency of values around the mean and lesser occurrences away from it it is often used as a first approximation to describe real valued random variables that tend to cluster around a single mean value the distribution is bell shaped and symmetrical the total area under the normal curve is one which is p of x various types of data such as body weight height the output of a manufacturing device etc follow the normal distribution additionally normal distribution is continuous and symmetrical with the tails asymptotic to the x-axis which means they touch the x-axis at infinity let us continue to discuss normal distribution in the following screen in a normal distribution to standardize comparisons of dispersion or the different measurement units like inches meters grams etc a standard z variable is used the uses of z value are as follows while the value of z or the number of standard deviations is unique for each probability within the normal distribution it helps in finding probabilities of data points anywhere within the distribution it is dimensionless as well that is it has no units such as millimeters liters columns etc there are different formulas to arrive at the normal distribution we will focus on one commonly used formula for calculating normal distribution which is z equals y minus mu whole divided by sigma here z is the number of standard deviations between y and the mean denoted by mu y is the value of the data point in concern me u is mean of the population or data points and sigma is the standard deviation of the population or data points let us look at an example for calculating normal distribution in the following screen suppose the time taken to resolve customer problems follows a normal distribution with a mean of 250 hours and standard deviation of 23 hours find the probability of a problem resolution taking more than 300 hours click the button to know the answer in this case y is equal to 300 me u equals 250 and sigma equals 23. applying the normal distribution formula z is equal to 300 minus 250 whole divided by 23 the result is 2.17 when you look at the normal distribution table the z value of 2.17 covers an area of 0.98499 under itself this means the probability of a problem taking 0 to 300 hours to be resolved is 98.5 percent and therefore the chances of a problem resolution taking more than 300 hours is point five percent let us understand the usage of z table in this screen the graphical representation of z table usage is given here the probability of areas under the curve is one for the actual value one can identify the z score by using the z table as shown this probability is the area under the curve to the left of point plus a to zero using the actual data when you calculate mean and standard deviation and the values are 25 and 5 respectively it is the normal distribution if the same data is standardized to a mean value of zero and standard deviation value of one it is the standard normal distribution in the next screen we will take a look at the z table the z table gives the probability that z is between zero and a positive number there are different forms of normal distribution z tables followed globally the most common form of z table with positive z scores is shown here the value of a called the percentage point is given along the borders of the table in bold and is to two decimal places the values in the main table are the probabilities that z is between 0 and plus a note that the values running down the table are to one decimal place the numbers along the column change only for the second decimal place let us look at some examples and how to use a z table in the following screen let us find the value of p of z less than zero the table is not needed to find the answer once we know that the variable z takes a value less than or equal to zero first the area under the curve is one and second the curve is symmetrical about z equals zero hence there is 0.5 or 50 percent above chance of z equals zero and zero point five or fifty percent below chance of z equals zero let us find the value of p of z greater than one point one two in this case the chance of z is greater than a number in this case 1.12 you can find this by using the following fact the opposite or complement of an event of a is the event of not a that is the opposite or complement of event a occurring is the event a not occurring its probability is given by p of not equal a equals one minus p of a in other words p of z greater than 1.12 is 1 minus the opposite which is p of z lesser than 1.12 using the table p of z less than 1.12 equals 0.5 plus p of 0 less than z less than 1.12 equals 0.5 plus 0.3686 which is 0.8686 hence the answer is p of z greater than 1.12 equals 1 minus 0.8686 which is 0.1314 note the answer is less than 0.5 let us find the value of p of z lies between 0 and 1.12 in this case where z falls within an interval the probability can be read straight off the table p of z lies between 0 and 1.12 equals 0.3686 we will learn about chi-square distribution in this screen chi-square distribution is also known as chi squared or chi-square distribution chi squared with k minus 1 degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables the chi-square distribution is one of the most widely used probability distributions in inferential statistics it is also known as hypothesis testing and the distribution is used in hypothesis tests when used in hypothesis tests it only needs one sample for the test to be conducted conventionally degree of freedom is k minus one where k is the sample size for example if w x y and z are four random variables with standard normal distributions then the random variable f which is the sum of w square x square y square and z square has a chi-square distribution the degrees of the freedom of the distribution df equals the number of normally distributed variables used in this case df is equal to four let us look at the formula to calculate chi-square distribution in the following screen chi-square calculated or sigma or the chi-square index equals f of o minus f of e whole square divided by f of e here f of o stands for an observed frequency and f of e stands for an expected frequency determined through a contingency table let us understand t distribution in the next screen the t distribution method is the most appropriate method to be used in the following situations when you have a sample size of less than 30 when the population standard deviation is not known when the population is approximately normal unlike the normal distribution a t distribution is lower at the mean and higher at the tails as seen in the image t distribution is used for hypothesis testing also as seen in the image the t distribution is symmetrical in shape but flatter than the normal distribution as the sample size increases the t distribution approaches normality for every possible sample size or degrees of freedom there is a different t distribution let us learn about f distribution in the following screen the f distribution is a ratio of two chi-square distributions a specific f distribution is denoted by the ratio of the degrees of freedom for the numerator chi-square and the degrees of freedom for the denominator chi-square the f-test is performed to calculate and observe if the standard deviations or variances of two processes are significantly different the project teams are usually concerned about reducing the process variance as per the formula f calculated equals s1 square divided by s2 square where s1 and s2 are the standard deviations of the two samples if the f calculated is one it implies there is no difference in the variance if s1 is greater than s2 then the numerator must be greater than the denominator in other words df1 equals n1 minus 1 and df2 equals n2 minus 1. from the f distribution table you can easily find out the critical f distribution at alpha and the degrees of freedom of the samples of two different processes df1 and df2 let us proceed to the next topic of this lesson in the following screen in this topic we will discuss collecting and summarizing data in detail let us learn about types of data in the following screen data is objective information which everyone can agree on it is a collection of facts from which conclusions may be drawn the two types of data are attribute data and variable data click each type to know more discrete data is data that can be counted and only includes numbers such as two forty or thousand fifty attribute data is commonly called pass fail or good bad data attribute or discrete data cannot be broken down into a smaller unit meaningfully it answers questions such as how many how often or what type some examples of attribute data are number of defective products percentage of defective products frequency at which a machine is repaired or the type of award received any data that can be measured on a continuous scale is continuous or variable data this type of data answers questions such as how long what volume or how far examples of continuous data include height weight time taken to complete a task temperature and so on let us understand the importance of selecting the data type in this screen deciding the data type facilitates analysis and interpretation therefore the first step in the measure phase is to determine what type of data should be collected this can be done by considering the following the first consideration is to identify what is already known for this the values already identified for the process are listed these include critical to quality parameters or ctq's key process output variables or kpovs and the key process input variable or kpivs next to understand how to proceed with the data gathered it is necessary to determine the data type that fits the metrics for the key variables identified the question now arises why should the data type be identified this is important as it enables the right set of data to be collected analyzed and used to draw inferences it is not advisable to convert one type of data into another converting attribute data to variable data is difficult and requires assumptions to be made about the process it may also require additional data gathering including retesting units let us look at measurement scales in the following screen there are four measurement scales arranged in the table in increasing order of their statistical desirability in the nominal scale the data consists of only names or categories and there is no possibility of ordering an example of this type of measurement can be a bag of colored balls which contains ten green balls five black balls eight yellow balls and nine white balls this is the least informative of all scales the most appropriate measure of central tendency for this scale is mode in the ordinal or ranking scale data is arranged in order and values can be compared with each other an example of this scale can be the ratings given to different restaurants three for a five for b two for c and four for d the central tendency for this scale is median or mode the interval scale is used for ranking items in step order along a scale of equidistant points for example the temperatures of three metal rods are 100 degrees 200 degrees and 600 degrees fahrenheit respectively note that three times 200 degrees is not the same as 600 degrees as a temperature measurement the central tendency here is mean median or mode mean is used if the data does not have any outliers the ratio scale represents variable data and is measured against a known standard or increment however this scale also has an absolute zero that is no numbers exist below zero an example of the ratio scale are physical measures where height weight and electric charge represent ratio scale data note that negative length is not possible again here you would use mean median or mode as the central tendency measure in the next screen we will learn about assuring data accuracy to ensure data is accurate sampling techniques are used sampling is the process act or technique of selecting an appropriate test group or sample from a larger population it is preferable to survey 100 people to surveying 10 000 people sampling saves the time money and effort involved in collecting data the three types of sampling techniques described here are random sampling sequential sampling and stratified sampling click each type to know more random sampling is the technique where a group of subjects or a sample for study is selected from a larger group or population at random sequential sampling is similar to multiple sampling plans except that it can in theory continue indefinitely in other words it is a non-probability sampling technique wherein the researcher picks a single subject or a group of subjects in a given time interval conducts the study analyzes the results and then picks another group of subjects if needed and so on in stratified sampling the idea is to take samples from subgroups of a population this technique gives an accurate estimate of the population parameter in this screen we will compare simple random sampling with stratified sampling simple random sampling is easy to do while stratified sampling takes a lot of time the possibility of simple random sampling giving erroneous results is very high while stratified sampling minimizes the chances of error simple random sampling doesn't have the power to show possible causes of variation while stratified sampling if done correctly will show assignable causes in the next screen we will look at the check sheet method of collecting data the process of collecting data is expensive wrongly collected data leading to wrong analysis and inferences results in resources being wasted a check sheet is a structured form prepared to collect and analyze data it is a generic tool that is relatively simple to use and can be adopted for a variety of purposes check sheets are used when the data can be observed and collected repeatedly by the same person or at the same location they are also used while collecting data from a production process a common example is calculating the number of absentees in a company the table shows absentee data collected for a week we will discuss data coding and its advantages in the following screen data coding is a process of converting and condensing raw data into categories and sets so that the data can be used for further analysis the benefits of data coding are listed here data coding simplifies the large quantity of data that is collected from sources the large amount of data makes analysis and drawing conclusions difficult it leads to chaos and ambiguity data coding simplifies the data by coding it into variables and then categorizing these variables raw data cannot be easily entered into computers for analysis data coding is used to convert raw data into process data that can be easily fed into computing systems for calculation and analysis coding of data makes it easy to analyze the data converted data can either be analyzed directly or fed into computers the analyst can easily draw conclusions when all the data is categorized and computerized data coding also enables organized representation of data division of data into categories helps organize large chunks of information thus making analysis and interpretation easier data coding also ensures that data repetition does not occur and duplicate entries are eliminated so that the final result is not affected in the following screen we will discuss measures of central tendency of the descriptive statistics in detail a measure of central tendency is a single value that indicates the central point in a set of data and helps in identifying data trends the three most commonly used measures of the central tendency are mean median and mode click each measure to know more mean is the most common measure of central tendency it is the sum of all the data values divided by the number of data points also called arithmetic mean or average it is the most widely used measure of central tendency also known as positional mean median is the number present in the middle of the data set when the numbers are arranged in ascending or descending order if the data set has an even number of entries then the median is the mean of the two middle numbers median can also be calculated by the formula n plus one divided by two where n is the number of entries mode also known as frequency mean is the value that occurs most frequently in a set of data data sets that have more than one mode are known as bimodal data let us look at an example for determining mean median and mode in this screen the data set has the numbers one two three four five five six seven and eight click the button to know the answer as previously defined mean is the sum of all the data items divided by the number of items therefore the mean is equal to 41 divided by 9 which is equal to four point five six the number in the middle of the data set is five therefore the median is five mode is the most frequently occurring number which is again five in this screen we will understand the effect of outliers on the data set let us consider a minor change to the data set a new number 100 is added to the data set on using the same formula to calculate mean the new mean is 15.11 ideally 50 percent of values should lie on either side of the mean value however in this example it can be seen that almost 90 percent of values lie below the mean value of 15.11 and only one value above the mean the data point 100 is called an outlier an outlier is an extreme value in the data set that skews the mean value to one side of the data set note that the median remains unchanged at five therefore mean is not an appropriate measure of central tendency if the data has outliers median is preferred in this case in the next screen we will look at measures of dispersion of the descriptive statistics apart from central tendency another important parameter to describe a data set is spread or dispersion contrary to the measures of central tendency such as mean median and mode measures of dispersion express the spread of values higher the variation of data points higher the spread of the data the three main measures of dispersion are range variance and standard deviation we will discuss each of these in the upcoming screens let us start with the first measure of dispersion range the range of a particular set of data is defined as the difference between the largest and smallest values of the data in the example the largest value of the data is nine and the smallest value is one therefore the range is nine minus one eight in calculating range all the data points are not needed and only the maximum and minimum values are required let us understand the next measure of dispersion variance in the following screen the variance denoted as sigma square or s square is defined as the average of squared mean differences and shows the variation in a data set to calculate the variance for a sample data set of 10 numbers type the numbers in an excel sheet calculate the variance using the formula equals v-a-r-p or v-a-r-s the v-a-r-p formula gives the population variance which is 7.24 for this example the v a r s formula gives the sample variance 8.04 population variance is calculated when the data set is for the entire population and sample variance is calculated when data is available only for a sample of the population population variance is preferred over sample variance as the latter is only an estimate sample variance allows for a broader range of possible answers for the true mean of the population that is the confidence levels are higher in sample variance note that variance is a measure of variation and cannot be considered as the variation in a data set in the following screen we will understand the next measure of dispersion standard deviation standard deviation denoted by sigma or s is given by the square root of variance the statistical notation of this is given on screen standard deviation is the most important measure of dispersion standard deviation is always relative to the mean for the same data set the population standard deviation is 2.69 and sample standard deviation is 2.83 as in variance calculation if the data set is measured for every unit in a population the population standard deviation and sample standard deviation can be calculated in excel using the formula given on the screen the steps to manually calculate the standard deviation are first calculate the mean then calculate the difference between each data point and the mean and square that answer next calculate the sum of the squares next divide the sum of the squares by n or n minus 1 to find the variance lastly find the square root of variance which gives the standard deviation in the next screen we will look at frequency distribution of the descriptive statistics frequency distribution is a method of grouping data into mutually exclusive categories showing the number of observations in each class an example is presented to demonstrate frequency distribution a survey was conducted among the residents of a particular area to collect data on cars owned by each home a total of 20 homes were surveyed to create a frequency table for the results collected in the survey the first step is to divide the results into intervals and count the number of results in each interval for instance in this example the intervals would be the number of households with no car one car two cars and so on next a table is created with separate columns for the intervals the tallied results for each interval and the number of occurrences or frequency of results in each interval each result for a given interval is recorded with a tally mark in the second column the tally marks for each interval are added and the sum is entered in the frequency column the frequency table allows viewing distribution of data across a set of values at a glance in the following screen we will look at cumulative frequency distribution a cumulative frequency distribution table is similar to the frequency distribution table only more detailed there are additional columns for cumulative frequency percentage and cumulative percentage in the cumulative frequency column the cumulative frequency or the previous row or rows is added to the current row the percentage is calculated by dividing the frequency by the total number of results and multiplying by one hundred the cumulative percentage is calculated similar to the cumulative frequency let us look at an example for cumulative frequency distribution the ages of all the participants in a chess tournament are recorded the lowest age is 37 and the highest is 91. keeping intervals of 10 the lowest interval starts with the lower limit as 35 and the upper limit as 44. similar intervals are created until an upper limit of ninety four in the frequency column the number of times a result appears in a particular interval is recorded in the cumulative frequency column the cumulative frequency of the previous row is added to the frequency of the current row for the first row the cumulative frequency is the same as the frequency in the second row the cumulative frequency is one plus two which is three and so on in the percentage column the percentage of the frequency is listed by dividing the frequency by the total number of results which is 10 and multiplying the value by 100 for instance in the first row the frequency is one and the number of results is ten therefore the percentage is ten the final column is the cumulative percentage column in this column the cumulative frequency is divided by the total number of results which is ten and the value is multiplied by hundred note that the last number in this column should be equal to one hundred in this example the cumulative frequency is one and the total number of results is ten therefore the cumulative percentage of the first row is ten let us look at the stem and leaf plots which is one of the graphical methods of understanding distribution graphical methods are extremely useful tools to understand how data is distributed sometimes merely by looking at the data distribution errors in a process can be identified the stem and leaf method is a convenient method of manually plotting data sets it is used for presenting data in a graphical format to assist visualizing the shape of a given distribution in the example on the screen the temperatures in fahrenheit for the month of may are given to collate this information in a stem and leaf plot all the tens digits are entered in the stem column and all the units digits against each tens digit are entered in the leaf column to start with the lowest value is considered in this case the lowest temperature is 51. in the first row five is entered in the stem column and zero in the leaf column the next lowest temperature is 58. eight is entered in the leaf column corresponding to five in the stem the next number is fifty-nine all the temperatures falling in the fifties are similarly entered in the next row the same process is repeated for temperatures in the sixties this is continued till all the temperature values are entered in the table let us understand another graphical method in the next screen box and whisker plots a box and whisker graph based on medians or quartiles is used to display a data set in a way that allows viewing the distribution of the data points easily consider the following example the lengths of 13 fish caught in a lake were measured and recorded the data set is given on the screen the first step to draw a box and whisker plot is therefore to arrange the numbers in increasing order next find the median as there is an odd number of data entries the median is the number in the middle of the data set which in this case is 12. the next step is to find the lower median or quartile this is the median of the lower six numbers the middle of these numbers is halfway between 8 and 9 which would be 8.5 similarly the upper median or quartile is located for the upper six numbers to the right of the median the upper median is halfway between the two values fourteen and fourteen therefore the upper median is fourteen let us now understand how the box and whisker chart is drawn using the values of the median and upper and lower quartiles the next step is a numbered line is drawn extending far enough to include all the data points then a vertical line is drawn from the median point 12. the lower and upper quartiles 8.5 and 14 respectively are marked with vertical lines and these are joined with the median line to form two boxes as shown on the screen next two whiskers are extended from either ends of the boxes as shown to the smallest and largest numbers in the data set 5 and 20 respectively the box and whiskers graph is now complete the following inferences can be drawn from the box and whisker plot the lengths of the fish range from five to twenty the range is therefore fifteen the quartiles split the data into four equal parts in other words one quarter of the data numbers is less than 8.5 one quarter between 8.5 and 12 next quarter of the data numbers are between 12 and 14 and another quarter has data numbers greater than 14. in this screen we will learn about another graphical method scatter diagrams a scatter diagram or scatter plot is a tool used to analyze the relationship or correlation between two sets of variables x and y with x as the independent variable and y as the dependent variable a scatter diagram is also useful when cause effect relationships have to be examined or root causes have to be identified there are five different types of correlation that can be used in a scatter diagram let us learn about them in the next screen the five types of correlation are perfect positive correlation moderate positive correlation no relation or no correlation moderate negative correlation and perfect negative correlation click each type to learn more in perfect positive correlation the value of dependent variable y increases proportionally with any increase in the value of independent variable x this is said to be one is to one that is any change in one variable results in an equal amount of change in the other the following example is presented to demonstrate perfect positive correlation the consumption of milk is found to increase proportionally with an increase in the consumption of coffee the data is presented in the table on the screen the scatter diagram for the data is also shown it can be observed from the graph that as x increases y also increases proportionally hence the points are linear in this type of correlation as the value of the x variable increases the value of y also increases but not in the same proportion to demonstrate this the following example is presented the increase in savings for increase in salary is shown in the table as you can notice in the scatter diagram the points are not linear although the value of y increases with increase in the value of x the increase is not proportional when a change in one variable has no impact on the other there is no relation or correlation between them let us consider the following example to study the relation between the number of fresh graduates in the city and the job openings available data for both was collected over a few months and tabulated as shown the scatter diagram for the same is also displayed it can be observed that the data points are scattered and there is no trend emerging from the graph therefore there is no correlation between the number of fresh graduates and the number of job openings in the city in moderate negative correlation an increase in one variable results in a decrease in the other variable however this change is not proportional to the change in the first variable to demonstrate modern negative correlation the prices of different products are listed along with the number of units sold for each product the data is shown in the table from the scatter diagram shown it can be observed that higher the price of a product lesser are the number of units of that product sold however the decrease in the number of units with increasing price is not proportional in perfect negative correlation an increase in one variable results in a proportional decrease of the other variable this is also an example of one is to one correlation as an example the effect of an increase in the project time extension on the success of project is considered the data is shown in the table the scatter diagram for the data shows a proportional decrease in the probability of the project's success with each extension of the project time hence the points are linear perfect correlations are rare in the real world when encountered they should be investigated and verified in this screen we will look at another graphical method histograms histograms are similar to bar graphs except that the data in histograms is grouped into intervals they are used to represent category wise data graphically a histogram is best suited for continuous data the following example illustrates how a histogram is used to represent data data on the number of hours spent by a group of 15 people on a special project in one week is collected this data is then divided into intervals of two and the frequency table for the data is created the histogram for the same data is also displayed looking at the histogram it can be observed at a glance that most of the team members spent between two to four hours on the project in the following screen we will look at the next graphical method normal probability plots normal probability plots are used to identify if a sample has been taken from a normal distributed population when sample data from a normal distributive population is represented as a normal probability plot it forms a straight line the following example is presented to illustrate normal probability plots a sampling of diameters from a drilling operation is done and the data is recorded the data set is given to create a normal probability plot the first step is to construct a cumulative frequency distribution table this is followed by calculating the mean rank probability by dividing the cumulative frequency by the number of samples plus one and multiplying the answer by one hundred the fully populated table for mean rank probability estimation is shown on the screen please take a look at the same in the next step a graph is plotted on log paper or with minitab using this data minitab is a statistical software used in six sigma minitab normal probability plot instructions are also given on the screen the completed graph is shown on the screen from the graph it can be seen that the random sample forms a straight line and therefore the data is taken from a normally distributed population let us proceed to the next topic of this lesson in this topic we will discuss measurement system analysis or msa in detail let us understand what msa is in the following screen throughout the dmac process the output of the measurement system ms is used for metrics analysis and control efforts an error prone measurement system will only lead to incorrect data incorrect data leads to incorrect conclusions it is important to set right the ms before collecting the data measurement system analysis or msa is a technique that identifies measurement error or variation and sources of that error in order to reduce the variation it evaluates the measuring system to ensure the integrity of data used for analysis msa is therefore one of the first activities in the measure phase the measurement systems capability is calculated analyzed and interpreted using gauge repeatability and reproducibility to determine measurement correlation bias linearity percent agreement and precision or tolerance let us discuss the objectives of msa in the next screen a primary objective of msa is to obtain information about the type of measurement variation associated with the measurement system it is also used to establish criteria to accept and release new measuring equipment msa also compares measuring one method against another it helps to form a basis for evaluating a method which is suspected of being deficient the measurement system variations should be resolved to arrive at the correct baselines for the project objectives as baselines contain crucial data based on which decisions are taken it is extremely important that the measurement system be free of error as far as possible let us look at measurement analysis in detail in the next screen in measurement analysis the observed value is equal to the sum of the true value and the measurement error the measurement error can be a negative or a positive value measurement error refers to the net effect of all sources of measurement variability that cause an observed value to deviate from the true value true variability is the sum of the process variability and the measurement variability process variability and measurement variability must be evaluated and improved together measurement variability should be addressed before looking at process variability if process variability is corrected before resolving measurement variability then any improvements to the process can not be trusted to have taken place owing to a faulty measurement system in the following screen we will identify the types of measurement errors the two types of measurement errors are measurement system bias and measurement system variation click each type to know more measurement system bias involves calibration study in the calibration study the total mean is given by the sum of the process mean and the measurement mean the statistical notation is shown on the screen measurement system variation involves gauge repeatability and reproducibility or grrr study in the grr study the total variance is calculated by adding the process variance with the measurement variation the statistical notation is shown on the screen in this screen we will discuss the sources of variation the chart on the screen lists the different sources of variation observed process variation is divided into two actual process variation and measurement variation actual process variation can be divided into long-term and short-term process variations in a gauge rr study process variation is often called part variation measurement variation can be divided into variations caused by operators and variations due to the gauge the variation due to operators is owing to reproducibility variation due to gauge is owing to repeatability both actual process variation and measurement variation have a common factor that is variation within a sample let us understand gauge repeatability and reproducibility or grr in the next screen gauge repeatability and reproducibility or grr is a statistical technique to assess if a gauge or gauging system will obtain the same reading each time a particular characteristic or parameter is measured gauge repeatability is the variation in measurement when one operator uses the same gauge to measure identical characteristics of the same part repeatedly gauge reproducibility is the variation in the average of measurements when different operators use the same gauge to measure identical characteristics of the same part the figures on the screen illustrate gauge repeatability and reproducibility in the next screen we will discuss the components of grr study the figure on the screen illustrates the difference between gauge repeatability and reproducibility the figure shows the repeatability and reproducibility for six different parts represented by the numbers one to six for two different trial readings by three different operators as can be observed a difference in reading for part one indicated by the color green by three different operators is known as reproducibility error a difference in reading of part 4 indicated by red by the same operator in two different trials is known as the repeatability error in the following screen we will look at some guidelines for gauge repeatability and reproducibility studies the following should be kept in mind while carrying out gauge repeatability and reproducibility or grr studies grr studies should be performed over the range of expected observations care should be taken to use actual equipment for gr studies written procedures and approved practices should be followed as would have been in actual operations the measurement variability should be represented as is not the way it was designed to be after grr the measurement variability is separated into casual components sorted according to priority and then targeted for action in the following screen let us look at some more concepts associated with grr bias is the distance between the sample mean value and the sample true value it is also called accuracy bias is equal to mean minus reference value process variation is equal to six times the standard deviation the bias percentage is calculated as bias divided by the process variation the next term is linearity linearity refers to the consistency of bias over the range of the gauge linearity is given by the product of slope and process variation precision is the degree of repeatability or closeness of data smaller the dispersion in the data set better the precision the variation in the gauge is the sum of variation due to repeatability and the variation due to reproducibility in the following screen we will understand measurement resolution measurement resolution is the smallest detectable increment that an instrument will measure or display the number of increments in the measurement system should extend over the full range for a given parameter some examples of wrong gauges or incorrect measurement resolution are given here a truck weighing scale is used for measuring the weight of a t-pack a caliper capable of measuring differences of 0.1 millimeters is used to show compliance when the tolerance limits are plus or minus 0.07 millimeters thus the measurement system that matches the range of the data should only be used an important prerequisite for grr studies is that the gauge has an acceptable resolution in the next screen we will look at examples for repeatability and reproducibility repeatability is also called equipment variation or ev it occurs when the same technician or operator repeatedly measures the same part or process under identical conditions with the same measurement system the following example illustrates this concept a 36 kilometer per hour pace mechanism is timed by a single operator over a distance of 100 meters on a stopwatch and three readings are taken trial one takes nine seconds trial two takes ten seconds and trial 3 takes 11 seconds the process is measured with the same equipment in identical conditions by the same operator assuming no operator error the variation in the three readings is known as repeatability or equipment variation reproducibility is also called appraiser variation or av it occurs when different technicians or operators measure the same part or process under identical conditions using the same measurement system let us extend the example for repeatability to include data measured by two operators the readings are displayed on the slide the difference in the readings of both operators is called reproducibility or appraiser variation it is important to resolve equipment variation before appraiser variation if appraiser variation is resolved first the results will still not be identical due to variation in the equipment itself in this screen we will learn about data collection in grr there are some important considerations for data collection in grr studies there are usually three operators and around ten units to measure general sampling techniques must be used to represent the population and each unit must be measured two to three times by each operator it is important that the gauge be calibrated accurately it should also be ensured that the gauge has an acceptable resolution another practice is that the first operator measures all the units in random order then this order is maintained by all other operators all the trials must be repeated in the next screen we will discuss the anova method of analyzing grr studies the anova method is considered to be the best method for analyzing grr studies this is because of two reasons the first being anova not only separates equipment and operator variation but also provides insight on the combined effect of the two second anova uses standard deviation instead of range as a measure of variation and therefore gives a better estimate of the measurement system variation the one drawback of using anova is the considerations of time resources and cost in the next screen we will understand how msa can be interpreted two results are possible for an msa in the first case the reproducibility error is larger than the repeatability error this occurs when the operators are not trained and calibrations on the gauge dial are not clear the other possibility is that the repeatability error is larger than the reproducibility error this is clearly a maintenance issue and can be resolved by calibrating the equipment or performing maintenance on the equipment this indicates that the gauge needs redesign to be more rigid and the location needs to be improved it also occurs when there is ambiguity in sops msa is an experiment which seeks to identify the components of variation in the measurement in the following screen we will look at a template used for grr studies a sample gauge rr sheet is given on this screen the operators here are andrew murphy and lucy wang who are the appraisers in this study they have measured and rated the performance of three employees ibrahim glasoff brianna scott and jason schmidt this is a sample template for a gauge r r study the parts are shown across the top of the sheet in this case the measurement system is being evaluated using three parts the employees abraham glasoff brianna scott and jason schmidt the operators measure each part repeatedly from this data the average x and ranges are for each inspector and for each part are calculated the grand average for each inspector and each part is also calculated in this example a control limit ucl in the sheet was compared with the difference in averages of the two inspectors to identify if there is a significant difference in their measurements their difference is 0.111 which is outside the ucl of 0.108 given the r average of 0.042 in the next screen we will look at the results page for this grr study the sheet on the screen displays the results for the data entered in the template in the previous screen please spend some time to go through the data for a better understanding of the concept in the following screen we will look at the interpretation to this results page the percentage grrr value is highlighted in the center right of the table in the previous screen there are three important observations to be made here about the gauge rr study first this study also shows the interaction between operators and parts if the percentage grr value is less than 30 then the gauge is acceptable and the measurement system does not require any change if the value is greater than 30 then the gauge needs correction the equipment variation is checked and resolved first followed by the appraiser variation second if ev equals zero it means the ms is reliable the equipment is perfect and the variation in the gauge is contributed by different operators if the av is equal to zero the ms is precise third if ev is equal to zero and there is a v the operators have to be trained to ensure all operators follow identical steps during measurement and the av is minimal the interaction between operators and parts can also be studied under grr using part variation the trueness and precision cannot be determined in a grrr if only one gauge or measurement method is evaluated as it may have an inherent bias that would go undetected merely by varying operators and parts let us proceed to the next topic of this lesson in the following screen in this topic we will discuss process and performance capability in detail in the following screen we will look at the differences between natural process limits and specification limits natural process limits or control limits are derived from the process data and are the voice of the process the data consists of real-time values from past process performance therefore these values represent the actual process limits and indicate variation in the process the two control limits are upper control limit ucl and lower control limit lcl specification limits are provided by customers based on their requirements or the voice of the customer and cannot be changed by the organization these limits act as targets for the organization and processes are designed around the requirements the product or service has to meet customer requirements and has to be well within the specification limits if the product or service does not meet customer requirements it is considered as a defect therefore specification limits are the intended results or requirements from the product or service that are defined by the customer the two specification limits are upper specification limit or usl and lower specification limit or lsl the difference between the two is called tolerance an important point to note is that for a process if the control limits lie within the specification limits the process is said to be under control conversely if specification limits lie within the control limits the process will not meet customer requirements in the following screen we will look at process performance metrics and how they are calculated the two major metrics used to measure process performance are defects per unit or dpu and defects per million opportunities or dpmo dpu is calculated by dividing the number of defects by the total number of units dpmo is calculated by multiplying the defects per opportunity with 1 million in the following screen we will look at an example for calculating process performance in this example the quality control department checks the quality of finished goods by sampling a batch of 10 items from the produced lot every hour the data is collected over 24 hours the table displays the data for the number of defectives for the sampling period if items are consistently found to be outside the control limits on any given day the production process is stopped for the next day let us now interpret the results of the sampling in this example as the sample size is constant dpu or p-bar is used to calculate the process capability the total number of defectives is 34 and the subgroup size is 10. the total number of units is 10 multiplied by 24 which is 240. the defects per unit is 0.0124 the defects per million opportunities is obtained by multiplying the defects per unit with 1 million which is 141 666.66 therefore by looking at the dpmo table it can be said that the process is currently working at 2.6 sigma or 86.4 percent yield we will learn about process stability studies in this screen the activities carried out in the measure phase are msa collection of data statistical calculations and checking for accuracy and validity this is followed by a test for stability as changes cannot be made to an unstable process with a set of data believed to be accurate the process is checked for stability this is important because if a process is unstable no changes can be implemented why does a process become unstable a process can become unstable due to special causes of variation multiple special causes of variation lead to instability a single special cause leads to an out of control condition run charts in minitab can be used to check for process stability let us look at the steps to plot a run chart in minitab in the following screen to plot a run chart in minitab first enter the sample data collected to check for stability next click stat on the minitab window followed by quality tools next click run charts select the column and choose the subgroup size as two click ok the graph shown on the screen is interpreted by looking at the last four values if any of the p values is less than 0.05 the presence of special causes of variation can be validated this means there is a good chance that the process will become unstable in the following screen we will look at process stability studies causes of variation variation can be due to two types of causes common causes of variation and special causes of variation click each type to learn more common causes of variation are the many sources of variation within a process which have a stable and repeatable distribution over a period they contribute to a state of statistical control where the output is predictable some other factors which do not always act on the process can also cause variation these are special causes of variation these are external to the process and are irregular in nature when present the process distribution changes and the process output is not stable over a period special causes may result in defects and need to be eliminated to bring the process under control run charts indicate the presence of special causes of variation in the process if special causes are detected the process has to be brought to a stop and a root cause analysis has to be carried out if the root cause analysis reveals the special cause to be undesirable corrective actions are taken to remove the special cause we will learn about verifying process stability and normality in this screen based on the type of variation of process exhibits it can be verified if the process is in control if there are special causes of variation the process output is not stable over time the process cannot be said to be in control conversely if there are only common causes of variation in a process the output forms a distribution that is stable and predictable over time a process being in control means the process does not have any special causes of variation once a process is understood to be stable the control chart data can be used to calculate the process capability indices in the following screen we will discuss process capability studies process capability is the actual variation in the process specification to carry out a process capability study first plan for data collection next collect the data finally plot and analyze the results obtaining the appropriate sampling plan for the process capability study depends on the purpose and whether there are any customer or standard requirements for the study for new processes or a project proposal the project capability can be estimated by a pilot run let us look at the objectives of process capability studies in the next screen the objectives of a process capability study are to establish a state of control over a manufacturing process and then maintain the state of control over a period of time on comparing the natural process limits or the control limits within the specification limits any of the following outcomes is possible first the process limits are found to fall between the specification limits this shows the process is running well and no action is required the second possibility is that the process spread and the specification spread are approximately the same in this case the process is centered by making an adjustment to the centering of the process this would bring the batch of products within the specifications the third possibility is that the process limits fall outside the specification limits in this case reduce the variability by partitioning the pieces of batches to locate and target the largest offender a design experiment can be used to identify the primary source of variation in the following screen we will learn about identifying characteristics in process capability process capability deals with the ability of the process to meet customer requirements therefore it is crucial that the characteristics selected for a process capability study indicates a key factor in the quality of the product or process also it should be possible to influence the value of the characteristic by adjusting the process the operating conditions that affect the characteristic should also be defined and controlled apart from these requirements other factors determining the characteristics to be measured are customer purchase order requirements or industry standards in the following screen we will look at identifying specifications or tolerances in process capability the process specification or tolerances are defined either by industry standards based on customer requirements or by the organization's engineering department in consultation with the customer a stability study followed by a comprehensive capability study also helps in identifying if the process mean meets the target or the customer mean the process capability study indicates whether the process is capable it is used to determine if the output consistently meets specifications and the probability of a defect or defective this information is used to evaluate and improve the process to meet the tolerance requirements in the following screen we will learn about process performance indices process performance is defined as a statistical measurement of the outcome of a process characteristic which may or may not have been demonstrated to be in a state of statistical control in other words it is an estimate of the process capability of a process during its initial setup before it has been brought into a state of statistical control it differs from the process capability in that for process performance a state of statistical control is not required the three basic process performance indices are process performance or pp process performance index or ppk and process capability index denoted as ppm or cpm click each index to know more pp stands for process performance it is computed by subtracting the lower specification limit from the upper specification limit the whole divided by natural process variation or six sigma ppk is the process performance index and a minimum of the values of the upper and lower process capability indices the upper and lower process capability indices are calculated as shown on the screen ppu or upper process capability index is given by the formula usl minus x divided by 3s ppl or lower process capability index is given by x minus lsl divided by 3s here x is process average better known as x bar and s is sample standard deviation cpm denotes the process capability index mean which accounts for the location of the process average relative to a target value it can be calculated as shown on the screen here myu stands for process average sigma symbol denotes the process standard deviation usl is the upper specification limit and lsl is the lower specification limit t is the target value which is typically the center of the tolerance x i is the sample reading and n is the number of sample readings we will look at the key terms in process capability in this screen zst or short term capability is the potential performance of the process in control at any given point of time it is based on the sample collected in the short term the long term performance is denoted by zlt it is the actual performance of the process over a given period of time subgroups are several small samples collected consecutively each sample forms a subgroup the subgroups are chosen so that the data points are likely to be identical within the subgroup but different between two subgroups the process shift is calculated by subtracting the long-term capability from the short-term capability the process shift also reflects how well a process is controlled it is usually a factor of 1.5 let us look at short-term and long-term process capability in the next screen the concept of short-term and long-term process shift is explained graphically on this screen there are three different samples taken at time 1 time 2 and time 3. the smaller waveforms represent the short-term capability and they are joined with their means to show the shift in long-term performance the long-term performance curve is shown below with the target value marked in the center it is important to note that over a period of time or subgroups a typical process will shift by approximately 1.5 times the standard deviation also long-term variation is more than short-term variation this difference is known as the sigma shift and is an indicator of the process control the reasons for a process shift include changes in operators raw material used wear and tear and time periods we will discuss the assumptions and conventions of process variations in the following screen long-term variation is always longer than short-term variation click each term to know more short-term variations are due to the common causes the variance is inherent in the process and known as the natural variation short-term variations show variation within subgroup and are therefore called within subgroup variation they are usually a small number of samples collected at short intervals in short-term variation the variation due to common causes are captured however common causes are difficult to identify and correct the process may have to be redesigned to remove common causes of variation long-term variations are due to common as well as special causes the added variation or abnormal variation is due to factors external to the usual process long-term variation is also known as the overall variation and is a sample standard deviation for all the samples put together long-term variation shows variations within the subgroup and between subgroups special causes increasing variation include changes in operators raw material and wear and tear the special causes need to be identified and corrected for process improvement this screen explains how the factors of stability capability spread and defect summary are used to interpret the process condition this table gives the process condition for different levels or types of variation with reference to common causes and special causes the table is read as follows in the first scenario the process has lesser common causes of variation or ccv and no special causes of variation or scv in this case the variability is less the capability is high the possibility of defects is less and the process is said to be capable and in control next if the process has lesser ccv and some scb are present then it has high variability low capability and a high possibility of defects the process is said to be out of control and incapable the third possibility is that the process has high ccv and no scv in this case the variability is moderate to high the capability is very low and possibility of defects is very high although the process is in control it is incapable finally at the other extreme is the situation where the process has high ccv and scv is also present here the process has high variability low capability high possibility of defects and is out of control and incapable this table is a quick reference to understand process conditions in the next screen we will compare the cpk and cp values when cpk and cp values are compared three outcomes are possible when cpk is lesser than cp it can be inferred that the mean is not centered when cpk is equal to cp the inference is that the process is accurate the process is considered capable if cpk is greater than one this will happen only if variations are in control cpk can never be greater than cp if this situation occurs the calculations have to be rechecked we will look at an example problem for calculating process variation in the following screen the table on this screen shows data for customer complaint resolution time over a period of three weeks each week's data forms a sub group for example the resolution time is 48 hours for a particular case in week 1. in week 2 the case takes up 50 hours and in week 3 it takes about 49 hours the subgroup size is 10. let us understand how to calculate long-term and short-term standard deviations are calculated for this data the average for each week is calculated by dividing the total number of complaints resolved by the subgroup size a grand average is also calculated for all the three weeks the variations within subgroups and between subgroups for each week are calculated this is followed by calculating the total variations within and between subgroups overall variation is given by the sum of total variation within subgroups and total variation between subgroups finally the standard deviations for the short term and the long term are calculated using the formula given on the screen the results for the process variation calculations are as follows the grand average for all three weeks is 47.5 the total variation within subgroups is 1023.8 the total variation between subgroups is 161.67 both these variations are added to give the overall variation of 1185.5 the short-term standard deviation is 6.2 and the long-term standard deviation is 6.4 note that the overall variation can also be calculated with the usual sample variance formula let us discuss the effective mean shift on the process capability in this screen the table given here shows the defect level at different sigma multiple values and different mean shifts from the table it can be seen that when the mean is centered within the specification limits and the process capability is 1 that is plus or minus 3s fits within the specification limits the dpmo is 2700 percent and the probability of a good result is 99.73 percent if the mean shifts by 1.5 sigma then a tail moves outside the specification limit to a greater extent now the dpmo increases to over sixty six thousand this is almost a twenty five hundred percent increase in defects if the process has a process capability of two that is plus or minus six s fits within the specification limits and the mean shifts by 1.5 sigma then the probability of a good result is 99.99966 percent this is the same as a process with a capability of 1.5 that is plus or minus 4.5 s fitting within the specification limits and no shift in the mean the long-term and short-term capability table shows the variations in capabilities for the purposes of six sigma the assumption is that the long-term variability will have a 1.5 s difference from the short-term variability as seen in statistical process control this assumption can be challenged if control charts are used and these kinds of shifts are detected quickly in the chart it can be seen that the mean shift is negligible as the process capability increases therefore for a six sigma process the long-term variation does not have much effect in the next screen we will look at key concepts in process capability for attribute data the customary procedure for defining process capability for attribute data is to define the mean rate of non-conformity defects and defectives are examples of non-conformity defects per million opportunities or dpmo is the measure of process capability for attribute data for this the mean and the standard deviation for attribute data have to be defined for defectives p bar is used for checking process capability for both constant and variable sample sizes for defects c bar and u bar are used for constant and variable sample sizes respectively the p bar c bar and u bar are the equivalent of the standard deviation denoted by sigma for continuous data in this topic we will learn about the patterns of variation in detail let us start with the classes of distributions in the following screen when data obtained from the measurement phase is plotted on a chart it is observed that it exhibits a variety of distributions depending on the data type and its source these distribution patterns will help you understand the data better probability statistics and inferential statistics are the methods used to describe the parameters for the classes of distributions click each method to know more probability is based on the assumed model of distribution and it is used to find the chances of a certain outcome or event to occur statistics uses the measured data to determine a model to describe the data used inferential statistics describe the population parameters based on the sample data using a particular model in this screen we will discuss the types of distributions there are two types of distributions discrete distribution and continuous distribution discrete distribution includes binomial distribution and poisson distribution continuous distribution includes normal distribution chi square distribution t distribution and f distribution let us learn about discrete probability distribution in the following screen discrete probability distribution is characterized by the probability mass function it is important to be familiar with discrete distributions while dealing with discrete data some of the examples of discrete probability distribution are binomial distribution poisson distribution negative binomial distribution geometric distribution and hyper geometric distribution we will focus only on the two most useful discrete distributions binomial distribution and poisson distribution like most probability distributions these distributions also help in predicting the sample behavior that has been observed in a population let us learn about binomial distribution in the following screen binomial distribution is a probability distribution for discrete data named after the swiss mathematician jacob bernoulli it is an application of popular knowledge to predict the sample behavior binomial distribution also describes the discrete data as a result of a particular process like the tossing of a coin for a fixed number of times and the success or failure in an interview a process is known as bernoulli's process when the process output has only two possible values like defective or okay pass or fail and yes or no binomial distribution is used to deal with defective items defect is any non-compliance with a specification defective is a product or service with at least one defect binomial distribution is most suitable when the sample size is less than 30 and less than 10 percent of the population it is the percentage of non-defective items provided the probability of creating a defective item remains the same over a period let us look at the equation the probability of exactly r successes out of a sample size of n is denoted by p of r which is equal to ncr whole multiplied by p to the power of r and one minus p whole to the power of n minus r in the equation b is the probability of success r is the number of successes desired and n is the sample size to continue discussing the binomial distribution let us look at some of its key calculations in the following screen the mean of a binomial distribution is denoted by mio and is given by n multiplied by p the standard deviation of a binomial distribution is denoted by sigma which is equal to n multiplied by p multiplied by one minus p the method of calculating factorials say factorial of five is the product of five four three two and one which is equal to one hundred twenty similarly factorial of four is the product of four three two and one which is equal to twenty four let us look at an example of calculating binomial distribution in the next screen suppose you wish to know the probability of getting heads five times in eight coin tosses you can use the binomial equation for the same click the answer button to see how this is done the tossing of a coin has only two outcomes heads and tails it means that the probability of each outcome is 0.5 and it remains fixed over a period of time additionally the outcomes are statistically independent in this case the probability of success denoted by p is 0.5 the number of successes desired is denoted by r which is five and the sample size is denoted by n which is eight therefore the probability of five heads is equal to factorial of eight c r which is eight divided by factorial of five and factorial of eight minus five whole multiplied by zero point five to the power of five multiplied by one minus zero point five whole to the power of eight minus five this calculation gives a result of zero point two one eight seven which is equal to twenty one point eight seven percent let us learn about poisson distribution in this screen fuson distribution is named after simeon denis posson and is also used for discrete data poisson distribution is an application of the population knowledge to predict the sample behavior it is generally used for describing the probability distribution of an event with respect to time or space some of the characteristics of poisson distribution are as follows poisson distribution describes the discrete data resulting from a process like the number of calls received by a call center agent or the number of accidents at a signal unlike binomial distribution which deals with binary discrete data poisson distribution deals with integers which can take any value poisson distribution is suitable for analyzing situations wherein the number of trials similar to the sample size in binomial distribution is large and tends towards infinity additionally it is used in situations where the probability of success in each trial is very small almost tending towards zero this is the reason why poisson distribution is applicable for predicting the occurrence of rare events like plane crashes car accidents etc and is therefore widely used in the insurance sector poisson distribution can be used for predicting the number of defects as well given a low defect occurrence rate let us look at the formula for calculating poisson distribution in the next screen the poisson distribution for a probability of exactly x occurrences is given by p of x equals to lambda to the power of x multiplied with log e to the power of minus lambda whole divided by factorial of x in this equation lambda is the mean number of occurrences during the interval x is the number of occurrences desired and e is the base of natural logarithm which is equal to 2.71828 the mean of the poisson distribution is given by the lambda and the standard deviation of a poisson distribution is given by sigma which is the square root of lambda let us look at an example to calculate poisson distribution in the next screen the past records of a road junction which is accident prone show a mean number of five accidents per week at this junction assume that the number of accidents follows a poisson distribution and calculate the probability of any number of accidents happening in a week click the button to know the answer given the situation you know that the value of lambda or mean is five so p of zero that is the probability of zero accidents per week is calculated as five to the power of 0 multiplied by e to the power of minus 5 whole divided by factorial of zero the answer is 0.006 applying the same formula the probability of one accident per week is 0.03 the probability of more than two accidents per week is one minus the sum of probabilities of zero one and two accidents which is 0.884 in other words the probability is 88.4 percent let us learn about continuous probability distribution in this screen continuous probability distribution is characterized by the probability density function a variable is said to be continuous if the range of possible values falls along a continuum for example loudness of cheering at a ball game weight of cookies in a package length of a pen or the time required to assemble a car continuous probability distributions help in predicting the sample behavior observed in a population let us learn about normal distribution in this screen the normal or gaussian distribution is a continuous probability distribution the normal distribution is represented as n and depends on two factors me u which stands for mean and sigma which gives the standard deviation of the data points normal distribution normally has a higher frequency of values around the mean and lesser occurrences away from it it is often used as a first approximation to describe real valued random variables that tend to cluster around a single mean value the distribution is bell shaped and symmetrical the total area under the normal curve is 1 which is p of x various types of data such as body weight height the output of a manufacturing device etc follow the normal distribution additionally normal distribution is continuous and symmetrical with the tails asymptotic to the x-axis which means they touch the x-axis at infinity let us continue to discuss normal distribution in the following screen in a normal distribution to standardize comparisons of dispersion or the different measurement units like inches meters grams etc a standard z variable is used the uses of z value are as follows while the value of z or the number of standard deviations is unique for each probability within the normal distribution it helps in finding probabilities of data points anywhere within the distribution it is dimensionless as well that is it has no units such as millimeters liters coulombs etc there are different formulas to arrive at the normal distribution we will focus on one commonly used formula for calculating normal distribution which is z equals y minus mu whole divided by sigma here z is the number of standard deviations between y and the mean denoted by miu why is the value of the data point in concern me u is mean of the population or data points and sigma is the standard deviation of the population or data points let us look at an example for calculating normal distribution in the following screen suppose the time taken to resolve customer problems follows a normal distribution with a mean of 250 hours and standard deviation of twenty three hours find the probability of a problem resolution taking more than three hundred hours click the button to know the answer in this case y is equal to three hundred me u equals 250 and sigma equals 23. applying the normal distribution formula z is equal to 300 minus 250 whole divided by 23. the result is 2.17 when you look at the normal distribution table the z value of 2.17 covers an area of 0.98499 under itself this means the probability of a problem taking 0 to 300 hours to be resolved is 98.5 percent and therefore the chances of a problem resolution taking more than 300 hours is 1.5 percent let us understand the usage of z table in this screen the graphical representation of z table usage is given here the probability of areas under the curve is one for the actual value one can identify the z score by using the z table as shown this probability is the area under the curve to the left of point plus a to zero using the actual data when you calculate mean and standard deviation and the values are 25 and 5 respectively it is the normal distribution if the same data is standardized to a mean value of 0 and standard deviation value of 1 it is the standard normal distribution in the next screen we will take a look at the z table the z table gives the probability that z is between zero and a positive number there are different forms of normal distribution z tables followed globally the most common form of z table with positive z scores is shown here the value of a called the percentage point is given along the borders of the table in bold and is to two decimal places the values in the main table are the probabilities that z is between zero and plus a note that the values running down the table are to one decimal place the numbers along the column change only for the second decimal place let us look at some examples on how to use a z table in the following screen let us find the value of p of z less than zero the table is not needed to find the answer once we know that the variable z takes a value less than or equal to zero first the area under the curve is one and second the curve is symmetrical about z equals zero hence there is zero point five or fifty percent above chance of z equals zero and zero point five or fifty percent below chance of z equals zero let us find the value of p of z greater than one point one two in this case the chance of z is greater than a number in this case 1.12 you can find this by using the following fact the opposite or complement of an event of a is the event of not a that is the opposite or complement of event a occurring is the event a not occurring its probability is given by p of not a equals one minus p of a in other words p of z greater than one point one two is one minus the opposite which is p of z lesser than 1.12 using the table p of z less than 1.12 equals 0.5 plus p of 0 less than z less than 1.12 equals 0.5 plus 0.3686 which is 0.8686 hence the answer is p of z greater than 1.12 equals 1 minus 0.86 which is 0.1314 note the answer is less than 0.5 let us find the value of p of z lies between 0 and 1.12 in this case where z falls within an interval the probability can be read straight off the table p of z lies between 0 and 1.12 equals 0.3686 we will learn about chi-square distribution in this screen chi-square distribution is also known as chi squared or chi-square distribution chi squared with k-1 degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables the chi-squared distribution is one of the most widely used probability distributions in inferential statistics it is also known as hypothesis testing and the distribution is used in hypothesis tests when used in hypothesis tests it only needs one sample for the test to be conducted degree of freedom is k1 where k is the sample size click the button to view the chi-square distribution formula chi-square calculated or sigma or the chi-square index equals f of o minus f of e the whole square divided by f of e here f of o stands for an observed frequency and f of e stands for an expected frequency determined through a contingency table we will learn about the chi-square distribution in detail in the later part of this lesson let us proceed to the next screen to discuss t distribution the t distribution method is the most appropriate method to be used in the following situations when you have a sample size of less than 30 when the population standard deviation is not known when the population is approximately normal unlike the normal distribution a t distribution is lower at the mean and higher at the tails as seen in the image t distribution is used for hypothesis testing also as seen in the image the t distribution is symmetrical in shape but flatter than the normal distribution as the sample size increases the t distribution approaches normality for every possible sample size or degrees of freedom there is a different t distribution let us learn about f distribution in the following screen the f distribution is a ratio of two chi-square distributions a specific f distribution is denoted by the ratio of the degrees of freedom for the numerator chi square and the degrees of freedom for the denominator chi-square the f-test is performed to calculate and observe if the standard deviations or variances of two processes are significantly different the project teams are usually concerned about reducing the process variance as per the formula f calculated equals s1 square divided by s2 square where s1 and s2 are the standard deviations of the two samples if the f calculated is 1 it implies there is no difference in the variance if s 1 is greater than s 2 then the numerator must be greater than the denominator in other words df1 equals n1 minus 1 and df2 equals n2 minus 1. from the f distribution table you can easily find out the critical f distribution at alpha and the degrees of freedom of the samples of two different processes df1 and df2 let us proceed to the next topic of this lesson in the following screen in this topic we will discuss exploratory data analysis in detail let us learn about multivariate studies in the following screen multivariate studies or multi-variable studies are used to analyze variation in a process analyzing variation helps in investigating the stability of a process more stable the process less is the variation multivariate studies also help in identifying areas to be investigated finally they help in breaking down the variation into components to make the required improvements multivariate studies classify variation sources into three major types positional cyclical and temporal click each type to learn more positional variation occurs within a single piece or a product variation in pieces of a batch is also an example of positional variation in positional variation measurements at different locations of a piece would produce different values suppose a company is manufacturing a metal plate of thickness one inch and the plate thickness is different at many points it is an example of positional variation some of the other examples can be pallet stacking in a truck temperature gradient in an oven variation observed from cavity to cavity within a mold region of a country and line on invoice cyclical variation occurs when measurement differs from piece to piece or product to product but over a short period of time time is important because certain measurements may change if a product such as a hot metal sheet is measured after a long interval if the measurement at the same location in a piece varies with different pieces it is an example of cyclical variation other examples of cyclical variations are match to batch variation lot to lot variation and account activity week to week temporal variation occurs over a longer period of time such as machine wear and tear and changes in efficiency of an operator before and after lunch temporal variations may also be seasonal if the range of positional variation in a piece is more in winter than in summer it is an example of temporal variation the variation may occur because of unfavorable working conditions in winter process drift performance before and after breaks seasonal and shift based differences month-to-month closings and quarterly returns can be examples of temporal variation we will learn about creating a multivariate chart in this screen the outcome of multivariate studies is the multivariate chart it depicts the type of variation in the product and helps in identifying the root cause there are five major steps involved in creating a multivariate chart they are select process and characteristics decide sample size create a tabulation sheet plot the chart and link the observed values click each step to learn more the first step is to select the process and the relevant characteristics to be investigated for example selecting the process where the plate of one inch thickness is being manufactured in this process four equipment numbered one to four produce the one-inch plates the characteristic to be measured is the thickness of the plate ranging from 0.95 inch to 1.05 inches any plate thickness outside this range is a defect the second step is to decide the sample size and the frequency of data collection in this example the sample size is five pieces per equipment and the frequency of collecting data is every two hours starting from eight in the morning to two in the afternoon then the tabulation sheet is created where the data will be recorded so one should measure the thickness of the plate being produced by the four equipment at a data collection frequency of two hours the third step is to create a tabulation sheet in this example the tabulation sheet with data records contains the columns with time equipment number and thickness as headers the fourth step is to plot the chart in this example a chart can be plotted with time on the x-axis and plate thickness on the y-axis the last step is to link the observed values in this example the observed values can be linked by appropriate lines we will continue to learn about creating a multivariate chart in this screen the path to create a multivariate chart in minitab is by selecting stat then quality tools followed by multivariate chart the multivariate chart created from the data recorded is shown here the upper specification limit of 1.05 inches and the lower specification limit of 0.95 inches has been marked by green lines data outside these lines are defects the blue dots show the positional variation the dots are the measurements of pieces in a batch of any single equipment the black lines join the mean of the data recorded from the equipment the mean of the data recorded from the products of equipment number three is much below the similar mean of other equipment this shows that equipment number three is producing more defects than the other equipment the red line is the mean of the data recorded at a particular time the red line rises toward the right which means the data points shift up after 12 pm this may be because of the change in operator efficiency after a lunch break at 12 pm multivariate chart helps us visually depict the variations and establish the root causes of variations in the next screen we will learn about simple linear correlation correlation means association between variables simple linear regression and multiple regression techniques are very important as they help in validating the association of x with y the coefficient correlation shows the strength of the relationship between y and x to associate y with a single x and statistically validate the relationship correlation is used in excel use equals corel open bracket close bracket function to calculate correlation coefficient the dependent variable y may depend on many independent variables x but correlation is used to find the behavior of y as one of the x's changes correlation helps us to predict the direction of movement in values in y when x changes statistical significance of this movement is denoted by correlation coefficient r it is also known as pearson's coefficient of correlation in any correlation the value of the correlation coefficient is always between minus one and plus one positive value of r denotes the direction of movement in both variables is the same as x increases y also increases and vice versa negative value of r denotes that the direction of movement in both variables is in inverse fashion as x increases y decreases and as x decreases y increases when the value of r is zero it means that there is no correlation between the two variables higher the absolute value of r stronger the correlation between y and x absolute value of a number is its value without the sign plus four has an absolute value of four and minus four again has an absolute value of four an r value of greater than plus 0.85 or lesser than minus 0.85 indicates a strong correlation hence r value of minus 0.95 shows a stronger correlation than the r value of minus 0.74 the next screen will elaborate on correlation with the help of an example and illustrations through scatter plots the four graphs on the screen are scatter plots displaying four different levels of correlation correlation measures the linear association between the dependent variable or output variable y and one independent or input variable x as can be deduced from the graphs a definite pattern emerges as the absolute value of correlation coefficient r increases it is easy to see a pattern in r value of 0.9 and above than to see a pattern in r value of 0.07 it is difficult to find a pattern below correlation coefficient of 0.5 click the example button to know more to understand how correlation helps let us consider an example a correlation test was performed on the scores of a set of students from their first grade high school and under graduation the undergraduation score was the dependent variable and first grade score the independent variable the value of correlation coefficient r was calculated to be 0.29 and the correlation between undergraduation scores and high school scores was 0.45 this means the high school scores have higher correlation compared to the first grade scores this states the performance of students in high school is a better indicator of their performance in undergraduation than their performance in the first grade although the correlation exists as both the values of r are less than 0.85 it will be difficult to draw a straight line on the scanner plot in this screen we will learn about regression although correlation gives the direction of movement of the dependent variable y as independent variable x changes it does not provide the extent of the movement of y as x changes this degree of movement can be calculated using regression if a high percentage of variability in y is explained by changes in x one can use the model to write a transfer equation y is equal to f x and use the same equation to predict future values of y given x and x given y the output of regression on y and x is a transfer function equation that can predict values of y for any other value of x transfer function is generally denoted by f and the equation is written as y equals to f of x y can be regressed on one or more x's simultaneously simple linear regression is for 1x and multiple linear regression is for more than one x the next screen will focus on key concepts of regression there are two key concepts of regression transfer function to control y and vital x click each concept to learn more the output of regression is a transfer function f of x although the transfer function f of x gives the degree of movement in y as x changes it is not the correct transfer function to control y as there may be a low level of correlation between the two the main thrust of regression is to discover whether a significant statistical relationship exists between y and a particular x that is by looking at p values based on regression one can infer the vital x and eliminate the unimportant x's the analyze phase helps in understanding if there is statistical relevance between y and x if the relevance is established using metrics from regression analysis one can move forward with the tests the simple linear regression or slr should be used as a statistical validation tool in the beginning of this phase in this screen we will understand the concept of simple linear regression a simple linear regression equation is a fitted linear equation and is represented by the equation shown here in this equation y is the dependent variable and x is the independent variable a is the intercept of the fitted line on the y axis which is equal to the value of y at x equal to zero b is the regression coefficient or the slope of the line and c is the error in the regression model which has a mean of zero the next screen will focus on the least squares method in simple linear regression with reference to the error mentioned earlier if correlation coefficient of y and x is not equal to one meaning the relation is not perfectly linear there could be several lines that could fit in the scatter plot notice the two graphs displayed for the same set of five points two different types of lines are drawn and both of them have errors error refers to the points on the scatter plot that do not fall on the straight line drawn the second graph shows the y-intercept statistical software like minitab fits the line which has the least value of error squared and added as is clear from the graph error is the distance of the point from the fitted line typically the data lies off the line in perfect linear relation all points would lie on the line and error would be zero the distance from the point to the line is the error distance used in the sse calculations let's understand slr with the help of an example in the next screen consider the following example suppose a farmer wishes to predict the relationship between the amount spent on fertilizers and the annual sales of his crops he collects the data shown here for the last few years and determines his expected revenue if he spends eight dollars annually on fertilizers he has targeted sales of thirty one dollars this year the steps to perform simple linear regression in ms excel are as follows copy the data table on an excel worksheet select all the data from b1 to c6 this is assuming the years table appears in cells a1 to a6 click insert and choose the plain scatter chart it is titled scatter with only markers the basic scatter chart will appear as shown on the screen right click on the data points in the scatter chart and choose the option add trendline then choose the option linear and select the boxes titled display r squared value and display equation a linear line will appear which is called the best fit line or the least squares line to use the data for regression analysis the interpretation of the scatter chart is as follows the r-square value or the coefficient of determination conveys if the model is good and can be used the r-square value here is 0.3797 it means 38 percent of variability in y is explained by x the remaining 62 variation is unexplained or due to residual factors other factors like rain amount and variability sunshine temperatures seed type and seed quality could be tested the low value of r square statistically validates poor relationship between y and x thus the equation presented cannot be used for further analysis in a similar situation one should refer to the cause and effect matrix and study the relationship between y and a different x variable we will discuss multiple linear regression in this screen if a new variable x2 is added to the r square model the impact of x1 and x2 on y gets tested this is known as multiple linear regression the value of r square changes due to the introduction of the new variable the resulting value of r square which can be used in cases of multiple regression is known as r square adjusted the model can be used if r square adjusted value is greater than seventy percent we will look at the key concepts in the next screen the key concepts in multiple linear regression are as follows the residuals or the differences between the actual value and the predicted value give an indication of how good the model is if the errors or residuals are small and predictions use x's that are within the range of the collected data the predictions should be fine the sum of squares total can be calculated as follows sum of squares total or sst equals the sum of squares of regression or ssr plus sum of squares of error or sse to arrive at sum of squares of regression ssr use the formula ssr equals sum of squares total or sst minus sum of squares of error or sse since ssr is sse subtracted from sst value of sse should be less than sst r squared is sum of squares of regression or ssr divided by sum of squares total or sst calculating sst and sse helps in determining ssr and r square to get a sense of the error in the fitted model calculate the value of y for a given data using the fitted line equation to check for error take two observations of y at the same x the most important thing to remember in regression analysis is that the obtained fitted line equation cannot be used to predict y for values of x outside the data for example it would not be possible to predict the amount spent on fertilizers for a forecasted sales of 15 or sixty dollars both data points lie outside the data set on which regression analysis is performed if y is dependent on many x's then simple linear regression analysis can be used to prioritize x but it requires running separate regressions on y with each x if an x does not explain variation in y then it should not be explored any further these were the interpretations of the simple linear regression equation in the next screen we will learn that despite a relationship being established between two variables the change in one may not cause a change in the other let us discuss the difference between correlation and causation in the following screen a regression equation denotes only a relationship between the y and the x this does not mean that a change in one variable will cause a change in the other if number of schools and incidents of crime in a city rise together there may be a relationship but no causation the increase in both the factors could be due to a third factor that is population in other words both of them may be dependent variables to an independent variable consider the graphs shown on the screen the graphs on the left show the relations between number of sneezes and incidence of death with respect to population both have a positive correlation finding a positive correlation between incidents of deaths and number of sneezes does not mean we assume sneezing is the cause of somebody's death despite the correlation being very strong as depicted in the graph on the right let us proceed to the next topic of this lesson in the following screen in this topic we will discuss hypothesis testing in detail let us learn about statistical and practical significance of hypothesis test in the following screen the differences between a variable and its hypothesized value may be statistically significant but may not be practical or economically meaningful for example based on a hypothesis test nutra worldwide inc wants to implement a trading strategy which is proven to provide statistically significant returns however it does not guarantee trading on this strategy would result in economically meaningful positive returns when the logical reasons are examined before implementation the returns are economically significant the returns may not be significant when statistically proven strategy is implemented directly the returns may not be economically significant after accounting for taxes transaction costs and risks inherent in the strategy thus there should be a practical or economic significant study before implementing any statistically significant data the next screen will briefly focus on the conceptual differences between a null and an alternate hypothesis the conceptual differences between a null and an alternate hypothesis are as follows assume the specification of the current process is itself the null hypothesis null hypothesis denoted as the basic assumption for any activity or experiment is represented as h o no hypothesis cannot be proved it can only be rejected or disproved it is important to note that if null hypothesis is rejected alternative hypothesis must be right for example assuming that a movie is good one plans to watch it therefore the null hypothesis in this scenario will be movie is good alternative hypothesis or h a challenges the null hypothesis or is the converse of the null hypothesis in this scenario alternate hypothesis will be movie is not good in the following screen we will discuss type 1 and type 2 error rejecting a null hypothesis when it is true is called type 1 error it is also known as producer's risk for example the rejection of a product by the qa team when it is not defective will cause loss to the producer suppose when a movie is good it is reviewed to be not good this reflects type 1 error in this case the null hypothesis is rejected when it is actually true the two important points to be noted are significance level or alpha is the chance of committing a type 1 error the value of alpha is 0.05 or 5 percent accepting a null hypothesis when it is false is called type 2 error it is also known as consumer's risk for example the acceptance of a defective product by the quality analyst of an organization will cause loss to the consumer who buys it minimizing type 2 error requires acceptance criteria to be very strict suppose when a movie is not good it is reviewed to be good this reflects type two error in this case the alternate hypothesis is rejected when it was actually true the two important points to be noted are beta is the chance of committing a type 2 error the value of beta is 0.2 or 20 percent any experiment should have as less beta value as possible the next screen will cover the key points to remember about type 1 and type 2 error as you start dealing with the two types of errors keep the following points in mind the probability of making one type of error can be reduced when one is willing to accept a higher probability of making the other type of error suppose the management of a company producing pacemakers wants to ensure no defective pacemaker reaches the consumer so the quality assurance team makes stringent guidelines to inspect the pacemakers this would invariably decrease the beta error or type 2 error but this will also increase the chance that a non-defective pacemaker is declared defective by the quality assurance team thus alpha error or type 1 error increases if all null hypotheses are accepted to avoid rejecting true null hypothesis it will lead to type 2 error typically alpha is set at 0.05 which means that the risk of committing a type 1 error is 1 out of 20 experiments in case of any product the teams must decide what type of error should be less and set the value of alpha and beta accordingly in the next screen we will discuss the power of test the power of a hypothesis test or the power of test is the probability of correctly rejecting the null hypothesis when it is false power of a test is represented by 1 minus beta which is also the type 2 error the probability of not committing a type 2 error is called the power of a hypothesis test the power of a test helps in improving the advantage of hypothesis testing the higher the power of a test the better it is for purposes of hypothesis testing given a choice of tests the one with the highest power should be preferred the only way to decrease the probability of a type 2 error given the significance level or probability of type 1 error is to increase the sample size it is important to note that quality inspection is done on sample pieces and not on all the products so beta error is a function of the sample size if the sample size is not appropriate the defects in a product line could easily be missed out giving a wrong perception of the quality of the product this will increase the type ii error to decrease this error the quality assurance team has to increase the sample size in hypothesis testing alpha is called the significance level and one minus alpha is called the confidence level of the test in the next screen we will focus on the determinants of sample size for continuous data the sample size can be calculated by answering three simple questions how much variation is present in the population at what interval does the true population mean need to be estimated and how much representation error is allowed in the sample continuous data is data which can be measured the sample size for continuous data can be determined by the formula shown on the screen we will learn about the standard sample size formula for continuous data in the next screen representation error or alpha error is generally assumed to be 5 or 0.05 hence the expression of 1 minus alpha divided by 2 amounts to 0.975 or 97.5 percent looking up the value of z 97.5 from the z table gives the value 1.96 the expression reduces to the one shown on screen when alpha is 5 z is 1.96 to detect a change that is half the standard deviation one needs to get at least 16 data points for the sample click the example tab to view an example of continuous data calculation using standard sample size formula the population standard deviation for the time to resolve customer problems is 30 hours what should be the size of a sample that can estimate the average problem resolution time within plus or minus 5 hours tolerance with 99 percent confidence to know with 99 confidence that the time to resolve a customer problem ranges between 25 and 35 hours the value of z for 99.5 must be two point five seven five a good result should fall outside the range of zero point five percent which is one in two hundred trials it is expected that one hundred ninety nine out of two hundred trials will confirm a proper conclusion the calculation gives a result of 238.70 one cannot have point seven zero of a sample so one needs to round up to the nearest integer if there are 239 samples the significance level is greater than 0.01 which indicates the confidence is less than 99 using 239 reduces alpha and increases the confidence level the rounded up value 239 means the expectations are being met for the confidence level of the test we will learn about the standard sample size formula for discrete data in this screen like continuous data one can find out the sample size required while dealing with discrete population if the average population proportion non-defective is p then population standard deviation can be calculated by using the expression shown on the screen the expression for sample size is present it is important to note that in this expression the interval or tolerance is in percentage click the example tab to view an example of discrete data calculation using standard sample size formula the non-defective population proportion for pen manufacturing is eighty percent what should be the sample size to draw a sample that can estimate the proportion of compliant pans within plus or minus five percent with an alpha of five percent consider calculating the sample size for discrete data for which the population proportion non-defective is eighty percent and the tolerance limit is within plus or minus five percent substituting the values it is found the sample size should be 246. in this example to know if the population proportion for good pens is still within 75 to 85 percent and to have 95 percent confidence that the sample will allow a good conclusion one needs to inspect 245.86 pounds 0.86 of a pen cannot be inspected so the value is rounded up to maintain the confidence level inspecting 245 or fewer pens reduces the confidence level this means the z value would be lower than 1.96 and alpha would be greater than 0.05 suppose one is willing to accept a greater range in the estimate the proportion is within 20 percent of the past results and approximately within one standard deviation of the proportion delta changes to 0.20 and the number of needed samples is 15.4 is approximately equal to 16. this screen will focus on the hypothesis testing roadmap though the basic determinants of accepting or rejecting a hypothesis remain the same various tests are used depending on the type of data from the figure shown on the screen you can conclude the type of test to be performed based on the kind of data and values available for discrete data if mean and standard deviation are both known the z-test is used and if mean is known but standard deviation is unknown the t-test is used if the standard deviation is unknown and if the sample size is less than 30 it is preferable to use the t-test if variance is known one should go for chi-squared test if mean and standard deviation are known for a set of continuous data it is recommended to go for the z-test for mean comparison of two with standard deviation unknown go for t-test and for mean comparison of many with standard deviation unknown go for f-test also if the variance is known for continuous data go for f-test the next few screens will discuss in detail the tests for mean variance and proportions let us understand hypothesis test for means theoretical through an example in the next screen the examples of hypothesis testing based on the types of data and values available are discussed here the value of alpha can be assumed to be 5 or 0.05 suppose you want to check for the average height of a population north american males are selected as the population here from the population 117 men are gathered as the sample and the readings of their height are taken the null hypothesis is that the average height of north american males is 165 centimeters and the alternate hypothesis is that the height is lesser or greater than 165 centimeters consider the sample size n as 117 for z test and sample size n is 25 for t test sample average or x bar is 164.5 centimeters using the data given let us calculate the z calc value and t calc value the population height is 165 centimeters with a standard deviation of 5.2 centimeters and the average height of the sample group is 164.5 centimeters the test for significant difference should be conducted first let us compute z calc value using the formula given on the screen hence the z calc is 1.04 which is less than 1.96 or t critical therefore the null hypothesis cannot be rejected since z 0.05 equals 1.96 the null hypothesis is not rejected at five percent level of significance the statistical notation is shown on the screen thus a conclusion based on the sample collected is that the average height of north american males is 165 centimeters if the population standard deviation is not known a t-test is used it is similar to the z-test instead of using the population parameter or sigma the sample statistic standard deviation or s is used in this example the s value is 5.0 let us now compute t value using the formula given on the screen the statistical notation to reject null hypothesis is shown on the screen the t critical value is 2.064 and we know the t calc value is 0.5 which is less than 2.064 therefore the null hypothesis cannot be rejected at five percent level of significance thus a conclusion based on the sample collected is that the average height of north american males is 165 centimeters the conclusion of not rejecting the null hypothesis is based on the assumption that the 25 males are randomly selected from all males in north america null and alternative hypotheses are same for both z-test and t-test in both the examples the null hypothesis is not rejected in the next screen we will understand the hypothesis test for variance with an example in hypothesis test for variance chi-square test is used in the case of a chi-square test the null and alternate hypotheses are defined and the values of chi-square critical and chi-square are calculated to understand this concept with an example click the button given on the screen the null hypothesis is that the proportion of wins in australia or abroad is independent of the country played against the alternate hypothesis is that the proportion of wins in australia or abroad is dependent on the country played against chi-square critical is 6.251 and chi-square calculated is 1.36 since the calculated value is less than the critical value the proportion of wins of the australia hockey team is independent of the country played or place in this screen we will discuss hypothesis test for proportions with an example the hypothesis test on population proportion can be performed to understand this with an example click the button given on the screen let us perform hypothesis test on population proportion the null hypothesis is that the proportion of smokers among males in a place named r is 0.10 represented as p 0. the alternative hypothesis is the proportion is different than 0.10 in notation it is represented as null hypothesis is p equals p zero against alternative hypothesis is p different than p zero a sample of 150 adult males are interviewed and it is found that 23 of them are smokers thus the sample proportion is 23 divided by 150 which is 0.153 substituting this value in the expression of z given on the screen gives the result of 1.80 you can reject the null hypothesis at level of significance alpha if z is greater than z alpha for five percent level of confidence the z value should be 1.96 since the calculated z value is more than what is required for five percent level of confidence the null hypothesis is rejected hence it can be concluded that the proportion of smokers in r is greater than 0.10 in this screen we will focus on comparison of means of two processes means of two processes are compared to understand whether the outcomes of the two processes are significantly different this test is helpful in understanding whether a new process is better than an old process this test can also determine whether the two samples belong to the same population or different populations it is especially required for benchmarking to compare an existing process with another benchmarked process let us proceed to the next screen to learn about the paired comparison hypothesis test for means the example of two mean t-tests with unequal variances is discussed here null and alternate hypotheses are defined the average heights of men in two different sets of people are compared to see if the means are significantly different for this test the sample sizes means and variances are required to calculate the value of t two samples of sizes n one of one hundred twenty five and n two of one hundred ten are taken from the two populations the mean value of sample size one is one hundred 167.3 and sample size 2 is 165.8 the standard deviation for sample sizes 1 and 2 are 4.2 and 5.0 respectively using the formula given on the screen the t value is derived as 2.47 the null hypothesis is rejected if the calculated value of t is more than the required value of t in other words reject null hypothesis at level of significance a if computed t value is greater than t of df a divided by two with a t-test we're comparing two means and the population parameter sigma is unknown therefore we're pulling the sample standard deviations in order to calculate t the variances are weighted by the number of data points in each sample group since t 223 and 0.025 equals 1.96 the null hypothesis is rejected at 5 level of significance the test used here is known as the paired t-test and is considered a very powerful test in the next screen we will look into the example of the paired comparison hypothesis test for variance of f-test it is important to understand the different types of tests through an example susan is trying to compare the standard deviation of two companies according to her the earnings of company a are more volatile than those of company b she has been obtaining earnings data for the past 31 years for company a and for the past 41 years for company b she finds that the sample standard deviation of company a's earnings is four dollars forty cents and of company b's earnings is three dollars ninety cents determine whether the earnings of company a have a greater standard deviation than those of company b at five percent level of significance click the button given on the screen to know the answer susan has the data of the earnings of the companies distributions rarely have the same spread or variance when processes are improved one of the strategies is to reduce the variation it is important to be able to compare variances with each other a null hypothesis would indicate no change has occurred if it can be rejected and the variance is lower one can claim success the statistical notation for this example is given on the screen suppose one has to compare two sets of company data susan has looked at the earnings of two companies she has been studying the effects of strategy management styles and leadership profiles on the earnings of these companies there are significant differences in these kpiv's she wants to know if they have an effect on the variance in the earnings she has sample data over several decades for each company by the given data it can be concluded that earnings of company a have a greater standard deviation than those of company b in calculating the f-test statistic always put the greater variance in the numerator let us look at the f test example of hypothesis test for equality of variance in this screen the degrees of freedom for company a and company b are 30 and 40 respectively the critical value from f table equals 1.74 the null hypothesis is rejected if the f test statistic is greater than 1.74 the calculated value of f test statistic is 1.273 and therefore at the 5 significance level the null hypothesis cannot be rejected the next screen will focus on hypothesis tests f-test for independent groups a restaurant which wants to explore the recent overuse of avocados suspects there is a difference between two chefs and the number of avocados used to prepare the salads the data shown in the table is the measure of avocados in ounces the weight of avocado slices used in salads prepared by two different chefs is to determine if one chef is using more avocados than the other perhaps the restaurant's expenditures on avocados is greater this month than the average of the past 12 months this is assuming there is no change in avocado prices or the amount of avocados being used click the tab to learn to conduct an f-test in ms excel the f-test is conducted in ms excel through the following steps open ms excel click data click data analysis please follow the facilitator instruction on how to install add-ins select f-test to sample for variances in variable one range select the data set for group a and select data set for group b in variable 2 range click ok the screen shot of the f test window is also shown here in this screen we will discuss the f test assumptions before interpreting the f test the assumptions to be considered are null hypothesis there is no significant statistical difference between the variances of the two groups thus concluding any variation could be because of chance this is common cause of variation alternate hypothesis there is a significant statistical difference between the variances of the two groups thus concluding that variations could be because of assignable causes too this is special cause of variation the following screen will focus on f-test interpretations the interpretations for the conducted f-test are from the excel result sheet the p-value is 0.03 if p-value is low or below 0.05 the null must be rejected thus no hypothesis with 97 confidence is rejected also the fact that variation could only be due to common cause of variation is rejected it is inferred from the test that there could be assignable causes of variation or special causes of variation excel provides the descriptive statistics for each variable it also gives the degrees of freedom for each f is the calculated f statistic f critical is a reference number found in a statistics book table p of f less than or equal to f is the probability that f really is less than f critical or that the null hypothesis would be falsely rejected since the p-value is less than the alpha the null hypothesis can be confidently rejected alongside conducting a hypothesis test a meaningful conclusion from the test has been drawn the following screen will focus on hypothesis test t-test for independent groups as discussed earlier the table shows the measure of avocados in ounces and the significant difference in their means needs to be inspected if a significant amount of difference is found it can be concluded that there is a possibility of special cause of variation the next screen will demonstrate how to conduct the two-sample t-test the two-sample independent t-test inspects two groups of data for significant difference in their means the idea is to conclude if there is a significant amount of difference if there is a statistical evidence of variation one can conclude a possibility of special cause of variation the steps for conducting a two-sample t-test are open ms excel click data and click data analysis select two-sample independent t-test assuming unequal variances in variable 1 range select the data set for group a and select the data set for group b in variable 2 range keep the hypothesized mean difference as 0. click ok in the following screen we will focus on two sample independent t test assumptions the assumptions for a two sample independent t test are null hypothesis there is no significant statistical difference between the means of the two groups thus concluding any variation could be because of chance this is common cause of variation alternate hypothesis there is a significant statistical difference between the means of the two groups thus concluding that variations could be because of assignable causes too this is special cause of variation the null hypothesis states the mean of group a is equal to the mean of group b the alternate hypothesis states that the mean of group a is not equal to the mean of group b note that alternate hypothesis tests two conditions mean of a is less than mean of b and mean of a is greater than mean of b thus a two-tailed probability needs to be used before we interpret the t-test results let us compare the two-tailed and one-tailed probability in the next screen two-tailed probability and one-tailed probability are used depending on the direction of the alternate hypothesis if the alternate hypothesis tests more than one direction either less or more use a two-tailed probability value from the test example if mean of a is not equal to mean of b then it is two-tailed probability if the alternate hypothesis tests only one direction use a one-tailed probability value from the test example if mean of a is greater than mean of b then it is one-tailed probability in the next screen let us look at the two-sample independent t-test results and interpretations the results are shown in the table on the screen the inference is as the two-tailed probability is being tested the p-value of two-tailed probability testing is 0.24 which is greater than 0.05 if p-value is greater than 0.05 the null hypothesis is not rejected this means one cannot reject the fact that there is no significant statistical difference between the two means similar to the f-test excel provides the descriptive statistics for each group or variable the t stat is shown excel also shows one-tailed or two-tailed data for the one-tailed test the alpha is 0.05 the error is expected to be in one direction for the two-tailed test the error is alpha slash two or zero point zero two five in this example t stat or t calculated is less than either t criticals therefore the null hypothesis cannot be rejected thus it can be inferred that both the groups are statistically same we will discuss the paired t-test in the next screen paired t-test is another hypothesis test from the family of t-tests the following points will help in understanding the paired t-test in detail the paired t-test is one of the most powerful tests from the t-test family the paired t-test is conducted before and after the process to be measured for example a group of students score x in cssgb before taking the training program post the training program the scores are taken again one needs to find out if there is a statistical difference between the two sets of scores if there is a significant difference the inference could be that the training was effective it is important to note that the paired t-test interpretation shows the effectiveness of the improvement measures this is the main reason why paired t-tests are often used in the improve stage let us learn about sample variance in the following screen sample variance is defined as the average of the squared differences from the mean the sample variance that is s square can be used to calculate and understand the degree of variation of a sample it can also be used in statistics however it cannot be used or explained directly because its value does not provide any information to use the value you have to first convert it into standard deviation and then combine it with the mean click the button to know the steps for calculating the sample variance step one calculate the mean or average of the sample step two subtract each of the values from the mean step three calculate the square value of the results step four take the average of the squared differences let us understand how to calculate the sample variance with the help of an example in this screen consider the sample of weights the mean value is one hundred forty when you subtract the individual values from the mean take the square value of the results and then take the average of the squared differences you will get 1936 as the sample variance this number is not useful as it is in order to get the standard deviation take the square root of the sample variance square root of 1936 equals forty four the standard deviation in combination with the mean will tell you how much the majority of the people weigh in this example if your mean is one hundred forty and your variance is forty four you can conclude that the majority of people weigh between 96 pounds mean minus 44 and 184 pounds mean plus 44. let us proceed to the next screen that focuses on the analysis of variance or anova which is the comparison of more than two means a t-test is used for one sample two sample tests are used for comparing two means to compare the means of more than two samples use the anova method anova stands for analysis of variance anova does not tell the better mean it helps in understanding that all the sample means are not equal the shortlisted samples based on anova output can further be tested one important aspect of anova is it generalizes the t test to include more than two samples performing multiple two-sample t-tests would increase the chance of committing a type one error hence anovas is useful in comparing two or more means the next screen will help in understanding this concept through an exam as an example consider the takeaway food delivery time of three different outlets is there any evidence that the averages for the three outlets are not equal in other words can the delivery time be benchmarked on the outlet the null hypothesis will assume that the three means are equal if the null hypothesis is rejected it would mean that there are at least two outlets that are different in their average delivery time in minitab one can perform anova in one of the statistics packages ensure that the data of the table is stacked in two columns in the main menu go to stat anova and then one way the left column of the table will have the outlets and the right column will have the time in minutes this is similar to the table shown on the screen in the one-way analysis of variance window select the response as delivery time and factor as outlet and click ok the output of this process is shown here notice the p value which is much higher than 0.05 the steps to perform anova in excel are as follows after entering the data to a spreadsheet select the anova single factor test from the data analysis tool pack select the array for analysis designate that the data is in columns and select an output range excel shows the descriptive statistics for each column in the top table in the second table the anova analysis shows whether the variation is greater between the groups or within the groups it shows the sum of squares or ss degrees of freedom df means of squares ms or sum of squares divided by n minus 1 or variance the f statistic ms between divided by ms within p-value and f-critical from a reference table the f and p are calculated for the variation that occurs within each of the groups and between the groups if the conditions in the groups are significant it would be expected to see the between groups ss much higher and the ms slightly higher let us now interpret the minitab anova results since the p-value is more than 0.05 the null hypothesis is accepted this means there is no significant difference between means of delivery time for the three outlets based on the confidence intervals it is found that the intervals overlap which means there is little that separates the means of the three samples in one-way anova where there was only one factor to be benchmarked that is the outlet of delivery if there are two factors you may use the two-way anova in this screen we will learn in detail about chi-square distribution the chi-square distribution is one of the most widely used probability distributions in inferential statistics it is also known as hypothesis testing and the distribution is used in hypothesis tests when used in hypothesis test needs one sample for the test to be conducted the chi-squared distribution is also known as chi squared it has k1 degrees of freedom and is the distribution of a sum of the squares of k independent standard normal random variables suppose in a field for nine players player one comes in and can choose amongst all nine positions available player two can choose only amongst eight and so on after all the eight players have chosen their positions the last player gets to choose the last position left the eight players are free to choose in a playing field of nine eight is the degree of freedom for this example conventionally degree of freedom is n1 where n is the sample size for example if w x y and z are four random variables with standard normal distributions the random variable f which is the sum of w square x square y square and z square has a chi-square distribution the degrees of freedom of the distribution or the df equals the number of normally distributed variables used in this case df equals four the formula for chi-squared distribution is shown on the screen it is important to note that f of o stands for an observed frequency and f of e stands for an expected frequency the next screen will explain chi-square test through an example suppose the australian hockey team wishes to analyze its wins at home and abroad against four different countries the data has two classifications and the table is also known as a two by four contingency table with two rows and four columns the expected frequencies can be calculated assuming there is a relationship thus expected frequency for each of the observed frequency is equal to product of row total and column total divided by overall total one has to find out how to calculate the expected frequency if the observed frequency is three wins against south africa in australia then it would convert to total winds at home which is 21 divided by the total number of wins or 31 and is the result multiplied by five the result is 3.39 similarly the expected population parameters for all cases are found in this step all the information of the previous screen is combined and the table is populated the estimated population parameters are calculated and added the formula estimates the observed frequency to calculate the final chi-square index which in this case is 1.36 it is important to note that there is a different chi-squared distribution for each of the different numbers of degrees of freedom for the chi-square distribution the degrees of freedom are calculated as per the number of rows and columns in the contingency table the equation for degrees of freedom should be noticed the number of degrees of freedom is equal to three assuming an alpha of ten percent the chi-squared distribution in the chi-square table is noticed and a critical chi-square index of 6.251 is arrived at chi-square calculated value is 1.36 both the values of the chi-square index should be plotted the critical chi-square distribution divides the whole region into acceptance and rejection while the calculated chi-square distribution is based on data and conveys whether the data falls into an acceptance or rejection region therefore as the calculated value is less than the critical value and falls in the acceptance region the proportion of wins of the aussie team at home or abroad has nothing to do with the opponent being played against let us proceed to the next topic of this lesson in the next screen in this topic we will learn in detail about hypothesis testing with non-normal data let us begin with the mon whitney test in the next screen the mon whitney test also known as the wilcoxon rank sum test is a non-parametric test which is used to compare two unpaired groups in this test the value of alpha is by default set at 0.05 and the rejection and acceptance condition remains the same for different cases that is if p is less than alpha reject the null hypothesis if p is greater than alpha reject the alternate hypothesis the aim of this test is to rank the entire data available for each condition and then compare the total outcome of the two ranks click the button to know the steps to perform the mon whitney test to perform the mon whitney test first rank all the values from low to high without paying any attention to the group to which each value belongs the smallest number gets a rank of one the largest number gets a rank of n where n is the total number of values in the two groups if there are ties continue to rank the values anyway pretending they are slightly different then find the average of the ranks for all the identical values and assign that rank to all these values continue this till all the whole number ranks have been used next sort the values into two groups these can now be used for the mon whitney u test summate the ranks for the observations from sample one and then summate the ranks in sample two larger group let us look at an example of the mon whitney test in this screen suppose you have two sets of data g1 and g2 the g1 values are 14 2 5 16 and 9 and the g2 values are 4 2 18 14 and 8. now combine the g1 and g2 values sort them in ascending order and mention the group name against each value next rank the groups from 1 to 10 and check if any values are identical take an average of the ranks of the identical values and place it against the identical values in the final rank column hence the average final rank is 1.5 for ranks 1 and 2. similarly the average final rank is 7.5 for ranks 7 and 8. next calculate r1 and r2 by adding the ranks of the groups 1 and 2 respectively in this example the r 1 value is 28 and our 2 value is 27 from the given data 5 is the value of both n1 and n2 the formula for the mon whitney u test for n one and n two values is u one equals n one multiplied by n two plus n one multiplied by n one plus one whole divided by two minus r one similarly u2 equals n1 multiplied by n2 plus n2 multiplied by n2 plus 1 whole divided by 2 minus r2 in this example the value of u1 is 12 and u2 is 13. now the u value can be calculated by taking the minimum value among 12 and 13 which is 12. look up the mon whitney u test table for n 1 equals five and n two equals five you will get the critical value of u as two to be statistically significant the obtained u has to be equal to or less than this critical value our calculated value of u is 12 which is not less than two that means there is no statistical difference between the means of the two groups in this screen we will learn about the kruskal-wallis test the kruskal-wallis test is named after william kruskal and w alan wallace it is also a non-parametric test used for testing the source of origin of samples for example whether the samples originate from the same distribution or not the characteristics of the kruskal-wallis test are as follows it is the only way to analyze variants by ranks this test compares the medians of two or more samples to find out if the samples are from different populations since this test is a non-parametric method it does not assume the normal distribution of the residuals unlike the analogous one-way analysis of variance for this test the null hypothesis is that the medians of all groups are equal and the alternate hypothesis is that at least one population median of one group is different from the population median of at least one other group let us learn about the moods median in the next screen the mood's median is also a non-parametric test that is used to test the equality of medians from two or more different populations this test works when the output y variable is continuous and discrete ordinal or discrete count while the input x variable is discrete with two or more attributes click the button to view the steps involved in the mood's median test following are the steps in the mood's median test first find the median of the combined data set next find the number of values in each sample that are greater than the median and form a contingency table then find the expected value for each cell and finally find the chi-square value we will learn about the freedmen test in this screen the friedman test is another form of a non-parametric test it does not make any assumptions about the specific shape of the population from which the sample is drawn and therefore allows smaller sample data sets to be analyzed unlike anova the friedman test does not require the data set to be randomly sampled from normally distributed populations with equal variances this test uses a two-tailed hypothesis test where the null hypothesis is that the population medians of each treatment are statistically identical to the rest of the group in the next screen we will learn about the one sample sign test the one sample sign test is the simplest of all the non-parametric tests that can be used instead of a one-sample t-test it is similar to the concept of testing if a coin is fair in showing heads or tails here the null hypothesis represented as h o is the hypothecated median or assumed median of the sample which belongs to the population click the button to view the steps involved in the one sample sign test following are the steps in a one sample sign test first count the number of positive values these are the values that are larger than the hypothesized median next count the number of negative values these are the values that are smaller than the hypothesized median finally test the values to check if there are significantly more positive values or negative values than expected this screen will focus on the one sample wilcoxon test the one sample wilcoxon test also known as the wilcoxon signed rank test is another form of a non-parametric test this test is equivalent to the parametric one-sample t-test and more powerful than the non-parametric one-sample sign test let us discuss the characteristics of this test in the following screen some of the characteristics of this test are as follows this test assumes that the sample is randomly taken from a population with a symmetric frequency distribution around the median also in this test the symmetry can be observed with the histogram or by checking if the median and mean are approximately equal the conclusion in this test is that if the value is on the midpoint you can continue and accept the null hypothesis if not you need to reject the alternate hypothesis click the button given on the screen to view an example let us consider an example the median customer satisfaction score of an organization has always been 3.7 and the management wants to see if this has changed they conduct a survey and get the results grouped by the customer type the conclusion will be as follows if the median value is 3.7 the null hypothesis h o can be accepted if not the alternate hypothesis needs to be rejected the alpha value will be 0.05 choose from over 300 in-demand skills and get access to 1 000 plus hours of video content for free visit scale up by simply learn click on the link in the description to know more this lesson will focus on the improved phase of the dmacc process the improved phase comes after the analyze phase in the analyze phase the data was analyzed and some patterns were found to identify where the problem lies design of experiments or doe consists of a series of planned and scientific experiments that test various input variables and their eventual impact on the output variable design of experiments can be used as a one-stop alternative for analyzing all influencing factors to arrive at a successful model doe is applicable where multiple input variables known as factors affect a single response variable an output variable is the variable which may get affected due to multiple input variables doe is preferred over one factor at a time or o fat experiments because it does not miss interactions with techniques like blocking experimental error can be eliminated the trials should be randomized to avoid concluding that a factor is significant when the time at which it is measured or the sequence followed may have influenced the responses result an example of blocking is highlighted in the table given on the screen with techniques like replication many experiments can be conducted to ensure a robust model we will understand the concept of design experiments through an example in the upcoming screen to understand doe and the main effects consider the following example suppose the objective of the experiment is to achieve uniform part dimensions at a particular target value to reduce variations the inputs x or factors that affect the output are cycle time mold temperature holding pressure holding time and material type the process is the molding process and the output or the response of the experiment is the part hardness the components of the doe in this example will be described in the next screen output response factors levels and interactions are the components of the doe in the given example click each component to learn more the response variable is the part hardness and is measured as a result of the experiment and is used to judge the effects of factors factors of this experimental setup are cycle time mold temperature holding pressure holding time and material type factors can be varied and are called levels the molding temperature can be set at 600 degrees fahrenheit or 700 degrees fahrenheit plastic type can be fillers and no fillers and the material type has two levels of nylon or acetyl interactions refer to the degree to which factors depend on one another some experiments evaluate the effect of interactions in the molding example the interaction between cycle time and molding temperature is critical the best level for time depends on what temperature is set if the temperature level is higher the cycle time may have to be decreased to achieve the same response from the experiment let us understand full factorial experiments through an example full factorial experimental design contains all combinations of all levels of all factors this experimental design ensures no possible treatment combinations get omitted hence full factorial designs are often preferred over other designs the table shown here is for a two-way heat treatment experiment there are two factors oven time x2 and the temperature x1 at which the material is drawn out of the oven the output y of the experiment is the hardness of the material each of the factors has two levels this example illustrates the concepts of main factor and interaction effects from the table it is clear that without repetition the experiment will have four different outcomes based on the changes in levels of factors each experimental trial here is repeated to give a total of eight values let us now analyze the mean effect an analysis of the means helps in understanding how a change in temperature at which the material is drawn creates a difference in the average part hardness this affects the output and is called the main effect analysis of means also tells how a change in oven time creates a difference in the average part hardness this is also the main effect analysis of means explains how interaction between temperature and time affects the average part hardness this is known as the interaction effect let us next understand the concept of main effect for calculating the main effect the means have to be calculated hence to calculate the main effect of draw temperature the mean of the hardness values has to be calculated the values are populated in the corresponding columns of draw temperatures the columns have been labeled a1 and a2 the value of the mean of a1 is 91 and of a2 is 82. plotting the data on a graph shows that changing draw temperatures changes the average hardness similarly we calculate the mean of hardness values in b1 and b2 the values are 87 and 86 which are plotted on a graph it can be seen that changing the oven time does not affect the average hardness now let us understand how the interaction between temperature and time affects the average part hardness to check how draw temperature and oven time interact the mean values are calculated by taking the repetition response hence the cell a1 b1 has the mean of the values 90 and 87. the cell a2 b1 has the mean of the values 84 and 87. after the mean values are calculated they are plotted on a graph the graph shows that to reduce interactions low temperature and high oven time should be selected to have the desired output of high hardness also called renault hardness if low hardness is the desired output the experimental setup should have high draw temperature and high oven time the ideal case is represented by the parallel lines which give the desired output based on the main effect without being affected by the interaction between the factors the parallel lines are shown as a dotted line the mean of the factors are also calculated and shown in the small table in this we will introduce the concept of runs in design of experiments the numbers of experiments in a doe setting is known as runs a full factorial experiment without replication on five factors and two levels is two raised to the power of five which equals 32 runs a full factorial experiment with one replication on five factors and two levels is 32 plus 32 which equals 64 runs a half fractional factorial experiment without replication on five factors and two levels is two raised to the power of five minus one which equals sixteen runs a half factorial experiment with one replication on five factors and two levels is 16 plus 16 which equals 32 runs the number of combinations can be determined using the formula l to the power of f where l is the number of levels and f is the number of factors half fractional factorial is calculated using the formula l to the power f minus one at three levels five factors full factorial experiment would amount to 243 trials and half factorial experiments would require 81 trials the difference between full factorial and half fractional factorial experiments can be seen from the number of runs let us proceed to the next topic of this lesson in the following screen in this topic we will discuss root cause analysis in detail we will learn about residuals analysis in the following screen while performing the regression analysis of a linear or non-linear model you will get a model with the predicted values some of the data might fit within that model whereas others may be scattered across the modeled equation predicts one value for y at level x however the actual value for y observed at that level of x is different from the predicted value this difference between the observed value of the dependent variable y and the predicted value is called residual the formula to calculate residual is observed value minus predicted value residuals are considered to be errors and each data point has one residual you can validate the assumptions on random errors as they are independent exhibit normal distribution have a constant various sigma square for all the settings of the independent variables and finally have a mean as zero in the next slide we will continue to discuss residuals analysis as discussed in the previous screen while performing any regression analysis you will observe that not all the data fits into the linear model as the linear regression model is not always appropriate for the data therefore you should assess the appropriateness of the model by defining residuals and examining the residual plots if all assumptions are satisfied the residuals should randomly vary around zero and the spread of the residuals should be the same throughout the plot that is no systematic patterns are observed remember in residuals analysis both the sum and the mean of the residuals are zero residuals and diagnostic statistics allow you to identify patterns that either poorly fit in the model with a strong influence on the estimated parameters or have a high leverage it is helpful to interpret these diagnostics together to understand any potential problems with the model in the next screen we will learn about data transformation using the box cox method the available data must be transformed when it does not exhibit the normal distribution box and cox in the year 1964 developed a procedure for estimating the best transformation to normality within the family of power transformation it works by taking the current y data and raising it to the power known as lambda the formula for transformation of y is represented as y asterisk equals y to the power lambda minus one the whole divided by lambda this formula is used where the value of lambda is not zero if the value of lambda is zero you can use natural logarithm to transform y the family of power transformations can be used for the following for converting a data set so that parametric statistics can be used here lambda is a parameter to be defined from the data for any continuous data greater than zero this will not work when the values are less than or equal to zero transforming specs note that the use of the transformation does not guarantee normality in the next screen we will continue the discussion on data transformation using box and cox the table on the screen shows how the data can be transformed using lambda the first column lists down values of lambda and the second column shows the transformed value if the value of lambda is negative 2 it becomes y to the power negative two after the transformation which is one divided by y square similarly if the value of lambda is negative one after transformation it becomes y to the power of negative one which is one divided by y and so on note that you will use a different formula when you have the value of lambda as zero wherein you will take natural log of the value y similarly transform values are also shown on the screen click the example button to know more let us look at an example of how data transformation is done using box the difference between original data and the data transformed using box is shown on the screen figure one shows the original data plotted on a histogram here you can see that this data is abnormal in figure two the box and cox procedure is applied on the original data and it is transformed you can see that the data in the second figure is more normal than figure one let us learn about process input and output variables in the following screen process improvement has a few prerequisites before a process can be improved it must first be measured to assess the level of improvement required the first step is to know the input variables and output variables and check for any relationship the sipoc map and the cause and effect matrix are very helpful there are many ways to measure the key process variables metrics such as the percent defective operation costs elapsed time backlog quantity and documentation errors can be used critical variables are best identified by the process owners once they are identified cause and effect tools are used to establish the relationship between variables a cause and effect matrix is shown on the screen the key process input variables have been listed vertically and the key process output variables horizontally for each of the output variables a prioritization number is assigned numbers which reflect the effect of each input variable on the output variable are entered in the matrix the process output priority is multiplied with the input variables to arrive at the results for each input variable the values are added to determine the results for each input variable for process input variable 1 the output variables are 3 4 and 7 with a prioritization value of 4 7 and 11 respectively therefore multiplying the output variables with their corresponding prioritization numbers and adding those gives one hundred seventeen which is around thirty three percent of the total effect the process input variables results are compared to each other to determine which input variable has the greatest effect on the output variables click the cause and effect matrix template button to view another template a sample of the cause effect matrix or ce matrix is shown here the ce matrix gives the correlation between input and output variables in this screen we will discuss the steps to update the ce matrix the steps for updating the cause and effect matrix r list the input variables vertically under the column process inputs list the output variables horizontally under the numbers 1 to 15. these output variables are important from the customer's perspective one can refer to either the qfd or the ctq tree to know the key output variables rank the output variables based on customer priority these numbers can also be taken from the qfd the input variables with the highest score become the point of focus in the project another method to establish the cause effect relation is the cause and effect diagram this is explained in detail in the following screen the cause and effect diagram is used to find the root cause and the potential solutions to a problem a cause and effect diagram breaks down a problem into bite-sized pieces and also displays the possible causes in a graphic manner it is also known as the fishbone the 4m or the ishikawa diagram it is commonly used to examine effects or problems to find out the possible causes and to indicate the possible areas to collect data the steps involved in the cause and effect diagram are all the possible causes of the problem or effects selected for analysis are brainstormed the major causes are classified under the headings of materials methods machinery and manpower the cause and effect diagram is drawn with the problem at the point of the central access line and the causes on the diagram are written under the classifications chosen the next screen illustrates the cause and effect diagram with the help of an example the diagram shows the cause and effect diagram for the possible causes of solder defects on a reflow soldering line this diagram helps in collecting data and discovering the root cause during brainstorming the group looked at all the major causes and then grouped them under the main headings under materials causes like types of solder paste components and the components packaging used are considered the major causes under methods are technology and preventive maintenance similarly operator and schedule are placed under manpower while tools and oven are grouped under machinery causes the next screen will discuss another root cause analysis tool in detail 5y is one of the tools used to analyze the root cause of a problem the responsibility of the root cause analysis lies with the 5y analysis team the technical experts have a great responsibility as the conclusion will be drawn from the way the drill down of the symptoms is carried out the 5y is a very simple tool as it poses the y question to every problem till the root cause is obtained it is important to know that the 5y tool does not restrict the interrogation to 5 questions y can be asked as many times as required till the root cause for the problem is found it can be used along with the cause and effect diagram the following screen will explain the process of the five why technique the process for the 5-y technique is identify the problem and emphasize the problem statement arrange for a brainstorming session with the team including subject matter experts process owners and team members explain the purpose and the problem statement analyze scenarios working backwards from the problem ask why for the answers obtained until the root cause is found normally reasons like insufficient resources and time become the root causes if the drill down in brainstorming is carried out in the right direction it is often found that the root cause is related to the process therefore the occurrence of a problem is often due to the process and not an individual or a team in the next screen we will understand the concept of the five-wide technique with the help of an example the process for the 5y technique is identify the problem and emphasize the problem statement arrange for a brainstorming session with the team including subject matter experts process owners and team members explain the purpose and the problem statement analyze scenarios working backwards from the problem ask why for the answers obtained until the root cause is found normally reasons like insufficient resources and time become the root causes if the drill down in brainstorming is carried out in the right direction it is often found that the root cause is related to the process therefore the occurrence of a problem is often due to the process and not an individual or a team in the next screen we will understand the concept of the five why technique with the help of an example in this topic we will discuss lean tools in detail let us learn about lean techniques in the following screen the eight lean techniques are kaizen bokayoke 5s just in time kanban jadoka tactime and hijanka click each technique to know more kaizen or continuous improvement is the building block of all lean production methods kaizen philosophy implies that all incremental changes routinely applied and sustained over a long period of time results in significant improvements the second technique is pokayoka it is also known as mistake proofing it is good to do it right the first time and even better to make it impossible to do it wrong the first time the prompt received to save the word document before closing it without saving is an example of bokoyoka 5s is a set of five japanese words which translate to sort set in order shine standardize and sustain this is a simple and yet powerful tool of lean the sort principle refers to sorting items according to a rule the rule could be frequency of use or time of use after sorting the objects are set in order the place for everything is defined and everything is placed accordingly cleaning of the area refers to the shine principle the fourth step requires formation and circulation of a set of written standards the last step refers to sustaining the process by following the standards set earlier 5s is useful as a framework to create and maintain the workplace just in time or jit is another lean technique this technique philosophizes about producing the necessary units in the necessary quantity at the necessary time with the required quality as an item is removed from a shelf of a super mart the system confirms it and automatically sends a note for replenishment this kind of technique can be used in an organization to prevent accumulation of inventory the fifth technique is known as kanban which means signboard in japanese kanban utilizes visual display cards to signal movement of material between the steps of a product process this is one of the examples of visual control in lean the next technique is jiduka it means automation with human touch and is sometimes known as autonomation jiduka implements supervisory function in the production line and stops the process as soon as a defect is encountered the process does not start again till the root cause of the defect is eliminated tack time is the maximum time in which the customer demands need to be met for example a customer needs 100 products and the company has 420 minutes of available production time tax time equals time available divided by demand in this case the company has a maximum of 4.2 minutes per product this will be the target for the production line the final technique is hijanka which means production leveling and smoothing it is a technique to reduce waste occurring due to fluctuating customer demand let us understand the concept of cycle time reduction in this screen cycle time reduction refers to the reduction in the time taken for a complete process implementing lean techniques reduces cycle time and releases resources faster than any other method low cycle time increases productivity and throughput lean techniques release resources early achieving more production with the same machinery internal and external waste is reduced and the operational process is simplified with a decrease in product damage all these factors help in satisfying the customer and staying ahead in competition the following screen describes the concept of cycle time reduction through an example the changes brought by implementing lean techniques on an existing process are illustrated in the given diagram things to be noticed are number of operators used work allocation to the operators path or the movement in the process and flow of the process notice the changes brought about by implementing lean techniques on the old process first the path followed by the material in between the process is considerably reduced this decreases the cycle time for the entire process second the number of operators is reduced to 3 when compared to 5 in the old process operator 1 can now work on process 1 and process 4. similarly operator 2 can work on process 2 and process 3. there is an increased productivity of the operators and the remaining skilled operators can be used in some other process or system the next screen will introduce the concept of kaizen and kaizen blitz kaizen means good change in japanese kaizen is a continuous improvement method to improve the functions of an organization the improvements could be in process productivity quality technology and safety it brings in small incremental changes to the process kaizen blitz is known as kaizen event or kaizen workshop if the event is tightly defined and the scope is evident for implementation processes can be easily changed and improved teams could improve processes through creative problem solving methods in structured workshops over a short time scale the next screen will provide the differences between kaizen and kaizen blitz the differences between kaizen and kaizen blitz are kaizen is a method that brings continuous improvement in the organization while kaizen blitz is a workshop or an event that brings in change kaizen brings in small incremental changes in the organization there are no major changes made within the processes kaizen blitz is applied when a rapid solution is required the kaizen method follows a step-by-step process it standardizes measures and compares the process with the requirement before improving it kaizen blitz plans for the event executes it arrives at a solution and follows it through all the people of the organization are involved in kaizen whereas kaizen blitz is led by the top management and others are invited to participate the decision making lies with the upper management in kaizen the process is standardized and measurements are regularly collected and compared before the decision is taken this relatively delays the process of decision making in kaizen blitz decisions are taken soon and the process change is wrapped in three to five days kaizen is a continuous improvement method whereas the kaizen blitz is part of the improving process kaizen follows pdca in essence plan do and check and act for the improvement process kaizen blitz uses pdca for execution where the events are planned conducted decided implemented and followed up the following screen will elaborate on the concepts of kaizen and kaizen blitz through examples kaizen and kaizen blitz are practiced in many organizations across the world the examples of kaizen and kaizen blitz method are shown here click each tab to know more the toyota production system is known for kaizen practices in toyota if any issue arises in the production line the line personnel sees all the production until the issue is resolved once the solution is implemented the team resumes the production cycle a wood window company in the state of iowa us uses the kaizen blitz method to redesign their shop floor and replace expensive non-flexible automation with low-cost highly flexible cellular applications eliminating scraps reorganizing work areas and reducing inventory are some of the examples of quick implementation through kaizen blitz the term lien refers to creating more value to customers with fewer resources it means reducing unwanted activities or processes that do not add value to the product or service for the customer the lean philosophy is to provide perfect value to the customer through a perfect value creation process that has zero waste while the ultimate goal is to achieve zero waste you may not always get that in the first couple of tries however you will achieve minimum waste and continue to move towards zero waste eventually hence lean is the path towards perfection lean is about optimizing the process from beginning to end eliminating non-value-adding activities in vas and increasing flow to ensure that parts and services are provided to customers more quickly if quality is the word to describe six sigma then speed is the word to describe lean let's understand the importance of lean there are many benefits of lean and some of them are reduced cost reduced cycle time more throughput and increased productivity despite all of these benefits lean is not implemented by most of the organizations because of the misconception that it is only suited in manufacturing areas the reason for this misconception is the beginning of lean it began and grew in popularity in the manufacturing areas starting with toyota in recent years one can notice more applications of lane in other areas such as healthcare and the transactional space however the truth is that lean concepts can be applied in any business and in any process on the next screen let's discuss how lean and six sigma combine lean and six sigma are two different principles or methodologies that combine to form and create one powerful continuous improvement methodology they have various overlapping goals toward the improvement with the aim of creating the most efficient system though the approaches are different the methods complement each other lean six sigma takes the power and rigor of six sigma methodology and combines it with lean concepts leading to faster results better quality and improved customer satisfaction let's look at the differences between lean and six sigma lean focuses on efficiency by identifying value from the customer's point of view removing unnecessary steps in the process and improving process speed or velocity on the other hand six sigma focuses on effectiveness with the help of breakthrough processes identifying root causes and reduction in variation therefore when six sigma is combined with lean it is possible to achieve business transformation so remember lean is about speed with a focus on efficiency six sigma is about quality with a focus on effectiveness and lean six sigma brings the best of both together to yield a better result first implement lean to streamline the process this helps to understand the chronic problems and the ways to handle them quickly once the problem is identified use six sigma methodology to analyze the issues and provide business improvement transformation in other words lean is used to reduce the waste and six sigma is used to reduce the variation thanks to our training experts it was a real pleasure having you with us and with that we have come to an end of this full course video on six sigma i hope it was informative and interesting if you have any questions related to the topics that we covered in this video please ask away in the comment section below our team of experts will be delighted to hear it from you thanks for watching stay safe and keep learning hi there if you like this video subscribe to the simply learn youtube channel and click here to watch similar videos turn it up and get certified click here
Info
Channel: Simplilearn
Views: 286,420
Rating: 4.9632792 out of 5
Keywords: simplilearn, six sigma full course, six sigma green belt training, six sigma explained, six sigma certification, six sigma course, learn six sigma, learn six sigma online free, learn six sigma course, six sigma training videos, six sigma training, six sigma training material, six sigma course for beginners, six sigma fundamentals, six sigma course free, six sigma, lean six sigma green belt training and certification, simplilearn six sigma
Id: KfFez57ay6E
Channel Id: undefined
Length: 408min 34sec (24514 seconds)
Published: Sat Feb 06 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.