Friday, March 9, 2012

Design Quality Measures

Industrial experiments have 2 fundamental goals:

1. Fundamental understanding

2. Predictive ability

Several design quality measures can help you judge how well a design can help you achieve these goals -- before you collect any data!

Design Measures Graphic

First let's look at quality measures that help us judge the ability of an experiment design to extract information about effects -- fundamental understanding.

VIFs: VIFs are Variance Inflation Factors. They tell you how "clean" the estimate of an effect is. A VIF of 1 indicates that the effect is free from any contamination from other effects. A design with all VIFs equal to 1 is ideal. VIFs larger than 1 indicate some level of contamination from other effects. VIFs larger than 5 generally indicate such a high level of contamination that the design will not separate effects well enough to learn anything about them. The model may still predict well -- you just won't be able to rank the effects.

Correlation Matrix: The correlation matrix also measures how "clean" effect estimates will be. The correlation matrix is more detailed in identifying the source of the contamination. A perfect correlation matrix has all ones on the main diagonal and zeroes everywhere else. This means each effect is correlated perfectly with itself and not at all correlated with anything else. Any off diagonal elements that are not zero indicate some level of correlation between 2 effects -- contamination. Most experts advise keeping off diagonal elements between -0.95 and 0.95. Once again, a design with a poor correlation matrix may allow you to fit a model that predicts well, but you won't be able to rank the effects.

Relative Variance of Coefficients: The power for identifying various coefficients should as high as possible to rank the effects accurately. Try to keep the power over 0.8. As before, a model built from a design with low power for the relative variance of the coefficients could still predict well.

Next time let's look at design quality measures that tell us about predictive ability.

Wednesday, February 1, 2012

What Does R^2 Really Tell Us?

R^2 (R-squared) is the "Coefficient of Determination." Many of you have heard of this and even rely on it.

But what does R^2 really tell us?

models
derivee.cours-de-math.eu

R^2 is a number that varies from 0 to 1. Zero means there is no correlation at all between your factors and your response -- it's all noise. One means you have a perfect correlation between your factors and your response -- no noise at all. Anything in between indicates a combination of correlation and noise.

So how much of the variation can be explained by your model? R^2 gives us that proportion. If R^2 is 0.9, for example, then 90% of the variation in the data can be explained by the model.

PLEASE NOTE: 90% of the variation in the data collected so far can be explained by the model. R^2 tells us nothing about the model's ability to make predictions.

A consequence of this is that a high R^2 does not necessarily mean you have a model that predicts well. Suppose you collect data for a line at both the high and low ends. The data may fit beautifully at the ends, providing an excellent R^2 value, but there is curvature your line cannot predict. Your predictions in the center will be very poor despite the high R^2 value.

So if 90% of the variation is explained by the model, is the remaining 10% noise? Not necessarily. It could be that the model is not a perfect fit (as a matter of fact it is very likely the model is not a perfect fit!) to your data, even if there were no noise. This lack of fit accounts for some of the 10%.

A consequence of this is that if you have a lot of noise in your data, you can never get a very large R^2 even with a perfect model. If you really have 20% noise in your data, the very best R^2 possible is only 0.8. Many people have a "magic" number for R^2 that must be met or exceeded, like 0.999, or 0.95. Clearly this would require an excellent data set with very little noise.

R^2 is an interesting number, but it is limited in what it can actually tell you.

Next time, let's look at some measures of design quality and how they can help you pick the right experiment designs.

Friday, January 27, 2012

A Reward for "Effective Innovation" Readers

To thank you for reading the "Effective Innovation" blog you can have access to all ObDOE eCourses for one full year for just $450 -- half the normal price! This offer is only good through January 31, 2012, so don't wait.

You can claim your reward here.

What's the Catch?

There is no catch, but you must pay by Jan 31.

Is this exactly the same as the full price eCourses?

Yes

Is coaching included?

Yes

Why are you doing this?

I put a lot of thought into the "Effective Innovation" blog. I hope this offer tells you how much I appreciate your time spent reading it.

Can I tell my friends about this?

Yes -- then they can become readers as well!

Thank You! Bill Kappele

Monday, January 9, 2012

Parity Plots

Chemical Engineers have a method for looking at how well theoretical values match up with measured values -- the parity plot. This plot is useful to anyone who wants to compare theoretical values with actual measured values.

Parity Plot

Plot Courtesy of JMP, www.JMP.com

Here's the idea -- you plot the predictions from a simulation, model, mass balance, etc. on the y-axis and the corresponding actual measurements on the x-axis. A line with a slope of 1 and a high correlation indicates good agreement between theory and reality. Anything else indicates that something is missing from your theory.

For example, if you create a Response Surface model to fit you data and the plot of actual values vs. predicted values has a slope of 1 and an R^2 of 0.9, you can feel pretty comfortable that your model is describing the data well. (Warning: this does not mean that your model will predict well in regions where you haven't collected data.)

JMP automatically creates this plot for you when you use the "fit model" package. You can create this plot for yourself easily in other packages.

You can get a good idea of the quality of your theory (or model) using the simple parity plot.

Next time let's look at what R^2 really tells us.

Friday, December 30, 2011

Happy New Year!

We often forget that we can start fresh each day. The beginning of a new year reminds us, though, and so we set resolutions to help us re-invent ourselves.

Happy New Year

Image: vichie81 / FreeDigitalPhotos.net


Here are a few suggestions for changes you can make in your life to make yourself a more successful innovator:

  1. Get up early and read for one hour every day. This one habit will help you read up to 50 books a year! Focus on books that encourage creativity and the developing of good habits. Try to use what you learn each day, making new innovation skills a part of your life. Here are some recommendations to get you started:
    1. The Leader's Guide to Lateral Thinking Skills: Unlocking the Creativity and Innovation in You and Your Team, by Paul Sloane. This book will open your eyes to a more complete way of thinking that will inspire your creative, innovative spirit.
    2. Lateral Thinking, by Edward DeBono. Lateral Thinking is the necessary companion of logical thinking. The two together are the engine of innovation.
    3. Bill & Dave: How Hewlett and Packard Built the World's Greatest Company, by Michael S. Malone. Bill Hewlett and Dave Packard are two of the world's greatest innovators. You can learn how they innovated virtually everything that is good in the workplace today and gain inspiration for your own innovation.
  2. Listen to educational CDs and books on tape in your car. You can use your commute time to learn about anything you like. Your attention is focused on driving, not everything that is said, so you can listen over and over. Each time through the CD you will pick up different pieces of important information. Here are some CDs to get you started:

    1. Smart Thinking, by Edward DeBono. Learn how to improve your thinking skills from the world's expert on thinking.
    2. Eat That Frog!: 21 Great Ways to Stop Procrastinating and Get More Done in Less Time, by Brian Tracey. Procrastination is an innovation killer! Learn to stop procrastinating and innovate.
    3. Getting Things Done: The Art Of Stress-Free Productivity, by David Allen. Being overwhelmed is counterproductive to innovation. Get your life under control so you can free your mind for creative thought.

  3. Attend workshops and short courses on innovation skills. Workshops will teach you new skills and give you a chance to practice them on exercises before applying them in your work. In-person and online courses are available. Here are some suggestions:

    1. Guided Tour of ObDOE eCourses For the modest sum of $75 per month you have access to a variety of on-line courses. You are assigned a Coach, a real human being to answer your questions and help you make progress.
    2. Practical Measurement System Analysis This is an in-house, one-day workshop to help you learn to evaluate your measurement systems to see if you can trust the data they provide.
    3. Performing Objective Experiments This in-house, 3-day workshop will teach you the practical fundamentals of the most powerful experimental strategy for innovation -- Design of Experiments. If you attend this workshop and apply the skills you learn in your work, you will become a top performer.


Happy New Year! Good luck re-inventing yourself. May you find happiness.

Next time, let's look at Parity Plots and how to make them.

Thursday, December 22, 2011

How Probability Can Help us Out in Tricky Situations

Have you ever been confident you are doing the right thing, but you keep getting poor results? How can you explain this?

frustrated

Everything we do has some probability of succeeding and some probability of failing. For example, if you play Blackjack perfectly, you will win just slightly more than half the hands and lose just slightly less than half the hands. Yes, you will win in the long run, but you will lose a lot along the way.

If you are very careful and thorough in all of your experimental work you will often draw correct conclusions -- but not always. Everything that comes from data is uncertain, so you will be misled from time to time. How often will you be misled? If you use 95% confidence limits to draw your conclusions, you can expect to be right about 95% of the time. While this is a lot better than playing Blackjack, it isn't perfect. And you can't tell when you will be misled. Sometimes you can be misled several times in a row. (This is called "rotten luck!") However, if you consistently use 95% confidence limits to draw conclusions, you will be right far more often than you are wrong over the long run.

It is very easy to get frustrated when you draw a conclusion that turns out to be wrong, but don't give up. Make sure you know the right way to draw conclusions from your data, then stay the course. You will get through the rough spots -- and there won't be that many of them. If you give up and start "winging it" instead of using good strategies, you will be frustrated far more often.

Next time, let's look at a few ideas for New Year's resolutions.

Friday, December 9, 2011

Probability: The Good News, and the Bad News

The world is uncertain. Data are uncertain, and everything that comes from data is uncertain.

Last week's blog discussed how Statistics was invented to help us deal with this uncertainty by making decisions with a high probability of being correct. But probability is a tricky subject.

The mathematical definition of probability is very precise, and very abstract: "Probability is a measure of sets in an abstract space of events" (The Lady Tasting Tea, p.301). What does this mean in real life?



When the weather man says, "There is a 70% chance of rain," what does this mean? Does it mean that 70% of people will get wet? Does it mean that 70% of the time it will be raining? Does it mean that 70% of the area outside will receive rain? Clearly the answer to all of these questions is, "no." but what does it really mean?

The most practical explanation of the meaning is that, based on past data, 70% of days with conditions like we have today have received rain. The take away message is: it may rain, so bring your umbrella.

In Statistics, 95% probability generally means that if we repeat an experiment 100 times, on average, 95 of those times will turn out as we expect. So, in Statistics, a 95% probability of rain means that of the next 100 days with conditions like we see today, 95 of those days will receive rain.

So the good news is we can use probability to make better decisions.

Here's the bad news -- People really don't understand probability. Most people believe that if a fair coin is tossed 10 times in a row and comes up heads every time, then the next toss has a very high probability of being tails. It does not -- it is still 50%. The probability of an independent event is not affected by past history. Most people think that the probability of a 100 year flood occurring in their state is 1 in 100 each year immediately after a flood, then it increases with time from the last flood. It doesn't change year to year. And, the probability of a 100 year flood occurring is generally much higher -- there are many 100 year flood planes in a state and they could all possibly flood any year. (Cliff Mass discusses this for Washington State in his blog.)

Here's the bottom line: probability is your friend when used correctly. Just be careful to use it correctly. Consulting a Statistician is always a good idea.

Next time, let's look at how probability can help us out of tricky situations.