01-02-2017 12:32 PM
Happy 2017! I saw the #365Papers hashtag on Twitter, and I'm aiming to accomplish it. I may not read a paper a day (schedules and such) but I'll catch up on weekends or when I have time. I am aiming to read a variety of papers (not just SAS) but I'm planning to post my notes here on any relevant papers. I'll post back future articles to this as replies.
For the first paper, I read LTC Douglas McAllaster's paper Basic Usage of SAS/ETS Software to Forecast a Time Series. Notes below are provided as written in the article, to avoid any misinterpretation on my part. I'll comment on the paper at the end.
Exploratory Data Analysis
The first and necessary step for proper time series analysis is an exploration of your data. Look at a time series plot of the data to visually identify the presence of three possible components: Level, Trend, and Seasonilty. First, the plot may be flat, showing only random fluctuation about some constant LEVEL. Second, the plot may show an upward (or downward) TREND over time. Finally, the plot may show some obvious SEASONALITY.
Exponential Smoothing Ideas
The basic idea of this method is to project a series based mostly on recent data, ie. without much regard to older data, meaning it's called a "short memory" method. The forecast for the next period should be based on the accuracy of the most recent forecast. We can describe smoothing methods as self-correcting, adaptive, and responsive.
Furthermore, the short memory of exponential smoothing makes it a particularly appropriate method when the time series components (level, trend, and seasonality) are changing over time.
There are three main exponential smoothing methods: single, Holt's and Winter's. Single exponential smoothing is used for a series which has only the level component, Holt's exponential smoothing method for a series which also has a trend component, and Winter's exponential smoothing method for a series which also has a seasonal component.
Single Exponential Smoothing
SES is the simplest of the three main exponential smoothing methods.
This factor controls how much relative weight the procedure gives to newr versus old data. A factor near 1 gives the vast majority of the weight to the most recent observations whereas a factor nearer 0 allows the older observations to influence the forecast. Typical values are 1/10 and 1/3.
To determine a "good or bad" smoothing factor, most texts recommend analysts use a trial and error approach to minimise some measures of inaccuracy, either absolute or squared error.
Holt's idea is to smooth the trend component as well as the level component. That is, Holt's method is the proper one when the series has both the level and trend components.
Winters extended Holt's method to smooth the third and final component, seasonality. Once an analyst understands the smoothing concept in Holt's method, using Winters' method should be an easy transition.
All of these exponential smoothing procedures require initialization. Typically, statistical software packages perform a simple linear regression against the first few observations to produce an initial forecast of the level and trend components. Thereafter, the exponential smoothing procedure adjusts future forecasts based on the accuracy of the recent forecasts. Fortunately, forecasting is not sensitive to the initialisation method as long as there are ten or twenty observations, since exponential smoothing has short memory.
Time series data often exhibits something called auto-correlation, which means that observations are not independent, one from another. Now, if the observations are not independent, then neither will the residuals from the regression be independent.
There are two types of auto-correlation: positive and negative. Positive auto-correlation is more common in real data. This means that errors tend to stay positive for a while then switch negative and stay negative for a while and so on back and forth. Negative correlation is not so easy to spot on a plot and occurs when errors bounce back and forth between positive and negative too often.
The Durbin-Watson statistic is limited in that it detects auto-correlation at lag 1 only. Nevertheless, lag 1 correlation is the most common pattern in real data, so the DW stat is a very useful diagnostic.
The statisically correct method for handling auto-correlated data is to use the auto-regressive method. This method accounts for the dependency with an additional parameter to model the auto-correlation. SAS/ETS software includes PROC AUTOREG to fit an auto-regressive model.
I've been reading Time Series articles for a while, and I was very excited to read this article. However, I found the article doesn't cover enough of the topics to give novices a starting point, and doesn't present enough new information for experts to get anything new from it. I did like the short descriptions for some of the methods, but any Introduction-level article should have the same information.
Have a great day!
01-08-2017 05:46 PM
So it's now January 8th and I've been pretty decent at staying on task - I have to spend 2 days reading a couple of articles that were more involved than anticipated, but as of today I've read 8 articles.
The one I read today (Ten Simple Rules for Effective Statistical Practice, available here) is a great article for students and experienced analysts. As with the previous article, I will provide my notes strictly right from the article and then my thoughts. The article is from a biology journal, so the examples etc. used have a biological focus, but the principles still apply to all analysts.
Rule 1 - Statistical Methods Should Enable Data to Answer Scientific Questions
Inexperienced users of statistics tend to take for granted the link between data and scientific issues and, as a result, may jump directly to a technique based on data structure rather than scientific goal.
Questions asked go from "Which test should I use?" to "Where are the differentiated genes?"
Rule 2 - Signals Always Come With Noise
Variability can be good in some cases, becuase we need variability in predictors to explain variability in outcomes. Other times variability may be annoying, such as when we get three different numbers when measuring the same thing three times. This latter variability is usually called "noise", in the sesne that it is either not understood or thought to be irrelevant. Statistical analyses aim to assess the signal provided by the data, the interesting variability, in the presence of noise, or irrelevant variability.
Rule 3 - Plan Ahead, Really Ahead
Asking questions at the design stage can save headaches at the analysis stage; careful data collection can greatly simplfy analysis and make it more rigourous. Sir Ronald Fisher put it "To consult the statistician after an experiment is finished is merely to ask him [/her] to comduct a post mortem examination. He [/she] can perhaps say what the experiment died from."
Rule 4 – Worry about Data Quality
“Garbage in produces garbage out” – the complexity of modern data collection requires many assumptions about the function of technology, often including data pre-processing technology. It is highly advisable to approach pre-processing with care, as it can have profound effects that easily go unnoticed.
Try to understand as much as you can how these data arrived at your desk or disk. Why are some data missing or incomplete? Did they get lost through some substantively relevant mechanism? Understanding such mechanisms can help to avoid some seriously misleading results.
Tinker around with the data – exploratory data analysis is often the most informative part of the analysis. This is when data quality issues and outliers can be revealed.
Rule 5 – Statistical Analysis Is More Than a Set of Computations
Statistical software provides tools to assist analyses, not define them. The scientific context is critical, and the key to principled statistical analysis is to bring analytic methods into close correspondence with scientific questions.
A reader will likely want to consider the fundamental issue of whether the analytic technique is appropriately linked to the substantive questions being answered. Don’t make the reader puzzle over this: spell it out clearly.
Rule 6 – Keep it Simple
All else being equal, simplicity beats complexity. Start with simple approaches and only add complexity as needed, and then only add as little as seems essential. Keep in mind that good design, implemented well, can often allow simple methods of analysis to produce strong results.
Rule 7 – Provide Assessments of Variability
A basic purpose of statistical analysis is to help assess uncertainty, often in the form of a standard error or a confidence interval, and one of the great successes of statistical modeling and inference is that it can provide estimates of standard errors from the same data that produce estimates of the quantity of interest.
Rule 8 – Check Your Assumptions
Every statistical inference involves assumptions, which are based on substantive knowledge and some probabilistic representation of data variation. It is therefore important to understand the assumptions embodied in the methods you are using and to do whatever you can to understand and assess these assumptions. At a minimum, you will want to check how well your statistical model fits the data.
Rule 9 – When Possible, Replicate!
Every good analyst examines the data at great length, looking for patterns of many types and searching for predicted and unpredicted results. This process often involves dozens of procedures, including many alternative visualizations.
Rule 10 – Make Your Analysis Reproducible
Given the same set of data, together with a complete description of the analysis, it should be possible to reproduce the tables, figures and statistical inferences. However, multiple barriers to this such as different computing architectures, software versions, and settings.
Improve the ability to reproduce findings by being very systematic about the steps in the analysis, by sharing the data and code used to produce the results.
I think this is a very important paper as it covers some of the topics I always try and cover when I’m talking with students, and explains other issues that I have not thought of. Obviously a lot more can be said about each of the rules (for example, documentation of your code), but for a starting point, I think this does a great job. My mother taught me that the three pillars of a good relationship are Honesty, Communication and Respect, and I think these 10 Rules perfectly integrate with that philosophy.
My final thought is review your analyses and your data with a friend or co-worker; I find it very helpful to walk through my work with someone completely unfamiliar with the topic, as they will ask questions that will force me to examine my assumptions.
Until next time
01-15-2017 09:11 PM
The next article I read in the #365Papers was a fairly long one, and I’m sure a contentious one in the field of statistics. I’d be interested in hearing others’ thoughts. The paper is by Greenland et al, and called Statistical test, P values, confidence intervals, and power: a guide to misinterpretations (direct link to the paper). As with the other articles, I’m providing the text more or less verbatim from the article, and I’ll provide some thoughts at the end.
A key problem is that there are no interpretations of statistical tests, confidence intervals, and statistical power that are at once simple, intuitive, correct, and foolproof.
There is aerious problem of defining the scope of a model, in that it should allow not only for a good representation of the observed data but also of hypothetical alternative data that might have been observed. The reference frame for data that “might have been observed” is often unclear.
Much statistical teaching and practice has developed a strong (and unhealthy) focus on the idea that the main aim of a study should be to test null hypotheses. This exclusive focus on null hypotheses contributes to misunderstanding of tests. Adding to the misunderstanding is that many authors (including RA Fisher) use “null hypothesis” to refer to any test hypothesis, even though this usage is at odds with other authors and with ordinary English definitions of “null” – as are statistical usages of “significance” and “confidence”.
Moving from tests to estimates
Neyman proposed the construction of confidence intervals in the way we are used to because they have the following property: If one calculates, say, 95% confidence intervals repeatedly in valid applications, 95% of them, on average, will contain the true effect size. Hence, the specified confidence interval is called the coverage probability. As Neyman stressed repeatedly, this coverage probability is a property of a long sequence of confidence intervals computed from valid models, rather than a property of any single confidence interval.
Common Misinterpretations of single P values
The P value is the probability that the test hypothesis is true
The P value for the null hypothesis is the probability that chance alone produced the observed association.
A significant test result (P<0.05) means that the test hypothesis is false or should be rejected.
A non-significant test result (P >0.05) means that the test hypothesis is true or should be accepted.
A large P value is evidence in favour of the test hypothesis.
A null hypothesis P value greater than 0.05 means that no effect was observed, or that absence of an effect was shown or demonstrated.
Statistical significance indicates a scientifically or substantively important relation has been detected.
Lack of statistical significance indicates that the effect size is small.
The P value is the chance of our data occurring if the test hypothesis is true.
If you reject the test hypothesis because P<0.05, the chance you are in error (the hance your “significant finding” is a false positive) is 5%.
P=0.05 and P<0.05 mean the same thing.
P values are properly reported as inequalities (e.g., report “P<0.02” when P=0.015 or report “P>0.05” when P=0.06 or P=0.70).
Statistical significance is a property of the phenomenon being studied, and thus statistical tests detect significance.
One should always use two-sided P values.
Common misinterpretations of P value comparisons and predictions
When the same hypothesis is tested in different studies and none or a minority of the tests are statistically significant (all P>0.05), the overall evidence supports the hypothesis.
When the same hypothesis is tested in two different populations and the resulting P values are on opposite sides of 0.05, the results are conflicting.
When the same hypothesis is tested in two different populations and the same P values are obtained, the results are in agreement.
If one observes a small P value, there is a good chance that the next study will produce a P value at least as small for the same hypothesis.
Common misinterpretations of confidence intervals
The specific 95% confidence interval presented by a study has a 95% chance of containing the true effect size.
An effect size outside the 95% confidence interval has been refuted (or excluded) by the data.
If two confidence intervals overlap, the difference between two estimates or studies is not significant.
An observed 95% confidence interval predicts the 95% of the estimates from future studies will fall inside the observed interval.
If one confidence interval includes the null value and another excludes that value, the interval excluding the null is the more precise one.
Common misinterpretations of power
If you accept the null hypothesis because the null P value exceeds 0.05 and the power of your test is 90%, the chance you are in error (the chance that your finding is a false negative) is 10%.
If the null P value exceeds 0.05 and the power of this test is 90% at an alternative, the results support the null over the alternative.
Suggested Guidelines for users and readers of statistics
I provided more detail here than I intended because I think this paper (from 2016) could have the potential to trigger a major paradigm shift in research and statistics. I would not normally put so much weight to a paper, but given the senior author was Dr Douglas Altman, I put more confidence in the information (puns intended).
I remember learning stats in university and getting very confused by the concepts versus the language – had this paper been available, I think I would have had a more comfortable understanding. I found this paper to be very easy to understand, although fairly dense. The concepts were presented in a very readable way, but the amount of information I wanted to absorb required a couple of readings.