Optimists would predict that corn yields tend to increase over time. While that’s true, there were also epochs of exponential yield increase. Other pundits have grouped these epochs, and I’ll call them:
1866-1936 “The hardscrabble age”
1937-1955 “The hybrid age”
1956-1995 “The age of genetics and pesticides”
1996-2019 “The biotech age”
We not only want to see if there is a relationship between year and yield, but also to break up the years into logical clusters, and perform regression analyses on each of them. So, we will use proc tranreg as follows:
ods graphics on;
proc transreg data=import1 ss2 plots=fit(nocli noclm);
ods output coef=coef;
model identity('US Corn Yield (Bu/A)'n) = class(Group / zero=none) | identity(year_);
run;
To perform a regression and ANOVA, we used the ss2 option in the proc line.
To break up the years into groups of interest, we included a column named ‘Group’ in the dataset to classify the groups, and then specify to model the response variable by ‘class(group)’.
After running the model, we see the overall regression is significant, with an adjusted R2 of 0.9792 (not shown). One of the largest deviants was year 2012, which was a major drought year throughout the Midwest US. And the regression coefficients bear out that corn yields were unpredictable and non-increasing, until the hybrid age dawned around 1937.
The regression table is:
Variable | DF | Coefficient | Type II Sum of Squares | Mean Square | F Value | Pr > F | Label |
Class.Group1866-1936 | 1 | 24.7429090 | 4679.7 | 4679.7 | 97.22 | <.0001 | Group 1866- 1936 |
Class.Group1937-1955 | 1 | 45.7953938 | 5290.0 | 5290.0 | 109.90 | <.0001 | Group 1937- 1955 |
Class.Group1956-1995 | 1 | 60.4673001 | 52172.5 | 52172.5 | 1083.86 | <.0001 | Group 1956- 1995 |
Class.Group1996-2019 | 1 | 57.7120584 | 1662.3 | 1662.3 | 34.53 | <.0001 | Group 1996- 2019 |
Identity(Group1866- 1936Year_) | 1 | -0.0000501 | 10.0 | 10.0 | 0.21 | 0.6495 | Group 1866- 1936 * Year_ |
Identity(Group1937- 1955Year_) | 1 | 0.0020928 | 333.0 | 333.0 | 6.92 | 0.0094 | Group 1937- 1955 * Year_ |
Identity(Group1956- 1995Year_) | 1 | 0.0051000 | 18495.1 | 18495.1 | 384.23 | <.0001 | Group 1956- 1995 * Year_ |
Identity(Group1996- 2019Year_) | 1 | 0.0053416 | 4377.5 | 4377.5 | 90.94 | <.0001 | Group 1996- 2019 * Year_ |
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.