I'm having a problem with missing group values not showing up in proc report. I have data on electricity production from various generating plants that I want to sum by county. Some of the counties do not have a generating plant, so there is missing data for those counties. I have used the missing option in the proc report statement, but the missing counties do not show up in the report output. If I use PROC TABULATE, they do show up, so it's something I'm doing wrong in Report. Here's my code (data attached).
proc report data=test missing;
columns ("StateFIPS" StateFIPS) ("StateCode" statecode) ("CountyFIPS" CountyFIPS) ("CountyName" countyname) date,netgen;
define StateFIPS / group '';
define StateCode / group '';
define CountyFIPS / group '';
define CountyName / group '';
define date / Across '';
define NetGen / analysis sum '';
compute statefips;
if statefips ^='' then hold1=Statefips;
else statefips=hold1;
endcomp;
compute statecode;
if statecode ^='' then hold2=Statecode;
else statecode=hold2;
endcomp;
format
statefips $CHAR2.
countyfips $CHAR3.
date year4.;
where year(date)>=2000;
run;
Thanks for your help!
There aren't any missing values for COUNTY variables in your dataset.
204 proc freq data="&path\test"; 205 where year(date) >= 2000; 206 tables CountyFIPS*CountyName / list missing; 207 run; NOTE: There were 157 observations read from the data set C:\Downloads\test. WHERE YEAR(date)>=2000;
The FREQ Procedure Cumulative Cumulative CountyFIPS CountyName Frequency Percent Frequency Percent ---------------------------------------------------------------------------------- 013 Bell County 1 0.64 1 0.64 049 Clark County 35 22.29 36 22.93 065 Estill County 5 3.18 41 26.11 089 Greenup County 14 8.92 55 35.03 125 Laurel County 31 19.75 86 54.78 127 Lawrence County 37 23.57 123 78.34 199 Pulaski County 17 10.83 140 89.17 207 Russell County 17 10.83 157 100.00
There aren't any missing values for COUNTY variables in your dataset.
204 proc freq data="&path\test"; 205 where year(date) >= 2000; 206 tables CountyFIPS*CountyName / list missing; 207 run; NOTE: There were 157 observations read from the data set C:\Downloads\test. WHERE YEAR(date)>=2000;
The FREQ Procedure Cumulative Cumulative CountyFIPS CountyName Frequency Percent Frequency Percent ---------------------------------------------------------------------------------- 013 Bell County 1 0.64 1 0.64 049 Clark County 35 22.29 36 22.93 065 Estill County 5 3.18 41 26.11 089 Greenup County 14 8.92 55 35.03 125 Laurel County 31 19.75 86 54.78 127 Lawrence County 37 23.57 123 78.34 199 Pulaski County 17 10.83 140 89.17 207 Russell County 17 10.83 157 100.00
Aha, now it makes sense. Those observations without generation data also have missing values for the date. So when I subset for those dates greater than 2000, it eliminates the missing counties. <Head slap>
Note that if you reverse the order of the analysis and across variables in your COLUMNS statement you don't have use so many tricks in the PROC REPORT code to get a single line header.
proc report data="&path\test" missing;
where year(date) between 2000 and 2004;
columns StateFIPS statecode CountyFIPS countyname netgen,date ;
define StateFIPS / group ;
define StateCode / group ;
define CountyFIPS / group ;
define CountyName / group ;
define date / Across ' ';
define NetGen / analysis sum ' ';
compute statefips;
if statefips ^='' then hold1=Statefips;
else statefips=hold1;
endcomp;
compute statecode;
if statecode ^='' then hold2=Statecode;
else statecode=hold2;
endcomp;
format
statefips $CHAR2.
countyfips $CHAR3.
date year4.
;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.