BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ebowen
Quartz | Level 8

I'm having a problem with missing group values not showing up in proc report. I have data on electricity production from various generating plants that I want to sum by county. Some of the counties do not have a generating plant, so there is missing data for those counties. I have used the missing option in the proc report statement, but the missing counties do not show up in the report output. If I use PROC TABULATE, they do show up, so it's something I'm doing wrong in Report. Here's my code (data attached).

 

proc report data=test missing;
columns ("StateFIPS" StateFIPS) ("StateCode" statecode) ("CountyFIPS" CountyFIPS) ("CountyName" countyname) date,netgen;
define StateFIPS / group '';
define StateCode / group '';
define CountyFIPS / group '';
define CountyName / group '';
define date / Across '';
define NetGen / analysis sum '';

compute statefips;
	if statefips ^='' then hold1=Statefips;
	else statefips=hold1;
endcomp;
compute statecode;
	if statecode ^='' then hold2=Statecode;
	else statecode=hold2;
endcomp;

format 
	statefips $CHAR2.
	countyfips $CHAR3.
	date year4.;
where year(date)>=2000;
run;

Thanks for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

There aren't any missing values for COUNTY variables in your dataset.

204   proc freq data="&path\test";
205     where year(date) >= 2000;
206     tables CountyFIPS*CountyName / list missing;
207   run;

NOTE: There were 157 observations read from the data set C:\Downloads\test.
      WHERE YEAR(date)>=2000;
The FREQ Procedure

                                                          Cumulative    Cumulative
CountyFIPS    CountyName         Frequency     Percent     Frequency      Percent
----------------------------------------------------------------------------------
013           Bell County               1        0.64             1         0.64
049           Clark County             35       22.29            36        22.93
065           Estill County             5        3.18            41        26.11
089           Greenup County           14        8.92            55        35.03
125           Laurel County            31       19.75            86        54.78
127           Lawrence County          37       23.57           123        78.34
199           Pulaski County           17       10.83           140        89.17
207           Russell County           17       10.83           157       100.00

View solution in original post

3 REPLIES 3
Tom
Super User Tom
Super User

There aren't any missing values for COUNTY variables in your dataset.

204   proc freq data="&path\test";
205     where year(date) >= 2000;
206     tables CountyFIPS*CountyName / list missing;
207   run;

NOTE: There were 157 observations read from the data set C:\Downloads\test.
      WHERE YEAR(date)>=2000;
The FREQ Procedure

                                                          Cumulative    Cumulative
CountyFIPS    CountyName         Frequency     Percent     Frequency      Percent
----------------------------------------------------------------------------------
013           Bell County               1        0.64             1         0.64
049           Clark County             35       22.29            36        22.93
065           Estill County             5        3.18            41        26.11
089           Greenup County           14        8.92            55        35.03
125           Laurel County            31       19.75            86        54.78
127           Lawrence County          37       23.57           123        78.34
199           Pulaski County           17       10.83           140        89.17
207           Russell County           17       10.83           157       100.00
ebowen
Quartz | Level 8

Aha, now it makes sense. Those observations without generation data also have missing values for the date. So when I subset for those dates greater than 2000, it eliminates the missing counties. <Head slap>

Tom
Super User Tom
Super User

Note that if you reverse the order of the analysis and across variables in your COLUMNS statement you don't have use so many tricks in the PROC REPORT code to get a single line header.

proc report data="&path\test" missing;
where year(date) between 2000 and 2004;

columns StateFIPS statecode CountyFIPS countyname netgen,date ;

define StateFIPS / group ;
define StateCode / group ;
define CountyFIPS / group ;
define CountyName / group ;
define date / Across ' ';
define NetGen / analysis sum ' ';

compute statefips;
  if statefips ^='' then hold1=Statefips;
  else statefips=hold1;
endcomp;
compute statecode;
  if statecode ^='' then hold2=Statecode;
  else statecode=hold2;
endcomp;

format
  statefips $CHAR2.
  countyfips $CHAR3.
  date year4.
;
run;

image.png

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1958 views
  • 0 likes
  • 2 in conversation