BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ebowen
Quartz | Level 8

I'm having a problem with missing group values not showing up in proc report. I have data on electricity production from various generating plants that I want to sum by county. Some of the counties do not have a generating plant, so there is missing data for those counties. I have used the missing option in the proc report statement, but the missing counties do not show up in the report output. If I use PROC TABULATE, they do show up, so it's something I'm doing wrong in Report. Here's my code (data attached).

 

proc report data=test missing;
columns ("StateFIPS" StateFIPS) ("StateCode" statecode) ("CountyFIPS" CountyFIPS) ("CountyName" countyname) date,netgen;
define StateFIPS / group '';
define StateCode / group '';
define CountyFIPS / group '';
define CountyName / group '';
define date / Across '';
define NetGen / analysis sum '';

compute statefips;
	if statefips ^='' then hold1=Statefips;
	else statefips=hold1;
endcomp;
compute statecode;
	if statecode ^='' then hold2=Statecode;
	else statecode=hold2;
endcomp;

format 
	statefips $CHAR2.
	countyfips $CHAR3.
	date year4.;
where year(date)>=2000;
run;

Thanks for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

There aren't any missing values for COUNTY variables in your dataset.

204   proc freq data="&path\test";
205     where year(date) >= 2000;
206     tables CountyFIPS*CountyName / list missing;
207   run;

NOTE: There were 157 observations read from the data set C:\Downloads\test.
      WHERE YEAR(date)>=2000;
The FREQ Procedure

                                                          Cumulative    Cumulative
CountyFIPS    CountyName         Frequency     Percent     Frequency      Percent
----------------------------------------------------------------------------------
013           Bell County               1        0.64             1         0.64
049           Clark County             35       22.29            36        22.93
065           Estill County             5        3.18            41        26.11
089           Greenup County           14        8.92            55        35.03
125           Laurel County            31       19.75            86        54.78
127           Lawrence County          37       23.57           123        78.34
199           Pulaski County           17       10.83           140        89.17
207           Russell County           17       10.83           157       100.00

View solution in original post

3 REPLIES 3
Tom
Super User Tom
Super User

There aren't any missing values for COUNTY variables in your dataset.

204   proc freq data="&path\test";
205     where year(date) >= 2000;
206     tables CountyFIPS*CountyName / list missing;
207   run;

NOTE: There were 157 observations read from the data set C:\Downloads\test.
      WHERE YEAR(date)>=2000;
The FREQ Procedure

                                                          Cumulative    Cumulative
CountyFIPS    CountyName         Frequency     Percent     Frequency      Percent
----------------------------------------------------------------------------------
013           Bell County               1        0.64             1         0.64
049           Clark County             35       22.29            36        22.93
065           Estill County             5        3.18            41        26.11
089           Greenup County           14        8.92            55        35.03
125           Laurel County            31       19.75            86        54.78
127           Lawrence County          37       23.57           123        78.34
199           Pulaski County           17       10.83           140        89.17
207           Russell County           17       10.83           157       100.00
ebowen
Quartz | Level 8

Aha, now it makes sense. Those observations without generation data also have missing values for the date. So when I subset for those dates greater than 2000, it eliminates the missing counties. <Head slap>

Tom
Super User Tom
Super User

Note that if you reverse the order of the analysis and across variables in your COLUMNS statement you don't have use so many tricks in the PROC REPORT code to get a single line header.

proc report data="&path\test" missing;
where year(date) between 2000 and 2004;

columns StateFIPS statecode CountyFIPS countyname netgen,date ;

define StateFIPS / group ;
define StateCode / group ;
define CountyFIPS / group ;
define CountyName / group ;
define date / Across ' ';
define NetGen / analysis sum ' ';

compute statefips;
  if statefips ^='' then hold1=Statefips;
  else statefips=hold1;
endcomp;
compute statecode;
  if statecode ^='' then hold2=Statecode;
  else statecode=hold2;
endcomp;

format
  statefips $CHAR2.
  countyfips $CHAR3.
  date year4.
;
run;

image.png

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1269 views
  • 0 likes
  • 2 in conversation