BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Rjk
Fluorite | Level 6 Rjk
Fluorite | Level 6
Hi I'm facing a problem where using the class in the proc means step is causing duplication of data, one set with blank values in the class variable.
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

When you use a CLASS statement it does all levels, total and the for each level of the class variable. 

If you had two variables it would do the overall, all levels of the first variable, all levels of the second variable and all levels of both combined. The NWAY option will restrict the output to the 'highest' level and not have the lower levels. The _TYPE_ variable allows you to filter this out as well.

 

 

Check the results from the following:

proc means data=sashelp.class stackods;
class sex age;
var height;
ods output summary = want1;
run;

proc means data=sashelp.class stackods NWAY;
class sex age;
var height;
ods output summary=want2;
run;

 

 

EDIT: Apparently the STACKODS and ODS is formatted differently.

 

Here's a better example - check the 

proc means data=sashelp.class noprint ;
class sex age;
var height;
output out=want1 mean= sum= /autoname;
run;

proc means data=sashelp.class nway noprint ;
class sex age;
var height;
output out=want2 mean= sum= /autoname;
run;


Title 'example of no NWAY option';
proc print data=want1;
run;

Title 'example of NWAY option';
proc print data=want2;
run;

@Rjk wrote:
Hi I'm facing a problem where using the class in the proc means step is causing duplication of data, one set with blank values in the class variable.

 

View solution in original post

8 REPLIES 8
Reeza
Super User

When you use a CLASS statement it does all levels, total and the for each level of the class variable. 

If you had two variables it would do the overall, all levels of the first variable, all levels of the second variable and all levels of both combined. The NWAY option will restrict the output to the 'highest' level and not have the lower levels. The _TYPE_ variable allows you to filter this out as well.

 

 

Check the results from the following:

proc means data=sashelp.class stackods;
class sex age;
var height;
ods output summary = want1;
run;

proc means data=sashelp.class stackods NWAY;
class sex age;
var height;
ods output summary=want2;
run;

 

 

EDIT: Apparently the STACKODS and ODS is formatted differently.

 

Here's a better example - check the 

proc means data=sashelp.class noprint ;
class sex age;
var height;
output out=want1 mean= sum= /autoname;
run;

proc means data=sashelp.class nway noprint ;
class sex age;
var height;
output out=want2 mean= sum= /autoname;
run;


Title 'example of no NWAY option';
proc print data=want1;
run;

Title 'example of NWAY option';
proc print data=want2;
run;

@Rjk wrote:
Hi I'm facing a problem where using the class in the proc means step is causing duplication of data, one set with blank values in the class variable.

 

Rjk
Fluorite | Level 6 Rjk
Fluorite | Level 6
Thanks for the NWAY, thats what I was missing, makes more sense now 🙂
Rick_SAS
SAS Super FREQ

Please show the SAS syntax you are running and the results.

Rjk
Fluorite | Level 6 Rjk
Fluorite | Level 6

HI, 

PLease refer below for sample data set, code and outputs.

 


/* sample data set */
data MyData;
input VarA$ 1 VarB$ 3 Bal;

datalines;
A X 100
A X 110
A X 120
A Y 130
B X 140
B Y 150
B Y 160
B Z 170
C Y 180
C Z 190
C Z 200
;

Run;

 

 

proc means data=work.MyData sum nonobs;
/* by ;*/
class VarA VarB;
var Bal;
output out=work.RST1 sum=SumBal;
run;


proc means data=work.MyData sum nonobs;
by VarA VarB;
/* class ;*/
var Bal;
output out=work.RST3 sum=SumBal;
run;

 

 

 

RESULTS :::::::  RST1

 

VarAVarB_TYPE__FREQ_SumBal
  0111650
 X14470
 Y14620
 Z13560
A 24460
B 24620
C 23570
AX33330
AY31130
BX31140
BY32310
BZ31170
CY31180
CZ32390
     

 

 

 

Results RST3   :::::::::::::::::::::::::::::

 

VarAVarB_TYPE__FREQ_SumBal
AX03330
AY01130
BX01140
BY02310
BZ01170
CY01180
CZ02390

 

 

 

RST 3 is the required results, but just wondering why RST1 had the extra data in there duplicated.

 

Looks like @Reeza and @Astounding  have answered my question 🙂

ballardw
Super User

If ALL of your class variables are "blank" then I would bet a small stack of $$ that the _type_=0. That is the overall summary for the entire data set. The same as you would get without any class variables.

 

 

Rjk
Fluorite | Level 6 Rjk
Fluorite | Level 6
Thanks, I was wondering what _TYPE_ was for, makes sense now 🙂
Astounding
Opal | Level 21

While you already have the right answer, you've received so much information that it may be difficult to sift out what you need here.  Within @Reeza 's post is mention of the NWAY option.  You should be adding that on your PROC MEANS statement:

 

proc means data=have nway;

 

That will remove the extra levels of summarization.

 

Sometimes those extra levels are helpful, but that's another story for another day. 

Rjk
Fluorite | Level 6 Rjk
Fluorite | Level 6
Thanks, Thats what I was missing, makes a lot more sense now 🙂

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 666 views
  • 3 likes
  • 5 in conversation