BookmarkSubscribeRSS Feed
capam
Pyrite | Level 9

Hi,

 

I have the following simple proc means with class. 

 

There are a few classes that should produce a variety of results, however, this code takes only the max class and populates it everywhere. What would cause this to happen?

 

proc means data=tabs_dist_stats mean max n;
	class SORT_ORDER;
	var TabsEff_NOTABS TabsEff_TABS;
run;  

An excerpt of the output is attached.

 

 

10 REPLIES 10
PaigeMiller
Diamond | Level 26

Can you show us a representative sample of your data?

 

I can't see your .pdf file in my browser.

--
Paige Miller
capam
Pyrite | Level 9

The input data is below.

Astounding
PROC Star

One possibility:  Does the variable SORT_ORDER have a format permanently associated with it?  PROC CONTENTS will reveal that.

 

If that's, the case,  the format can group actual values into formatted levels.  You can temporarily remove the format by adding this statement to the PROC MEANS:

 

format SORT_ORDER;

capam
Pyrite | Level 9
I inserted the line, but got no change in output.
ballardw
Super User

@capam wrote:

Hi,

 

I have the following simple proc means with class. 

 

There are a few classes that should produce a variety of results, however, this code takes only the max class and populates it everywhere. What would cause this to happen?

 


Please describe what "takes only the max class and populates it everywhere" means. Your output example has sort_order, your class variable, with multiple values and so "only the max class" doesn't make sense at all to me.

capam
Pyrite | Level 9
Thanks for the question. On the output under column SORT_ORDER the numbers go from 1113 on up. Under the Mean column there are duplicate values. For example, 1121:1129 all have 506044 which should apply only to 1129. The Mean value for 1120 is the max Mean which somehow is also placed on the values for 1121:1126. The same phenomenon applies to Maximum. Each of the SORT_ORDER's should have it's unique values for Mean/Maximum. The same applies to 1113:1118. The respective Mean/Maximum should be unique for each.

I hope this is clearer now.
capam
Pyrite | Level 9
Should be 'The Mean value for 1121'. Also, when I stated 'The same applies to 1113:1118' I intended to mean that 417767 and 315591 are repeated in 1113:1118. Those numbers for Mean are also repeated in 1130:1136.
ballardw
Super User

 

If you run this code I think you will find that you do not have unique values for the variable TabsEff-NOTABS.

 

proc freq data=tabs_dist_stats;
   tables sort_order * tabsEff_NoTabs/ list;
run;

That will produce a table of each value of sort_order pared with each value of tabsEff_NOTABS as they exist in your data with one row per combination and a count.

 

 

When I see repeated values for the mean of an analysis variable within a class variable as you show it usually means that the exact same pattern (number and values) for the analysis variables are repeated for the class the variable. Not that in you result table the N  values repeat for those group with the name number of values for the analysis variables for class 1113 through 1118: 7 non-missing values for TabsEff_NOTABS and 322 for TabsEff_TABS. Which exact same values show up again for N for the class variable values of 1130 through 1136. So you have lots of duplicated data in your data set.

capam
Pyrite | Level 9
Yes. I've somehow corrupted the data. Thanks.
ballardw
Super User

@capam wrote:
Yes. I've somehow corrupted the data. Thanks.

Without seeing code a common issue like this can come from a data step that looks like:

 

Data want;

   set want;

<stuff>.

run;

or especially

Data want;

   set (or merge) want

         dataset2

   ;

run;

 

If you add records or a number of other things you can replace your original data set with one containing more (or fewer!) records. And if you rerun the code testing to add additional variables you can do this multiple times.

 

So see if your previous code has any of the input/output sets with the same name.

And be very careful when using that construct.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1595 views
  • 2 likes
  • 4 in conversation