BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
geneshackman
Pyrite | Level 9

Hi, I have a data set that looks like the set below.

 

There are a set of variables (v1, v2, v3). Each variable covers the same set of counties (counties a1, a2, a3). I did a kurtosis analysis and got kurtosis for each variable (third column). The variables have the same kurtosis values for each county, but each variable has different values. So for example, V1 has a kurtosis of 44 for all the counties, V2 has a kurtosis of 24 for all counties, and so on.

 

I don't know beforehand what the value of the kurtosis will be. Also, I will be sorting the data set by the kurtosis values, so I won't know, beforehand, the order of the variables. When I run the data, v1 could have the largest kurtosis, or v3 could have the largest kurtosis."

 

I want to select the variable and all it's counties, with the largest kurtosis, save it as a subset, then select the variable and all it's counties, with the second largest kurtosis, save that as a subset, and so on.

 

My initial idea was to make another column (VarOrder) that would assign 1, 2, 3, to show which variable was largest, second largest, third largest. I'm not sure how to do that, and would also welcome suggestions on any other way to do this.

 

variable county Kurtosis v_other VarOrder
v1 a1 44 3  
v1 a2 44 4  
v1 a3 44 2  
v2 a1 24 5  
v2 a2 24 2  
v2 a3 24 4  
v3 a1 36 1  
v3 a2 36 5  
v3 a3 36 2  

 

Thanks

 

Gene

 

1 ACCEPTED SOLUTION

Accepted Solutions
RichardDeVen
Barite | Level 11

You really don't need to separate the original data set into subsets.  Instead, compute the kurtosis and sort the data by descending kurtosis and variable.  After that work, assign a 'sequence' number to the sorted rows and then use WHERE and BY statements to restrict and independently process groups of rows that you were planning to place in different data sets.

 

Example:

You don't show how the original kurtosis analysis is done.

proc sort data=kurtoses;
  by descending kurtosis variable county;
run;

data kurtoses;
  set kurtoses;
  by descending kurtosis variable county;
  if first.variable then kurtorder+1;       * group kurtorder=1 means variable has highest kurtosis;
run;

proc print data=kurtoses;
  by kurtorder;
  where kurtorder <= 3;
  var kurtosis variable county;
run;

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

All of this was discussed in your other thread. There, you were asked to provide an example of the output you want. Please provide an example of the output. If you want to move forward on this, we need to see the desired output.

 

Also, in your other thread we explained how there's no need to separate the data for the largest value, do an analysis; then pick the second largest value, separate that out, and do the same analysis. In your other thread, you said PROC SUMMARY and then sort won't work, but using you have explained, it will work, and it will be a lot simpler than the method you are proposing.

 

 

--
Paige Miller
RichardDeVen
Barite | Level 11

You really don't need to separate the original data set into subsets.  Instead, compute the kurtosis and sort the data by descending kurtosis and variable.  After that work, assign a 'sequence' number to the sorted rows and then use WHERE and BY statements to restrict and independently process groups of rows that you were planning to place in different data sets.

 

Example:

You don't show how the original kurtosis analysis is done.

proc sort data=kurtoses;
  by descending kurtosis variable county;
run;

data kurtoses;
  set kurtoses;
  by descending kurtosis variable county;
  if first.variable then kurtorder+1;       * group kurtorder=1 means variable has highest kurtosis;
run;

proc print data=kurtoses;
  by kurtorder;
  where kurtorder <= 3;
  var kurtosis variable county;
run;
geneshackman
Pyrite | Level 9

Richard, 

 

Thanks very much for your suggestion. That was exactly what I was looking for.

 

Thanks

 

Gene

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 552 views
  • 1 like
  • 3 in conversation