BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
geneshackman
Pyrite | Level 9

Hi, I have a data set that looks like the set below.

 

There are a set of variables (v1, v2, v3). Each variable covers the same set of counties (counties a1, a2, a3). I did a kurtosis analysis and got kurtosis for each variable (third column). The variables have the same kurtosis values for each county, but each variable has different values. So for example, V1 has a kurtosis of 44 for all the counties, V2 has a kurtosis of 24 for all counties, and so on.

 

I don't know beforehand what the value of the kurtosis will be. Also, I will be sorting the data set by the kurtosis values, so I won't know, beforehand, the order of the variables. When I run the data, v1 could have the largest kurtosis, or v3 could have the largest kurtosis."

 

I want to select the variable and all it's counties, with the largest kurtosis, save it as a subset, then select the variable and all it's counties, with the second largest kurtosis, save that as a subset, and so on.

 

My initial idea was to make another column (VarOrder) that would assign 1, 2, 3, to show which variable was largest, second largest, third largest. I'm not sure how to do that, and would also welcome suggestions on any other way to do this.

 

variable county Kurtosis v_other VarOrder
v1 a1 44 3  
v1 a2 44 4  
v1 a3 44 2  
v2 a1 24 5  
v2 a2 24 2  
v2 a3 24 4  
v3 a1 36 1  
v3 a2 36 5  
v3 a3 36 2  

 

Thanks

 

Gene

 

1 ACCEPTED SOLUTION

Accepted Solutions
RichardDeVen
Barite | Level 11

You really don't need to separate the original data set into subsets.  Instead, compute the kurtosis and sort the data by descending kurtosis and variable.  After that work, assign a 'sequence' number to the sorted rows and then use WHERE and BY statements to restrict and independently process groups of rows that you were planning to place in different data sets.

 

Example:

You don't show how the original kurtosis analysis is done.

proc sort data=kurtoses;
  by descending kurtosis variable county;
run;

data kurtoses;
  set kurtoses;
  by descending kurtosis variable county;
  if first.variable then kurtorder+1;       * group kurtorder=1 means variable has highest kurtosis;
run;

proc print data=kurtoses;
  by kurtorder;
  where kurtorder <= 3;
  var kurtosis variable county;
run;

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

All of this was discussed in your other thread. There, you were asked to provide an example of the output you want. Please provide an example of the output. If you want to move forward on this, we need to see the desired output.

 

Also, in your other thread we explained how there's no need to separate the data for the largest value, do an analysis; then pick the second largest value, separate that out, and do the same analysis. In your other thread, you said PROC SUMMARY and then sort won't work, but using you have explained, it will work, and it will be a lot simpler than the method you are proposing.

 

 

--
Paige Miller
RichardDeVen
Barite | Level 11

You really don't need to separate the original data set into subsets.  Instead, compute the kurtosis and sort the data by descending kurtosis and variable.  After that work, assign a 'sequence' number to the sorted rows and then use WHERE and BY statements to restrict and independently process groups of rows that you were planning to place in different data sets.

 

Example:

You don't show how the original kurtosis analysis is done.

proc sort data=kurtoses;
  by descending kurtosis variable county;
run;

data kurtoses;
  set kurtoses;
  by descending kurtosis variable county;
  if first.variable then kurtorder+1;       * group kurtorder=1 means variable has highest kurtosis;
run;

proc print data=kurtoses;
  by kurtorder;
  where kurtorder <= 3;
  var kurtosis variable county;
run;
geneshackman
Pyrite | Level 9

Richard, 

 

Thanks very much for your suggestion. That was exactly what I was looking for.

 

Thanks

 

Gene

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 945 views
  • 1 like
  • 3 in conversation