BookmarkSubscribeRSS Feed
geneshackman
Pyrite | Level 9

Hi, I have a data set that looks like the set below.

 

There are a set of variables (v1, v2, v3). Each variable has values for the same set of counties (counties a1, a2, a3). The variables have the same values for each county, but each variable has different values. I don't know beforehand what the value of the variables will be. Also, I will be sorting the data set by the values, so I won't know, beforehand, the order of the variables. When I run the data, v1 could have the largest value, or v3 could have the largest value.

 

I want to select the variable with the largest value, and do some statistics on that variable. Then select the variable with the second largest value and do the same statistics on it, and so on.

 

Is there any simple way to do this?

 

variable county value
v1 a1 44
v1 a2 44
v1 a3 44
v2 a1 24
v2 a2 24
v2 a3 24
v3 a1 36
v3 a2 36
v3 a3 36

 

6 REPLIES 6
PaigeMiller
Diamond | Level 26

So, every v1 has the same value? And every v2 has the same value? And so on?

 

What statistics do you want to do? What meaningful statistics could  you do anyway on data that has a constant value?

 

Anyway, couldn't you do the statistics all at once in PROC SUMMARY, and then sort them? 

 

To tell you the truth, I find the problem statement to be confusing. It would help greatly if you showed us the desired output from your example. 

--
Paige Miller
PGStats
Opal | Level 21

"I want to select the variable with the largest value, and do some statistics on that variable."

 

The way your question reads, you want to do statistics on a single value. Please clarify.

PG
geneshackman
Pyrite | Level 9

Well, the data set has some other variables, I'm just presenting the variables useful for this part of my problem. 

 

Lets say this instead. I want to select the variable with the largest value, and create a subset of data, variableX and all the counties. Then select the variable with the second largest value, and create a subject of data for that variable, variable Y and all the counties.

 

Does that clarify?

 

Thanks

 

geneshackman
Pyrite | Level 9

Lets say the data set looks like this. Valueout is what I'm using to sort the data set so I can select the v1 or v2 or v3, whichever is largest, and v_other is what I'm going to analyze. But, as mentioned above, the task I am looking to do is to select the variable with the largest valueout, and make that a subset, that I can further work with. Sorry, I wasn't clearer on this at the start.

 

variable county valueout v_other
v1 a1 44 3
v1 a2 44 4
v1 a3 44 2
v2 a1 24 5
v2 a2 24 2
v2 a3 24 4
v3 a1 36 1
v3 a2 36 5
v3 a3 36 2

 

 

PaigeMiller
Diamond | Level 26

It would be extremely helpful if you provide us with the desired output from this analysis, as I requested. It would be extremely helpful if you told us the statistics you want to compute, as I requested.

 

Also, you can compute all the desired statistics in one execution of PROC SUMMARY and then sort them by largest, then next largest and so on. There's no need to split out the largest, compute statistics; split out the 2nd largest, etc.

--
Paige Miller
geneshackman
Pyrite | Level 9

Thanks for your response, but actually, I'm pretty sure I can't use proc summary and then sort. The easiest thing is if I can select a subset of data.

 

However, if you want to know, valueout is an indicator of kurtosis, or how spread out the data are with regard to v_other. I am selecting only variables that have kurtosis greater than 3, which means they they have outliers and are not "normally" distributed. I will take each subset and sequentially remove the most extreme values of v_other until I get a kurtosis of less than 3, so there are no more outliers. I don't know exactly how I will do this part yet, other than manually.

 

Thanks

 

Gene

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 506 views
  • 0 likes
  • 3 in conversation