SAS Programming

Satori · Posted 09-12-2023 07:10 AM

I have multiple entries with the same ID variable. I'm using proc summary to keep the max of each of them. The code example is below. For the variables in the macro variable fvar specified in the var statement, the max is applied, but what happens to the variables in the id statement (macro variable &other. The variables in other are character variables with qualitative data. If there is a missing value and a non missing I would like to keep the non missing, but what happens when both are non missing, which one is kept?

proc summary data=bva1 nway; class ID; var &fvar.; id &other; output out=bva2 max=;

Astounding · Posted 09-12-2023 09:56 AM

@Satori ,

I'm not sure you understand what you are asking for here. Let's say you have three ID variables:

id idvar1 idvar2 idvar3;

The values in the output data set are NOT the maximum values of each. (Yes, that would give you the non-missing, if there is both a missing and a non-missing.) But the values selected for ID variables are computed sequentially. https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/proc/n1niocuk07q2wnn16mz96fboupgc.htm

This means you get the maximum of idvar1. But you don't necessarily get the maximum of idvar2 or idvar3. Instead, SAS selects in this manner. For all observations that have the maximum value of idvar1, what is the maximum value of idvar2. There may be observations that have a lower value for idvar1, and a higher value for idvar2. SAS doesn't consider those. It only looks at observations having the maximum value for idvar1, when selecting the maximum value for idvar2. So it's the last observation per ID (your CLASS variable) if the data were sorted by ID idvar1 idvar2 idvar3.

View solution in original post

PaigeMiller · Posted 09-12-2023 07:46 AM

The documentation explains:

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/proc/n1niocuk07q2wnn16mz96fboupgc.htm

When you specify only one variable in the ID statement, the value of the ID variable for a given observation is the maximum (minimum) value found in the corresponding group of observations in the input data set.

Use IDMIN in the PROC statement to include the minimum value of the ID variables in the output data set.

--
Paige Miller

Satori · Posted 09-12-2023 08:38 AM

What if the variables are character variables with text? What is the max/min of that?

PaigeMiller · Posted 09-12-2023 08:48 AM

@Satori wrote:
What if the variables are character variables with text? What is the max/min of that?

Some things you can figure out for yourself. What is the minimum/maximum of:

Ant

Bear

Cougar

Dog

--
Paige Miller

ballardw · Posted 09-12-2023 10:52 AM

@Satori wrote:
What if the variables are character variables with text? What is the max/min of that?

Provide actual data.

Provide the actual code instead of hiding variable(s) behind macro variables.

If you do not understand the results, or they don't match what you need, describe what you expect/need/want.

ID variables in Proc Means provide additional output variables. What can depend on options in several places. By default (from the documentation)

When you specify only one variable in the ID statement, the value of the ID variable for a given observation is the maximum (minimum) value found in the corresponding group of observations in the input data set.

IDMIN on the proc statement gets the minimum value associated. But don't expect character values to behave nicely. Since "missing" is generally smaller than anything if your ID group as missing values they likely wont appear in the ID variables in the output unless IDMIN is set or all the Id values are missing fro the group.

Multiple ID variables and SAS goes into use of the multiple if there are "ties" in the first ID variable in the list.

Satori · Posted 09-12-2023 09:46 AM

What if there is more than one variable?

PaigeMiller · Posted 09-12-2023 09:53 AM

It's all explained in the documentation I linked to. Its all very sensible, it does exactly what you would think it should do.

And again, this is simple enough for you to figure out yourself. So, why does the two PROC SUMMARY below give different results? Can you explain why the results are what they are?

proc summary nway data=sashelp.cars;
class origin;
var cylinders;
id make model;
output out=maxx max=;
run;

proc summary nway idmin data=sashelp.cars;
class origin;
var cylinders;
id make model;
output out=maxx2 max=;
run;

--
Paige Miller

Astounding · Posted 09-12-2023 09:56 AM

@Satori ,

I'm not sure you understand what you are asking for here. Let's say you have three ID variables:

id idvar1 idvar2 idvar3;

The values in the output data set are NOT the maximum values of each. (Yes, that would give you the non-missing, if there is both a missing and a non-missing.) But the values selected for ID variables are computed sequentially. https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/proc/n1niocuk07q2wnn16mz96fboupgc.htm

This means you get the maximum of idvar1. But you don't necessarily get the maximum of idvar2 or idvar3. Instead, SAS selects in this manner. For all observations that have the maximum value of idvar1, what is the maximum value of idvar2. There may be observations that have a lower value for idvar1, and a higher value for idvar2. SAS doesn't consider those. It only looks at observations having the maximum value for idvar1, when selecting the maximum value for idvar2. So it's the last observation per ID (your CLASS variable) if the data were sorted by ID idvar1 idvar2 idvar3.

Satori · Posted 09-12-2023 11:38 AM

Thanks! Finally an answer to what is hapenning.

SAS Programming

general question about proc summary

Re: general question about proc summary

Re: general question about proc summary

Re: general question about proc summary

Re: general question about proc summary

Re: general question about proc summary

Re: general question about proc summary

Re: general question about proc summary

Re: general question about proc summary

Re: general question about proc summary

Follow Us

What is...

SAS Programming

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...