BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
NewUsrStat
Pyrite | Level 9

Hi guys, 

suppose to have the following: 

 

     ID             my_value                  

      1                      

      1                     1

      1                      

      1                     1 

      2                    1

      2                    0 

      2                    0

      2             

      3                    

      3                    0

      3                                                        

      4

      4

      4

Is there a way to subset according to the following rules:

 

1) if in my_value column for each ID there is at least one "1" then retain that row and delete the other rows for that ID (regardless if they contain a missing or 0). If there is more than one row with "1" it doesn't matter. Give me the ID one time with "1" in column my_value. (case of ID = 1)

2) if there's no 1 but only 0 and missing then retain 0 (case of ID = 3)

3) if only missing then give me one time that row with missing corresponding to my_value (case of ID = 4) 

 

Desired output: 

 

     ID             my_value                  

      1                     1

      2                    1

      3                   0

      4 

 

Thank you in advance

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @NewUsrStat,

 

You can use PROC SUMMARY to create an output dataset with the maximum of MY_VALUE per ID:

proc summary data=have;
by id;
var my_value;
output out=want(drop=_:) max=;
run;

If your real dataset is only grouped, but not sorted by ID, add the NOTSORTED option to the BY statement:

by id notsorted;

If it is not grouped either, use the NWAY option and the CLASS statement instead of the BY statement:

proc summary data=have nway;
class id;
...

View solution in original post

4 REPLIES 4
FreelanceReinh
Jade | Level 19

Hello @NewUsrStat,

 

You can use PROC SUMMARY to create an output dataset with the maximum of MY_VALUE per ID:

proc summary data=have;
by id;
var my_value;
output out=want(drop=_:) max=;
run;

If your real dataset is only grouped, but not sorted by ID, add the NOTSORTED option to the BY statement:

by id notsorted;

If it is not grouped either, use the NWAY option and the CLASS statement instead of the BY statement:

proc summary data=have nway;
class id;
...
NewUsrStat
Pyrite | Level 9
Thank you very much FreelanceReinh!
Kurt_Bremser
Super User

There is an important lesson to be learned here: you looked at the problem as one of removal, while @FreelanceReinh sees it as one of keeping. It is always good to consider an issue from the diametrically opposed POV, as it may (and often does) lead to a much simpler solution.

Whenever you deal with conditional code, ask yourself if it isn't better to use the negated condition and perform the opposite action.

NewUsrStat
Pyrite | Level 9
Thank you Kurt_Bremser. It is a good idea the look at the problem with the opposite perspective. Unusual to think in this way but sometimes decisive.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 681 views
  • 4 likes
  • 3 in conversation