BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
NewUsrStat
Quartz | Level 8

Hi guys, 

suppose to have the following: 

 

     ID             my_value                  

      1                      

      1                     1

      1                      

      1                     1 

      2                    1

      2                    0 

      2                    0

      2             

      3                    

      3                    0

      3                                                        

      4

      4

      4

Is there a way to subset according to the following rules:

 

1) if in my_value column for each ID there is at least one "1" then retain that row and delete the other rows for that ID (regardless if they contain a missing or 0). If there is more than one row with "1" it doesn't matter. Give me the ID one time with "1" in column my_value. (case of ID = 1)

2) if there's no 1 but only 0 and missing then retain 0 (case of ID = 3)

3) if only missing then give me one time that row with missing corresponding to my_value (case of ID = 4) 

 

Desired output: 

 

     ID             my_value                  

      1                     1

      2                    1

      3                   0

      4 

 

Thank you in advance

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @NewUsrStat,

 

You can use PROC SUMMARY to create an output dataset with the maximum of MY_VALUE per ID:

proc summary data=have;
by id;
var my_value;
output out=want(drop=_:) max=;
run;

If your real dataset is only grouped, but not sorted by ID, add the NOTSORTED option to the BY statement:

by id notsorted;

If it is not grouped either, use the NWAY option and the CLASS statement instead of the BY statement:

proc summary data=have nway;
class id;
...

View solution in original post

4 REPLIES 4
FreelanceReinh
Jade | Level 19

Hello @NewUsrStat,

 

You can use PROC SUMMARY to create an output dataset with the maximum of MY_VALUE per ID:

proc summary data=have;
by id;
var my_value;
output out=want(drop=_:) max=;
run;

If your real dataset is only grouped, but not sorted by ID, add the NOTSORTED option to the BY statement:

by id notsorted;

If it is not grouped either, use the NWAY option and the CLASS statement instead of the BY statement:

proc summary data=have nway;
class id;
...
NewUsrStat
Quartz | Level 8
Thank you very much FreelanceReinh!
Kurt_Bremser
Super User

There is an important lesson to be learned here: you looked at the problem as one of removal, while @FreelanceReinh sees it as one of keeping. It is always good to consider an issue from the diametrically opposed POV, as it may (and often does) lead to a much simpler solution.

Whenever you deal with conditional code, ask yourself if it isn't better to use the negated condition and perform the opposite action.

NewUsrStat
Quartz | Level 8
Thank you Kurt_Bremser. It is a good idea the look at the problem with the opposite perspective. Unusual to think in this way but sometimes decisive.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

5 Steps to Your First Analytics Project Using SAS

For SAS newbies, this video is a great way to get started. James Harroun walks through the process using SAS Studio for SAS OnDemand for Academics, but the same steps apply to any analytics project.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 366 views
  • 4 likes
  • 3 in conversation