I have some data structured as follows (reasons are 0/1):
ID reason1 reason2 reason3 reason4 reason5
1 1 0 0 0 0
2 1 1 0 0 0
3 0 0 1 0 1
4 0 0 1 1 0
5 0 0 0 0 1
I would like to filter through the reason1-reason5 and keep the first "1" across, so my final dataset looks like this:
ID reason
1 1
2 1
3 3
4 3
5 5
So that even if the ID has more than 1 reason, we prioritize the earlier 1. How can I do this with SQL?
@Melk wrote:
That was simpler than I thought! So lets say I want to prioritize 5 over 4 now, so priority will be 1 then 2 then 3 then 5 then 4 - can I make a simple edit to the code for this?
NOW you have to provide what your output would look like to demonstrate mean by "prioritize" in this fashion.
One way might be to change the order of the variables instead of using whichn(1, of reasons(*))
whichn(1,reason5, reason4, reason1,reason2,reason3); would return the number in this list with a result of 1 means Reason5 had a 1; 2 means reason4 had a 1 but reason5 did not and so forth.
May i ask why not datastep using whichn function?
Why do you want to do this in SQL and not simply... ?
data have;
input ID $ reason1-reason5;
datalines;
1 1 0 0 0 0
2 1 1 0 0 0
3 0 0 1 0 1
4 0 0 1 1 0
5 0 0 0 0 1
;
data want;
set have;
array reasons{*} reason1-reason5;
reason=whichn(1, of reasons[*]);
keep ID reason;
run;
Thought it would be easier than a data step - is it not?
Note that function whichn will also work within SQL, but you will have to list the variables:
proc sql;
select
id,
whichn(1, reason1, reason2, reason3, reason4, reason5) as reason
from have;
quit;
When you say "prioritize 5 over 4" do you mean Column 5 and 4?
So if both Column 4 and 5 are 1 and the rest are 0, you want reason=5, correct?
I mean reason5 over reason4 (based on var name).
If you want to reverse the order, do like this (or its equivalent with a datastep)
proc sql;
select
id,
6 - whichn(1, reason5, reason4, reason3, reason2, reason1) as reason
from have;
quit;
For an arbitrary priority order, I suggest you use two arrays :
data want;
set have;
/* In order of priority */
array reasons{*} reason1 reason2 reason3 reason5 reason4;
/* Reason number in reasons list, padded with special value 0 */
array order{5} (1,2,3,5,4,0);
reason = order{whichn(1, of reasons[*], 1)};
keep ID reason;
run;
@Melk wrote:
That was simpler than I thought! So lets say I want to prioritize 5 over 4 now, so priority will be 1 then 2 then 3 then 5 then 4 - can I make a simple edit to the code for this?
Try reordering your list in the array based on priorities, not sure if that will work or not.
@Melk wrote:
That was simpler than I thought! So lets say I want to prioritize 5 over 4 now, so priority will be 1 then 2 then 3 then 5 then 4 - can I make a simple edit to the code for this?
NOW you have to provide what your output would look like to demonstrate mean by "prioritize" in this fashion.
One way might be to change the order of the variables instead of using whichn(1, of reasons(*))
whichn(1,reason5, reason4, reason1,reason2,reason3); would return the number in this list with a result of 1 means Reason5 had a 1; 2 means reason4 had a 1 but reason5 did not and so forth.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.