Hi
i have a sample dataset where i need to apply logic to unique_id as per code it has in waterfall manner.
For instance if for unique ID 1, Code A is found which gets Yes in preference then other instance of unique ID 1 will get "no" in Preference column
for unique ID 2 Code B is found which gets Yes in preference then other instance of unique ID 2 will get "no" in Preference column
for unique ID 3 Code E is found which gets Yes in preference then other instance of unique ID 3 will get "no" in Preference column
so waterfall for Code follows the order as A,B,E,D,C.
So A gets first prefernce if found for unique id,
B gets second and so on
How can i apply this waterfall for unique id. So in nutshell if i find a code for unique id as per Waterfall it should get Yes in preference and rest all other instance of same unique id get No in preference.
UNIQUE_ID CODE Preference
1 A Yes
1 B no
2 B Yes
2 C no
3 C no
3 D no
3 E Yes
Read all the 'Yes' values, determining the prime yes (i.e. A=yes supersedes B=yes >> E=yes >> C=yes >> D=yes). Then read all the cases turning everything to No except the prime yes.
In other words if an ID starts out with 3 Yes's and 2 No's, the 3 Yes's are read first [see the "where=(preference='Yes')" parameter] to determine the prime yes. Then all 5 Yes's and No's are read, assigning No to all but the prime yes:
data have;
input UNIQUE_ID CODE :$1. Preference :$3.;
datalines;
1 A Yes
1 B no
2 B Yes
2 C no
3 C no
3 D no
3 E Yes
run;
data want (drop=prime_rank);
set have (in=inyes where=(preference='Yes'))
have (in=inkeep);
by unique_id;
if first.unique_id then prime_rank=.;
retain prime_rank;
if inyes then do;
prime_rank=min(prime_rank,findc('ABEDC',code));
delete; /* Delete this case, it will be read again, and output below */
end;
if NOT(code=CHAR('ABEDC',prime_rank) and preference='Yes') then preference='No';
run;
Create a custom format that maps the order.
ABEDC
Then A->1
B->2
E->3
D->4
C->5
Sort by the new variable and use BY groups to identify the first.
Read all the 'Yes' values, determining the prime yes (i.e. A=yes supersedes B=yes >> E=yes >> C=yes >> D=yes). Then read all the cases turning everything to No except the prime yes.
In other words if an ID starts out with 3 Yes's and 2 No's, the 3 Yes's are read first [see the "where=(preference='Yes')" parameter] to determine the prime yes. Then all 5 Yes's and No's are read, assigning No to all but the prime yes:
data have;
input UNIQUE_ID CODE :$1. Preference :$3.;
datalines;
1 A Yes
1 B no
2 B Yes
2 C no
3 C no
3 D no
3 E Yes
run;
data want (drop=prime_rank);
set have (in=inyes where=(preference='Yes'))
have (in=inkeep);
by unique_id;
if first.unique_id then prime_rank=.;
retain prime_rank;
if inyes then do;
prime_rank=min(prime_rank,findc('ABEDC',code));
delete; /* Delete this case, it will be read again, and output below */
end;
if NOT(code=CHAR('ABEDC',prime_rank) and preference='Yes') then preference='No';
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.