My data looks like this:
LoanId Mod_Flag
12345
12345
12345 Y
34521
34521
68749 Y
68749 Y
68749
78423
78423
Is there any way that I could select a) all records corresponding to those LoanIds that have a blank (ie. non-Y) mod_flag throughout the LoanId group, in addition to b) the records from those LoanId groups whose mod_flag may either be blank or mod_flag='Y', corresponding to those LoanIds that ONLY have a mod_flag = 'Y'? To be clearer by final data would look as follows:
LoanId Mod_Flag
12345 Y
34521
34521
68749 Y
68749 Y
78423
78423
I think a "self merge" will work nicely here.
data loan;
infile cards firstobs=2 missover;
input loanid :$5. mod_flag:$1.;
cards;
LoanId Mod_Flag
12345
12345
12345 Y
34521
34521
68749 Y
68749 Y
68749
78423
78423
;;;;
run;
proc print;
run;
data loan2;
merge loan loan(in=in2 rename=(mod_flag=Y) where=(Y='Y'));
by loanid;
if in2 and mod_flag eq 'Y' then output;
else if not in2 then output;
drop y;
run;
proc print;
run;
For the a) the first solution that come to my mind is to select those who have "Y"
proc sort data=have;
by loanld mod_flag;
run;
data want;
set have;
by loanld mod_flag;
where mod_flag='Y';
if last.mod_flag; /*avoid duplicated loanld if multiple 'Y'*/
rename mod_flag=group_flag;
run;
data have;
merge have
want;
if group_flag ne 'Y';
run;
For the b), with the original dataset have you could do something like
data have;
merge have
want;
if group_flag eq Mod_flag;
run;
proc sort data=have;
by LoanId descending Mod_Flag;
run;
data want(where=(Mod_Flag='Y' or new='N'));
set have;
by LoanId descending Mod_Flag;
retain new;
if first.LoanId then new='';
if Mod_Flag='Y' then new='Y';
if new='' then new='N';
run;
I think a "self merge" will work nicely here.
data loan;
infile cards firstobs=2 missover;
input loanid :$5. mod_flag:$1.;
cards;
LoanId Mod_Flag
12345
12345
12345 Y
34521
34521
68749 Y
68749 Y
68749
78423
78423
;;;;
run;
proc print;
run;
data loan2;
merge loan loan(in=in2 rename=(mod_flag=Y) where=(Y='Y'));
by loanid;
if in2 and mod_flag eq 'Y' then output;
else if not in2 then output;
drop y;
run;
proc print;
run;
data loan; infile cards firstobs=2 missover; input loanid :$5. mod_flag:$1.; cards; LoanId Mod_Flag 12345 12345 12345 Y 34521 34521 68749 Y 68749 Y 68749 78423 78423 ;;;; run; proc sql; create table want as select * from loan group by loanid having sum(Mod_Flag='Y')=0 union all select * from loan where Mod_Flag='Y' order by loanid; quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.