My data looks like this:
LoanId Mod_Flag
12345
12345
12345 Y
34521
34521
68749 Y
68749 Y
68749
78423
78423
Is there any way that I could select a) all records corresponding to those LoanIds that have a blank (ie. non-Y) mod_flag throughout the LoanId group, in addition to b) the records from those LoanId groups whose mod_flag may either be blank or mod_flag='Y', corresponding to those LoanIds that ONLY have a mod_flag = 'Y'? To be clearer by final data would look as follows:
LoanId Mod_Flag
12345 Y
34521
34521
68749 Y
68749 Y
78423
78423
I think a "self merge" will work nicely here.
data loan;
infile cards firstobs=2 missover;
input loanid :$5. mod_flag:$1.;
cards;
LoanId Mod_Flag
12345
12345
12345 Y
34521
34521
68749 Y
68749 Y
68749
78423
78423
;;;;
run;
proc print;
run;
data loan2;
merge loan loan(in=in2 rename=(mod_flag=Y) where=(Y='Y'));
by loanid;
if in2 and mod_flag eq 'Y' then output;
else if not in2 then output;
drop y;
run;
proc print;
run;
For the a) the first solution that come to my mind is to select those who have "Y"
proc sort data=have;
by loanld mod_flag;
run;
data want;
set have;
by loanld mod_flag;
where mod_flag='Y';
if last.mod_flag; /*avoid duplicated loanld if multiple 'Y'*/
rename mod_flag=group_flag;
run;
data have;
merge have
want;
if group_flag ne 'Y';
run;
For the b), with the original dataset have you could do something like
data have;
merge have
want;
if group_flag eq Mod_flag;
run;
proc sort data=have;
by LoanId descending Mod_Flag;
run;
data want(where=(Mod_Flag='Y' or new='N'));
set have;
by LoanId descending Mod_Flag;
retain new;
if first.LoanId then new='';
if Mod_Flag='Y' then new='Y';
if new='' then new='N';
run;
I think a "self merge" will work nicely here.
data loan;
infile cards firstobs=2 missover;
input loanid :$5. mod_flag:$1.;
cards;
LoanId Mod_Flag
12345
12345
12345 Y
34521
34521
68749 Y
68749 Y
68749
78423
78423
;;;;
run;
proc print;
run;
data loan2;
merge loan loan(in=in2 rename=(mod_flag=Y) where=(Y='Y'));
by loanid;
if in2 and mod_flag eq 'Y' then output;
else if not in2 then output;
drop y;
run;
proc print;
run;
data loan; infile cards firstobs=2 missover; input loanid :$5. mod_flag:$1.; cards; LoanId Mod_Flag 12345 12345 12345 Y 34521 34521 68749 Y 68749 Y 68749 78423 78423 ;;;; run; proc sql; create table want as select * from loan group by loanid having sum(Mod_Flag='Y')=0 union all select * from loan where Mod_Flag='Y' order by loanid; quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.