Hi all,
I want to perform a hierarchical if condition within a group.
My data structure looks like the following:
Facilityid | Lender | LenderRole | LeadArrangerCredit | AgentCredit |
1 | Deutsche Bank | Arranger | Yes | No |
1 | Goldman | Admin Agent | No | No |
1 | JP Morgan | Participant | No | No |
1 | Citi | Administrative Agent | No | No |
2 | Morgan Stanley | Manadated Lead arranger | No | No |
2 | Wells Fargo | Bookrunner | No | No |
2 | UBS | Bookrunner | No | No |
2 | Credit Suisse | Bookrunner | Yes | No |
3 | Deutsche Bank | Administrative Agent | No | Yes |
4 | JP Morgan | Bookrunner | No | No |
4 | Goldman | Bookrunner | No | No |
4 | Citi | Bookrunner | No | No |
4 | UBS | Bookrunner | No | Yes |
I want to code a dummy variable (lead_arranger) per Facilityid group. Thereby, the following restrictions apply:
- If any Lender has the role of "Administrative Agent" it is deemed as the lead arranger (i.e. lead_arranger = 1)
- If there is no "Administrative Agent", the Lender which is the LeadArrangerCredit is the lead arranger
- If there is no "Administrative Agent" and no LeadArrangerCredit, the Lender with the AgentCredit is the lead arranger
- For loans with only one lender (i.e. only one observation per facilityid group), this lender is automatically the lead arranger
- If none of the above is true, there is no lead arranger
Thank you so much!
Except for "Administrative Agent" the LenderRole terms in your spreadsheet don't match the terms in your narrative.
To make below code work you need to amend the terms in the informat so that they match your data (terms in the format must be uppercase, data can be mixed case).
data have;
infile datalines dsd truncover;
input Facilityid Lender:$40. LenderRole:$20. (LeadArrangerCredit AgentCredit) (:$3.);
datalines;
1,Deutsche Bank,Arranger,Yes,No
1,Goldman,Admin Agent,No,No
1,JP Morgan,Participant,No,No
1,Citi,Administrative Agent,No,No
2,Morgan Stanley,Manadated Lead arranger,No,No
2,Wells Fargo,Bookrunner,No,No
2,UBS,Bookrunner,No,No
2,Credit Suisse,Bookrunner,Yes,No
3,Deutsche Bank,Administrative Agent,No,Yes
4,JP Morgan,Bookrunner,No,No
4,Goldman,Bookrunner,No,No
4,Citi,Bookrunner,No,No
4,UBS,Bookrunner,No,Yes
;
proc format;
invalue lead_arr_hier (upcase)
'ADMINISTRATIVE AGENT' = 1
'LEADARRANGERCREDIT' = 2
'AGENTCREDIT' = 3
other = 999
;
quit;
data inter;
set have;
lead_order=input(LenderRole,lead_arr_hier.);
run;
proc sort data=inter;
by Facilityid lead_order;
run;
data want;
set inter;
by Facilityid lead_order;
if first.Facilityid then
do;
if last.Facilityid or lead_order ne 999 then lead_arranger = 1;
end;
run;
I had a similar approach as @Patrick with ranking but ended up using a nested SQL instead:
proc sql;
select have.*,
case
when lenderscore is not null then 'Yes'
when No = 1 then 'Yes'
else 'No'
end as Lead_Arranger
from have
left join
(select facilityid,
lender,
count(*) as No,
case
when lenderRole = "Administrative Agent" then 3
when LeadArrangerCredit = 'Yes' then 2
when AgentCredit = 'Yes' then 1
else 0 end as lenderScore
from have
group by facilityid
having calculated lenderScore = max(calculated lenderScore)) as score
on have.facilityid = score.facilityid and
have.lender = score.lender
;
quit;
Edit: this code works if a lender only can appear once per FacilityId
@edeede wrote:
Thanks, Patrick. I think the spreadsheet fits the narrative. For facilityid 2 Credit Suisse would be lead_arranger, and for facilityid 4 it would be UBS, that is not refelcted in the code.
True. I've misread your narrative. Below a data step option that should work
data inter;
set have;
if LenderRole='Administrative Agent' then lead_order = 1;
else if LeadArrangerCredit = 'Yes' then lead_order = 2;
else if AgentCredit then lead_order= 3;
else lead_order = 999;
run;
proc sort data=inter;
by Facilityid lead_order;
run;
data want;
set inter;
by Facilityid lead_order;
if first.Facilityid then
do;
if last.Facilityid or lead_order ne 999 then lead_arranger = 1;
end;
run;
Assign a RANK to each observation. Sort by the RANK. And set the flag true only on the first observation.
data have;
infile cards dsd dlm='|' truncover;
input Facilityid Lender :$20. LenderRole :$20. LeadArrangerCredit :$3. AgentCredit :$3.;
cards;
1|Deutsche Bank|Arranger|Yes|No
1|Goldman|Admin Agent|No|No
1|JP Morgan|Participant|No|No
1|Citi|Administrative Agent|No|No
2|Morgan Stanley|Manadated Lead arranger|No|No
2|Wells Fargo|Bookrunner|No|No
2|UBS|Bookrunner|No|No
2|Credit Suisse|Bookrunner|Yes|No
3|Deutsche Bank|Administrative Agent|No|Yes
4|JP Morgan|Bookrunner|No|No
4|Goldman|Bookrunner|No|No
4|Citi|Bookrunner|No|No
4|UBS|Bookrunner|No|Yes
;
data rank;
set have;
if lenderrole='Administrative Agent' then rank=1;
else if LeadArrangerCredit='Yes' then rank=2;
else if AgentCredit='Yes' then rank=3;
else rank=4;
run;
proc sort;
by facilityid rank;
run;
data want;
set rank;
by facilityid rank;
FLAG = first.facilityid and (rank<4) ;
run;
Results:
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.