Hi guys, I just picked up SAS, and have trouble doing the following. There are around 100k of observations, hence its impossible for me to do it manually. As you can see, there are duplicates for one company (identified by the ISIN) due to the different statements that were produced by bankscope. There are 6 types of possible statements for a company in a certain year. And the priority should be C1/C2, C*, U1/U2,U* whenever possible as shown in table 2. In another words, the priority should be followed and duplicates to be deleted. Thanks in advance guys.
Table 1
year | ISIN | Consolidation code | total assets |
2005 | AU123 | C* | 20 |
2005 | AU123 | C2 | 18 |
2006 | UK345 | C1 | 25 |
2006 | UK345 | C* | 40 |
2006 | UK345 | U1 | 28 |
Table 2
year | ISIN | Consolidation code | total assets |
2005 | AU123 | C2 | 18 |
2006 | UK345 | C1 | 25 |
Alternatively
data have;
input year$5. ISIN$6. Consolidation_code$3. total_assets$3.;
Consolidation_code=tranwrd(Consolidation_code,'*','99');
cards;
2005 AU123 C* 20
2005 AU123 C2 18
2006 UK345 C1 25
2006 UK345 C* 40
2006 UK345 U1 28
;
proc sort data=have;
by year isin Consolidation_code;
run;
data want;
set have;
by year isin Consolidation_code;
if first.isin;
run;
Thanks,
Jag
Hi ,
You can use the below approach .
data ranked ;
set have ;
if Consolidation_code = "C1" then rank = 1 ;
else if Consolidation_code = "C2" then rank = 2 ;
else if Consolidation_code = "C*" then rank =3 ;
else if Consolidation_code = "U1" then rank = 4 ;
else if Consolidation_code = "U2" then rank = 5 ;
else if Consolidation_code = "U*" then rank = 6 ;
else rank = 9 ;
run;
proc sort data = ranked ;
by Consolidation_code rank ;
run;
data final ;
set ranked;
by Consolidation_code rank ;
if first.Consolidation_code ;
run;
thanks
Alternatively
data have;
input year$5. ISIN$6. Consolidation_code$3. total_assets$3.;
Consolidation_code=tranwrd(Consolidation_code,'*','99');
cards;
2005 AU123 C* 20
2005 AU123 C2 18
2006 UK345 C1 25
2006 UK345 C* 40
2006 UK345 U1 28
;
proc sort data=have;
by year isin Consolidation_code;
run;
data want;
set have;
by year isin Consolidation_code;
if first.isin;
run;
Thanks,
Jag
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.