Hi,
I have a dataset that needs to be mapped based on the granularity of the available data. Some records have higher granularity than others. Currently I use excel to solve my problem using nested IF function but it is both tedious and time consuming.
The mapping file looks like the below table ( Sample mapping). The original mapping file has more records.
This is what I did in excel, first I get a mapping key by concatenating the different columns. This key is then used to lookup and get the BSmap column. Example, for records in the dataset that has Product L5=SF,GL_Product=400 then the mapping picks BSMap=Reverse but if Product L5=SF without values for other columns It maps BSMap=OtherStrucFin. Now I want to repeat this mapping in SAS but do not want to use IF-else statement. Can someone please help me figure out a different approach to get this done in SAS? I use SAS 9.3
Product L5 | Product L6 | Segment | Account L6 | GL_Product | GL_Acc | Mapping Key | BSMap |
Banca | OtherWM | ||||||
CMP | CashManagement | ||||||
LSS CF | CFLSS CF | LeveragedFinance | |||||
PEF | CFPEF | Project&ExportFinance | |||||
SF | CFSF | OtherStrucFin | |||||
SF | Aviation | CFSFAviation | AviationFinance | ||||
Syndications | Syndications | FI | CFSyndicationsSyndicationsFI | SyndicationsNon-CFFI | |||
Syndications | Syndications | FI | 8888845 | CFSyndicationsSyndicationsFI8888845 | Repo | ||
SF | 400 | CFSF400 | Reverse | ||||
PEF | Loans | PEFLoans | Loan |
I cannot figure out what you mean by "map" or "granularity of the available data".
Can you explain using different words?
Example input data and what output you expect would help a lot.
Hi Tom,
Let me try to use the following example, below are two lines of record from my dataset (DATA1)
Business Unit ID | Currency | PCS Product L04 DESC | PCS Product L05 DESC | PCS Product L06 DESC | PCS Customer Segment L05 DESC | PCS Account L06 DESC | Prod No | Account_Number |
8 | NGN | Treasury Markets | Treasury Mkts - Inp | Treasury Mkts - Inp | Other Banking | BS Treasury Bills | 738 | 115120 |
8 | NGN | Treasury Markets | Treasury Mkts - Inp | Treasury Mkts - Inp | Other Banking | BS Due from Other Banks | 761 | 122001 |
Below is another dataset (mapping file).
PCS Product L04 DESC | PCS Product L05 DESC | PCS Product L06 DESC | PCS Customer Segment L05 DESC | PCS Account L06 DESC | Product_Number | Account_Number | Mapping Key | BSMap |
Treasury Markets | Treasury Markets | TMNostros | ||||||
Treasury Markets | Treasury Mkts - Inp | 624 | Treasury MarketsTreasury Mkts - Inp624 | TMLevel1Securities | ||||
Treasury Markets | Treasury Mkts - Inp | 738 | Treasury MarketsTreasury Mkts - Inp738 | TMLevel1Securities | ||||
Treasury Markets | Treasury Mkts - Inp | 761X | Treasury MarketsTreasury Mkts - Inp761X | TMNostros | ||||
Treasury Markets | Treasury Mkts - Inp | 122051 | Treasury MarketsTreasury Mkts - Inp122051 | TMNostros | ||||
Treasury Markets | Treasury Mkts - Inp | BS Due from Other Banks | Treasury MarketsTreasury Mkts - InpBS Due from Other Banks | TMNostros | ||||
Treasury Markets | Treasury Mkts - Inp | BS Inv Sec - Debt Sec NGrp | Treasury MarketsTreasury Mkts - InpBS Inv Sec - Debt Sec NGrp | TMLevel1Securities | ||||
Treasury Markets | Treasury Mkts - Inp | BS Treasury Bills | Treasury MarketsTreasury Mkts - InpBS Treasury Bills | TMLevel1Securities |
What I am trying to do is combine the two datasets such that my output looks like the following table. I need the last column from mapping table to be part of my DATA1.
Business Unit ID | Currency | PCS Product L04 DESC | PCS Product L05 DESC | PCS Product L06 DESC | PCS Customer Segment L05 DESC | PCS Account L06 DESC | Prod No | Account_Number | RRR NGA Map |
8 | NGN | Treasury Markets | Treasury Mkts - Inp | Treasury Mkts - Inp | Other Banking | BS Treasury Bills | 738 | 115120 | TMLevel1Securities |
8 | NGN | Treasury Markets | Treasury Mkts - Inp | Treasury Mkts - Inp | Other Banking | BS Due from Other Banks | 761 | 122001 | TMNostros |
There are records in the dataset that have data only for column "PCS PRODUCT L04 DESC", so the expectation is for them to be tagged as "TM Nostros" , but for records that have more information i.e data for other columns, they need to be tagged according to the different levels shown in mapping table.
Not sure if I have explained the problem properly, please let me know if it is unclear.
Do the records in the "mapping" table have an order of precedence?
Assuming that they are ordered from least specific to most specific then something like this might work.
First convert your printouts into datasets.
I will add a ROW variable to data table to have a unique identifier.
I will add an LROW variable to the mapping table as a unique identifier and also to use to find the "best" match.
data have ;
infile cards dsd dlm='|' truncover;
length row BU 8 Currency $5 Prod_No Account_Number PL4-PL6 PC5-PC6 $40 ;
input BU Currency PL4-PL6 PC5-PC6 Prod_no Account_Number;
row+1;
cards;
8|NGN|Treasury Markets|Treasury Mkts - Inp|Treasury Mkts - Inp|Other Banking|BS Treasury Bills|738|115120
8|NGN|Treasury Markets|Treasury Mkts - Inp|Treasury Mkts - Inp|Other Banking|BS Due from Other Banks|761|122001
;
data lookup ;
infile cards dsd dlm='|' truncover;
length lrow 8 Mapping_Key $100 BSMap $40 Prod_No Account_Number PL4-PL6 PC5-PC6 $40 ;
input PL4-PL6 PC5-PC6 Prod_no Account_Number Mapping_Key BSMap;
lrow+1;
cards;
Treasury Markets| | | | | | |Treasury Markets|TMNostros
Treasury Markets|Treasury Mkts - Inp| | | |624| |Treasury MarketsTreasury Mkts - Inp624|TMLevel1Securities
Treasury Markets|Treasury Mkts - Inp| | | |738| |Treasury MarketsTreasury Mkts - Inp738|TMLevel1Securities
Treasury Markets|Treasury Mkts - Inp| | | |761X| |Treasury MarketsTreasury Mkts - Inp761X|TMNostros
Treasury Markets|Treasury Mkts - Inp| | | | |122051|Treasury MarketsTreasury Mkts - Inp122051|TMNostros
Treasury Markets|Treasury Mkts - Inp| | |BS Due from Other Banks| | |Treasury MarketsTreasury Mkts - InpBS Due from Other Banks|TMNostros
Treasury Markets|Treasury Mkts - Inp| | |BS Inv Sec - Debt Sec NGrp| | |Treasury MarketsTreasury Mkts - InpBS Inv Sec - Debt Sec NGrp|TMLevel1Securities
Treasury Markets|Treasury Mkts - Inp| | |BS Treasury Bills| | |Treasury MarketsTreasury Mkts - InpBS Treasury Bills|TMLevel1Securities
;
Now do a left join between the two and take the "best" result.
Here is a trick using COALESCE() to let you match the missing values in the mapping dataset.
proc sql ;
create table want as
select a.row,b.lrow,b.bsmap,b.mapping_key,* from have a
left join lookup b
on (a.pl4 = coalesce(b.pl4,a.pl4))
and (a.pl5 = coalesce(b.pl5,a.pl5))
and (a.pl6 = coalesce(b.pl6,a.pl6))
and (a.pc5 = coalesce(b.pc5,a.pc5))
and (a.pc6 = coalesce(b.pc6,a.pc6))
and (a.prod_no = coalesce(b.prod_no,a.prod_no))
and (a.account_number = coalesce(b.account_number,a.account_number))
group by a.row
having b.lrow = max(b.lrow)
;
quit;
Results:
Obs row lrow BSMap Mapping_Key 1 1 8 TMLevel1Securities Treasury MarketsTreasury Mkts - InpBS Treasury Bills 2 2 6 TMNostros Treasury MarketsTreasury Mkts - InpBS Due from Other Banks Account_ Obs BU Currency Prod_No Number PL4 PL5 PL6 1 8 NGN 738 115120 Treasury Markets Treasury Mkts - Inp Treasury Mkts - Inp 2 8 NGN 761 122001 Treasury Markets Treasury Mkts - Inp Treasury Mkts - Inp Obs PC5 PC6 1 Other Banking BS Treasury Bills 2 Other Banking BS Due from Other Banks
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
I'm not even going to try to create any data set from the posted text.
first I get a mapping key by concatenating the different columns
That sounds like a poorly defined operation to begin with but to duplicate that step you could use in a data step:
MappingKey = catt( var1, var2, var3 <list your variable names).
Please not that most of your "mapping key" values show a beginning C that does not have a column shown that would contribute such. So there seems to be something missing either in the data or your problem description.
You don't really indicate where the BSMap value comes from so I don't know if you are asking for help with that.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.