Hello everyone,
I am currently a beginner level in SAS Programming. I am now assigned to write a hash object code, which is mainly for doing lookup(in this case).
Here are the codes:
data ncp_dim_ba (drop=excp_code)
excp.excp_ncp_dim_ba;
length excp_code $50;
attrib region length=$30 label='Region';
attrib zone length=$2 label='LF and LD Zone';
attrib state_code length=$5 label='State Code';
attrib state length=$30 label='State Name';
attrib business_area_code length=$30 label='Business Area Code';
attrib business_area length=$50 label='Business Area';
attrib area_code length=$30 label='Area Code';
attrib area_text length=$50 label='Area Code';
attrib cost_center_code length=$30 label='Cost Center Code';
attrib cost_center_desc length=$50 label='Cost Center Description';
attrib gis_ba length=$50 label='GIS Business Area';
attrib data_dttm length=8 format=datetime. label='Data Date and Time';
attrib forecast_cost_center length=$50 label='forecast_cost_center';
attrib gui_list length=$1 label='GUI List';
if _n_=1 then do;
declare hash chkst(dataset:"ncpdm.ncp_dim_state");
chkst.definekey('forecast_cost_center');
chkst.definedata('cost_center_code','business_area_code');
chkst.definedone();
call missing(cost_center_code,business_area_code);
end;
set man_dim_ba (firstobs=2);
data_dttm=datetime();
rc=chkst.find();
if (rc=0) then output ncp_dim_ba;
else do;
excp_code='Exception: State Code is invalid';
output excp.excp_ncp_dim_ba;
end;
keep excp_code region zone state_code state business_area_code business_area area_code area_text
cost_center_code cost_center_desc gis_ba data_dttm forecast_cost_center gui_list;
run;
As you can see, forecast_cost_center is the hash key while cost_center_code and business_area_code are the hash data.
The data looks something like this:
Region State Business Area Area Code Area Text Cost Center Cost Center Description GIS - Business Area Asset BA Forecast Cost Center GUI List Central WP_Kuala_Lumpur C611900 KWSN_KLB KL BARAT C611900 PC KL-Utara 6119 - Kuala Lumpur Utara PC KLU PK KL Barat Y Central WP_Kuala_Lumpur C611900 KWSN_KLB KL BARAT C611920 PC KL-Utara 6119 - Kuala Lumpur Utara PC KLU PK KL Barat Y Central WP_Kuala_Lumpur C611900 KWSN_KLB KL BARAT C611930 PC KL-Utara 6119 - Kuala Lumpur Utara PC KLU PK KL Barat Y Central WP_Kuala_Lumpur C611900 KWSN_KLB KL BARAT C611940 PC KL-Utara 6119 - Kuala Lumpur Utara PC KLU PK KL Barat Y Central WP_Kuala_Lumpur C611900 KWSN_KLB KL BARAT C611950 PC KL-Utara 6119 - Kuala Lumpur Utara PC KLU PK KL Barat Y Central WP_Kuala_Lumpur C611900 KWSN_KLB KL BARAT C611960 PC KL-Utara 6119 - Kuala Lumpur Utara PC KLU PK KL Barat Y Central WP_Kuala_Lumpur C611900 KWSN_KLB KL BARAT C611961 PC KL-Utara 6119 - Kuala Lumpur Utara PC KLU PK KL Barat Y
From there, I want to do lookup with the hash key and hash data defined above. Here is the thing, Forecast_cost_center is actually the column that I want to keep and retain its value(Key). When we look at the differences between Forecast_cost_center and Cost_Center_Desc, we can see that the last word is having different value. "Barat" is the new data and this is supposed to replace "Timur".
Problem is, my senior do not want me to use the method of :
If Cost_center_desc ne Forecast_cost_center then do:
cost_center_desc = forecast_cost_center;
end;
This is because the source data is constantly being updated and such function/method will not be sufficient for all data that is not necessarily have the matching pair.
I need to lookup properly so that the cost_center_code and Business_area_code will be changed according to Forecast_center_code instead of Cost_center_Desc.
Cost_center_code(this is replaced by Forecast_cost_center) should be tied to forecast_cost_center instead of Cost_center_Desc.
Need advise on the accuracy of my Hash Data, if they are correctly assigned and if they will work fine as I am currently out of the premise and will not be able to test the code until tomorrow but i really want to know the accuracy of it.
Please enlighten me/advise me.
Your input is highly appreciated.
The code you've posted looks clean which is a good sign. I'm struggling though with the business logic as you describe it so giving you a rather high level answer here.
What you need is a common column in both tables (can be different column names but needs to have the same data) over which you can look-up values.
I'm not sure if I got this right but isn't cost_center_code this column? If so then that's the one you want to use as key column in the hash - and you then have forecast_cost_center in the data part as this is the column you want returned in case there is a match.
The hash find() method will return all columns defined as data in case of a match over the key column(s). The hash will not return values (variables) only listed as key columns.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.