Hi,
I'm trying to macro-ise the following code, I have a lot of bank name cleansing to do beyond these 2 names
proc sql;
create table Bank_cleansed as select *,
(case when (address_name contains "Bank of FR")
then "Bank of France"
when (address_name contains "Bank of DM")
then "Bank of Germany"
else address_name
end) as new_address_name
from Bankfile;
run;
I'm not quite there with the macro:
%macro addname(string,fullname)
proc sql;
create table Bank_cleansed
as select *,
(case when (address_name contains &string.)
then &fullname.
else address_name
end)
as new_address_name
from Bankfile;
run;
%mend;
%addname("Bank of FR","Bank of France");
%addname("Bank of DM","Bank of Germany");
I'm thinking the code needs to be somehow embedded in the macro? Any help would be great.
Thanks! Liz
Your macro looks fine, but your logic is flawed. Each time you call it you are reading the same input and writing to the same output. So the second run overwrite the results of the first run.
Why do you feel the need for making a macro? Why not just write the code instead.
proc sql;
create table Bank_cleansed as
select *
, case when (address_name contains "Bank of FR") then "Bank of France"
when (address_name contains "Bank of DM") then "Bank of Germany"
else address_name
end as new_address_name
from Bankfile
;
quit;
What is the source of the rename lists? Do you have it in a dataset somewhere? So if you had a dataset named FIXED_ADDRESS with variables ADDRESS_NAME and FIXED_NAME then your code might look like:
proc sql;
create table Bank_cleansed as
select a.*
, coalesce(b.fixed_name,a.address_name) as new_address_name
from Bankfile
left join Fixed_address b
on a.address_name contains b.address_name
;
quit;
@Liz_P wrote:
I'm not quite there with the macro:
What is the problem you are having?
Your macro looks fine, but your logic is flawed. Each time you call it you are reading the same input and writing to the same output. So the second run overwrite the results of the first run.
Why do you feel the need for making a macro? Why not just write the code instead.
proc sql;
create table Bank_cleansed as
select *
, case when (address_name contains "Bank of FR") then "Bank of France"
when (address_name contains "Bank of DM") then "Bank of Germany"
else address_name
end as new_address_name
from Bankfile
;
quit;
What is the source of the rename lists? Do you have it in a dataset somewhere? So if you had a dataset named FIXED_ADDRESS with variables ADDRESS_NAME and FIXED_NAME then your code might look like:
proc sql;
create table Bank_cleansed as
select a.*
, coalesce(b.fixed_name,a.address_name) as new_address_name
from Bankfile
left join Fixed_address b
on a.address_name contains b.address_name
;
quit;
Hi,
The reason I didn't want to put so much code as >100 bank names to clean. Storing as a dataset and merging in works a treat. Thankyou 🙂
Hello,
By calling a macro for each renaming, you will have to reprocess the entire dataset each time.
Here is a solution using a data step rather than proc sql.
data clean_names;
infile cards dlm=',' dsd;
length name clean_name $50;
input name clean_name;
cards;
Bank of FR, Bank of France
Bank of DM, Bank of Germany
;
run;
data have;
input address_name $50.;
cards;
foo Bank of FR bar
baz Bank of DM qux
;
run;
proc sql noprint;
SELECT nobs
INTO :nbanks
FROM dictionary.tables
WHERE LIBNAME="WORK" AND MEMNAME="HAVE";
quit;
data want;
array names(&nbanks.) $50. _TEMPORARY_;
array clean(&nbanks.) $50. _TEMPORARY_;
if _N_=1 then do i=1 by 1 until(endcn);
set clean_names end=endcn;
names(i)=name;
clean(i)=clean_name;
end;
set have;
new_address_name=address_name;
do i=1 to dim(names);
if index(address_name, strip(names(i))) then do;
new_address_name=clean(i);
leave;
end;
end;
keep address_name new_address_name;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.