Hi,
I have a dataset with a unique postal code and flags for ~80 campaigns that looks like:
FSALDU Cmp_201301 Cmp_201302 Cmp_201304 ....
A0A1B0 1 . 1
A0A1C0 . . 1
A0A1G0 1 . .
...
Data notes:
What I am trying to do is compare the flags on each postal code to create a 'Change' variable (Same, Added, Removed) comparing to the previous campaign.
For example my output dataset would look like:
FSALDU Cmp_chg_201302 Cmp_chg_201304 ...
A0A1B0 Removed Added
A0A1C0 Same Added
A0A1G0 Removed Same
...
To do this I create a macro with the comparison logic as:
%Macro Change(Prv_wk, Cur_wk) ;
IF Cmp_&Cur_wk=Cmp_&Prv_wk then Cmp_chg_&Cur_wk='Same' ;
IF Cmp_&Cur_wk=1 and Cmp_&Prv_wk=. then Cmp_chg_&Cur_wk='Added' ;
IF Cmp_&Cur_wk=. and Cmp_&Prv_wk=1 then Cmp_chg_&Cur_wk='Removed' ;
%mend ;
Within my Data Step I have:
Data Change_table ; Set Master_Data_tbl ;
%Change(201301,201302) ;
%Change(201302,201304) ;
Keep FSALDU Cmp_chg: ;
RUN ;
I could write the macro out 79 times to get the table i'm looking for, but I was wondering how I can optimize this code for automation. I imagine the solution would involve arrays and loops..
Thanks for your help!
OK. Let's assume you store a SAS data set with just one variable: WEEK
It should be character, with a length of 6, taking on values like 201301, 201302, etc. There will be one observation for each week.
Your program could look like this, to adjust automatically to the weeks in your data set:
proc sql noprint;
select distinct ('Cmp_' || week) into : campaign_list separated by ' ' from my.list_of_weeks;
select distinct ('Chg_' || week) into : change_list separated by ' ' from my.list_of_weeks;
quit;
%put _user_;
data want;
set have;
array camps {*} &campaign_list;
array changes {*} $ 7 &change_list;
do _i_=2 to dim(camps);
if cmp{_i_} = cmp{_i_-1} then changes{_i_}='Same';
else if cmp{_i_} > cmp{_i_-1} then changes{_i_}='Added';
else changes{_i_}='Removed';
end;
run;
The %put statement is only there to give you a feel for what you are getting with macro language. And you will have an extra "change" variable in your data set, since there is no way to assign the change that takes place in the first week. (It could be dropped, but that part would be more complex.) The program is easily maintained, since there is no maintenance. Just change the SAS data set holding the weeks and the program adjusts automatically.
Something like this?
Well, you don't actually describe what it means to optimize code for automation. Does it mean the fastest-running code, regardless of how complex it gets? Does it mean the simplest code that is still reasonably fast, so that it is easy to modify and maintain?
At any rate, data_null_ has a good approach. If you are confident about your incoming values always being 1 or missing, you can speed it up slightly:
if cmp=cmp[i-1] then chg='S';
else if cmp > cmp[i-1] then chg='A';
else chg='R';
Good luck.
Astounding:
Good point. I mean optimized in terms of automation and not necessarily performance.
Data_Null:
Thanks for the reply.
I like the method, but every 2 weeks (sometimes 1) I will add a campaign, so the array for the Chg would have to be dynamic..
In that case, it would be wise to keep the code constant.
Instead of changing the code, maintain a separate file holding a list of the campaigns. The file could be a text file, a SAS data set, a table ... doesn't matter as long as your program can read it. Then the program should transform the list of campaigns into the variable names that the program will use.
What types of file(s) will your production system permit you to use, to hold and maintain a list of campaigns?
Some macro language will be involved but it will be minimal if you can store the list of campaigns as a SAS data set.
Yes, I can store store a separate SAS dataset with the list of campaigns.
How would I use that to create the arrays in the restructuring program?
OK. Let's assume you store a SAS data set with just one variable: WEEK
It should be character, with a length of 6, taking on values like 201301, 201302, etc. There will be one observation for each week.
Your program could look like this, to adjust automatically to the weeks in your data set:
proc sql noprint;
select distinct ('Cmp_' || week) into : campaign_list separated by ' ' from my.list_of_weeks;
select distinct ('Chg_' || week) into : change_list separated by ' ' from my.list_of_weeks;
quit;
%put _user_;
data want;
set have;
array camps {*} &campaign_list;
array changes {*} $ 7 &change_list;
do _i_=2 to dim(camps);
if cmp{_i_} = cmp{_i_-1} then changes{_i_}='Same';
else if cmp{_i_} > cmp{_i_-1} then changes{_i_}='Added';
else changes{_i_}='Removed';
end;
run;
The %put statement is only there to give you a feel for what you are getting with macro language. And you will have an extra "change" variable in your data set, since there is no way to assign the change that takes place in the first week. (It could be dropped, but that part would be more complex.) The program is easily maintained, since there is no maintenance. Just change the SAS data set holding the weeks and the program adjusts automatically.
Thank you for your help, this worked perfectly.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.