Folks,
I've written the following piece of code;
data _null_;
start_date=put(intnx('month','01AUG18'd,0,'B'),date9.);
end_date=put(intnx('month','31AUG18'd,0,'E'),date9.);
end_date1=put(intnx('month','31AUG18'd,-1,'E'),date9.);
call symputx('start_date',start_date);
call symputx('end_date',end_date);
call symputx('end_date1',end_date1);
run;
%put &=start_date &=end_date &=end_date1;
As you can see for my analysis my start_date is the first day of August my end_date of August and my other end_date1 is the last day of July.
What I need to do now is put this into a macro so I loop back through previous months to this effect.
So basically the macro runs and carries out various datasetps and outputs results for start_date=1st of August end_date=last day of August and end_date1=last day of July.
Then it loops through again for when the start_date=1 of July end_date=last day of July and end_date1=last day of June.
I need this process to continue until January 2010.
Would anyone know how I could include something like this in a macro, please?
First of all, when using date values for anything but display in a macro variable, it is best to not have them formatted. See Maxim 28.
data _null_;
start_date=intnx('month','01AUG18'd,0,'B');
end_date=intnx('month','31AUG18'd,0,'E');
end_date1=intnx('month','31AUG18'd,-1,'E');
call symputx('start_date',start_date);
call symputx('end_date',end_date);
call symputx('end_date1',end_date1);
run;
symputx will take care of the numeric-to-character conversion in a graceful way (as opposed to symput).
With the raw values, it is quite easy to create a macro loop:
data _null_;
call symputx('start_date','01aug2018'd);
call symputx('end_date','01jan2010'd);
run;
%macro mymac;
%let curr_date=&start_date;
%do %until (&curr_date < &end_date);
/* your code in here */
%let curr_date = %sysfunc(intnx(month,&curr_date,-1,b));
%end;
%mend;
%mymac
Hi Kurt,
I'm running into some error messages here. Does it relate to the piece of code here?
NOTE: Line generated by the macro variable "START_DATE".
379 "21397
______
77
ERROR: Invalid date/time/datetime constant "21397"d.
ERROR 77-185: Invalid number conversion on "21397"d.
NOTE: Line generated by the macro variable "END_DATE".
379 "18263
______
/*Subset for your lookup month*/
data august;
set testa;
if "&start_date"d <= dt_filed <="&end_date"d then result="match";
if result='match' then output august;
run;
@Sean_OConnor wrote:
Hi Kurt,
I'm running into some error messages here. Does it relate to the piece of code here?
NOTE: Line generated by the macro variable "START_DATE". 379 "21397 ______ 77 ERROR: Invalid date/time/datetime constant "21397"d. ERROR 77-185: Invalid number conversion on "21397"d. NOTE: Line generated by the macro variable "END_DATE". 379 "18263 ______
/*Subset for your lookup month*/ data august; set testa; if "&start_date"d <= dt_filed <="&end_date"d then result="match"; if result='match' then output august; run;
Since you have raw numbers in the macro variables, you can use those directly:
data august;
set testa;
if &start_date <= dt_filed <= &end_date then result="match";
if result='match' then output august;
run;
Hi Kurt,
Many thanks for your input. Unfortunately I'm still having some issues.
Part of may data steps is that we subset the dataset based on months.
data _null_;
start_date=intnx('month','01AUG18'd,0,'B');
end_date=intnx('month','31AUG18'd,0,'E');
end_date1=intnx('month','31AUG18'd,-1,'E');
call symputx('start_date',start_date);
call symputx('end_date',end_date);
call symputx('end_date1',end_date1);
run;
%put &=start_date &=end_date &=end_date1;
data _null_;
call symputx('start_date','01aug2018'd);
call symputx('end_date','01jul2018'd);
run;
%macro mymac;
%let curr_date=&start_date;
%do %until (&curr_date < &end_date);
/* your code in here */
data testa;
set a.analysis_dataset;
run;
/*Subset for your lookup month*/
data august;
set testa;
if &start_date <= dt_filed <=&end_date then result="match";
if result='match' then output august;
run;
%let curr_date = %sysfunc(intnx(month,&curr_date,-1,b));
%end;
%mend;
%mymac
When it gets to the step to create the august dataset we're getting an issue
NOTE: Format $CRS_CYN was not found or could not be loaded.
NOTE: There were 529429 observations read from the data set A.ANALYSIS_DATASET.
NOTE: The data set WORK.TESTA has 529429 observations and 167 variables.
NOTE: DATA statement used (Total process time):
real time 15.86 seconds
cpu time 4.25 seconds
NOTE: Format $CRS_CYN was not found or could not be loaded.
NOTE: There were 529429 observations read from the data set WORK.TESTA.
NOTE: The data set WORK.AUGUST has 0 observations and 168 variables.
NOTE: DATA statement used (Total process time):
real time 1.42 seconds
cpu time 1.40 seconds
The variable dt_filed is a date variable with the following format - 04/01/2010
Have you any advise, please?
Remove that step:
data _null_;
call symputx('start_date','01aug2018'd);
call symputx('end_date','01jul2018'd);
run;
as it overwrites the values set in the previous step with hardcoded ones that won't work in
if &start_date <= dt_filed <=&end_date then result="match";
as the hardcoded end_date is smaller than the hardcoded start_date, so the condition MUST be false.
Hi Kurt,
Thank you for your help so far.
To be honest I'm still a bit lost about all of this. I've put up some of my code to give a better idea what I'm currently doing and what I'm trying to do.
/*Manually adding your Rules based categories*/
%let start_date='01nov2017'd; *First day of analysis month;
%let end_date='31dec2018'd; * Last day of analysis month;
%let end_date1='30nov2017'd; *Last day of lookback month month previous to above;
/*Copy of your dataset*/
data testa;
set a.analysis_dataset;
run;
/*Subset for your lookup month*/
data subset;
set testa;
if &start_date <= dt_filed <=&end_date then result="match";
if result='match' then output subset;
run;
data month;
set subset ;
if substr(id,1,1) in ('X') then
delete;
if dc_type_participant='Purchaser' then
output month;
run;
data month_t;
length type_of_buyer classification $50.;
set month;
if dc_res_purch_info='First Time Buyer Owner-Occupier' then
type_of_buyer ='First Time Buyer';
else if dc_res_purch_info ne 'First Time Buyer Owner-Occupier' then type_of_buyer='Previous Residential Home Owner';
run;
proc sort data=month_t; by id;
*Create your previous months data which omits the latest month again only looking at
those who purchased a house previously and a res one;
data previous_t;
set a.full_period;
where dt_filed between '01JAN2010'd and &end_date1;
if substr(id,1,1) in ('X') then
delete;
if dc_type_participant='Purchaser' then
output previous_t;
run;
proc sort data=previous_t(keep=id);
by id;
*Compute the number of times someone has appeared previously;
proc summary data=previous_t sum;
by id;
output out=appearances_buyer;
run;
data appearances_buyer(drop=_type_);
set appearances_buyer;
rename _freq_ = estamp_buyer;run;
proc sort data=appearances_buyer;
by id;
run;
data previous_estamp;
merge month_t(in=a) appearances_buyer(in=b);
by id;
if a;
array change _numeric_;
do over change;
if change=. then
change=0;
end;
run;
data sole_&start_date;
set previous_estamp;
if estamp_buyer>=1 then
first_time=1;
else first_time=0;
if first_time=1 then classification='Previous Residential Home Owner';
if first_time=0 then classification='First Time Buyer of Residential Property';
run;
So currently, I'm changing the months at the top of the program but as discussed I would like to write a macro which would do this automatically and create a dataset at the very end.
Hopefully this makes things a little clearer to understand.
This creates your dates from one set of year/month dynamically:
%let year=2017;
%let month=11;
data _null_;
start_date = mdy(&month,1,&year);
end_date = intnx('month',start_date,13,'e');
end_date1 = intnx('month',start_date,0,'e');
call symputx('start_date',start_date,'g');
call symputx('end_date',end_date,'g');
call symputx('end_date1',end_date1,'g');
run;
/* this is just for testing/verifying */
data test;
start_date = &start_date;
end_date = &end_date;
end_date1 = &end_date1;
format _numeric_ yymmdd10.;
run;
Look at the dataset, and you'll see the same date values (although in a different format for human consumption) you created with
%let start_date='01nov2017'd; *First day of analysis month;
%let end_date='31dec2018'd; * Last day of analysis month;
%let end_date1='30nov2017'd; *Last day of lookback month month previous to above;
Let's build on what you have already.
%macro loop_mon;
%local n_mon k;
%let n_mon = %sysfunc(intck(month, "15Jan2010"d, "15Aug2010"d);
%do k = 0 %to n_mon;
** your code plus analysis code goes here;
%end;
%mend loop_mon;
%loop_mon
So what does it mean when I say "** your code plus analysis code goes here? Here's what gets inserted at that point:
data _null_;
start_date=put(intnx('month','01AUG18'd,0-&k,'B'),date9.);
end_date=put(intnx('month','31AUG18'd,0-&k,'E'),date9.);
end_date1=put(intnx('month','31AUG18'd,-1-&k,'E'),date9.);
call symputx('start_date',start_date);
call symputx('end_date',end_date);
call symputx('end_date1',end_date1);
run;
%put &=start_date &=end_date &=end_date1;
*** plus the analysis code (that you haven't shown yet) for this set of dates;
Please clarify your rules for the relationship between the dates?
It looks like you want to treat START_DATE as the input and derive the other two from that?
Then you want to loop changing START_DATE from starting at August 2018 by stepping back one month until start_date is January 2010?
So set a range of start dates. Then use an interative %DO loop to generate an offset in months over that range. Use that offset and the range to calculate new START_DATE. Use the START_DATE to calculate the other two dates.
%let range_high='01AUG2018'd;
%let range_low='01JAN2010'd ;
%do offset=0 %to %sysfunc(intck(month,&range_high,&range_low)) %by -1;
%let start_date=%sysfunc(intnx(month,&range_high,&offset,b));
%let end_date=%sysfunc(intnx(month,&start_date,0,e));
%let end_date1=%sysfunc(intnx(month,&start_date,-1,e));
... your code using START_DATE END_DATE and END_DATE1 ...
%end;
Note that these %LET statements will just set START_DATE etc to the actual number of days that SAS uses to store a date value. You can use that in your calculations. But if you want to display if for yourself in a way that is human readable then you will need to apply a date style format to it.
So for example here is a trivial macro that implements this loop and then just prints the generate macro variables values so you can see.
677 %macro test(range_high,range_low); 678 %local offset start_date end_date end_date1 ; 679 %do offset=0 %to %sysfunc(intck(month,&range_high,&range_low)) %by -1; 680 %let start_date=%sysfunc(intnx(month,&range_high,&offset,b)); 681 %let end_date=%sysfunc(intnx(month,&start_date,0,e)); 682 %let end_date1=%sysfunc(intnx(month,&start_date,-1,e)); 683 684 %put %sysfunc(putn(&start_date,date9.)) &=offset &=start_date &=end_date &=end_date1 ; 685 686 %end; 687 %mend test; 688 689 %test('01AUG2018'd,'01JUN2018'd) ; 01AUG2018 OFFSET=0 START_DATE=21397 END_DATE=21427 END_DATE1=21396 01JUL2018 OFFSET=-1 START_DATE=21366 END_DATE=21396 END_DATE1=21365 01JUN2018 OFFSET=-2 START_DATE=21336 END_DATE=21365 END_DATE1=21335
Hi Tom,
This appears to be doing the trick.
I just have one question.
As this macro will be creating a dataset for each given month is it possible to include a piece of code that would create a distinct dataset rather than writing over each additional one?
So like
i= 1......n;
data output_&i;
set month;
run;
Using @Tom's code:
%let range_high='01AUG2018'd;
%let range_low='01JAN2010'd ;
%do offset=0 %to %sysfunc(intck(month,&range_high,&range_low)) %by -1;
%let start_date=%sysfunc(intnx(month,&range_high,&offset,b));
%let end_date=%sysfunc(intnx(month,&start_date,0,e));
%let end_date1=%sysfunc(intnx(month,&start_date,-1,e));
%let dataset_suffix = %eval(&offset * -1);
... your code using START_DATE END_DATE and END_DATE1 ...
data MyTable&dataset_suffix;
.....
%end;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.