BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tahos
Fluorite | Level 6

Hi,

I am looking to be able to set a custom interval and then create month dummies for observations with a macro. I am running into problems with the comparison of dates. In addition to what is below, I have also tried using quotes around the macro date variables and using the %sysevalf-function with %if-%then-%else -structure (and both at the same time). In the %sysevalf-case the date comparison is just always true or false. Below is my code:

 

 

%macro date_loop(start,end);
	data WORK.x7;
			set work.x99;
		   %let start=%sysfunc(inputn(&start,anydtdte9.));
		   %let end=%sysfunc(inputn(&end,anydtdte9.));
		   %let dif=%sysfunc(intck(month,&start,&end));
		     %do i=0 %to &dif;
					%let date=%sysfunc(intnx(month,&start,&i,b),date9.);
					%let date2=%sysfunc(intnx(month,&start,&i,e),date9.);
					if mydate1 <= &date2. AND mydate2 >= &date. then 
						%do;
							if employed = 'true' then p&i. = 1;
							if dummy = 'true' then p&i. = 1;
						%end;
					
					else %do;
							p&i. = 0;
						%end;
		     %end;
	 run;
%mend date_loop;


%date_loop(01jan2019,01jan2020);

which results in following errors

 

ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, -, /, <, <=, <>, =, >, ><, >=, AND, EQ, GE, GT, IN, 
              LE, LT, MAX, MIN, NE, NG, NL, NOTIN, OR, ^=, |, ||, ~=.  

ERROR 388-185: Expecting an arithmetic operator.

ERROR 76-322: Syntax error, statement will be ignored.

I would be grateful for any help!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Before you try to make code dynamic by applying macro processing, you must have code that works for a given single instance.

Next, see Maxim 28. Formatting macro variables (especially those containing dates) makes using them in code harder than necessary. For comparisons and calculations, the raw values are much better to handle.

 

What you want can be done with no macro coding, using a do loop in a data step and a transpose:

data have;
input id $ employed $ dummy $ mydate1 :date9. mydate2 :date9.;
format mydate1 mydate2 e8601da10.;
datalines;
1 true true 01JAN2019 03MAR2019
2 false false 01JAN2019 01JAN2020
3 true true 02FEB2019 04JUN2019
;

%let start=%sysfunc(inputn(20190101,yymmdd8.));
%let num_period=3;

data long;
set have;
do p = 1 to &num_period;
  if
    mydate1 le intnx('month',&start,p-1,'e') and mydate2 ge intnx('month',&start,p-1,'b')
    and (employed = "true" or dummy = "true")
  then value = 1;
  else value = 0;
  output;
end;
run;
    
proc transpose data=long out=want (drop=_name_) prefix=p;
by id employed dummy mydate1 mydate2;
var value;
id p;
run;

Result:

1	true	true	2019-01-01	2019-03-03	1	1	1
2	false	false	2019-01-01	2020-01-01	0	0	0
3	true	true	2019-02-02	2019-06-04	0	1	1

 

View solution in original post

11 REPLIES 11
Patrick
Opal | Level 21

Looks like you're mixing SAS macro and data step code in the wrong way.

      if mydate1 <= &date2. AND mydate2 >= &date. then
        %do;

I doubt that you need SAS macro code at all. Can you please explain to us what you have and what you're trying to achieve? Ideally post sample data in the form of a working SAS data step and then show us the desired result using this sample data.

tahos
Fluorite | Level 6

I currently have data like this: 

data have;
   input id $ employed $ dummy $ mydate2 $ mydate1$;
   datalines;
1 true true 01JAN2019 03MAR2019
2 false false 01JAN2019 01JAN2020
3 true true 02FEB2019 04JUN2019
;

and would like to create dummies like this:  

data want;
   input id $ employed $ dummy $ mydate2 $ mydate1$ p1 $ p2$ p3$;
   datalines;
1 true true 01JAN2019 03MAR2019 1 1 1
2 false false 01JAN2019 01JAN2020 0 0 0
3 true true 02FEB2019 04JUN2019 0 1 1
;

As mentioned, I have also tried rows like this:

%if %sysevalf(mydate1 <= &date2) AND %sysevalf(mydate2>= &date) %then 

and there the problem is that the comparison is always false (even with quotes around macro variables).

Patrick
Opal | Level 21

Thanks for posting the have and want data.

I can't figure out the logic used to create these p variables in your Want data. Can you explain?

tahos
Fluorite | Level 6

The logic is simply that if observation is from a certain period (eg. period 1) and has employed = true or dummy = true, then the monthly dummy p1 = 1. Otherwise, it gets value 0.

Patrick
Opal | Level 21

Where do you get the information about the period from? It's not in the have data.

You only mention p1. How do you get to p2 and p3?

 

Can you provide an example using your Have data and explain the logic in detail how this transforms to your Want data?

 

tahos
Fluorite | Level 6

The periods are months. The information is from the dates in the observation (end and start date). So if the person was employed during month x, then px = 1.

Kurt_Bremser
Super User

Before you try to make code dynamic by applying macro processing, you must have code that works for a given single instance.

Next, see Maxim 28. Formatting macro variables (especially those containing dates) makes using them in code harder than necessary. For comparisons and calculations, the raw values are much better to handle.

 

What you want can be done with no macro coding, using a do loop in a data step and a transpose:

data have;
input id $ employed $ dummy $ mydate1 :date9. mydate2 :date9.;
format mydate1 mydate2 e8601da10.;
datalines;
1 true true 01JAN2019 03MAR2019
2 false false 01JAN2019 01JAN2020
3 true true 02FEB2019 04JUN2019
;

%let start=%sysfunc(inputn(20190101,yymmdd8.));
%let num_period=3;

data long;
set have;
do p = 1 to &num_period;
  if
    mydate1 le intnx('month',&start,p-1,'e') and mydate2 ge intnx('month',&start,p-1,'b')
    and (employed = "true" or dummy = "true")
  then value = 1;
  else value = 0;
  output;
end;
run;
    
proc transpose data=long out=want (drop=_name_) prefix=p;
by id employed dummy mydate1 mydate2;
var value;
id p;
run;

Result:

1	true	true	2019-01-01	2019-03-03	1	1	1
2	false	false	2019-01-01	2020-01-01	0	0	0
3	true	true	2019-02-02	2019-06-04	0	1	1

 

tahos
Fluorite | Level 6

Thank you for your solution as it works! But with lots of data and many months the data set long becomes huge which is somewhat problematic. Is there no way to get the comparison with %sysevalf working?

 

Before going into macros I hard-coded the procedure as multiple data-steps and it worked. It seems weird that it would be not possible to do this with macros, which would be more practical as I could then use the same logic for multiple other operations which require me to get monthly information then I am only given the beginning and the end date. 

Tom
Super User Tom
Super User

@tahos wrote:

Thank you for your solution as it works! But with lots of data and many months the data set long becomes huge which is somewhat problematic. Is there no way to get the comparison with %sysevalf working?

 

Before going into macros I hard-coded the procedure as multiple data-steps and it worked. It seems weird that it would be not possible to do this with macros, which would be more practical as I could then use the same logic for multiple other operations which require me to get monthly information then I am only given the beginning and the end date. 


The problem with your original code has nothing to do with dates. The problem is you are comparing the value of your macro variable to a string of letters.  You are confused and think that you are giving the macro code the name of a variable.  The macro processor just operates on the text of the code and then passes the generated text to SAS to actually execute.  So when the macro code is running that string of letters is not yet a variable name. It only has meaning as a variable name once the generated SAS code starts running.

 

Not sure what your point about performance is.  I am not sure that building the tall dataset and then running it through PROC TRANSPOSE is really a problem.  You can make the first data step a view so you do not ever have to save a work dataset with all of the observations.  Or you could use an ARRAY to generate the wide dataset in the first step and eliminate the need for PROC TRANSPOSE.

Kurt_Bremser
Super User

You can keep the dataset "long" small by only keeping id for identification. After the transpose (which is also done only by id), merge the result back to "have". I only kept all the other variables to avoid this join, but in real life, i would do it.

In essence, the "long" dataset can't be much larger than the end result. As my code proves, a macro is not needed, and in that case, a macro is also not the solution. It only leads to convoluted and hard to understand/maintain/debug code. If it was the proper solution, you would not have had to come here for help.

Reeza
Super User

@tahos wrote:

 

Before going into macros I hard-coded the procedure as multiple data-steps and it worked. It seems weird that it would be not possible to do this with macros, which would be more practical as I could then use the same logic for multiple other operations which require me to get monthly information then I am only given the beginning and the end date. 


It is possible to do this with macros, but not more efficient at all. 

 

1. Determine the maximum number of months

2. Use an array to fill in your 0/1

 

First figure out how many possible p values you can have. If you know this ahead of time, its much easier. 

data have;
informat mydate1 mydate2 $10.;
   input id $ employed $ dummy $ mydate2  mydate1 ;
   datalines;
1 true true 01JAN2019 03MAR2019
2 false false 01JAN2019 01JAN2020
3 true true 02FEB2019 04JUN2019
4 true true 01JAN2019 01JAN2020
;

%macro date_loop(start,end);

          %*Calculate number of months needed from macro parameters;
		   %let start=%sysfunc(inputn(&start,date9.));
		   %let end=%sysfunc(inputn(&end,date9.));
		   %let dif=%eval(%sysfunc(intck(month,&start,&end)) + 1) ;
		   
		   
	data want;
			set have;
			
			*convert variables to SAS dates for date calculations;
			date_start = input(mydate2, date9.);
			date_end = input(mydate1, date9.);
			format date_start date_end date9.;

             *Declare an array of the size needed;
             array p(*) p1-p&dif.;
             
             *Initialize array values to 0;
             do i=1 to dim(p);
                        p(i) = 0;
             end;
             
             *calculate number of months for that record;
             nMonths = intck('month', date_start, date_end, 'C');
             *calculate starting index (number of months from Start);
             nMonths2Start = intck('month',&start,  date_start,  'C');
                  
             *set to 1 when conditions are met;     
		     do i = nMonths2Start+1 to (nMonths2Start+nMonths+1);				
							if employed = 'true' or dummy = 'true' then p(i) = 1;
		     end;
	 run;
%mend date_loop;

*test macro;
options mprint symbolgen;
%date_loop(01jan2019, 01jan2020);

Can't test it right now but another option you can consider. 

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 1668 views
  • 5 likes
  • 5 in conversation