Remember that your macro code essentially resolves at compile time. Basically, at compile time, the value of the data set variable is unknown -- the data set variable is only available at compile time.
Why are you replacing the data set do loop with a %DO loop?
I'd probably take a different approach:
do varYear = &vystrt to &vyend;
if year(dtps) LE varYear and year(dtps) GE varYear then do;
... Some Code ...
where &vystrt was the start date for the loop and &vyend was the end date for the loop. This leaves your existing data step logic intact and for every iteration through the loop, the execution time value of dtps will be compared to the current value of varYear -- which is starting and ending based on your macro variable values.
Or, do you mean that you want to do EVERYTHING that the program is doing -- only do it inside a Macro program??? You may have to get into doing a CALL EXECUTE in order to do some process for every observation in your data set.
No, I don't think it's stupid. I see what you're trying to do. But the problem is this: the macro %do loop will generate code at program code COMPILE time and the value of DTPS is not available until program code EXECUTION time. You really can't mix the 2 different references the way you do here:
%if year(dtps) LE &varYear and year(dtps) GE &varYear &then &do;
%do varMonth = month(dtps) %to 12 ;
because DTPS will never be available when the macro code is resolved. I really think you need to look at a CALL EXECUTE on a second macro, where the %do loop is inside the CALL EXECUTE -- because THEN, the value of DTPS will be available when you need it -- at program EXECUTION time. There are some examples of CALL EXECUTE on the Tech Support web site, or I'm sure you can work with Tech Support for more help on doing this with CALL EXECUTE instead of a %DO in the DATA step.
The thing to watch out for with "call execute", which I tripped over, is the timing of when macro logic is applied.
If "call execute" is used to call a macro directly, then the pure macro pieces are resolved/executed within the data step. All the non-pure-macro code generated by the macro is executed after the orginating data step has completed. That is why I ended up using "call execute" to call/issue an %include'd file. That file has the macro call in it. In that way, all the macro pieces don't execute until after the Data step, but I was still able to pass to the macro values from the datastep.
As for what is being attempted here, I'm not sure I see what the intended end result is. Is it to have specially dynamically named variables based on values in the dataset? If that is the case, why is it necessary to create these variables? Using them in later SAS code will be difficult because SAS wants to know what those variable names are for charting, reporting, running summaries. etc. So, I would want to know, what is trying to be accomplished at the top level?
Those variables are used for reporting (it contains information about contracts and their validity period on which we need to calculate a consumption profile based on the monthly values that we have: cn1-cn12 and cl1-cl12. Currently it's asked for years 2008 to 2011 but of course it will evolve every year). The harcoded code is really too long (even if it's mainly copy/paste for every year of the do loop).
The only problem that I have is that I'm not able to compare the dataset's variables with the macro variables at compilation time. I think I'll try to find a work around by using "call execute" as suggested by Cynthia.
I think you may still be heading for trouble/[a headache]/non-working-solution.
I think having explicitly named variables like what you are doing is going to cause you grief, especially with the reporting, and lots of maintenance of overly complicated code.
I think you need to reorganize how you are thinking about what you want to do.
If I were using SAS and wanted to report a value on a monthly interval, I would not create January, February, March, April, ... variables. I would have variable named "month" and a number of other variables with the summarized values "mean_sales", "peak_sales", etc. and simply have one observation per month. Then SAS would report the summarized values for each month.
I think you may be thinking too linearly, trying to put too much into a single observation/record. Make use of the two dimensions of a table structure, and grouping.
Could you be served by something like:
proc sort data=raw_data;
by year month;
by year month;
retain monthly_consumption count total;
if first.month then do;
monthly_consumption = 0;
if first.year then do;
yearly_consumption = 0;
count = 0;
total = 0;
monthly_consumption + consumed;
if last.month then do;
if last.year then do;
yearly_consumption = total / count;
I'm not saying the above code solves your problem, but I think if you make the year part of the variable's name, you would be creating problems for yourself.
Unfortunately, I have to keep these variables requirements as they come from the business. I just try to use the macro facilities to make my code shorter to update every year. Otherwise, it works perfectly with the harcoded version.