I have the following script which has a logic for _n_ within a macro. I wish to reset _n_ each time the macro is called.
Can you show me how _n_ is reset according to the new value of num?
Thanks.
%macro dostuff(num);
data want;
if _n_=1 then
do j=1 to 3;
set have;
total + var;
if j=1 then
firstvar = var;
if j=3 then
varavg = total / 3;
end;
retain varavg firstvar;
set have;
vardif = var - firstvar;
vardif2 = var - varavg;
drop j;
run;
more stuff
%mend dostuff;
Your meaning isn't clear. _n_ is a counter for the number of times the DATA step loop has iterated. You say you want to reset it according the value of num, but there is no variable num.
Maybe you should describe the big picture of what you are trying to do. It looks like you're using something like a DoW loop to compute the average of a variable and the value of the variable in the first record?
@capam wrote:
'var' is modified within 'more stuff'.
But we don't know what this means.
What 'more stuff'???
If you don't show us the code, how can we help?
And please explain to us the overall end goal of this macro(s), because there may be much simpler ways to achieve the end goal.
Also, I agree with @Quentin, you can't really modify _n_ anyway, it is a counter, controlled by the data step.
%macro dostuff(num);
data want;
if _n_=1 then
do j=1 to 3;
set have;
total + var;
if j=1 then
firstvar = var;
if j=3 then
varavg = total / 3;
end;
retain varavg firstvar;
set have;
vardif = var - firstvar;
vardif2 = var - varavg;
drop j;
run;
proc sgplot data=have ;
where unit = "&num";
scatter x=Time y=vardif;
scatter x=Time y=vardif2;
run;
%mend dostuff;
The output is based on a simple plot routine. Hope this clarifies it.
The vardif and vardif2 are derived from the 'unit' that is changed each iteration.
A matrix below describes the input and desired output vs actual output. Thanks for your help.
unit | Time | var | desired vardif | desired vardif2 | actual vardif | actual vardif3 |
2800 | 03JAN2014:05:11:48 | 41.1 | 0 | -0.5 | 0 | -0.5 |
2800 | 09JAN2014:16:59:38 | 41.1 | 0 | -0.5 | 0 | -0.5 |
2800 | 11JAN2014:20:31:40 | 42.6 | 1.5 | 1 | 1.5 | 1 |
2800 | 16JAN2014:07:40:15 | 42.6 | 1.5 | 1 | 1.5 | 1 |
2800 | 20JAN2014:14:53:08 | 42.6 | 1.5 | 1 | 1.5 | 1 |
2801 | 03JAN2014:05:11:48 | 41 | 0 | 0 | -0.1 | -0.6 |
2801 | 09JAN2014:16:59:38 | 41 | 0 | 0 | -0.1 | -0.6 |
2801 | 11JAN2014:20:31:40 | 41 | 0 | 0 | -0.1 | -0.6 |
2801 | 16JAN2014:07:40:15 | 42.5 | 1.5 | 1.5 | 1.4 | 0.9 |
2801 | 20JAN2014:14:53:08 | 42.5 | 1.5 | 1.5 | 1.4 | 0.9 |
What are the rules for your groups, they seem to differ in the number of records so I'm assuming weekly or a window of some sort?
How big is the data? Otherwise, SQL may be an easy solution.
@capam wrote:
A matrix below describes the input and desired output vs actual output. Thanks for your help.
unit Time var desired vardif desired vardif2 actual vardif actual vardif3 2800 03JAN2014:05:11:48 41.1 0 -0.5 0 -0.5 2800 09JAN2014:16:59:38 41.1 0 -0.5 0 -0.5 2800 11JAN2014:20:31:40 42.6 1.5 1 1.5 1 2800 16JAN2014:07:40:15 42.6 1.5 1 1.5 1 2800 20JAN2014:14:53:08 42.6 1.5 1 1.5 1 2801 03JAN2014:05:11:48 41 0 0 -0.1 -0.6 2801 09JAN2014:16:59:38 41 0 0 -0.1 -0.6 2801 11JAN2014:20:31:40 41 0 0 -0.1 -0.6 2801 16JAN2014:07:40:15 42.5 1.5 1.5 1.4 0.9 2801 20JAN2014:14:53:08 42.5 1.5 1.5 1.4 0.9
I would approach this with a "DoW loop", that is explicitly loop through each by group once to compute the total and first record, then a second time to compute the differences and output records. This is similar to the approach you were trying, but it allows you compute the total of the first three records in each by-group.
Note that with the DOW loop approach, you don't RETAIN variables, because the DATA step loop only iterates once for each by-group. COUNT and TOTAL are automatically reset to missing before the start of each by-group.
data have ;
input unit Time $18. var ;
cards ;
2800 03JAN2014:05:11:48 41.1
2800 09JAN2014:16:59:38 41.1
2800 11JAN2014:20:31:40 42.6
2800 16JAN2014:07:40:15 42.6
2800 20JAN2014:14:53:08 42.6
2801 03JAN2014:05:11:48 41 0
2801 09JAN2014:16:59:38 41 0
2801 11JAN2014:20:31:40 41 0
2801 16JAN2014:07:40:15 42.5
2801 20JAN2014:14:53:08 42.5
;
data want ;
do until(last.unit) ;
set have ;
by unit ;
count=sum(count,1) ;
if count=1 then firstvar=var ;
if count<=3 then do ;
total=sum(total,var) ;
end ;
end ;
average=total/3 ;
do until(last.unit) ;
set have ;
by unit ;
vardif=var-firstvar ;
vardif2=var-average ;
output ;
end ;
run ;
I don't understand. When I run my code, you get different values for firstvar and average for each by-group (each unit). And the computed values of vardif and vardif2 match your desired values. Don't they? If not, can you explain what is wrong with the below printout of the WANT created by my sample code:
unit Time var count firstvar total average vardif vardif2 2800 03JAN2014:05:11:48 41.1 5 41.1 124.8 41.6 0.0 -0.5 2800 09JAN2014:16:59:38 41.1 5 41.1 124.8 41.6 0.0 -0.5 2800 11JAN2014:20:31:40 42.6 5 41.1 124.8 41.6 1.5 1.0 2800 16JAN2014:07:40:15 42.6 5 41.1 124.8 41.6 1.5 1.0 2800 20JAN2014:14:53:08 42.6 5 41.1 124.8 41.6 1.5 1.0 2801 03JAN2014:05:11:48 41.0 5 41.0 123.0 41.0 0.0 0.0 2801 09JAN2014:16:59:38 41.0 5 41.0 123.0 41.0 0.0 0.0 2801 11JAN2014:20:31:40 41.0 5 41.0 123.0 41.0 0.0 0.0 2801 16JAN2014:07:40:15 42.5 5 41.0 123.0 41.0 1.5 1.5 2801 20JAN2014:14:53:08 42.5 5 41.0 123.0 41.0 1.5 1.5
_n_ is an automatic variable, I wouldn't reset it and it automatically will reset for each macro call/data step.
However, I suspect you're really looking for BY group functionality and something along the lines of the FIRST. logic.
Are you using this approach to try and get efficiency or because someone recommended this approach? It seems like it would be easier to do two data steps.
If you want a variable that is similar to _n_ but deviates in some way, simply make it your self and forget about resetting _n_. For example, this increments a new variable by one for every observation except 'Mary'.
data subset;
set sashelp.class;
my_n_ + name ne 'Mary'; * Note the implicit retain and initialization to 0;
run;
proc print; run;
If the intent is to control which observations to use to compute TOTAL and FIRSTVAR, you wouldn't re-set _n_ to get what you need. Consider changing the first SET statement:
do j=1 to 3;
set have (firstobs=&num);
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.