Quartz | Level 8

## Calculating variable name using do loops

Hi,

I have a dataset that looks like this:

Visit_date_1 Visit_date_2 Visit_date_3 Visit_date_4 .... Visit_date_n

I would like to calculate the number of months between Visit_date_2 and Visit_date_1, and Visit_date_3 and Visit_date_2, etc...  So essentially I am calculating Visit_date_n - Visit_date_(n-1).

I have written the following SAS code but it doesn't seem to work.  It says that "WARNING: Apparent symbolic reference I not resolved." and "ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was: &i.-1"

Note, I am trying a new UNTIL loop which doesn't seem to like character values.

TIA

data testing;
set testdate ;
do i = 2 to 100 until Visit_date_&i. = "" ;
%let j=%eval(&i.-1);
diff_month_i=intck('MONTH', Visit_date&i., Visit_date&j.);
end;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
Diamond | Level 26

## Re: Calculating variable name using do loops

You can't mix and match DATA step loops with MACRO loops. (Well, actually, you can, but this isn't the way to do it)

What you need is a DATA step ARRAY with DATA step DO loop. Only DATA step structures are needed here. Something like this:

``````data testing;
set testdate ;
array v visit_date_1--visit_date_100;
array d delta_date_1--delta_date_100;
do i = 2 to 100;
d(i)=intck('month',v(i),v(i-1));
end;
drop i;
run;``````

Adding: the suggestion by @Kurt_Bremser is a preferable solution. Use a long data set, not a wide data set, and your programming will be much easier, not just on this problem, but on all problems where you have a choice between long and wide.

--
Paige Miller
9 REPLIES 9
Diamond | Level 26

## Re: Calculating variable name using do loops

You can't mix and match DATA step loops with MACRO loops. (Well, actually, you can, but this isn't the way to do it)

What you need is a DATA step ARRAY with DATA step DO loop. Only DATA step structures are needed here. Something like this:

``````data testing;
set testdate ;
array v visit_date_1--visit_date_100;
array d delta_date_1--delta_date_100;
do i = 2 to 100;
d(i)=intck('month',v(i),v(i-1));
end;
drop i;
run;``````

Adding: the suggestion by @Kurt_Bremser is a preferable solution. Use a long data set, not a wide data set, and your programming will be much easier, not just on this problem, but on all problems where you have a choice between long and wide.

--
Paige Miller
Quartz | Level 8

## Re: Calculating variable name using do loops

Thanks, @PaigeMiller and @Kurt_Bremser for your responses. I will consider reformatting to a long dataset.
To answer my query, though, the delta_date_1 is not currently in my dataset. This is a new variable I want to create.
When I run the code provided by @PaigeMiller, I get this error "ERROR: Variable delta_date_1 cannot be found on the list of previously defined variables." When I remove the double hyphen (delta_date_1--delta_date_100 -> delta_date_1-delta_date_100) I get this error: "ERROR: Array subscript out of range at line 778 column 28." I looked up how to work with arrays, but I'm still confused why I am getting this error.
Can you please educate me on what I am doing wrong? Thank you.

Diamond | Level 26

## Re: Calculating variable name using do loops

I should have used a single dash. Beyond that, go for the long data set solution and avoid all these complications.

--
Paige Miller
Quartz | Level 8

## Re: Calculating variable name using do loops

Hi, @PaigeMiller, I tired the single dash but still get an error "ERROR: Array subscript out of range at line 1429 column 33."
I will transpose my variables to the long format.
Thanks.
Diamond | Level 26

## Re: Calculating variable name using do loops

@sharonlee wrote:
Hi, @PaigeMiller, I tired the single dash but still get an error "ERROR: Array subscript out of range at line 1429 column 33."
I will transpose my variables to the long format.
Thanks.

You have to show us the entire log for your DATA step. Otherwise, we can't possibly know what is line 1429 column 33.

--
Paige Miller
Quartz | Level 8

## Re: Calculating variable name using do loops

You are right, @PaigeMiller .  Thank you for pointing this out, and sorry for not including it earlier.

Here is a snap shot of my log (note I did not provide the variable list as it's quite lengthy.  Column 33 is actually one of the visit_date variables.).

The error is with my code:  d(i)=intck('month',v(i),v(i-1));

1477  data testing;

1478      set testdate;

1479      array v Visit_date__1--Visit_date__46;

1480      array d delta_date_1-delta_date_46;

1481      do i = 1 to 46 ;

1482          d(i)=intck('month',v(i),v(i-1));

1483      end;

1484      drop i;

1485  run;

ERROR: Array subscript out of range at line 1482 column 33.

<<list of variables>>

NOTE: The SAS System stopped processing this step because of errors.

NOTE: There were 1 observations read from the data set testdate.

WARNING: The data set WORK.TESTING may be incomplete.  When this step was stopped there were 0 observations and 114

variables.

WARNING: Data set WORK.TESTING was not replaced because this step was stopped.

NOTE: DATA statement used (Total process time):

real time           0.04 seconds

cpu time            0.03 seconds

Diamond | Level 26

## Re: Calculating variable name using do loops

In your DO loop, when i=1, you are trying to refer to v(i-1) which is v(0) which doesn't exist. Or in SAS terms, "Array subscript out of range..." Elements in a SAS array are indexed by positive integers, and cannot be indexed by zero or negative integers or an integer greater than the number of elements in the array.

Had the DO loop been 2 to 46, you would not have this error.

--
Paige Miller
Quartz | Level 8

## Re: Calculating variable name using do loops

It works, @PaigeMiller!  Thank you for the explanation. I was trying a lot of different combinations but did not think of that!

Have a great day and thanks again.

Super User

## Re: Calculating variable name using do loops

With a proper dataset structure (long vs. the unwieldy wide you have now), it boils down to this:

``````data want;
set have;
by id;
d = intck('month',lag(v),v);
if first.id then d = .;
run;``````

Discussion stats
• 9 replies
• 1894 views
• 4 likes
• 3 in conversation