- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have a dataset that looks like this:
Visit_date_1 Visit_date_2 Visit_date_3 Visit_date_4 .... Visit_date_n
I would like to calculate the number of months between Visit_date_2 and Visit_date_1, and Visit_date_3 and Visit_date_2, etc... So essentially I am calculating Visit_date_n - Visit_date_(n-1).
I have written the following SAS code but it doesn't seem to work. It says that "WARNING: Apparent symbolic reference I not resolved." and "ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was: &i.-1"
Note, I am trying a new UNTIL loop which doesn't seem to like character values.
Any advice?
TIA
data testing;
set testdate ;
do i = 2 to 100 until Visit_date_&i. = "" ;
%let j=%eval(&i.-1);
diff_month_i=intck('MONTH', Visit_date&i., Visit_date&j.);
end;
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You can't mix and match DATA step loops with MACRO loops. (Well, actually, you can, but this isn't the way to do it)
What you need is a DATA step ARRAY with DATA step DO loop. Only DATA step structures are needed here. Something like this:
data testing;
set testdate ;
array v visit_date_1--visit_date_100;
array d delta_date_1--delta_date_100;
do i = 2 to 100;
d(i)=intck('month',v(i),v(i-1));
end;
drop i;
run;
Adding: the suggestion by @Kurt_Bremser is a preferable solution. Use a long data set, not a wide data set, and your programming will be much easier, not just on this problem, but on all problems where you have a choice between long and wide.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You can't mix and match DATA step loops with MACRO loops. (Well, actually, you can, but this isn't the way to do it)
What you need is a DATA step ARRAY with DATA step DO loop. Only DATA step structures are needed here. Something like this:
data testing;
set testdate ;
array v visit_date_1--visit_date_100;
array d delta_date_1--delta_date_100;
do i = 2 to 100;
d(i)=intck('month',v(i),v(i-1));
end;
drop i;
run;
Adding: the suggestion by @Kurt_Bremser is a preferable solution. Use a long data set, not a wide data set, and your programming will be much easier, not just on this problem, but on all problems where you have a choice between long and wide.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To answer my query, though, the delta_date_1 is not currently in my dataset. This is a new variable I want to create.
When I run the code provided by @PaigeMiller, I get this error "ERROR: Variable delta_date_1 cannot be found on the list of previously defined variables." When I remove the double hyphen (delta_date_1--delta_date_100 -> delta_date_1-delta_date_100) I get this error: "ERROR: Array subscript out of range at line 778 column 28." I looked up how to work with arrays, but I'm still confused why I am getting this error.
Can you please educate me on what I am doing wrong? Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I should have used a single dash. Beyond that, go for the long data set solution and avoid all these complications.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I will transpose my variables to the long format.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@sharonlee wrote:
Hi, @PaigeMiller, I tired the single dash but still get an error "ERROR: Array subscript out of range at line 1429 column 33."
I will transpose my variables to the long format.
Thanks.
You have to show us the entire log for your DATA step. Otherwise, we can't possibly know what is line 1429 column 33.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You are right, @PaigeMiller . Thank you for pointing this out, and sorry for not including it earlier.
Here is a snap shot of my log (note I did not provide the variable list as it's quite lengthy. Column 33 is actually one of the visit_date variables.).
The error is with my code: d(i)=intck('month',v(i),v(i-1));
1477 data testing;
1478 set testdate;
1479 array v Visit_date__1--Visit_date__46;
1480 array d delta_date_1-delta_date_46;
1481 do i = 1 to 46 ;
1482 d(i)=intck('month',v(i),v(i-1));
1483 end;
1484 drop i;
1485 run;
ERROR: Array subscript out of range at line 1482 column 33.
<<list of variables>>
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 1 observations read from the data set testdate.
WARNING: The data set WORK.TESTING may be incomplete. When this step was stopped there were 0 observations and 114
variables.
WARNING: Data set WORK.TESTING was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.03 seconds
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In your DO loop, when i=1, you are trying to refer to v(i-1) which is v(0) which doesn't exist. Or in SAS terms, "Array subscript out of range..." Elements in a SAS array are indexed by positive integers, and cannot be indexed by zero or negative integers or an integer greater than the number of elements in the array.
Had the DO loop been 2 to 46, you would not have this error.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It works, @PaigeMiller! Thank you for the explanation. I was trying a lot of different combinations but did not think of that!
Have a great day and thanks again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
With a proper dataset structure (long vs. the unwieldy wide you have now), it boils down to this:
data want;
set have;
by id;
d = intck('month',lag(v),v);
if first.id then d = .;
run;