Hi! I am working with claims based data and I'm having difficulty selecting the cases I want based on their enrollment period.
I would like to select cases with enrollment for a full 12 months prior to their diagnosis and full enrollment afterwards until death or end of follow-up. There can be no gaps in enrollment.
The insurance enrollment file is arranged as 300 columns representing one month between 1/1991-12/2015 and a value of "1- yes" or "0- no" for enrollment during that month. Date of diagnosis is arranged as two separate columns "mm" and "yyyy," beginning in 2003. Date of death is arranged the same way as "mm" and "yyyy".
Thus far, I've converted the diagnosis date (DXDT) to columns representing a specific month/year similar to the enrollment indicator (status). I am trying to use an array as follows below to determine which patients have 12 months prior enrollment, but keep getting: "ERROR: Array subscript out of range at line 2900 column 38" even when I've tried explicitly stating the subscripts DXDT(157:300) and _status(145:300).
Thanks in advance for all insight and assistance!
ARRAY DXDT(*) DXDT157-DXDT300;
ARRAY _status(*) status145-status300;
_enroll_pre0=0;
DO i= 157 TO 300;
DO j= 0 TO 12;
IF _status(dxdt{i}-j)=1 then _enroll_pre=1;
As I understand it you want to determined whether all the STATUS variables between (diagnosis_date-12 months) and (death_date) are an unbroken string of 1's, correct?
Does every record have a diag_date between Jan 2003 and Dec 2015?, and a death_date between diag_date and Dec 2015?
If yes, than you don't need an array of DXDT columns. You just need to determine which STATUS month corresponds to the diagnosis date and which corresponds to death date, then look at the STATUS array elements between these dates to see if an element is not a 1. No need to bother with the DXDT array. You could use a program like this:
data want;
set have;
array status {145:300} status145-status300 /* All months Jan 1991 through Dec 2015*/
_beg_month=intck('month','31dec1990'd,diag_date) - 12;
_end_month=intck('month','31dec1990'd,death_date);
continuous='YES'; /* Default */
/* Now search for break in string of 1's */
do _m=_beg_month to _end_month until (continuous='NO');
if status{_m}^=1 then continuous='NO';
end;
run;
Your array actually goes from dxdt{1} to dxdt{144}, not 157 to 300.
Just subtract 156 from the begin and end of your i loop and it should work.
ARRAY DXDT(*) DXDT157-DXDT300;
ARRAY _status(*) status145-status300;
_enroll_pre0=0;
DO i= 157-156 TO 300-156;
DO j= 0 TO 12;
IF _status(dxdt{i}-j)=1 then _enroll_pre=1;
*some other stuff;
end;
end;
Or at least it will fix your subscript out of range issue for the dxdt array.
I can't really tell what you are trying to do with
_status(dxdt{i}-j)
On the first iteration, this is resolving to _status{some_value-0} but I don't think that is what you want.
You'd have to tell us more about the data in dxdt157-dxdt300 and status145-status300 for me to figure out what the rest of the code should be.
Hi Jeff,
Thanks for your help!
With the line "IF _status(dxdt{i}-j)=1 then _enroll_pre=1;" what I'd like to do is check if a patient diagnosed in month dxdt(i) was enrolled in that month and the 12 months prior (i.e. status(i-12) through status(i)=1).
All indicators have binary responses, however each subject (row) will only have 1 column where dxdt=1, while they should have multiple columns where status=1.
I'd like to create a single column indicator of whether patients diagnosed in any month between 157 and 300 were enrolled continuously for the 12 prior calendar months.
For example, if dxdt157=1 then status145 through status157 would need to be 1 as well.
Let me know if this clarifies! I don't think my IF statement is worded properly to do this yet.
If dxdt{157}=1, think about what that means for _status{dxdt{157}-j}.
That would be:
do j=0 to 12;
_status{dxdt{157}-j};
end;
which would then be:
do j=0 to 12;
_status{1-j};
end;
and that will give you a subscript out of range error.
Also note that (j=0 to 12) is 13 months. You probably either want j=1 to 12 or j=0 to 11.
I'm guessing you want something more like this:
ARRAY DXDT(157:300) DXDT157-DXDT300;
ARRAY _status(145:300) status145-status300;
_enroll_pre=0;
do i= 157 to 300;
If dxdt{i} = 1 then
do j= 0 to 11;
if _status(i-j)=1 then _enroll_pre=1;
* do some other stuff;
end;
end;
As I understand it you want to determined whether all the STATUS variables between (diagnosis_date-12 months) and (death_date) are an unbroken string of 1's, correct?
Does every record have a diag_date between Jan 2003 and Dec 2015?, and a death_date between diag_date and Dec 2015?
If yes, than you don't need an array of DXDT columns. You just need to determine which STATUS month corresponds to the diagnosis date and which corresponds to death date, then look at the STATUS array elements between these dates to see if an element is not a 1. No need to bother with the DXDT array. You could use a program like this:
data want;
set have;
array status {145:300} status145-status300 /* All months Jan 1991 through Dec 2015*/
_beg_month=intck('month','31dec1990'd,diag_date) - 12;
_end_month=intck('month','31dec1990'd,death_date);
continuous='YES'; /* Default */
/* Now search for break in string of 1's */
do _m=_beg_month to _end_month until (continuous='NO');
if status{_m}^=1 then continuous='NO';
end;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.