BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mles
Fluorite | Level 6

Hi! I am working with claims based data and I'm having difficulty selecting the cases I want based on their enrollment period.

 

I would like to select cases with enrollment for a full 12 months prior to their diagnosis and full enrollment afterwards until death or end of follow-up. There can be no gaps in enrollment.

 

The insurance enrollment file is arranged as 300 columns representing one month between 1/1991-12/2015 and a value of "1- yes" or "0- no" for enrollment during that month. Date of diagnosis is arranged as two separate columns "mm" and "yyyy," beginning in 2003. Date of death is arranged the same way as "mm" and "yyyy".

 

Thus far, I've converted the diagnosis date (DXDT) to columns representing a specific month/year similar to the enrollment indicator (status). I am trying to use an array as follows below to determine which patients have 12 months prior enrollment, but keep getting: "ERROR: Array subscript out of range at line 2900 column 38" even when I've tried explicitly stating the subscripts DXDT(157:300) and _status(145:300).

Thanks in advance for all insight and assistance!

 

ARRAY DXDT(*) DXDT157-DXDT300;

ARRAY _status(*) status145-status300;

_enroll_pre0=0;

DO i= 157 TO 300;

DO j= 0 TO 12;

IF _status(dxdt{i}-j)=1 then _enroll_pre=1;

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

As I understand it you want to determined whether all the STATUS variables between (diagnosis_date-12 months) and (death_date) are an unbroken string of 1's, correct?

 

Does every record have a diag_date between Jan 2003 and Dec 2015?,  and a death_date between diag_date and Dec 2015?

 

If yes, than you don't need an array of DXDT columns.  You just need to determine which STATUS month corresponds to the diagnosis date and which corresponds to death date, then look at the STATUS array elements between these dates to see if an element is not a 1.   No need to bother with the DXDT array.  You could use a program like this:

 

data want;
  set have;
  array status {145:300} status145-status300  /* All months Jan 1991 through Dec 2015*/

  _beg_month=intck('month','31dec1990'd,diag_date) - 12;
  _end_month=intck('month','31dec1990'd,death_date);

  continuous='YES';   /* Default */
  
  /* Now search for break in string of 1's */
  do _m=_beg_month  to _end_month until (continuous='NO');
    if status{_m}^=1 then continuous='NO';
  end;
run;

 

 

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

5 REPLIES 5
JeffMaggio
Obsidian | Level 7

Your array actually goes from dxdt{1} to dxdt{144}, not 157 to 300.

Just subtract 156 from the begin and end of your i loop and it should work.

 

ARRAY DXDT(*) DXDT157-DXDT300;
ARRAY _status(*) status145-status300;
_enroll_pre0=0;
DO i= 157-156 TO 300-156;
  DO j= 0 TO 12;
    IF _status(dxdt{i}-j)=1 then _enroll_pre=1;
    *some other stuff;
  end;
end;

 Or at least it will fix your subscript out of range issue for the dxdt array.

I can't really tell what you are trying to do with

_status(dxdt{i}-j)

On the first iteration, this is resolving to _status{some_value-0} but I don't think that is what you want.

 

You'd have to tell us more about the data in dxdt157-dxdt300 and status145-status300 for me to figure out what the rest of the code should be.

mles
Fluorite | Level 6

Hi Jeff,

 

Thanks for your help!

 

With the line "IF _status(dxdt{i}-j)=1 then _enroll_pre=1;" what I'd like to do is check if a patient diagnosed in month dxdt(i) was enrolled in that month and the 12 months prior (i.e. status(i-12) through status(i)=1).

All indicators have binary responses, however each subject (row) will only have 1 column where dxdt=1, while they should have multiple columns where status=1.

 

I'd like to create a single column indicator of whether patients diagnosed in any month between 157 and 300 were enrolled continuously for the 12 prior calendar months.

 

For example, if dxdt157=1 then status145 through status157 would need to be 1 as well.

 

Let me know if this clarifies! I don't think my IF statement is worded properly to do this yet.

 

 

JeffMaggio
Obsidian | Level 7

If dxdt{157}=1, think about what that means for _status{dxdt{157}-j}.

That would be:

do j=0 to 12;
  _status{dxdt{157}-j};
end;

which would then be:

do j=0 to 12;
  _status{1-j};
end;

and that will give you a subscript out of range error.

 

Also note that (j=0 to 12) is 13 months. You probably either want j=1 to 12 or j=0 to 11.

 

I'm guessing you want something more like this:

ARRAY DXDT(157:300) DXDT157-DXDT300;
ARRAY _status(145:300) status145-status300;

_enroll_pre=0;

do i= 157 to 300;
  If dxdt{i} = 1 then 
    do j= 0 to 11;
      if _status(i-j)=1 then _enroll_pre=1;
      * do some other stuff;
    end;
end;
mkeintz
PROC Star

As I understand it you want to determined whether all the STATUS variables between (diagnosis_date-12 months) and (death_date) are an unbroken string of 1's, correct?

 

Does every record have a diag_date between Jan 2003 and Dec 2015?,  and a death_date between diag_date and Dec 2015?

 

If yes, than you don't need an array of DXDT columns.  You just need to determine which STATUS month corresponds to the diagnosis date and which corresponds to death date, then look at the STATUS array elements between these dates to see if an element is not a 1.   No need to bother with the DXDT array.  You could use a program like this:

 

data want;
  set have;
  array status {145:300} status145-status300  /* All months Jan 1991 through Dec 2015*/

  _beg_month=intck('month','31dec1990'd,diag_date) - 12;
  _end_month=intck('month','31dec1990'd,death_date);

  continuous='YES';   /* Default */
  
  /* Now search for break in string of 1's */
  do _m=_beg_month  to _end_month until (continuous='NO');
    if status{_m}^=1 then continuous='NO';
  end;
run;

 

 

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mles
Fluorite | Level 6
Thanks for your input! This strategy resolved my issue!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 2129 views
  • 0 likes
  • 3 in conversation