BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pstuchli1
Calcite | Level 5

Apologies because this question has likely been answered, I'm just struggling to search for it in the correct way. I have n = approx 3,000 subjects with around 40 clinic visit dates in wide format, and then two new disease diagnosis dates (two different diseases) from a newly merged dataset.

 

ID    date1     date2     date3    ...      date40   diag_date1   diag_date2

1

2

.

.

 

What I need is to find a way to determine when diag_date1 and diag_date2 fit into the clinic visit date order, or to figure out between which two clinic visit dates the diagnosis dates occurred. Another problem is that date1 to date40 may not be chronological order. One idea I had is to transpose to long form and sort, but I don't know how to write code beyond that. Or is there a better way? I have SAS 9.4

 

Thank you so much in advance for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @pstuchli1 and welcome to the SAS Support Communities!

 

Try this:

/* Create test data */

data have;
call streaminit(27182818);
length subjid 8;
array date[40];
do subjid=1 to 3000;
  do _n_=1 to dim(date);
    date[_n_]=rand('integer',11000, 21000);
  end;
  diag_date1=rand('integer',10000, 22000);
  diag_date2=rand('integer',10000, 22000);
  output;
end;
format d: yymmdd10.;
run;

/* Determine between which visit dates diagnosis dates fall:
   diag_date1 falls between dl1=date[il1] and du1=date[iu1].
   diag_date2 falls between dl2=date[il2] and du2=date[iu2].
   The lower (upper) limiting date is missing if the diagnosis date
   is smaller (greater) than the minimum (maximum) visit date.
*/

data want;
set have;
array date[40];
array diag_date[2];
array il[2];
array iu[2];
array dl[2];
array du[2];
do _k=1 to dim(diag_date);
  do _j=1 to n(of date [*]);
    _d_s=smallest(_j, of date[*]);
    _s=whichn(_d_s, of date[*]);
    if _d_s<=diag_date[_k] then il[_k]=_s;
    if _d_s>=diag_date[_k] then do;
      iu[_k]=_s;
      leave;
    end;
  end;
  if il[_k] then dl[_k]=date[il[_k]];
  if iu[_k] then du[_k]=date[iu[_k]];
end;  
format d: yymmdd10.;
drop _:;
run;

Edit: Simplified the code. Checking for the extreme cases was redundant.

Edit 2: Just in case that you were using an older SAS 9.4 release (or even a SAS version <9.4) which doesn't accept rand('integer', a, b), you could replace these expressions by
floor((b-a+1)*rand('uniform')+a).

View solution in original post

4 REPLIES 4
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

please post sample data in the form of a datastep with datalines that represents what you have and want you want.

include as many examples that show differences in the date problem you are trying to solve.

Thank you.

 

Shmuel
Garnet | Level 18

You did not show how should output look.

According to requirement: "...or to figure out between which two clinic visit dates the diagnosis dates occurred"

I suggest:

 

1) Define array of dates as: array dx{40} date1-date40;

2) Use function: call sortn(off dx(*)); to sort the dates in the array

3) Make a loop to locate dates:  dx(i) le diag_date le dx(i+1) where i le 39.

FreelanceReinh
Jade | Level 19

Hello @pstuchli1 and welcome to the SAS Support Communities!

 

Try this:

/* Create test data */

data have;
call streaminit(27182818);
length subjid 8;
array date[40];
do subjid=1 to 3000;
  do _n_=1 to dim(date);
    date[_n_]=rand('integer',11000, 21000);
  end;
  diag_date1=rand('integer',10000, 22000);
  diag_date2=rand('integer',10000, 22000);
  output;
end;
format d: yymmdd10.;
run;

/* Determine between which visit dates diagnosis dates fall:
   diag_date1 falls between dl1=date[il1] and du1=date[iu1].
   diag_date2 falls between dl2=date[il2] and du2=date[iu2].
   The lower (upper) limiting date is missing if the diagnosis date
   is smaller (greater) than the minimum (maximum) visit date.
*/

data want;
set have;
array date[40];
array diag_date[2];
array il[2];
array iu[2];
array dl[2];
array du[2];
do _k=1 to dim(diag_date);
  do _j=1 to n(of date [*]);
    _d_s=smallest(_j, of date[*]);
    _s=whichn(_d_s, of date[*]);
    if _d_s<=diag_date[_k] then il[_k]=_s;
    if _d_s>=diag_date[_k] then do;
      iu[_k]=_s;
      leave;
    end;
  end;
  if il[_k] then dl[_k]=date[il[_k]];
  if iu[_k] then du[_k]=date[iu[_k]];
end;  
format d: yymmdd10.;
drop _:;
run;

Edit: Simplified the code. Checking for the extreme cases was redundant.

Edit 2: Just in case that you were using an older SAS 9.4 release (or even a SAS version <9.4) which doesn't accept rand('integer', a, b), you could replace these expressions by
floor((b-a+1)*rand('uniform')+a).

pstuchli1
Calcite | Level 5

Thank you so much to @Shmuel and @FreelanceReinh, brilliant work, these are fabulous and solved my problem! Thank you so so much!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 864 views
  • 2 likes
  • 4 in conversation