BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
kelly2016
Fluorite | Level 6

Hello SAS Community,
I'm a relatively new SAS programmer and I really need your help coding this!  I have a dataset that contains 5 columns, each representing a different disease diagnosis date in date format (date9.).  Some columns contain missing values.  I need to compare the diagnosis dates across these 5 columns when the value is not missing and flag or print out the earliest diagnosis date variable. Essentially, to find which disease diagnosis occurred first and flag that. 

 

Is there an easy way to code this?  See my sample data and expected output below. Thank you so much for your help!!

 

Data Example:

IDDiag1Diag2Diag3Diag4Diag5
126-Oct-17 28-Oct-1826-Mar-19 
2.....
322-Dec-21....
4...10-May-2113-Nov-22
515-Oct-1915-Oct-19.24-Jun-1523-Oct-17
625-Dec-1614-Oct-1824-Nov-2226-Dec-1523-Oct-23

 

I would like the output to be as follows:

IDDiag1Diag2Diag3Diag4Diag5Flag
126-Oct-17 28-Oct-1826-Mar-19 Diag1
2.....Disease Free
322-Dec-21....Diag1
4...10-May-2113-Nov-22Diag4
515-Oct-1915-Oct-19.24-Jun-1523-Oct-17Diag5
625-Dec-1614-Oct-1824-Nov-2226-Dec-1523-Oct-23Diag4

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

 

data want;
set have;

length flag $40.;

array _diag(*) diag1-diag5;

if nmiss(of _diag(*)) ne dim(_diag) then do;
    *find smallest value;
    min_date = min(of _diag(*));

    *find first location of smallest value;
    index_min_date = whichn(min_date, of _diag(*));

    *find name of variable at index;
     flag = vname(_diag(index_min_date));
end;
else flag = 'Disease Free';

run;

@kelly2016 wrote:

Hello SAS Community,
I'm a relatively new SAS programmer and I really need your help coding this!  I have a dataset that contains 5 columns, each representing a different disease diagnosis date in date format (date9.).  Some columns contain missing values.  I need to compare the diagnosis dates across these 5 columns when the value is not missing and flag or print out the earliest diagnosis date variable. Essentially, to find which disease diagnosis occurred first and flag that. 

 

Is there an easy way to code this?  See my sample data and expected output below. Thank you so much for your help!!

 

Data Example:

ID Diag1 Diag2 Diag3 Diag4 Diag5
1 26-Oct-17   28-Oct-18 26-Mar-19  
2 . . . . .
3 22-Dec-21 . . . .
4 . . . 10-May-21 13-Nov-22
5 15-Oct-19 15-Oct-19 . 24-Jun-15 23-Oct-17
6 25-Dec-16 14-Oct-18 24-Nov-22 26-Dec-15 23-Oct-23

 

I would like the output to be as follows:

ID Diag1 Diag2 Diag3 Diag4 Diag5 Flag
1 26-Oct-17   28-Oct-18 26-Mar-19   Diag1
2 . . . . . Disease Free
3 22-Dec-21 . . . . Diag1
4 . . . 10-May-21 13-Nov-22 Diag4
5 15-Oct-19 15-Oct-19 . 24-Jun-15 23-Oct-17 Diag5
6 25-Dec-16 14-Oct-18 24-Nov-22 26-Dec-15 23-Oct-23 Diag4

 

 

 

 



One way is above, there are others. 

 

 

 

View solution in original post

5 REPLIES 5
Reeza
Super User

 

data want;
set have;

length flag $40.;

array _diag(*) diag1-diag5;

if nmiss(of _diag(*)) ne dim(_diag) then do;
    *find smallest value;
    min_date = min(of _diag(*));

    *find first location of smallest value;
    index_min_date = whichn(min_date, of _diag(*));

    *find name of variable at index;
     flag = vname(_diag(index_min_date));
end;
else flag = 'Disease Free';

run;

@kelly2016 wrote:

Hello SAS Community,
I'm a relatively new SAS programmer and I really need your help coding this!  I have a dataset that contains 5 columns, each representing a different disease diagnosis date in date format (date9.).  Some columns contain missing values.  I need to compare the diagnosis dates across these 5 columns when the value is not missing and flag or print out the earliest diagnosis date variable. Essentially, to find which disease diagnosis occurred first and flag that. 

 

Is there an easy way to code this?  See my sample data and expected output below. Thank you so much for your help!!

 

Data Example:

ID Diag1 Diag2 Diag3 Diag4 Diag5
1 26-Oct-17   28-Oct-18 26-Mar-19  
2 . . . . .
3 22-Dec-21 . . . .
4 . . . 10-May-21 13-Nov-22
5 15-Oct-19 15-Oct-19 . 24-Jun-15 23-Oct-17
6 25-Dec-16 14-Oct-18 24-Nov-22 26-Dec-15 23-Oct-23

 

I would like the output to be as follows:

ID Diag1 Diag2 Diag3 Diag4 Diag5 Flag
1 26-Oct-17   28-Oct-18 26-Mar-19   Diag1
2 . . . . . Disease Free
3 22-Dec-21 . . . . Diag1
4 . . . 10-May-21 13-Nov-22 Diag4
5 15-Oct-19 15-Oct-19 . 24-Jun-15 23-Oct-17 Diag5
6 25-Dec-16 14-Oct-18 24-Nov-22 26-Dec-15 23-Oct-23 Diag4

 

 

 

 



One way is above, there are others. 

 

 

 

kelly2016
Fluorite | Level 6

Hi Reeza,

 

Thank you SO much for your super quick response! Your codes worked like magic and it is so concise! I have been struggling with this coding the whole afternoon and I am so grateful for your help! 

kelly2016
Fluorite | Level 6
Hi Reeza,

Thank you SO much for your super quick response! Your codes worked like magic and it is so concise! I have been struggling with this coding the whole afternoon and I am so grateful for your help!
PaigeMiller
Diamond | Level 26

Are these dates actual numeric SAS dates formatted as 26-Oct-17, or are they character strings?

--
Paige Miller
kelly2016
Fluorite | Level 6
Thank you, PaigeMiler. Yes, these SAS dates are numeric values in my dataset.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 737 views
  • 4 likes
  • 3 in conversation