Hello SAS Community,
I'm a relatively new SAS programmer and I really need your help coding this! I have a dataset that contains 5 columns, each representing a different disease diagnosis date in date format (date9.). Some columns contain missing values. I need to compare the diagnosis dates across these 5 columns when the value is not missing and flag or print out the earliest diagnosis date variable. Essentially, to find which disease diagnosis occurred first and flag that.
Is there an easy way to code this? See my sample data and expected output below. Thank you so much for your help!!
Data Example:
ID | Diag1 | Diag2 | Diag3 | Diag4 | Diag5 |
1 | 26-Oct-17 | 28-Oct-18 | 26-Mar-19 | ||
2 | . | . | . | . | . |
3 | 22-Dec-21 | . | . | . | . |
4 | . | . | . | 10-May-21 | 13-Nov-22 |
5 | 15-Oct-19 | 15-Oct-19 | . | 24-Jun-15 | 23-Oct-17 |
6 | 25-Dec-16 | 14-Oct-18 | 24-Nov-22 | 26-Dec-15 | 23-Oct-23 |
I would like the output to be as follows:
ID | Diag1 | Diag2 | Diag3 | Diag4 | Diag5 | Flag |
1 | 26-Oct-17 | 28-Oct-18 | 26-Mar-19 | Diag1 | ||
2 | . | . | . | . | . | Disease Free |
3 | 22-Dec-21 | . | . | . | . | Diag1 |
4 | . | . | . | 10-May-21 | 13-Nov-22 | Diag4 |
5 | 15-Oct-19 | 15-Oct-19 | . | 24-Jun-15 | 23-Oct-17 | Diag5 |
6 | 25-Dec-16 | 14-Oct-18 | 24-Nov-22 | 26-Dec-15 | 23-Oct-23 | Diag4 |
data want;
set have;
length flag $40.;
array _diag(*) diag1-diag5;
if nmiss(of _diag(*)) ne dim(_diag) then do;
*find smallest value;
min_date = min(of _diag(*));
*find first location of smallest value;
index_min_date = whichn(min_date, of _diag(*));
*find name of variable at index;
flag = vname(_diag(index_min_date));
end;
else flag = 'Disease Free';
run;
@kelly2016 wrote:
Hello SAS Community,
I'm a relatively new SAS programmer and I really need your help coding this! I have a dataset that contains 5 columns, each representing a different disease diagnosis date in date format (date9.). Some columns contain missing values. I need to compare the diagnosis dates across these 5 columns when the value is not missing and flag or print out the earliest diagnosis date variable. Essentially, to find which disease diagnosis occurred first and flag that.
Is there an easy way to code this? See my sample data and expected output below. Thank you so much for your help!!
Data Example:
ID Diag1 Diag2 Diag3 Diag4 Diag5 1 26-Oct-17 28-Oct-18 26-Mar-19 2 . . . . . 3 22-Dec-21 . . . . 4 . . . 10-May-21 13-Nov-22 5 15-Oct-19 15-Oct-19 . 24-Jun-15 23-Oct-17 6 25-Dec-16 14-Oct-18 24-Nov-22 26-Dec-15 23-Oct-23
I would like the output to be as follows:
ID Diag1 Diag2 Diag3 Diag4 Diag5 Flag 1 26-Oct-17 28-Oct-18 26-Mar-19 Diag1 2 . . . . . Disease Free 3 22-Dec-21 . . . . Diag1 4 . . . 10-May-21 13-Nov-22 Diag4 5 15-Oct-19 15-Oct-19 . 24-Jun-15 23-Oct-17 Diag5 6 25-Dec-16 14-Oct-18 24-Nov-22 26-Dec-15 23-Oct-23 Diag4
One way is above, there are others.
data want;
set have;
length flag $40.;
array _diag(*) diag1-diag5;
if nmiss(of _diag(*)) ne dim(_diag) then do;
*find smallest value;
min_date = min(of _diag(*));
*find first location of smallest value;
index_min_date = whichn(min_date, of _diag(*));
*find name of variable at index;
flag = vname(_diag(index_min_date));
end;
else flag = 'Disease Free';
run;
@kelly2016 wrote:
Hello SAS Community,
I'm a relatively new SAS programmer and I really need your help coding this! I have a dataset that contains 5 columns, each representing a different disease diagnosis date in date format (date9.). Some columns contain missing values. I need to compare the diagnosis dates across these 5 columns when the value is not missing and flag or print out the earliest diagnosis date variable. Essentially, to find which disease diagnosis occurred first and flag that.
Is there an easy way to code this? See my sample data and expected output below. Thank you so much for your help!!
Data Example:
ID Diag1 Diag2 Diag3 Diag4 Diag5 1 26-Oct-17 28-Oct-18 26-Mar-19 2 . . . . . 3 22-Dec-21 . . . . 4 . . . 10-May-21 13-Nov-22 5 15-Oct-19 15-Oct-19 . 24-Jun-15 23-Oct-17 6 25-Dec-16 14-Oct-18 24-Nov-22 26-Dec-15 23-Oct-23
I would like the output to be as follows:
ID Diag1 Diag2 Diag3 Diag4 Diag5 Flag 1 26-Oct-17 28-Oct-18 26-Mar-19 Diag1 2 . . . . . Disease Free 3 22-Dec-21 . . . . Diag1 4 . . . 10-May-21 13-Nov-22 Diag4 5 15-Oct-19 15-Oct-19 . 24-Jun-15 23-Oct-17 Diag5 6 25-Dec-16 14-Oct-18 24-Nov-22 26-Dec-15 23-Oct-23 Diag4
One way is above, there are others.
Hi Reeza,
Thank you SO much for your super quick response! Your codes worked like magic and it is so concise! I have been struggling with this coding the whole afternoon and I am so grateful for your help!
Are these dates actual numeric SAS dates formatted as 26-Oct-17, or are they character strings?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.