Hi all,
I have the below data and I would like to obtain subjects that have same AEDECOD and AEOUT="Not-recovered/Not-resolved" but end date (AEENDTC) is present.
data have;
input subject @5 aedecod $ 5-42 @43 aestdtn yymmdd10. @54 aeout $ 54-81 @80 aeendtn yymmdd10.;
format aestdtc aeendtc yymmdd10.;
datalines;
001 Fatigue 2019-06-03 Not-recovered/Not-resolved .
001 Nausea 2019-06-03 Not-recovered/Not-resolved .
001 Vomiting 2019-06-03 Not-recovered/Not-resolved .
003 Asthenia 2019-08-28 Recovered/Resolved 2019-08-28
003 Hypothension 2019-06-25 Recovered/Resolved 2019-08-08
003 Thrombocytopenia 2019-07-11 Recovered/Resolved 2019-07-23
003 Thrombocytopenia 2019-07-18 Recovered/Resolved 2019-08-04
003 Thrombocytopenia 2019-08-12 Not-recovered/Not-resolved 2019-08-12
004 Anaemia 2019-07-10 Not-recovered/Not-resolved .
004 Blood creatinine increased 2019-08-19 Recovered/Resolved 2019-09-04
004 Blood creatinine increased 2019-09-02 Recovered/Resolved 2019-11-13
004 Blood creatinine increased 2019-11-11 Recovered/Resolved 2019-12-08
004 Epistaxis 2019-01-21 Not-recovered/Not-resolved .
004 Epistaxis 2019-11-25 Recovered/Resolved 2019-11-28
004 Thrombocytopenia 2019-12-02 Not-recovered/Not-resolved .
004 Upper respiratory tract infection 2019-10-10 Recovered/Resolved 2019-10-13
005 Alanine aminotransfrerase increased 2019-08-14 Not-recovered/Not-resolved 2020-08-20
005 Alanine aminotransfrerase increased 2019-08-21 Recovered/Resolved 2020-09-06
005 Alanine aminotransfrerase increased 2019-09-07 Not-recovered/Not-resolved .
005 Aspartate aminotransfrerase increased 2019-08-14 Not-recovered/Not-resolved 2020-08-20
005 Aspartate aminotransfrerase increased 2019-08-21 Recovered/Resolved 2020-09-06
005 Aspartate aminotransfrerase increased 2019-09-07 Not-recovered/Not-resolved .
005 Blood creatinine increased 2019-05-29 Not-recovered/Not-resolved 2020-07-20
005 Cellulitis 2020-07-01 Not-recovered/Not-resolved 2020-08-19
005 Cellulitis 2020-08-20 Recovering/Resolving 2020-09-25
005 Cellulitis 2020-08-26 Not-recovered/Not-resolved 2020-10-01
005 Chills 2019-12-26 Recovered/Resolved 2019-12-27
005 Chloroma 2020-04-30 Not-recovered/Not-resolved 2020-12-30
005 Cholelithiasis 2020-08-18 Recovered/Resolved 2020-08-20
005 Epistaxis 2019-12-17 Recovering/Resolving 2019-12-26
005 Pyrexia 2020-08-16 Recovered/Resolved 2020-08-20
005 Pyrexia 2020-08-20 Not-recovered/Not-resolved .
005 Subcutaneous haematoma 2020-01-24 Not-recovered/Not-resolved .
005 Urinary Tract infection 2020-08-20 Recovering/Resolving 2020-08-25
005 Urinary Tract infection 2020-08-26 Not-recovered/Not-resolved .
;
run;
I was trying the below code:
proc sort data=have;
by subject aedecod aestdtn;
run;
data aeout1;
set have;
by subject aedecod aestdtn;
if first.subject and first.decod then ord=1;
else ord+1;
run;
proc sort data=aeout1 out=aeout2;
by subject aedecod aestdtn;
where ord ge 2;
run;
data aeout3;
set aeout2;
by subject aedecod aestdtn;
if last.subject;
run;
data aeout4;
merge aeout1(in=ot1) aeout(in=ot3);
by usubjid;
if ot1 and not ot3;
length note $100;
if aeout="Not-recovered/Not-resolved" and aeendtn ne . then note=strip(subject)||'AEENDTC is populated with AEOUT still being "Not-recovered/Not-resolved with same DECODE"';
else note='';
proc print;
run;
I am not getting what I desired for with the above code.
I want like below:
As you see above, all the different decodes for same subject are filtered with aeout not resovered/not resolved and aeendtc not missing
The following code gives the result you wanted. Please note the correction I made to your datastep code.
Your phrase "subjects that have same AEDECOD" is not comprehensible to me.
The result however is the same as you wanted as given below
data have;
input subject @5 aedecod $ 5-42 @43 aestdtn yymmdd10. @54 aeout $ 54-80 @81 aeendtn yymmdd10.;
format aestdtn aeendtn yymmdd10.;
datalines;
001 Fatigue 2019-06-03 Not-recovered/Not-resolved .
001 Nausea 2019-06-03 Not-recovered/Not-resolved .
001 Vomiting 2019-06-03 Not-recovered/Not-resolved .
003 Asthenia 2019-08-28 Recovered/Resolved 2019-08-28
003 Hypothension 2019-06-25 Recovered/Resolved 2019-08-08
003 Thrombocytopenia 2019-07-11 Recovered/Resolved 2019-07-23
003 Thrombocytopenia 2019-07-18 Recovered/Resolved 2019-08-04
003 Thrombocytopenia 2019-08-12 Not-recovered/Not-resolved 2019-08-12
004 Anaemia 2019-07-10 Not-recovered/Not-resolved .
004 Blood creatinine increased 2019-08-19 Recovered/Resolved 2019-09-04
004 Blood creatinine increased 2019-09-02 Recovered/Resolved 2019-11-13
004 Blood creatinine increased 2019-11-11 Recovered/Resolved 2019-12-08
004 Epistaxis 2019-01-21 Not-recovered/Not-resolved .
004 Epistaxis 2019-11-25 Recovered/Resolved 2019-11-28
004 Thrombocytopenia 2019-12-02 Not-recovered/Not-resolved .
004 Upper respiratory tract infection 2019-10-10 Recovered/Resolved 2019-10-13
005 Alanine aminotransfrerase increased 2019-08-14 Not-recovered/Not-resolved 2020-08-20
005 Alanine aminotransfrerase increased 2019-08-21 Recovered/Resolved 2020-09-06
005 Alanine aminotransfrerase increased 2019-09-07 Not-recovered/Not-resolved .
005 Aspartate aminotransfrerase increased 2019-08-14 Not-recovered/Not-resolved 2020-08-20
005 Aspartate aminotransfrerase increased 2019-08-21 Recovered/Resolved 2020-09-06
005 Aspartate aminotransfrerase increased 2019-09-07 Not-recovered/Not-resolved .
005 Blood creatinine increased 2019-05-29 Not-recovered/Not-resolved 2020-07-20
005 Cellulitis 2020-07-01 Not-recovered/Not-resolved 2020-08-19
005 Cellulitis 2020-08-20 Recovering/Resolving 2020-09-25
005 Cellulitis 2020-08-26 Not-recovered/Not-resolved 2020-10-01
005 Chills 2019-12-26 Recovered/Resolved 2019-12-27
005 Chloroma 2020-04-30 Not-recovered/Not-resolved 2020-12-30
005 Cholelithiasis 2020-08-18 Recovered/Resolved 2020-08-20
005 Epistaxis 2019-12-17 Recovering/Resolving 2019-12-26
005 Pyrexia 2020-08-16 Recovered/Resolved 2020-08-20
005 Pyrexia 2020-08-20 Not-recovered/Not-resolved .
005 Subcutaneous haematoma 2020-01-24 Not-recovered/Not-resolved .
005 Urinary Tract infection 2020-08-20 Recovering/Resolving 2020-08-25
005 Urinary Tract infection 2020-08-26 Not-recovered/Not-resolved .
;
run;
proc sql;
select * from have
where not missing(aeendtn)
and trim(AEOUT)="Not-recovered/Not-resolved";
quit;
The result is
Why does your sample expected output include a subject=5 and aedecod="Blood creatinene increased"?
That output record is the only input record for this subject/aedecod combination. Yet your stated criterion includes "subjects that have same AEDECOD". I take this to mean records in which a given subject/aedecod combination appears more than once (and then keep those records which also have a non-missing date and AEOUT="Not-recovered/Not-resolved"). If your sample output is not a mistake, could you make your objective more precise? I'm missing something.
And I have the same question about why your output has a subject=5 and aedecod="Chloroma". Again, "Chloroma" appears only once for this subject.
Now if those records should not be in the output, then this program does what I understand you want.
data have;
x=_n_;
input subject @5 aedecod $ 5-42 @43 aestdtn yymmdd10. @54 aeout $ 54-80 @81 aeendtn yymmdd10.;
*format aestdtc aeendtc yymmdd10.;
format aestdtn aeendtn yymmdd10.;
datalines;
001 Fatigue 2019-06-03 Not-recovered/Not-resolved .
001 Nausea 2019-06-03 Not-recovered/Not-resolved .
001 Vomiting 2019-06-03 Not-recovered/Not-resolved .
003 Asthenia 2019-08-28 Recovered/Resolved 2019-08-28
003 Hypothension 2019-06-25 Recovered/Resolved 2019-08-08
003 Thrombocytopenia 2019-07-11 Recovered/Resolved 2019-07-23
003 Thrombocytopenia 2019-07-18 Recovered/Resolved 2019-08-04
003 Thrombocytopenia 2019-08-12 Not-recovered/Not-resolved 2019-08-12
004 Anaemia 2019-07-10 Not-recovered/Not-resolved .
004 Blood creatinine increased 2019-08-19 Recovered/Resolved 2019-09-04
004 Blood creatinine increased 2019-09-02 Recovered/Resolved 2019-11-13
004 Blood creatinine increased 2019-11-11 Recovered/Resolved 2019-12-08
004 Epistaxis 2019-01-21 Not-recovered/Not-resolved .
004 Epistaxis 2019-11-25 Recovered/Resolved 2019-11-28
004 Thrombocytopenia 2019-12-02 Not-recovered/Not-resolved .
004 Upper respiratory tract infection 2019-10-10 Recovered/Resolved 2019-10-13
005 Alanine aminotransfrerase increased 2019-08-14 Not-recovered/Not-resolved 2020-08-20
005 Alanine aminotransfrerase increased 2019-08-21 Recovered/Resolved 2020-09-06
005 Alanine aminotransfrerase increased 2019-09-07 Not-recovered/Not-resolved .
005 Aspartate aminotransfrerase increased 2019-08-14 Not-recovered/Not-resolved 2020-08-20
005 Aspartate aminotransfrerase increased 2019-08-21 Recovered/Resolved 2020-09-06
005 Aspartate aminotransfrerase increased 2019-09-07 Not-recovered/Not-resolved .
005 Blood creatinine increased 2019-05-29 Not-recovered/Not-resolved 2020-07-20
005 Cellulitis 2020-07-01 Not-recovered/Not-resolved 2020-08-19
005 Cellulitis 2020-08-20 Recovering/Resolving 2020-09-25
005 Cellulitis 2020-08-26 Not-recovered/Not-resolved 2020-10-01
005 Chills 2019-12-26 Recovered/Resolved 2019-12-27
005 Chloroma 2020-04-30 Not-recovered/Not-resolved 2020-12-30
005 Cholelithiasis 2020-08-18 Recovered/Resolved 2020-08-20
005 Epistaxis 2019-12-17 Recovering/Resolving 2019-12-26
005 Pyrexia 2020-08-16 Recovered/Resolved 2020-08-20
005 Pyrexia 2020-08-20 Not-recovered/Not-resolved .
005 Subcutaneous haematoma 2020-01-24 Not-recovered/Not-resolved .
005 Urinary Tract infection 2020-08-20 Recovering/Resolving 2020-08-25
005 Urinary Tract infection 2020-08-26 Not-recovered/Not-resolved .
run;
data want;
set have (where=(aeout^='Not-recovered/Not-resolved'))
have (where=(aeout='Not-recovered/Not-resolved' and aeendtn=.))
have (where=(aeout='Not-recovered/Not-resolved' and aeendtn^=.));
by subject aedecod;
if aeout='Not-recovered/Not-resolved' and aeendtn^=.; /*all qualifying records ...*/
if not (first.aedecod=1 and last.aedecod=1) ; /* ... that are not singletons */
run;
Fortunately, your data are already sorted by subject/aedecod. So for each subject/aedecod the program above first reads all instances that do not have aedecod="Not-recovered/Not-resolved", followed by any dateless instances that do have aedecod="Not-recovered/Not-resolved, finally followed by qualifying records (non-missing date and aedecod="Not-recovered/Not-resolved"). So as long as those qualifying records are not the only ones for that subject/aedecod, the record is output.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.