Why does your sample expected output include a subject=5 and aedecod="Blood creatinene increased"?
That output record is the only input record for this subject/aedecod combination. Yet your stated criterion includes "subjects that have same AEDECOD". I take this to mean records in which a given subject/aedecod combination appears more than once (and then keep those records which also have a non-missing date and AEOUT="Not-recovered/Not-resolved"). If your sample output is not a mistake, could you make your objective more precise? I'm missing something.
And I have the same question about why your output has a subject=5 and aedecod="Chloroma". Again, "Chloroma" appears only once for this subject.
Now if those records should not be in the output, then this program does what I understand you want.
data have;
x=_n_;
input subject @5 aedecod $ 5-42 @43 aestdtn yymmdd10. @54 aeout $ 54-80 @81 aeendtn yymmdd10.;
*format aestdtc aeendtc yymmdd10.;
format aestdtn aeendtn yymmdd10.;
datalines;
001 Fatigue 2019-06-03 Not-recovered/Not-resolved .
001 Nausea 2019-06-03 Not-recovered/Not-resolved .
001 Vomiting 2019-06-03 Not-recovered/Not-resolved .
003 Asthenia 2019-08-28 Recovered/Resolved 2019-08-28
003 Hypothension 2019-06-25 Recovered/Resolved 2019-08-08
003 Thrombocytopenia 2019-07-11 Recovered/Resolved 2019-07-23
003 Thrombocytopenia 2019-07-18 Recovered/Resolved 2019-08-04
003 Thrombocytopenia 2019-08-12 Not-recovered/Not-resolved 2019-08-12
004 Anaemia 2019-07-10 Not-recovered/Not-resolved .
004 Blood creatinine increased 2019-08-19 Recovered/Resolved 2019-09-04
004 Blood creatinine increased 2019-09-02 Recovered/Resolved 2019-11-13
004 Blood creatinine increased 2019-11-11 Recovered/Resolved 2019-12-08
004 Epistaxis 2019-01-21 Not-recovered/Not-resolved .
004 Epistaxis 2019-11-25 Recovered/Resolved 2019-11-28
004 Thrombocytopenia 2019-12-02 Not-recovered/Not-resolved .
004 Upper respiratory tract infection 2019-10-10 Recovered/Resolved 2019-10-13
005 Alanine aminotransfrerase increased 2019-08-14 Not-recovered/Not-resolved 2020-08-20
005 Alanine aminotransfrerase increased 2019-08-21 Recovered/Resolved 2020-09-06
005 Alanine aminotransfrerase increased 2019-09-07 Not-recovered/Not-resolved .
005 Aspartate aminotransfrerase increased 2019-08-14 Not-recovered/Not-resolved 2020-08-20
005 Aspartate aminotransfrerase increased 2019-08-21 Recovered/Resolved 2020-09-06
005 Aspartate aminotransfrerase increased 2019-09-07 Not-recovered/Not-resolved .
005 Blood creatinine increased 2019-05-29 Not-recovered/Not-resolved 2020-07-20
005 Cellulitis 2020-07-01 Not-recovered/Not-resolved 2020-08-19
005 Cellulitis 2020-08-20 Recovering/Resolving 2020-09-25
005 Cellulitis 2020-08-26 Not-recovered/Not-resolved 2020-10-01
005 Chills 2019-12-26 Recovered/Resolved 2019-12-27
005 Chloroma 2020-04-30 Not-recovered/Not-resolved 2020-12-30
005 Cholelithiasis 2020-08-18 Recovered/Resolved 2020-08-20
005 Epistaxis 2019-12-17 Recovering/Resolving 2019-12-26
005 Pyrexia 2020-08-16 Recovered/Resolved 2020-08-20
005 Pyrexia 2020-08-20 Not-recovered/Not-resolved .
005 Subcutaneous haematoma 2020-01-24 Not-recovered/Not-resolved .
005 Urinary Tract infection 2020-08-20 Recovering/Resolving 2020-08-25
005 Urinary Tract infection 2020-08-26 Not-recovered/Not-resolved .
run;
data want;
set have (where=(aeout^='Not-recovered/Not-resolved'))
have (where=(aeout='Not-recovered/Not-resolved' and aeendtn=.))
have (where=(aeout='Not-recovered/Not-resolved' and aeendtn^=.));
by subject aedecod;
if aeout='Not-recovered/Not-resolved' and aeendtn^=.; /*all qualifying records ...*/
if not (first.aedecod=1 and last.aedecod=1) ; /* ... that are not singletons */
run;
Fortunately, your data are already sorted by subject/aedecod. So for each subject/aedecod the program above first reads all instances that do not have aedecod="Not-recovered/Not-resolved", followed by any dateless instances that do have aedecod="Not-recovered/Not-resolved, finally followed by qualifying records (non-missing date and aedecod="Not-recovered/Not-resolved"). So as long as those qualifying records are not the only ones for that subject/aedecod, the record is output.
... View more