I have a dataset that looks like this:
Patient_ID | Age | Obs_date | Obs_value | treatment |
1 | 19 | 03/04/2020 | | A |
1 | . | 12/09/2018 | | A |
1 | . | 01/05/2019 | | . |
2 | . | 12/22/2021 | | B |
2 | 35 | 06/12/2018 | | A |
2 | . | 04/12/2014 | | . |
2 | . | 09/17/2020 | | . |
2 | . | 08/28/2015 | | . |
2 | . | 09/09/2021 | | C |
2 | . | 03/31/2000 | | . |
3 | 16 | 02/04/2017 | | A |
3 | . | 09/12/2018 | | A |
3 | . | 10/10/2020 | | A |
4 | 44 | 05/22/2016 | | B |
5 | 28 | 05/08/2016 | | B |
5 | . | 09/05/2016 | | . |
5 | . | 01/25/2017 | | . |
5 | . | 01/09/2014 | | . |
5 | . | 12/12/2014 | | A |
5 | . | 07/18/2016 | | A |
5 | . | 09/12/2012 | | . |
5 | . | 06/07/2013 | | A |
5 | . | 01/30/2017 | | C |
I am trying to create a dataset with patients who:
- Aged≥18 years old
- Have Obs_date after from 03/01/2017 onwards
The new dataset should exclude patient_ID 3 (below 18 years old) and patient_IDs 4&5 (all Obs_dates are prior to 03/01/2017)
I tried the following:
data want;
set have;
if age =>18;
if '1mar2017'd <= obs_date;
run;
The output successfully deleted patient_IDs 3, 4, and 5; however, for patient_IDs 1&2, only one observation is kept for each. I’d like to keep all observations for patient_IDs 1&2 (screenshot below)

I'm looking for an output like the following:
Patient_ID | Age | Obs_date | Obs_value | treatment |
1 | 19 | 03/04/2020 | | A |
1 | . | 12/09/2018 | | A |
1 | . | 01/05/2019 | | . |
2 | . | 12/22/2021 | | B |
2 | 35 | 06/12/2018 | | A |
2 | . | 04/12/2014 | | . |
2 | . | 09/17/2020 | | . |
2 | . | 08/28/2015 | | . |
2 | . | 09/09/2021 | | C |
2 | . | 03/31/2000 | | . |