I have a dataset that looks like this:
Patient_ID | Age | Obs_date | Obs_value | treatment |
1 | 19 | 03/04/2020 |
| A |
1 | . | 12/09/2018 |
| A |
1 | . | 01/05/2019 |
| . |
2 | . | 12/22/2021 |
| B |
2 | 35 | 06/12/2018 |
| A |
2 | . | 04/12/2014 |
| . |
2 | . | 09/17/2020 |
| . |
2 | . | 08/28/2015 |
| . |
2 | . | 09/09/2021 |
| C |
2 | . | 03/31/2000 |
| . |
3 | 16 | 02/04/2017 |
| A |
3 | . | 09/12/2018 |
| A |
3 | . | 10/10/2020 |
| A |
4 | 44 | 05/22/2016 |
| B |
5 | 28 | 05/08/2016 |
| B |
5 | . | 09/05/2016 |
| . |
5 | . | 01/25/2017 |
| . |
5 | . | 01/09/2014 |
| . |
5 | . | 12/12/2014 |
| A |
5 | . | 07/18/2016 |
| A |
5 | . | 09/12/2012 |
| . |
5 | . | 06/07/2013 |
| A |
5 | . | 01/30/2017 |
| C |
I am trying to create a dataset with patients who:
The new dataset should exclude patient_ID 3 (below 18 years old) and patient_IDs 4&5 (all Obs_dates are prior to 03/01/2017)
I tried the following:
data want;
set have;
if age =>18;
if '1mar2017'd <= obs_date;
run;
The output successfully deleted patient_IDs 3, 4, and 5; however, for patient_IDs 1&2, only one observation is kept for each. I’d like to keep all observations for patient_IDs 1&2 (screenshot below)
I'm looking for an output like the following:
Patient_ID | Age | Obs_date | Obs_value | treatment |
1 | 19 | 03/04/2020 |
| A |
1 | . | 12/09/2018 |
| A |
1 | . | 01/05/2019 |
| . |
2 | . | 12/22/2021 |
| B |
2 | 35 | 06/12/2018 |
| A |
2 | . | 04/12/2014 |
| . |
2 | . | 09/17/2020 |
| . |
2 | . | 08/28/2015 |
| . |
2 | . | 09/09/2021 |
| C |
2 | . | 03/31/2000 |
| . |
@tbe7 wrote:
I have a dataset that looks like this:
Patient_ID
Age
Obs_date
Obs_value
treatment
1
19
03/04/2020
A
1
.
12/09/2018
A
1
.
01/05/2019
.
2
.
12/22/2021
B
2
35
06/12/2018
A
2
.
04/12/2014
.
2
.
09/17/2020
.
2
.
08/28/2015
.
2
.
09/09/2021
C
2
.
03/31/2000
.
3
16
02/04/2017
A
3
.
09/12/2018
A
3
.
10/10/2020
A
4
44
05/22/2016
B
5
28
05/08/2016
B
5
.
09/05/2016
.
5
.
01/25/2017
.
5
.
01/09/2014
.
5
.
12/12/2014
A
5
.
07/18/2016
A
5
.
09/12/2012
.
5
.
06/07/2013
A
5
.
01/30/2017
C
I am trying to create a dataset with patients who:
- Aged≥18 years old
- Have Obs_date after from 03/01/2017 onwards
The new dataset should exclude patient_ID 3 (below 18 years old) and patient_IDs 4&5 (all Obs_dates are prior to 03/01/2017)
I tried the following:
data want;
set have;
if age =>18;
if '1mar2017'd <= obs_date;
run;
The output successfully deleted patient_IDs 3, 4, and 5; however, for patient_IDs 1&2, only one observation is kept for each. I’d like to keep all observations for patient_IDs 1&2 (screenshot below)
I'm looking for an output like the following:
Patient_ID
Age
Obs_date
Obs_value
treatment
1
19
03/04/2020
A
1
.
12/09/2018
A
1
.
01/05/2019
.
2
.
12/22/2021
B
2
35
06/12/2018
A
2
.
04/12/2014
.
2
.
09/17/2020
.
2
.
08/28/2015
.
2
.
09/09/2021
C
2
.
03/31/2000
.
Maybe if made sure AGE was on all of the records...
Though you have dates far enough apart that if the "age" shown is as of that date the subject would be 19 or older for later observations.
delete records if none of the date observations occurred after a specific date
the value 35 for age should go for the 4th observation i.e., when patient_ID first turn to '2'. Sorry for the confusion!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.