Hi everyone. I have a data with multiple patients, each have one ore more rows, representing one or multiple pregnancies, even if a person has only one pregnancy, she can have more than one row (if she saw the doctor more than once during one pregnancy). I try to create a flag variable showing 'which pregnancy episode is this line of record indicates.'
To simplify the question, my input data looks like:
Note: although it looks in below simplified data that there are duplicate entries, but in my real data, the 'duplicate' row are different because I eliminate other variables/columns
data have;
input person_id condition_end_date;
datalines;
1 01/01/2014
1 01/10/2014
1 03/08/2015
1 03/08/2015
2 02/01/2014
2 09/08/2014
2 09/08/2014
2 06/07/2015
2 06/07/5015
;
run;
My output data, I want it to look like:
Person_id | Condition_end_date | Condition_id |
1 | 01/01/2014 | 1 |
1 | 01/10/2014 | 2 |
1 | 03/08/2015 | 3 |
1 | 03/08/2015 | 3 |
2 | 02/01/2014 | 1 |
2 | 09/08/2014 | 2 |
2 | 09/08/2014 | 2 |
2 | 06/07/2015 | 3 |
2 | 06/07/2015 | 3 |
like this? Btw, I have corrected an error in your 'Have' data set as I'm guessing 5015 should be 2015 🙂
data have;
input person_id condition_end_date:mmddyy10.;
format condition_end_date mmddyy10.;
datalines;
1 01/01/2014
1 01/10/2014
1 03/08/2015
1 03/08/2015
2 02/01/2014
2 09/08/2014
2 09/08/2014
2 06/07/2015
2 06/07/2015
;
run;
proc sort data=have;
by person_id condition_end_date;
run;
data want;
set have;
by person_id condition_end_date;
if first.person_id then condition_id=1;
else if first.condition_end_date then condition_id+1;
retain condition_id;
run;
I've tried something like below, but didn't work, I know the first. and last. part must have been wrong, but haven't figured out yet.
data test;
set have;
by person_id condition_end_date;
if first.condition_end_date then
condition_id=1;
else condition_id= condition_id+ 1;
if last.condition_end_date then
output;
run;
Any suggestions is welcomed!
Something like:
data test;
set have;
by person_id condition_end_date;
retain condition_id;
if first.condition_end_date then condition_id=1;
else condition_id=condition_id+1;
run;
like this? Btw, I have corrected an error in your 'Have' data set as I'm guessing 5015 should be 2015 🙂
data have;
input person_id condition_end_date:mmddyy10.;
format condition_end_date mmddyy10.;
datalines;
1 01/01/2014
1 01/10/2014
1 03/08/2015
1 03/08/2015
2 02/01/2014
2 09/08/2014
2 09/08/2014
2 06/07/2015
2 06/07/2015
;
run;
proc sort data=have;
by person_id condition_end_date;
run;
data want;
set have;
by person_id condition_end_date;
if first.person_id then condition_id=1;
else if first.condition_end_date then condition_id+1;
retain condition_id;
run;
Great, thank you so much @PeterClemmensen
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.