Hi everyone. I have a data with multiple patients, each have one ore more rows, representing one or multiple pregnancies, even if a person has only one pregnancy, she can have more than one row (if she saw the doctor more than once during one pregnancy). I try to create a flag variable showing 'which pregnancy episode is this line of record indicates.'
To simplify the question, my input data looks like:
Note: although it looks in below simplified data that there are duplicate entries, but in my real data, the 'duplicate' row are different because I eliminate other variables/columns
data have;
input person_id condition_end_date;
datalines;
1 01/01/2014
1 01/10/2014
1 03/08/2015
1 03/08/2015
2 02/01/2014
2 09/08/2014
2 09/08/2014
2 06/07/2015
2 06/07/5015
;
run;
My output data, I want it to look like:
Person_id | Condition_end_date | Condition_id |
1 | 01/01/2014 | 1 |
1 | 01/10/2014 | 2 |
1 | 03/08/2015 | 3 |
1 | 03/08/2015 | 3 |
2 | 02/01/2014 | 1 |
2 | 09/08/2014 | 2 |
2 | 09/08/2014 | 2 |
2 | 06/07/2015 | 3 |
2 | 06/07/2015 | 3 |
like this? Btw, I have corrected an error in your 'Have' data set as I'm guessing 5015 should be 2015 🙂
data have;
input person_id condition_end_date:mmddyy10.;
format condition_end_date mmddyy10.;
datalines;
1 01/01/2014
1 01/10/2014
1 03/08/2015
1 03/08/2015
2 02/01/2014
2 09/08/2014
2 09/08/2014
2 06/07/2015
2 06/07/2015
;
run;
proc sort data=have;
by person_id condition_end_date;
run;
data want;
set have;
by person_id condition_end_date;
if first.person_id then condition_id=1;
else if first.condition_end_date then condition_id+1;
retain condition_id;
run;
I've tried something like below, but didn't work, I know the first. and last. part must have been wrong, but haven't figured out yet.
data test;
set have;
by person_id condition_end_date;
if first.condition_end_date then
condition_id=1;
else condition_id= condition_id+ 1;
if last.condition_end_date then
output;
run;
Any suggestions is welcomed!
Something like:
data test;
set have;
by person_id condition_end_date;
retain condition_id;
if first.condition_end_date then condition_id=1;
else condition_id=condition_id+1;
run;
like this? Btw, I have corrected an error in your 'Have' data set as I'm guessing 5015 should be 2015 🙂
data have;
input person_id condition_end_date:mmddyy10.;
format condition_end_date mmddyy10.;
datalines;
1 01/01/2014
1 01/10/2014
1 03/08/2015
1 03/08/2015
2 02/01/2014
2 09/08/2014
2 09/08/2014
2 06/07/2015
2 06/07/2015
;
run;
proc sort data=have;
by person_id condition_end_date;
run;
data want;
set have;
by person_id condition_end_date;
if first.person_id then condition_id=1;
else if first.condition_end_date then condition_id+1;
retain condition_id;
run;
Great, thank you so much @PeterClemmensen
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.