Hi everyone, I have been struggling with some code. I have clinical data where each row represents one admission. I would like to restructure the data so that each row represents one patients. However, there is a number of tasks I would like to conduct during the process: 1. Patients specific information is mostly, but now always contained in the first admission. For example gender is always on the first admission, but the date of death (deathdate) can be in the row of any admission. That's why I would like to retain the value of deathdate when not missing ! 2. I would like to keep some specific data from the first admission always such as tumor stage. 3. I would like to conduct some operations between values from different admissions for example: calculate the difference in duration between the two admission dates. How can I do this most efficiently with the lowest number of new variables created, smallest number of DATA, DROP and RENAME statements? Please find examples of the data I have and what I want further. I used to use DO loop like this data want;
do until (Last.ID);
set have;
by ID;
select (admission);
when ('1') do; GenderNew = Gender; StageNew = stage; deathdate1 = deathdate; admission_date1 = admission_date; end;
when ('2') do; deathdate2 = deathdate; admission_date2 = admission_date; end;
otherwise;
end;
end;
drop admission gender stage deathdate admission_date;
run;
data want; set want;
format deathdate ddmmyy10.;
rename GenderNew = Gender StageNew = Stage;
Duration = admission_date2 - admission_date1;
deathdate = max(deathdate2, deathdate1);
drop admission_date1 admission_date2 deathdate1 deathdate2;
run; However, my method is annoying. Specially, that I need to create many new variables from the first observation instead of retaining them somehow. I have about 100 variables that I need to keep and it does not makes sense to make 100 new variables. Is there a more efficient way? Thanks in forward. Data example: data have; input id admission gender $ stage admission_date deathdate; format deathdate ddmmyy10.; cards; 1 1 m 2 5000 . 1 2 . . 5100 6500 2 1 f 1 5600 6600 2 2 . . 5900 . 3 1 f 4 5627 . 3 2 . 3 5830 7000 3 3 . 1 6000 . ; run; data want;
input id gender stage Duration deathdate;
format deathdate ddmmyy10.;
cards;
1 m 2 100 6500
2 f 1 200 6600
3 f 4 373 7000
;
run;
... View more