- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi friend, I have a data set that some of the participants are missing some of the visit data. I want to make up those missing for later imputation. I can only come up with the method with generating a missing data set for those participants and then stack with the original data set. Could you show me a faster way?
Thanks.
data I have
Obs | randomization | visit | tac | base_tac |
1 | 618001 | 2 | 8824 | 8824 |
2 | 618001 | 3 | 9088 | 8824 |
3 | 618001 | 4 | 9784 | 8824 |
4 | 618002 | 2 | 7962 | 7962 |
5 | 618002 | 3 | 8318 | 7962 |
6 | 618002 | 4 | 8819 | 7962 |
7 | 618003 | 2 | 10630 | 10630 |
data I want
Obs | randomization | visit | tac | base_tac |
1 | 618001 | 2 | 8824 | 8824 |
2 | 618001 | 3 | 9088 | 8824 |
3 | 618001 | 4 | 9784 | 8824 |
4 | 618002 | 2 | 7962 | 7962 |
5 | 618002 | 3 | 8318 | 7962 |
6 | 618002 | 4 | 8819 | 7962 |
7 | 618003 | 2 | 10630 | 10630 |
7 | 618003 | 3 | 10630 | |
7 | 618003 | 4 | 10630 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Could you explain further ... why does 618003 get two extra rows, and the other IDs do not get additional rows?
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@PaigeMiller. Thank you for the interest. Each ID should have 3 visits. The visit3,4 is missing for ID 618003. Therefore, if I want to impute the missing values for ID 618003, I think I have to generate the missing cells for ID 618003. The rest of the ID have visit2,3,4, without missing. That is why only ID 618003 requires the additional two rows. Is it clear? Please let me know.
Thank you again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is this what you are looking for :
data have;
input Obs randomization visit tac base_tac ;
cards;
1 618001 2 22 22
2 618001 3 33 22
3 618001 4 44 33
4 618002 2 44 33
5 618002 3 33 33
6 618002 4 33 33
7 618003 2 33 55
;
run;
Proc SQl;
Create table temp as
select distinct randomization
from have;
Quit;
data visit;
do visit=2 to 4;
output;
end;
run;
data temp2;
set temp;
do i=1 to nobs_;
set visit point=i nobs=nobs_; output;
end;
run;
proc sort data=have;
by randomization visit;
run;
data want;
merge have temp2;
by randomization visit;
retain base_tac_;
if first.randomization then base_tac_=base_tac;
if missing(base_tac) then base_tac=base_tac_;
obs=_n_;
drop base_tac_;
run;