Hi,
I have a dataset like this:
ID Minutes Systolic Diastolic
3 6 123 69
3 10 110 62
3 15 111 66
.
.
.
3 185 137 113
4 8 118 79
4 11 73 46
4 16 85 50
.
.
.
4 149 141 79
I am trying to take an average each day, and as you can see, there is a problem since measures are missing for certain days for certain participants. I am trying to add lines for the missing time periods and carry the previous values down until there is another measurement. In other words:
ID Minutes Systolic Diastolic
3 6 123 69
3 7 123 69
3 8 123 69
3 9 123 69
3 10 110 62
3 11 110 62
3 12 110 62
3 13 110 62
3 14 110 62
3 15 111 66
.
.
.
3 185 137 113
4 8 118 79
4 9 118 79
4 10 118 79
4 11 73 46
.
.
.
4 149 141 79
Then I can calculate averages per day and rolling averages. There would be no line for ID 4 for 6 or 7 minutes because they start at minute 8, or after 149 minutes because that is the last measurement for that ID. I am unsure how to do this in a data step or proc SQL or if there is another proc that would be better. Any help would be much appreciated. Thank you!
You have a better chance of getting an answer if you present the data that you have (and maybe also the output you want) as a datastep, e.g.:
data have;
input ID Minutes Systolic Diastolic;
cards;
3 6 123 69
3 10 110 62
3 15 111 66
3 185 137 113
4 8 118 79
4 11 73 46
4 16 85 50
4 149 141 79
;run;
I think that what you want can be accomplished like this:
data want;
set have;
by id;
if not last.id then do;
_P_=_N_+1;
set have(keep=minutes rename=(minutes=_next_)) point=_P_;
do minutes=minutes to _next_-1;
output;
end;
end;
else output;
drop _:;
run;
- assuming that your WANT table is sorted by ID and MINUTES, as was your sample.
What it does: if this is not the last record for that ID, it reads the MINUTES variable from the next record (variable _NEXT_) and outputs for the whole period up to the next record. Else, it just outputs.
You have a better chance of getting an answer if you present the data that you have (and maybe also the output you want) as a datastep, e.g.:
data have;
input ID Minutes Systolic Diastolic;
cards;
3 6 123 69
3 10 110 62
3 15 111 66
3 185 137 113
4 8 118 79
4 11 73 46
4 16 85 50
4 149 141 79
;run;
I think that what you want can be accomplished like this:
data want;
set have;
by id;
if not last.id then do;
_P_=_N_+1;
set have(keep=minutes rename=(minutes=_next_)) point=_P_;
do minutes=minutes to _next_-1;
output;
end;
end;
else output;
drop _:;
run;
- assuming that your WANT table is sorted by ID and MINUTES, as was your sample.
What it does: if this is not the last record for that ID, it reads the MINUTES variable from the next record (variable _NEXT_) and outputs for the whole period up to the next record. Else, it just outputs.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.