Hi,
I have a dataset like this:
ID Minutes Systolic Diastolic
3 6 123 69
3 10 110 62
3 15 111 66
.
.
.
3 185 137 113
4 8 118 79
4 11 73 46
4 16 85 50
.
.
.
4 149 141 79
I am trying to take an average each day, and as you can see, there is a problem since measures are missing for certain days for certain participants. I am trying to add lines for the missing time periods and carry the previous values down until there is another measurement. In other words:
ID Minutes Systolic Diastolic
3 6 123 69
3 7 123 69
3 8 123 69
3 9 123 69
3 10 110 62
3 11 110 62
3 12 110 62
3 13 110 62
3 14 110 62
3 15 111 66
.
.
.
3 185 137 113
4 8 118 79
4 9 118 79
4 10 118 79
4 11 73 46
.
.
.
4 149 141 79
Then I can calculate averages per day and rolling averages. There would be no line for ID 4 for 6 or 7 minutes because they start at minute 8, or after 149 minutes because that is the last measurement for that ID. I am unsure how to do this in a data step or proc SQL or if there is another proc that would be better. Any help would be much appreciated. Thank you!
You have a better chance of getting an answer if you present the data that you have (and maybe also the output you want) as a datastep, e.g.:
data have;
input ID Minutes Systolic Diastolic;
cards;
3 6 123 69
3 10 110 62
3 15 111 66
3 185 137 113
4 8 118 79
4 11 73 46
4 16 85 50
4 149 141 79
;run;
I think that what you want can be accomplished like this:
data want;
set have;
by id;
if not last.id then do;
_P_=_N_+1;
set have(keep=minutes rename=(minutes=_next_)) point=_P_;
do minutes=minutes to _next_-1;
output;
end;
end;
else output;
drop _:;
run;
- assuming that your WANT table is sorted by ID and MINUTES, as was your sample.
What it does: if this is not the last record for that ID, it reads the MINUTES variable from the next record (variable _NEXT_) and outputs for the whole period up to the next record. Else, it just outputs.
You have a better chance of getting an answer if you present the data that you have (and maybe also the output you want) as a datastep, e.g.:
data have;
input ID Minutes Systolic Diastolic;
cards;
3 6 123 69
3 10 110 62
3 15 111 66
3 185 137 113
4 8 118 79
4 11 73 46
4 16 85 50
4 149 141 79
;run;
I think that what you want can be accomplished like this:
data want;
set have;
by id;
if not last.id then do;
_P_=_N_+1;
set have(keep=minutes rename=(minutes=_next_)) point=_P_;
do minutes=minutes to _next_-1;
output;
end;
end;
else output;
drop _:;
run;
- assuming that your WANT table is sorted by ID and MINUTES, as was your sample.
What it does: if this is not the last record for that ID, it reads the MINUTES variable from the next record (variable _NEXT_) and outputs for the whole period up to the next record. Else, it just outputs.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.