Hi everyone, the code below has been developped by Haikuo, it imputed the day missing with the LOCF value, except the baseline data, it wont be carried forward.
But, now I am interested to give the value 0 to these imputed days...please see the want dataset I am looking for:
proc format;
value $seq
'baseline'=1
'week2'=2
'week4'=3
'week8'=4
'week12'=5
;
value seq
1='baseline'
2='week2'
3='week4'
4='week8'
5='week12'
;
run;
data new;
length subjid $8 day $10 value 8;
input subjid day value;
datalines;
1 baseline 10
1 week2 12
1 week4 14
1 week8 16
1 week12 12
2 baseline 10
2 week2 12
2 week4 10
3 baseline 10
3 week2 3
3 week8 4
4 baseline 10
4 week8 4
;;;;
run;
data new1;
set new;
_day=put(day,$seq.);
run;
data want;
set new1;
by subjid;
set new1(firstobs=2 keep=_day rename=_day=__day) new1(obs=1 drop=_all_);
if not last.subjid then do _i=_day to __day-1;
day=put(_i,seq.);
/* another line changed */
if not missing(value) then output;
if _day eq 1 then call missing(value);
end;
else do _i=_day to 5;
day=put(_i,seq.);
output;
end;
/*drop _:;*/
run;
Result from the Hikuo code, the blodface records are imputed with the LOCF:
Obs subjid day value
1 1 baseline 10
2 1 week2 12
3 1 week4 14
4 1 week8 16
5 1 week12 12
6 2 baseline 10
7 2 week2 12
8 2 week4 10
9 2 week8 10
10 2 week12 10
11 3 baseline 10
12 3 week2 3
13 3 week4 3
14 3 week8 4
15 3 week12 4
16 4 baseline 10
17 4 week8 4
18 4 week12 4
Dataset Want I am looking for, (I want to generate this dataset using the above code)...I think I need to make little changes to get the 0 vales rather the LOCF values.
1 baseline 10
1 week2 12
1 week4 14
1 week8 16
1 week12 12
2 baseline 10
2 week2 12
2 week4 10
2 week8 0
2 week12 0
3 baseline 10
3 week2 3
3 week4 0
3 week8 4
3 week12 0
4 baseline 10
4 week8 4
4 week12 0
Thanks, V.
An easy way to get numeric missing changed to 0 or any other number is to use PROC STDIZE. Of course you also need a way to EXPAND the data if the observations are also missing.
As you no longer require duplication of values to replace missing values the process can be simplified.
Key features:
create an invalue seq for an input function rather than using put with automatic type conversion
Create a restricted Cartesian product of subjid and day (normally a bad idea!) to form a template of all possible subjid and day combinations
Use a left join to update the template with value from new
Use the sum function, with zero, to convert missing values to zero
NOTE: This code generates values for subjid = 4 and day = week2 and week4, which are not in your want table. If you use this code you may need a datastep to delete these values based on whether the last undeleted row was a baseline, if I understand your data correctly. I can add that if you require.
proc format;
invalue seq
'baseline'=1
'week2'=2
'week4'=3
'week8'=4
'week12'=5
;
quit ;
data new;
length subjid $8 day $10 value 8;
input subjid day value;
datalines;
1 baseline 10
1 week2 12
1 week4 14
1 week8 16
1 week12 12
2 baseline 10
2 week2 12
2 week4 10
3 baseline 10
3 week2 3
3 week8 4
4 baseline 10
4 week8 4
;
Proc SQL ;
Create Table template As
Select distinct
a.subjid
, b.day
, input (b.day, seq.) as seq
From new a
, new b
Order
By a.subjid
, seq
;
Create Table want (drop = seq) As
Select
tmp.*
, Sum (new.value, 0) As value
From template tmp
Left Join
new new
On new.subjid = tmp.subjid
And new.day = tmp.day
Order
By tmp.subjid
, tmp.seq
;
Select *
from want ;
Quit ;
subjid day value
------ --- -----
1 baseline 10
1 week2 12
1 week4 14
1 week8 16
1 week12 12
2 baseline 10
2 week2 12
2 week4 10
2 week8 0
2 week12 0
3 baseline 10
3 week2 3
3 week4 0
3 week8 4
3 week12 0
4 baseline 10
4 week2 0
4 week4 0
4 week8 4
4 week12 0
Regards
Richard in Oz
Thank RichardinOz for your great code, will be great, because the baseline data will not be carried forward, it means week2 and week8 , records after baseline if there are missing, they will reminin missing.
week4 , sorry
Hi,
If you read into Ksharp's code, you will find it only needs a minor tweak to meet your new requirement.
data new;
length subjid $8 day $10 value 8;
input subjid day value;
datalines;
1 baseline 10
1 week2 12
1 week4 14
1 week8 16
1 week12 12
2 baseline 10
2 week4 10
2 week8 10
3 baseline 10
3 week2 3
3 week8 4
4 baseline 10
4 week8 4
;;;;
run;
data want;
merge new new(firstobs=2 keep=day rename=( day=_day));
array a{4} $ _temporary_ ('week2' 'week4' 'week8' 'week12');
output;
if day =: 'week' then do;
end=whichc(_day,of a{*});
do i=whichc(day,of a{*})+1 to ifn(end=0,dim(a),end-1);
day=a{i}; value=0;output;
end;
end;
drop _day end i;
run;
proc print;run;
The underlined statement is the only one needed to be added.
Haikuo
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.