Hi everyone, the code below has been developped by Haikuo, it imputed the day missing with the LOCF value, except the baseline data, it wont be carried forward.
But, now I am interested to give the value 0 to these imputed days...please see the want dataset I am looking for:
proc format;
value $seq
'baseline'=1
'week2'=2
'week4'=3
'week8'=4
'week12'=5
;
value seq
1='baseline'
2='week2'
3='week4'
4='week8'
5='week12'
;
run;
data new;
length subjid $8 day $10 value 8;
input subjid day value;
datalines;
1 baseline 10
1 week2 12
1 week4 14
1 week8 16
1 week12 12
2 baseline 10
2 week2 12
2 week4 10
3 baseline 10
3 week2 3
3 week8 4
4 baseline 10
4 week8 4
;;;;
run;
data new1;
set new;
_day=put(day,$seq.);
run;
data want;
set new1;
by subjid;
set new1(firstobs=2 keep=_day rename=_day=__day) new1(obs=1 drop=_all_);
if not last.subjid then do _i=_day to __day-1;
day=put(_i,seq.);
/* another line changed */
if not missing(value) then output;
if _day eq 1 then call missing(value);
end;
else do _i=_day to 5;
day=put(_i,seq.);
output;
end;
/*drop _:;*/
run;
Result from the Hikuo code, the blodface records are imputed with the LOCF:
Obs subjid day value
1 1 baseline 10
2 1 week2 12
3 1 week4 14
4 1 week8 16
5 1 week12 12
6 2 baseline 10
7 2 week2 12
8 2 week4 10
9 2 week8 10
10 2 week12 10
11 3 baseline 10
12 3 week2 3
13 3 week4 3
14 3 week8 4
15 3 week12 4
16 4 baseline 10
17 4 week8 4
18 4 week12 4
Dataset Want I am looking for, (I want to generate this dataset using the above code)...I think I need to make little changes to get the 0 vales rather the LOCF values.
1 baseline 10
1 week2 12
1 week4 14
1 week8 16
1 week12 12
2 baseline 10
2 week2 12
2 week4 10
2 week8 0
2 week12 0
3 baseline 10
3 week2 3
3 week4 0
3 week8 4
3 week12 0
4 baseline 10
4 week8 4
4 week12 0
Thanks, V.
An easy way to get numeric missing changed to 0 or any other number is to use PROC STDIZE. Of course you also need a way to EXPAND the data if the observations are also missing.
As you no longer require duplication of values to replace missing values the process can be simplified.
Key features:
create an invalue seq for an input function rather than using put with automatic type conversion
Create a restricted Cartesian product of subjid and day (normally a bad idea!) to form a template of all possible subjid and day combinations
Use a left join to update the template with value from new
Use the sum function, with zero, to convert missing values to zero
NOTE: This code generates values for subjid = 4 and day = week2 and week4, which are not in your want table. If you use this code you may need a datastep to delete these values based on whether the last undeleted row was a baseline, if I understand your data correctly. I can add that if you require.
proc format;
invalue seq
'baseline'=1
'week2'=2
'week4'=3
'week8'=4
'week12'=5
;
quit ;
data new;
length subjid $8 day $10 value 8;
input subjid day value;
datalines;
1 baseline 10
1 week2 12
1 week4 14
1 week8 16
1 week12 12
2 baseline 10
2 week2 12
2 week4 10
3 baseline 10
3 week2 3
3 week8 4
4 baseline 10
4 week8 4
;
Proc SQL ;
Create Table template As
Select distinct
a.subjid
, b.day
, input (b.day, seq.) as seq
From new a
, new b
Order
By a.subjid
, seq
;
Create Table want (drop = seq) As
Select
tmp.*
, Sum (new.value, 0) As value
From template tmp
Left Join
new new
On new.subjid = tmp.subjid
And new.day = tmp.day
Order
By tmp.subjid
, tmp.seq
;
Select *
from want ;
Quit ;
subjid day value
------ --- -----
1 baseline 10
1 week2 12
1 week4 14
1 week8 16
1 week12 12
2 baseline 10
2 week2 12
2 week4 10
2 week8 0
2 week12 0
3 baseline 10
3 week2 3
3 week4 0
3 week8 4
3 week12 0
4 baseline 10
4 week2 0
4 week4 0
4 week8 4
4 week12 0
Regards
Richard in Oz
Thank RichardinOz for your great code, will be great, because the baseline data will not be carried forward, it means week2 and week8 , records after baseline if there are missing, they will reminin missing.
week4 , sorry
Hi,
If you read into Ksharp's code, you will find it only needs a minor tweak to meet your new requirement.
data new;
length subjid $8 day $10 value 8;
input subjid day value;
datalines;
1 baseline 10
1 week2 12
1 week4 14
1 week8 16
1 week12 12
2 baseline 10
2 week4 10
2 week8 10
3 baseline 10
3 week2 3
3 week8 4
4 baseline 10
4 week8 4
;;;;
run;
data want;
merge new new(firstobs=2 keep=day rename=( day=_day));
array a{4} $ _temporary_ ('week2' 'week4' 'week8' 'week12');
output;
if day =: 'week' then do;
end=whichc(_day,of a{*});
do i=whichc(day,of a{*})+1 to ifn(end=0,dim(a),end-1);
day=a{i}; value=0;output;
end;
end;
drop _day end i;
run;
proc print;run;
The underlined statement is the only one needed to be added.
Haikuo
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.