BookmarkSubscribeRSS Feed
michtka
Fluorite | Level 6

Hi everyone, the code below has been developped by Haikuo, it imputed the day missing with the LOCF value, except the baseline data, it wont be carried forward.

But, now I am interested to give the value 0 to these imputed days...please see the want dataset I am looking for:

proc format;

         value $seq

            'baseline'=1

              'week2'=2

              'week4'=3

              'week8'=4

              'week12'=5

              ;

              value seq

              1='baseline'

              2='week2'

              3='week4'

              4='week8'

              5='week12'

              ;

      run;

data new;
   length subjid $8 day $10 value 8;
   input subjid day value;
   datalines;
1  baseline 10
1  week2    12
1  week4    14
1  week8    16
1  week12   12
2  baseline 10
2  week2    12
2  week4    10
3  baseline 10
3  week2     3
3  week8     4
4  baseline 10
4  week8     4
;;;;
   run;

data new1;

        set new;

        _day=put(day,$seq.);

        run;

data want;

  set new1;

  by subjid;

  set new1(firstobs=2 keep=_day rename=_day=__day) new1(obs=1 drop=_all_);

     if not last.subjid  then do _i=_day to __day-1;

          day=put(_i,seq.);

/* another line changed */

          if not missing(value) then output;

          if _day eq 1 then call missing(value);
  

       end;

  else do _i=_day to 5;

          day=put(_i,seq.);

          output;

       end;

       /*drop _:;*/

run;

  Result from the Hikuo code, the blodface records are imputed with the LOCF:

Obs subjid day value

1 1 baseline 10

2 1 week2 12

3 1 week4 14

4 1 week8 16

5 1 week12 12

6 2 baseline 10

7 2 week2 12

8 2 week4 10

9 2 week8 10

10 2 week12 10

11 3 baseline 10

12 3 week2 3

13 3 week4 3

14 3 week8 4

15 3 week12 4

16 4 baseline 10

17 4 week8 4

18 4 week12 4

Dataset Want I am looking for, (I want to generate this dataset using the above code)...I think I need to make little changes to get the 0 vales rather the LOCF values.

1  baseline 10

1  week2    12

1  week4    14

1  week8    16

1  week12   12

2  baseline 10

2  week2    12

2  week4    10

2 week8      0

2 week12    0

3  baseline 10

3  week2     3

3 week4      0

3  week8     4

3 week12    0

4  baseline 10

4  week8     4

4 week12    0

Thanks, V.

5 REPLIES 5
data_null__
Jade | Level 19

An easy way to get numeric missing changed to 0 or any other number is to use PROC STDIZE.  Of course you also need a way to EXPAND the data if the observations are also missing.

data new;
   length subjid $8 day 8 value 8;
  
input subjid day value @@;
   datalines;
1  0 10  1  2 12  1  4 14  1  8 16  1  12 12
2  0 10  2  2 12  2  4 10 
3  0 10  3  2  3  3  8  4 
4  0 10  4  8  4
;;;;
   run;
data classdata;
   do day=4,8,12;
    
output;
    
end;
  
run;
proc summary nway data=new classdata=classdata order=internal;
   by subjid;
   class day;
   output out=new2(drop=_freq_ _type_) idgroup(out(value)=);
   run;
proc stdize method=sum missing=0 reponly;
  
var value;
   run;
RichardinOz
Quartz | Level 8

As you no longer require duplication of values to replace missing values the process can be simplified.

Key features:
     create an invalue seq for an input function rather than using put with automatic type conversion
     Create a restricted Cartesian product of subjid and day (normally a bad idea!) to form a template of all  possible subjid and day combinations
     Use a left join to update the template with value from new
     Use the sum function, with zero, to convert missing values to zero

NOTE:  This code generates values for subjid = 4 and day = week2 and week4, which are not in your want table.  If you use this code you may need a datastep to delete these values based on whether the last undeleted row was a baseline, if I understand your data correctly.  I can add that if you require.

proc format;
         invalue seq
            'baseline'=1
              'week2'=2
              'week4'=3
              'week8'=4
              'week12'=5
              ;
quit ;

data new;
   length subjid $8 day $10 value 8;
   input subjid day value;
   datalines;
1  baseline 10
1  week2    12
1  week4    14
1  week8    16
1  week12   12
2  baseline 10
2  week2    12
2  week4    10
3  baseline 10
3  week2     3
3  week8     4
4  baseline 10
4  week8     4
;

Proc SQL ;

Create Table template As
  Select distinct
    a.subjid
   , b.day
   ,  input (b.day, seq.) as seq
  From new a
   , new b
  Order
   By a.subjid
   , seq
     ;
    
  Create Table  want  (drop = seq) As
   Select
    tmp.*
    , Sum (new.value, 0) As value
   From template tmp
   Left Join
     new   new
    On new.subjid = tmp.subjid
    And new.day  = tmp.day
   Order
    By tmp.subjid
    , tmp.seq
     ;
    
   
  Select *
   from want ;
  
Quit ;


subjid    day         value
------      ---           -----
1          baseline  10
1          week2     12
1          week4     14
1          week8     16
1          week12   12
2          baseline  10
2          week2     12
2          week4     10
2         week8     0
2          week12   0
3          baseline  10
3          week2     3
3          week4     0
3          week8     4
3          week12   0
4          baseline  10
4          week2     0
4         week4     0
4          week8     4
4          week12   0

Regards

Richard in Oz

michtka
Fluorite | Level 6


Thank RichardinOz for your great code, will be great, because the baseline data will not be carried forward, it means week2 and week8 , records after baseline if there are missing, they will reminin missing.

michtka
Fluorite | Level 6

week4 , sorry


Haikuo
Onyx | Level 15

Hi,

If you read into Ksharp's code, you will find it only needs a minor tweak to meet your new requirement.

data new;

  length subjid $8 day $10 value 8;

  input subjid day value;

  datalines;

1 baseline 10

1 week2 12

1 week4 14

1 week8 16

1 week12 12

2 baseline 10

2 week4 10

2 week8 10

3 baseline 10

3 week2 3

3 week8 4

4 baseline 10

4 week8 4

;;;;

  run;

data want;

merge new new(firstobs=2 keep=day rename=( day=_day));

array a{4} $ _temporary_ ('week2' 'week4' 'week8' 'week12');

output;

if day =: 'week' then do;

  end=whichc(_day,of a{*});

  do i=whichc(day,of a{*})+1 to ifn(end=0,dim(a),end-1);

  day=a{i}; value=0;output;

  end;

end;

drop _day end i;

run;

proc print;run;

The underlined statement is the only one needed to be added.

Haikuo

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1356 views
  • 0 likes
  • 4 in conversation