BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
abak
Obsidian | Level 7

I have data of the following basic forma (two columns). Well call this D1 for the data set.

 

Index      Target

A             .

A             .

A             .

A            4

A            3

B            .

B            4

B           3.5

B           .

 

I want to replace the missing values in "Target" by the first non-missing value, and fill every missing value after the first non missing target with the previous non-missing. Here's the "end" result:

 

Index      Target

A             4*

A             4*

A             4*

A             4

A             3

B            4*

B            4

B           3.5

B           3.5*

 

* is for the imputed values.

 

First, I started by created the following data step:

 

/*This step creates macro variables for every index of the first non-missing value*/

 

data _null_;

   set D1;

   retain j ind;

   by index;

   if _n_  = 1 then j = 0;

   if first.index then ind = 0;

   if Target ^= '.' and ind = 0 then do;

   call symput('FirstTarget'||strip(left(j)), 'Target');

   ind = 1;

   end;

   run;

 

 

/*Now I want to apply those macro variables to those values...and where my code doesn't work*/

 

data D1new;

    set D1;

    by index;

    retain ind j LastTarget;

 

/*If it's not the first missing value then use the last value, which has already been imputed or already existed*/

if Target = '.' and ind = 1 then Target = LastTarget;

 

/* If it's the first missing value then apply the stored first non-missing value */

    if Target = '.' and ind = 0 then do;

    ind = 1;

    Target = &&FirstTarget&j;

    end;

  

 /*For next iteration, create lagging variable that will remember the previous target value to use for imputation if the next value is missing. This takes care of the 'forward' missing values.*/

      LastTarget = Target;

 

     drop j ind;

run;

 

The problem lies with j. I've tried using call symput('j', j); or %let j = j; but it doesn't recognize j as a -number- but instead sees j as literally 'j' and tries to interpret &&FirstTarget&j as &FirstTargetj which doesn't exist (and won't work).

 

I have a larger number of indexes than two, and I figure understanding this will help me develop my SAS skills to using more than just the bare skills I have.

 

At worst case, I could split the data out by each Index and do it with a macro and $do i in 1 %to &n; but that seems unnecessary/overly long/non-elegant.

 

What am I doing wrong?

   

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Ok a simple one-->

 


data int1;
set have;
by index;
retain _t;
if first.index then _t = target;
if target = .
then target = _t;
else _t = target;
order = _n_;
drop _t;
run;

proc sort data=int1;
by index descending order;
run;

data int2;
set int1;
by index;
retain _t;
if first.index then _t = target;
if target = .
then target = _t;
else _t = target;
drop _t;
run;

proc sort
  data=int2
  out=want (drop=order)
;
by order;
run;

View solution in original post

5 REPLIES 5
novinosrin
Tourmaline | Level 20

data have;
input Index  $    Target;
cards;
A             .
A             .
A             .
A            4
A            3
B            .
B            4
B           3.5
B           .
;



data want;
if _n_=1 then do;
dcl hash h(dataset:'have(where=(Target ne .))');
h.definekey('index');
h.definedata('Target');
h.definedone();
end;
do until(last.index);
update have(obs=0) have;
by index;
if missing(Target) then rc=h.find();
output;
end;
drop rc ;
run;
abak
Obsidian | Level 7

Thanks for the solution. I don't understand hash though. I suppose that's up to me to go figure out how your code works. 

novinosrin
Tourmaline | Level 20

Ok a simple one-->

 


data int1;
set have;
by index;
retain _t;
if first.index then _t = target;
if target = .
then target = _t;
else _t = target;
order = _n_;
drop _t;
run;

proc sort data=int1;
by index descending order;
run;

data int2;
set int1;
by index;
retain _t;
if first.index then _t = target;
if target = .
then target = _t;
else _t = target;
drop _t;
run;

proc sort
  data=int2
  out=want (drop=order)
;
by order;
run;
abak
Obsidian | Level 7

Thanks so much for your help. i haven't rant this through yet but I'm assuming it works. 

 

Your quick responses were very helpful.

novinosrin
Tourmaline | Level 20

Please don't ever assume. Test thoroughly and if it doesn't work, come back to us. Simple as that!

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1195 views
  • 0 likes
  • 2 in conversation