BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
serena13lee
Quartz | Level 8

I have a dataset that has blanks and I would like to fill in the spaces based on other columns. HT, WT, and BI are filled in for patient and visit by variable. I would like to copy over the values of each patient and visit into the other variables. Can this be done by writing if patient and visit match, then copy value over? Or is there another way? 

 

data have; 
input Patient $ Visit Variable $ HT WT BI; 
cards;
A 1 H 1 . .
A 2 H 3 . .
A 3 H 4 . .
A 1 W . 2 . 
A 2 W . 5 .
A 3 W . 9 .
A 1 B . . 0
A 2 B . . 8
A 3 B . . 8 
B 1 H 1 . .
B 2 H 3 . .
B 3 H 1 . .
B 1 W . 2 . 
B 2 W . 4 .
B 3 W . 1 .
B 1 B . . 0
B 2 B . . 0
B 3 B . . 8 
;

data want; input Patient $ Visit Variable $ HT WT BI; cards; A 1 H 1 2 0 A 2 H 3 5 0 A 3 H 4 9 8 A 1 W 1 2 0 A 2 W 3 5 8 A 3 W 4 9 8 A 1 B 1 2 0 A 2 B 3 5 8 A 3 B 4 9 8 B 1 H 1 2 0 B 2 H 3 4 0 B 3 H 1 1 8 B 1 W 1 2 0 B 2 W 3 4 0 B 3 W 1 1 8 B 1 B 1 2 0 B 2 B 3 4 0 B 3 B 1 1 8 ;

Initially I thought this could be fixed with an if-else statement but this may be more complex?

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

It's harder than it looks.  Here's a solution that depends on VISIT actually taking on values of 1, 2, and 3 as indicated in your sample data:

 

data want;
   array htvals {3} _temporary_;
   array wtvals {3} _temporary_;
   array bivals {3} _temporary_;
   do until (last.patient);
      set have;
      by patient;
      if ht > . then htvals{visit} = ht;
      if wt > . then wtvals{visit} = wt;
      if bi > . then bivals{visit} = bi;
   end;
   do until (last.patient);
      set have;
      by patient;
      if ht = . then ht = htvals{visit};
      if wt = . then wt = wtvals{visit};
      if bi = . then bi = bivals{visit};
      output;
   end;
   do _n_=1 to 3;
      htvals{_n_} = .;
      wtvals{_n_} = .;
      bivals{_n_} = .;
   end;
run;

Wish I found an easier way, but this is all I came up with.

View solution in original post

8 REPLIES 8
Astounding
PROC Star

It's harder than it looks.  Here's a solution that depends on VISIT actually taking on values of 1, 2, and 3 as indicated in your sample data:

 

data want;
   array htvals {3} _temporary_;
   array wtvals {3} _temporary_;
   array bivals {3} _temporary_;
   do until (last.patient);
      set have;
      by patient;
      if ht > . then htvals{visit} = ht;
      if wt > . then wtvals{visit} = wt;
      if bi > . then bivals{visit} = bi;
   end;
   do until (last.patient);
      set have;
      by patient;
      if ht = . then ht = htvals{visit};
      if wt = . then wt = wtvals{visit};
      if bi = . then bi = bivals{visit};
      output;
   end;
   do _n_=1 to 3;
      htvals{_n_} = .;
      wtvals{_n_} = .;
      bivals{_n_} = .;
   end;
run;

Wish I found an easier way, but this is all I came up with.

serena13lee
Quartz | Level 8
Hi! Thanks so much for your solution. It's actually super robust and after making changed to my dataset it still worked. Thanks!
heffo
Pyrite | Level 9

Without actually knowing what you want to do with the data it seems like you have duplication of the data. 

Either you should remove the variable "Variable" or you should have the variables Patient,Visit, Variable and a new one called "Value" or similar. This is an example of the first one: 

 

 

proc sort data=have out=have2; 
*We need to make sure that the data is sorted on patient and visit for the next step.; by patient visit ; run; data want (rename=(HT_=HT WT_=WT BI_=BI)); set have2; by patient visit ; retain HT_ WT_ BI_; *We want to keep the values from the variables, but can only do this if they are not from the input data set.; *Put the right values in the right variable.;
if variable = "H" then HT_ = HT; else if variable = "W" then WT_ = WT; else if variable = "B" then BI_ = BI;

*Only output the last row of each Patient and visit. ; if last.visit then output;

*Get rid of some variables. ; drop variable HT WT BI; run;

 

 

This will give you the following data:

Patient Visit HT WT BI
A 1 1 2 0
A 2 3 5 8
A 3 4 9 8
B 1 1 2 0
B 2 3 4 0
B 3 1 1 8

Jagadishkatam
Amethyst | Level 16

Please try the proc sql approach

 

data have; 
input Patient $ Visit Variable $ HT WT BI; 
cards;
A 1 H 1 . .
A 2 H 3 . .
A 3 H 4 . .
A 1 W . 2 . 
A 2 W . 5 .
A 3 W . 9 .
A 1 B . . 0
A 2 B . . 8
A 3 B . . 8 
B 1 H 1 . .
B 2 H 3 . .
B 3 H 1 . .
B 1 W . 2 . 
B 2 W . 4 .
B 3 W . 1 .
B 1 B . . 0
B 2 B . . 0
B 3 B . . 8 
;

proc sql;
create table want as select a.patient,a.visit,a.variable,b.ht,b.wt,b.bi from have as a inner join (select patient,visit,max(ht) as ht, max(wt) as wt, max(bi) as bi from have group by patient, visit) as b on a.patient=b.patient and a.visit=b.visit order by a.patient,a.variable,a.visit;
quit;
Thanks,
Jag
ScottBass
Rhodochrosite | Level 12
data have; 
input Patient $ Visit Variable $ HT WT BI; 
cards;
A 1 H 1 . .
A 2 H 3 . .
A 3 H 4 . .
A 1 W . 2 . 
A 2 W . 5 .
A 3 W . 9 .
A 1 B . . 0
A 2 B . . 8
A 3 B . . 8 
B 1 H 1 . .
B 2 H 3 . .
B 3 H 1 . .
B 1 W . 2 . 
B 2 W . 4 .
B 3 W . 1 .
B 1 B . . 0
B 2 B . . 0
B 3 B . . 8 
;
run;

data want; 
input Patient $ Visit Variable $ HT WT BI; 
cards;
A 1 H 1 2 0
A 2 H 3 5 0
A 3 H 4 9 8
A 1 W 1 2 0 
A 2 W 3 5 8
A 3 W 4 9 8
A 1 B 1 2 0
A 2 B 3 5 8
A 3 B 4 9 8 
B 1 H 1 2 0
B 2 H 3 4 0
B 3 H 1 1 8
B 1 W 1 2 0 
B 2 W 3 4 0
B 3 W 1 1 8
B 1 B 1 2 0
B 2 B 3 4 0
B 3 B 1 1 8 
;
run;

proc summary data=have nway;
   class Patient Visit;
   var HT WT BI;
   output out=max max=;
run;

data test;
   * set PDV order (optional) ;
   if 0 then set have;

   * initialize &_hashnum_ to zero then declare hash objects ;
   %let _hashnum_=0;
   %hash_define(data=max, keys=Patient Visit, vars=HT WT BI);

   set have;
   %hash_lookup;

   drop _rc:;
run;

proc compare base=want comp=test;
run;

Do you have a typo in your want dataset?

 

Alternate approach:

 

data have; 
input Patient $ Visit Variable $ HT WT BI; 
cards;
A 1 H 1 . .
A 2 H 3 . .
A 3 H 4 . .
A 1 W . 2 . 
A 2 W . 5 .
A 3 W . 9 .
A 1 B . . 0
A 2 B . . 8
A 3 B . . 8 
B 1 H 1 . .
B 2 H 3 . .
B 3 H 1 . .
B 1 W . 2 . 
B 2 W . 4 .
B 3 W . 1 .
B 1 B . . 0
B 2 B . . 0
B 3 B . . 8 
;
run;

data want; 
input Patient $ Visit Variable $ HT WT BI; 
cards;
A 1 H 1 2 0
A 2 H 3 5 0
A 3 H 4 9 8
A 1 W 1 2 0 
A 2 W 3 5 8
A 3 W 4 9 8
A 1 B 1 2 0
A 2 B 3 5 8
A 3 B 4 9 8 
B 1 H 1 2 0
B 2 H 3 4 0
B 3 H 1 1 8
B 1 W 1 2 0 
B 2 W 3 4 0
B 3 W 1 1 8
B 1 B 1 2 0
B 2 B 3 4 0
B 3 B 1 1 8 
;
run;

proc summary data=have nway;
   class Patient Visit;
   var HT WT BI;
   output out=max max= / autoname;
run;

data test;
   * set PDV order (optional) ;
   if 0 then set have;

   * initialize &_hashnum_ to zero then declare hash objects ;
   %let _hashnum_=0;
   %hash_define(data=max, keys=Patient Visit, vars=HT_Max WT_Max BI_Max);

   set have;
   %hash_lookup;

   array vars{*} HT WT BI;
   array max{*} HT_Max WT_Max BI_Max;
   do i=1 to dim(vars);
      if missing(vars{i}) then vars{i}=max{i};
   end;

   drop _rc: i HT_Max WT_Max BI_Max;
run;

proc compare base=want comp=test;
run;

See 

https://github.com/scottbass/SAS/blob/master/Macro/hash_define.sas and 

https://github.com/scottbass/SAS/blob/master/Macro/hash_lookup.sas

 

If your data differs from what you posted then this approach may not work.

 

(Can you add "run;" statements to your have and want data steps as best practice?)

 


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
ScottBass
Rhodochrosite | Level 12

@Jagadishkatam 's approach is similar to mine, and I think is a clearer, cleaner approach.


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
Ksharp
Super User
data have; 
input Patient $ Visit Variable $ HT WT BI; 
cards;
A 1 H 1 . .
A 2 H 3 . .
A 3 H 4 . .
A 1 W . 2 . 
A 2 W . 5 .
A 3 W . 9 .
A 1 B . . 0
A 2 B . . 8
A 3 B . . 8 
B 1 H 1 . .
B 2 H 3 . .
B 3 H 1 . .
B 1 W . 2 . 
B 2 W . 4 .
B 3 W . 1 .
B 1 B . . 0
B 2 B . . 0
B 3 B . . 8
;

proc sort data=have(drop= Variable ) out=temp;
 by Patient  Visit;
run;
data key;
 update temp(obs=0) temp;
 by Patient  Visit;
run;
data want;
 if _n_=1 then do;
   if 0 then set key;
   declare hash h(dataset:'key');
   h.definekey( 'Patient' ,'Visit');
   h.definedata('HT', 'WT', 'BI');
   h.definedone();
 end;
set have(drop=HT WT BI);
call missing(HT ,WT ,BI);
h.find();
run;
serena13lee
Quartz | Level 8
This is a super clean solution! I will try modifying it to my dataset and see how it looks. Thanks!

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 2928 views
  • 4 likes
  • 6 in conversation