Re: Merging two rows similar to coalesce

J_C · Posted 09-29-2014 12:15 PM

Hi,

I am trying to merge two rows of data into one.

e.g. The coalesece function works between fields, if the first argument is blank, then populate field using the second argument.

I am trying to do this but between rows.

For example,

data=

Field1 Field2 Field3 Field4 Field5 Field6

1 N/A 2 3 N/A B

N/A 4 N/A 3 5 B

out=

Field1 Field2 Field3 Field4 Field5 Field6

1 4 2 3 5 B

How could this be done, given the fact that Field 6 is common amongst the two.

Thank you.

art297 · Posted 09-29-2014 12:53 PM

You could use something like:

data want (drop=_:);

set have (rename=(field1-field6=_field1-_field6));

array infield(*) $ _field1-_field6;

array field(6) $;

retain field:;

do _i=1 to dim(infield);

if mod(_n_,2) or input(infield(_i),? 12.) then field(_i)=infield(_i);

end;

if mod(_n_,2) eq 0 then output;

run;

Astounding · Posted 09-29-2014 01:18 PM

Under certain conditions, this is easy.

(1) There must be a variable (or a set of variables) that can be used to uniquely identify a row.

(2) There can be only one observation (per data set) for that unique identifier.

Then the program is simply:

data want;

update file1 file2;

by id;

run;

Both data sets need to be sorted to permit the BY statement to work.

And the trick is that the master set of values is in FILE2. The values in FILE1 replace only the missing values in FILE2.

Good luck.

Ksharp · Posted 09-30-2014 07:10 AM

Just two rows ?

data have;
input (Field1     Field2     Field3     Field4     Field5     Field6 ) ($);
cards;
1             N/A        2             3             N/A         B
N/A         4            N/A         3             5             B  
;
run;
data temp;
 set have;
 array f{*} $ field:;
 do i=1 to dim(f);
  if f{i}='N/A' then call missing(f{i})  ;
 end;
 drop i;
run;
data want;
 set temp end=last;
 array f{*} $ field1-field6;
 array _f{*} $ _field1-_field6;
 retain _: ;
 do i=1 to dim(f);
  _f{i}=coalescec(_f{i},f{i});
 end;
 if last then output;
 keep _: ;
run;

Xia Keshan

J_C · Posted 09-30-2014 08:34 AM

Thanks for all your answers, I just revised the procedures leading up to my dataset and used a mapping table in an earlier step to avoid this problem.

data_null__ · Posted 09-30-2014 08:47 AM

If you are saying that FIELD 6 could be used for ID then you could use UPDATE like this. Updating to the last non-missing value in the transaction file.

data trans;
   input Field1-Field5 field6 $;
   cards; 
1   .        2         3             .         B
.   4        .         3             5         B           
.   4        .         4             5         C           
1   .        2         4             .         C
;;;;
   run; 
data flat;
   update trans(obs=0) trans;
   by field6;
   run; 
proc print; 
   run;

Merging two rows similar to coalesce