BookmarkSubscribeRSS Feed
J_C
Calcite | Level 5 J_C
Calcite | Level 5

Hi,

I am trying to merge two rows of data into one.

e.g. The coalesece function works between fields, if the first argument is blank, then populate field using the second argument.

I am trying to do this but between rows.

For example,

data=

Field1     Field2     Field3     Field4     Field5     Field6

1             N/A        2             3             N/A         B

N/A         4            N/A         3             5             B           

out=

Field1     Field2     Field3     Field4     Field5     Field6

1             4             2            3             5             B      

How could this be done, given the fact that Field 6 is common amongst the two.

Thank you.

5 REPLIES 5
art297
Opal | Level 21

You could use something like:

data want (drop=_:);

  set have (rename=(field1-field6=_field1-_field6));

  array infield(*) $ _field1-_field6;

  array field(6) $;

  retain field:;

  do _i=1 to dim(infield);

    if mod(_n_,2) or input(infield(_i),? 12.) then field(_i)=infield(_i);

  end;

  if mod(_n_,2) eq 0 then output;

run;

Astounding
PROC Star

Under certain conditions, this is easy.

(1) There must be a variable (or a set of variables) that can be used to uniquely identify a row.

(2) There can be only one observation (per data set) for that unique identifier.

Then the program is simply:

data want;

   update file1 file2;

   by id;

run;

Both data sets need to be sorted to permit the BY statement to work.

And the trick is that the master set of values is in FILE2.  The values in FILE1 replace only the missing values in FILE2.

Good luck.

Ksharp
Super User

Just two rows ?

data have;
input (Field1     Field2     Field3     Field4     Field5     Field6 ) ($);
cards;
1             N/A        2             3             N/A         B
N/A         4            N/A         3             5             B  
;
run;
data temp;
 set have;
 array f{*} $ field:;
 do i=1 to dim(f);
  if f{i}='N/A' then call missing(f{i})  ;
 end;
 drop i;
run;
data want;
 set temp end=last;
 array f{*} $ field1-field6;
 array _f{*} $ _field1-_field6;
 retain _: ;
 do i=1 to dim(f);
  _f{i}=coalescec(_f{i},f{i});
 end;
 if last then output;
 keep _: ;
run;

Xia Keshan

J_C
Calcite | Level 5 J_C
Calcite | Level 5

Thanks for all your answers, I just revised the procedures leading up to my dataset and used a mapping table in an earlier step to avoid this problem.

data_null__
Jade | Level 19

If you are saying that FIELD 6 could be used for ID then you could use UPDATE like this.  Updating to the last non-missing value in the transaction file.

data trans;
   input Field1-Field5 field6 $;
   cards;
1   .        2         3             .         B
.   4        .         3             5         B          
.   4        .         4             5         C          
1   .        2         4             .         C
;;;;
   run;
data flat;
   update trans(obs=0) trans;
   by field6;
   run;
proc print;
  
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 2806 views
  • 0 likes
  • 5 in conversation