BookmarkSubscribeRSS Feed
almmotamedi
Obsidian | Level 7

Hi SAS users,

 

I would like to combine (!) two files having the same variables but with different values, as below:

 

Dataset1

 

Variable1    Variable2    Variable3 

    S                  A                 .         

 

 

 

Dataset2

 

Variable1    Variable2    Variable3 

    S                    .                  B

    S                    .                  C

 

 

 

and I would like to get this:

 

   

Variable1    Variable2    Variable3 

    S                    A                B

    S                    A                C

 

 

I tried MERGE BY statement for Variable1 but the result is not what I want. Because it gives me:

 

Variable1    Variable2    Variable3 

    S                    .                  B

    S                    .                  C

    S                    A                 .

 

 

Thank you for help.

  

7 REPLIES 7
Reeza
Super User

Post the code you used. 

You can probably try an update statement. 

almmotamedi
Obsidian | Level 7

I used just a simple Merge statement:

 

data dataset3;

merge dataset1 dataset2;

by Variable1;

run;

KevinViel
Pyrite | Level 9

Kurt directed you to the PDV.  I emphasize this when I teach intro classes.  In the very least, the order of your data sets in the MERGE statements is wrong: dataset2.VARIABLE2 will overwrite dataset1.VARIABLE1.  As in the SAS-L era, consider posting a description of your data and your goals.

 

When we need to assume, then we may speculate wildly.

 

HTH,

 

Kevin

Kurt_Bremser
Super User

When doing a MERGE, care must be taken with variables that share the same name. Which value takes precedence when the PDV is filled depends on the order in whcih records are read.

I recommend renaming the variables of one dataset and include code that forces correct assignments to the wanted variables in the output dataset.

LinusH
Tourmaline | Level 20
This is a perfect and a simple job for an SQL join. Especially if your real data is more complex than your sample.
Data never sleeps
Ksharp
Super User

It could be MERGE two datasets.

 

 

data Dataset1;
 input Variable1 $   Variable2 $   Variable3 $;
 cards; 
    S                  A                 .       
;  
data Dataset2;
 input Variable1  $  Variable2   $ Variable3 $;
 cards; 
    S               .                  B
    S                    .                  C
;
run;

data want;
 merge Dataset1 Dataset2(rename=(Variable1-Variable3=v1-v3));
 length new1-new3 temp1-temp3 $ 40;
 retain temp1-temp3;
 new1=coalescec(Variable1,v1) ;
 if missing(new1) then new1=temp1;
  else temp1=new1;
  
 new2=coalescec(Variable2,v2) ;
 if missing(new2) then new2=temp2;
  else temp2=new2;
  
 new3=coalescec(Variable3,v3) ;
 if missing(new3) then new3=temp3;
  else temp3=new3;
 keep new1-new3;
run;
KevinViel
Pyrite | Level 9

It would be better to have more details.  A MERGE, especially with a RENAME or KEEP data set option, is just one of the approaches.  We do not know the sizes of the datasets, the number of replicates, or the rules.  Given that, I might suggest a HASH because the data set(s) might not be sorted.  

 

data dataset1 ;
  variable1 = "S" ;
  variable2 = "A" ;
  variable3 = " " ;
run ;

 

data dataset2 ;
  variable1 = "S" ;
  variable2 = " " ;
  variable3 = "B" ;
  output ;
  variable1 = "S" ;
  variable2 = " " ;
  variable3 = "C" ;
  output ;
run ;

 

data dataset3
        ( drop = __: )
     ;

if 0 then set dataset1 ;

if _n_ = 1
then
  do ;
     declare hash _S ( dataset: "dataset1" ) ;
     __rc = _S.DefineKey ( "variable1" ) ;
     __rc = _S.DefineData( "variable2" ) ;
     __rc = _S.DefineDone() ;
  end ;

  set dataset2 ;

  __rc = _S.find( ) ;

run ;

 

Of course, we do not know if VARIABLE3 needs to be updated if missing.  Depending on the complexity of the rules (and data) another HASH can be defined or the KEY/VALUEs can be added using ADD or REF methods as needed.

 

HTH,

 

Kevin

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1232 views
  • 0 likes
  • 6 in conversation