BookmarkSubscribeRSS Feed
almmotamedi
Obsidian | Level 7

Hi SAS users,

 

I would like to combine (!) two files having the same variables but with different values, as below:

 

Dataset1

 

Variable1    Variable2    Variable3 

    S                  A                 .         

 

 

 

Dataset2

 

Variable1    Variable2    Variable3 

    S                    .                  B

    S                    .                  C

 

 

 

and I would like to get this:

 

   

Variable1    Variable2    Variable3 

    S                    A                B

    S                    A                C

 

 

I tried MERGE BY statement for Variable1 but the result is not what I want. Because it gives me:

 

Variable1    Variable2    Variable3 

    S                    .                  B

    S                    .                  C

    S                    A                 .

 

 

Thank you for help.

  

7 REPLIES 7
Reeza
Super User

Post the code you used. 

You can probably try an update statement. 

almmotamedi
Obsidian | Level 7

I used just a simple Merge statement:

 

data dataset3;

merge dataset1 dataset2;

by Variable1;

run;

KevinViel
Pyrite | Level 9

Kurt directed you to the PDV.  I emphasize this when I teach intro classes.  In the very least, the order of your data sets in the MERGE statements is wrong: dataset2.VARIABLE2 will overwrite dataset1.VARIABLE1.  As in the SAS-L era, consider posting a description of your data and your goals.

 

When we need to assume, then we may speculate wildly.

 

HTH,

 

Kevin

Kurt_Bremser
Super User

When doing a MERGE, care must be taken with variables that share the same name. Which value takes precedence when the PDV is filled depends on the order in whcih records are read.

I recommend renaming the variables of one dataset and include code that forces correct assignments to the wanted variables in the output dataset.

LinusH
Tourmaline | Level 20
This is a perfect and a simple job for an SQL join. Especially if your real data is more complex than your sample.
Data never sleeps
Ksharp
Super User

It could be MERGE two datasets.

 

 

data Dataset1;
 input Variable1 $   Variable2 $   Variable3 $;
 cards; 
    S                  A                 .       
;  
data Dataset2;
 input Variable1  $  Variable2   $ Variable3 $;
 cards; 
    S               .                  B
    S                    .                  C
;
run;

data want;
 merge Dataset1 Dataset2(rename=(Variable1-Variable3=v1-v3));
 length new1-new3 temp1-temp3 $ 40;
 retain temp1-temp3;
 new1=coalescec(Variable1,v1) ;
 if missing(new1) then new1=temp1;
  else temp1=new1;
  
 new2=coalescec(Variable2,v2) ;
 if missing(new2) then new2=temp2;
  else temp2=new2;
  
 new3=coalescec(Variable3,v3) ;
 if missing(new3) then new3=temp3;
  else temp3=new3;
 keep new1-new3;
run;
KevinViel
Pyrite | Level 9

It would be better to have more details.  A MERGE, especially with a RENAME or KEEP data set option, is just one of the approaches.  We do not know the sizes of the datasets, the number of replicates, or the rules.  Given that, I might suggest a HASH because the data set(s) might not be sorted.  

 

data dataset1 ;
  variable1 = "S" ;
  variable2 = "A" ;
  variable3 = " " ;
run ;

 

data dataset2 ;
  variable1 = "S" ;
  variable2 = " " ;
  variable3 = "B" ;
  output ;
  variable1 = "S" ;
  variable2 = " " ;
  variable3 = "C" ;
  output ;
run ;

 

data dataset3
        ( drop = __: )
     ;

if 0 then set dataset1 ;

if _n_ = 1
then
  do ;
     declare hash _S ( dataset: "dataset1" ) ;
     __rc = _S.DefineKey ( "variable1" ) ;
     __rc = _S.DefineData( "variable2" ) ;
     __rc = _S.DefineDone() ;
  end ;

  set dataset2 ;

  __rc = _S.find( ) ;

run ;

 

Of course, we do not know if VARIABLE3 needs to be updated if missing.  Depending on the complexity of the rules (and data) another HASH can be defined or the KEY/VALUEs can be added using ADD or REF methods as needed.

 

HTH,

 

Kevin

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1201 views
  • 0 likes
  • 6 in conversation