BookmarkSubscribeRSS Feed
mjalvarez
Calcite | Level 5

Hi Experts!

I need some help merging two tables.

 

Table A is a large dataset with 400+ columns and over 31,000 rows. Table B is smaller and shares some of the same columns (9) and rows (1600+) as Table A.

 

Both tables share a unique ID (unique_id) - I need to impute values from table B in to the corresponding missing field in table A by unique_id.

 

Here’s an example of what I want to do:

Table A

Unique_ID Var1 Var2 Var3 Var4 Var5 Var6……. Var400

MA_345   3   1  19  37.9   60   77

JM_909     .    .    .   40.2   55   67

TV_647   1   .   .   37.7   62   83

ED_331   7   5   . 38.0   65   88

 

Table 2

Unique_ID Var1 Var2 Var3 Var4 Var5 Var6

JM_909   3    1   10   40.2   55   67

TV_647   1   0   15   37.7   62   83

 

Result

Unique_ID Var1 Var2 Var3 Var4 Var5 Var6……. Var400

MA_345   3   1  19  37.9   60   77

JM_909   3    1   10   40.2   55   67

TV_647   1    0   15   37.7   62   83

ED_331   7    5   .    38.0   65   88

 

Any help or guidance on this?

 

Thank you!

2 REPLIES 2
svh
Lapis Lazuli | Level 10 svh
Lapis Lazuli | Level 10
There is a simple way to do this as long as the non-missing values for a particular UniqueID in Table 2 ALWAYS match the non-missing values in Table 1 (and if each table only has one row per Unique_ID). This appears to be the case from the data you have shared. E.g., subject JM_909 has the same non-missing values for Var4, Var5, and Var6 in both tables.
You might remember that a MERGE in a data set the values in the right-hand table overwrite the values in the left-hand table for the matched observations that are defined with the BY statement. You would first need to sort each data set by Unique_ID and then

Data Want;
merge Table1 Table2; /*For each value of UNIQUE_ID where the same variable exists in both tables, the value from Table2 will overwrite the value in Table1.*/
by Unique_ID;
run;
Tom
Super User Tom
Super User

You might want to do an UPDATE.  But you seem to want the reverse of the normal case it was designed to support.  The UPDATE statement is designed to apply transactions to an existing dataset.  Any missing value in the transaction dataset is ignored so the existing value is unchanged.  So the non-missing values in the transaction dataset "win".  You appear to want the reverse, where the non missing values in the original dataset "win".  So just treat the transactions as the original dataset and original dataset as the transactions.

 

data want;
  update table2 tableA;
  by unique_id;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 984 views
  • 0 likes
  • 3 in conversation