BookmarkSubscribeRSS Feed
mjalvarez
Calcite | Level 5

Hi Experts!

I need some help merging two tables.

 

Table A is a large dataset with 400+ columns and over 31,000 rows. Table B is smaller and shares some of the same columns (9) and rows (1600+) as Table A.

 

Both tables share a unique ID (unique_id) - I need to impute values from table B in to the corresponding missing field in table A by unique_id.

 

Here’s an example of what I want to do:

Table A

Unique_ID Var1 Var2 Var3 Var4 Var5 Var6……. Var400

MA_345   3   1  19  37.9   60   77

JM_909     .    .    .   40.2   55   67

TV_647   1   .   .   37.7   62   83

ED_331   7   5   . 38.0   65   88

 

Table 2

Unique_ID Var1 Var2 Var3 Var4 Var5 Var6

JM_909   3    1   10   40.2   55   67

TV_647   1   0   15   37.7   62   83

 

Result

Unique_ID Var1 Var2 Var3 Var4 Var5 Var6……. Var400

MA_345   3   1  19  37.9   60   77

JM_909   3    1   10   40.2   55   67

TV_647   1    0   15   37.7   62   83

ED_331   7    5   .    38.0   65   88

 

Any help or guidance on this?

 

Thank you!

2 REPLIES 2
svh
Lapis Lazuli | Level 10 svh
Lapis Lazuli | Level 10
There is a simple way to do this as long as the non-missing values for a particular UniqueID in Table 2 ALWAYS match the non-missing values in Table 1 (and if each table only has one row per Unique_ID). This appears to be the case from the data you have shared. E.g., subject JM_909 has the same non-missing values for Var4, Var5, and Var6 in both tables.
You might remember that a MERGE in a data set the values in the right-hand table overwrite the values in the left-hand table for the matched observations that are defined with the BY statement. You would first need to sort each data set by Unique_ID and then

Data Want;
merge Table1 Table2; /*For each value of UNIQUE_ID where the same variable exists in both tables, the value from Table2 will overwrite the value in Table1.*/
by Unique_ID;
run;
Tom
Super User Tom
Super User

You might want to do an UPDATE.  But you seem to want the reverse of the normal case it was designed to support.  The UPDATE statement is designed to apply transactions to an existing dataset.  Any missing value in the transaction dataset is ignored so the existing value is unchanged.  So the non-missing values in the transaction dataset "win".  You appear to want the reverse, where the non missing values in the original dataset "win".  So just treat the transactions as the original dataset and original dataset as the transactions.

 

data want;
  update table2 tableA;
  by unique_id;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 376 views
  • 0 likes
  • 3 in conversation