BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
asgee
Obsidian | Level 7

Hi all,

 

I have this dataset here which is a subset from a much larger master dataset:

Dataset A:

ID

Var1

Var2

Var3

Var4

Var5

1-2345

10

64.5

9.8

2.1

5

6-7890

10

5.4

7.5

10

6.4

9-8755

1.1

4.2

6.4

10.5

5.8

4-3210

10

15.4

3.1

11

13

 

I have new variables I want to add onto Dataset A based on this Dataset B: 

Dataset B:

Var1

Var2

Var3

Var4

Var5

Var6

Var7

10

64.5

9.8

2.1

5

7.7

5.5

10

5.4

7.5

10

6.4

9.4

6.8

1.1

4.2

6.4

10.5

5.8

8.2

10

10

15.4

3.1

11

13

1.2

51

 

The problem as you can see is that I don't have an "ID" variable on Dataset B (it's complicated) and so I don't have an identifier variable I can use to match and merge the two datasets together. 

 

I tried the code below but my understanding is that this prioritizes the variables I'm listing in order. I'm thinking this would be an issue since Var1 contains 3 rows of "10" which could confuse the merging since it won't know which ID to merge it to (?). 

proc sort data = dataA;
by var1 var2 var3 var4 var5;
run;

proc sort data = dataB;
by var1 var2 var3 var4 var5;
run;

data new;
merge dataA (in=A) dataB (in=B);
if A;
run;

How would I merge it so that it is the combination of values across "var1" - "var5" that I am merging by?

 

Since Dataset A originally came from Dataset B, the combination of unique values across "Var1" - "Var5" in both datasets are essentially the same. The only thing new is the "Var6" & "Var7" variables that I'm trying to add from Dataset B to Dataset A. 

 

Any help would be appreciated, thanks.

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Let's say every record in A has a unique combination of var1/var2/var3/var4/var5.  And the same for B.  If so, it won't matter which is the major sort key.  But you would have to change

 

data new;
merge dataA (in=A) dataB (in=B);
if A;
run;

to

data new;
  merge dataA (in=A) dataB (in=B);
  by var1 var2 var3 var4 var5;
  if A;
run;

Of course this could be done in sql as a left join of a with b   on a.var1=b.var1 and a.var2=b.var2 ... and a.var5=b.var5.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

2 REPLIES 2
mkeintz
PROC Star

Let's say every record in A has a unique combination of var1/var2/var3/var4/var5.  And the same for B.  If so, it won't matter which is the major sort key.  But you would have to change

 

data new;
merge dataA (in=A) dataB (in=B);
if A;
run;

to

data new;
  merge dataA (in=A) dataB (in=B);
  by var1 var2 var3 var4 var5;
  if A;
run;

Of course this could be done in sql as a left join of a with b   on a.var1=b.var1 and a.var2=b.var2 ... and a.var5=b.var5.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
asgee
Obsidian | Level 7

Hi @mkeintz ! Thanks for the reply. I see what you mean. I think I like your idea of doing the SQL commands instead since that's exactly what I was trying to go for (same values across each of the 5 vars).

Thanks, will try it out myself! 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 861 views
  • 1 like
  • 2 in conversation