BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
noobnbrnoobest
Fluorite | Level 6

Hi everybody!

Suppose I have three tables named TABLE1, CODES1 and CODES:

 

----------------------------------------------------

TABLE1

name, category_broad, category_exact
BANANA, 1 (foods), 101 (fruits)
APPLE, 1 (foods), 101 (fruits)
CAR, 2 (things), 201 (cars)
HOUSE, 2 (things), 202 (houses)
BALL, 2 (things), 203 (balls)
BREAD, 1 (foods), 102 (breads)

----------------------------

CODES1

category_broad, code
1 (foods), 34
2 (things), 35

----------------------------

CODES2

category_exact, code
101 (fruits), 463
102 (breads), 464
202 (houses), 564
203 (balls), 565

----------------------------------------------------

 

What I would like to achieve is two merges:
-first merge where if the category_broad from codes1 matches with category_broad from table1, then the code is added to the row that matched, and only matched rows are included.
-second merge where if the category_exact from codes2 matches with category_exact from table1, then the code is added to the row that matched, and only matched rows are included. So for example the row with "car" would be left out from the merge with codes2 table.

So the desired resulting tables would be:

TABLE1xCODES1

name, category_broad, category_exact, code
BANANA, 1 (foods), 101 (fruits), 34
APPLE, 1 (foods), 101 (fruits), 34
CAR, 2 (things), 201 (cars), 35
HOUSE, 2 (things), 202 (houses), 35
BALL, 2 (things), 203 (balls), 35
BREAD, 1 (foods), 102 (breads), 34

TABLE1xCODES2

name, category_broad, category_exact, code
BANANA, 1 (foods), 101 (fruits), 463
APPLE, 1 (foods), 101 (fruits), 463
HOUSE, 2 (things), 202 (houses), 564
BALL, 2 (things), 203 (balls), 565
BREAD, 1 (foods), 102 (breads), 464


I know this should not be too difficult but I am having trouble in achieving this.
Regarding the second merge, I have so far tried using this:

-------------------------------------
proc sort data=table1;

by category_exact;

run;

 

proc sort data=codes2;

by category_exact;

run;

 

data merged;

merge table1 codes2;

by category_exact;

run;

-------------------------------------------



The problem with this approach is that it brings to the resulting table also the row that did not match, which is the CAR row. And in that row, the value for column "code" is missing. This is something I don't understand regarding the 'merge by' option in SAS, since the by-variable did not match there. What is the correct way to solve this? Thank you so much in advance! 🙂

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

@noobnbrnoobest wrote:

Hi everybody!

Suppose I have three tables named TABLE1, CODES1 and CODES:

 

----------------------------------------------------

TABLE1

name, category_broad, category_exact
BANANA, 1 (foods), 101 (fruits)
APPLE, 1 (foods), 101 (fruits)
CAR, 2 (things), 201 (cars)
HOUSE, 2 (things), 202 (houses)
BALL, 2 (things), 203 (balls)
BREAD, 1 (foods), 102 (breads)

----------------------------

CODES1

category_broad, code
1 (foods), 34
2 (things), 35

----------------------------

CODES2

category_exact, code
101 (fruits), 463
102 (breads), 464
202 (houses), 564
203 (balls), 565

----------------------------------------------------

 

What I would like to achieve is two merges:
-first merge where if the category_broad from codes1 matches with category_broad from table1, then the code is added to the row that matched, and only matched rows are included.
-second merge where if the category_exact from codes2 matches with category_exact from table1, then the code is added to the row that matched, and only matched rows are included. So for example the row with "car" would be left out from the merge with codes2 table.

So the desired resulting tables would be:

TABLE1xCODES1

name, category_broad, category_exact, code
BANANA, 1 (foods), 101 (fruits), 34
APPLE, 1 (foods), 101 (fruits), 34
CAR, 2 (things), 201 (cars), 35
HOUSE, 2 (things), 202 (houses), 35
BALL, 2 (things), 203 (balls), 35
BREAD, 1 (foods), 102 (breads), 34

TABLE1xCODES2

name, category_broad, category_exact, code
BANANA, 1 (foods), 101 (fruits), 463
APPLE, 1 (foods), 101 (fruits), 463
HOUSE, 2 (things), 202 (houses), 564
BALL, 2 (things), 203 (balls), 565
BREAD, 1 (foods), 102 (breads), 464


I know this should not be too difficult but I am having trouble in achieving this.
Regarding the second merge, I have so far tried using this:

-------------------------------------
proc sort data=table1;

by category_exact;

run;

 

proc sort data=codes2;

by category_exact;

run;

 

data merged;

merge table1 codes2;

by category_exact;

run;

-------------------------------------------



The problem with this approach is that it brings to the resulting table also the row that did not match, which is the CAR row. And in that row, the value for column "code" is missing. This is something I don't understand regarding the 'merge by' option in SAS, since the by-variable did not match there. What is the correct way to solve this? Thank you so much in advance! 🙂


data merged;
    merge table1(in=in1) codes2(in=in2);
    by category_exact;
    if in1 and in2;
run;
--
Paige Miller

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26

@noobnbrnoobest wrote:

Hi everybody!

Suppose I have three tables named TABLE1, CODES1 and CODES:

 

----------------------------------------------------

TABLE1

name, category_broad, category_exact
BANANA, 1 (foods), 101 (fruits)
APPLE, 1 (foods), 101 (fruits)
CAR, 2 (things), 201 (cars)
HOUSE, 2 (things), 202 (houses)
BALL, 2 (things), 203 (balls)
BREAD, 1 (foods), 102 (breads)

----------------------------

CODES1

category_broad, code
1 (foods), 34
2 (things), 35

----------------------------

CODES2

category_exact, code
101 (fruits), 463
102 (breads), 464
202 (houses), 564
203 (balls), 565

----------------------------------------------------

 

What I would like to achieve is two merges:
-first merge where if the category_broad from codes1 matches with category_broad from table1, then the code is added to the row that matched, and only matched rows are included.
-second merge where if the category_exact from codes2 matches with category_exact from table1, then the code is added to the row that matched, and only matched rows are included. So for example the row with "car" would be left out from the merge with codes2 table.

So the desired resulting tables would be:

TABLE1xCODES1

name, category_broad, category_exact, code
BANANA, 1 (foods), 101 (fruits), 34
APPLE, 1 (foods), 101 (fruits), 34
CAR, 2 (things), 201 (cars), 35
HOUSE, 2 (things), 202 (houses), 35
BALL, 2 (things), 203 (balls), 35
BREAD, 1 (foods), 102 (breads), 34

TABLE1xCODES2

name, category_broad, category_exact, code
BANANA, 1 (foods), 101 (fruits), 463
APPLE, 1 (foods), 101 (fruits), 463
HOUSE, 2 (things), 202 (houses), 564
BALL, 2 (things), 203 (balls), 565
BREAD, 1 (foods), 102 (breads), 464


I know this should not be too difficult but I am having trouble in achieving this.
Regarding the second merge, I have so far tried using this:

-------------------------------------
proc sort data=table1;

by category_exact;

run;

 

proc sort data=codes2;

by category_exact;

run;

 

data merged;

merge table1 codes2;

by category_exact;

run;

-------------------------------------------



The problem with this approach is that it brings to the resulting table also the row that did not match, which is the CAR row. And in that row, the value for column "code" is missing. This is something I don't understand regarding the 'merge by' option in SAS, since the by-variable did not match there. What is the correct way to solve this? Thank you so much in advance! 🙂


data merged;
    merge table1(in=in1) codes2(in=in2);
    by category_exact;
    if in1 and in2;
run;
--
Paige Miller
noobnbrnoobest
Fluorite | Level 6
Thank you, this looks like the way to solve this once and for all!
PaigeMiller
Diamond | Level 26

There are many data set options that can be used in the SET and MERGE statements which may come in handy at some time in the future. You might want to bookmark this list of data set options

 

https://documentation.sas.com/doc/en/pgmmvacdc/9.4/ledsoptsref/p1pczmnhbq4axpn1l15s9mk6mobp.htm

--
Paige Miller
noobnbrnoobest
Fluorite | Level 6

Well this was quickly resolved with this great documentation:
https://support.sas.com/resources/papers/proceedings18/1746-2018.pdf
(at "EXAMPLE 3: UNMATCHED BY VALUE")

 

Turned out there's a way to specify that the other table will have all the distinct possible values for the by-variable:

 

data table1xcodes2;
merge codes2 (in=isfound) table1;
by category_exact;
if isfound;
run;


EDIT: What finally helped me understand this code was the explanation, that the "(in=variable)" -option creates a temporary variable, that gets values 0 and 1 for all rows. Then when it is called in the if-statement it does the magic.
"Any records with a value of [the by-variable "category_exact"] that did not appear in the [table1] dataset will have a value of 0 for [isfound]"

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1197 views
  • 2 likes
  • 2 in conversation