BookmarkSubscribeRSS Feed
Gladis6680
Obsidian | Level 7

Hi,

question on merge. I would like to merge 2 data sets

DATA_1       
 HOME_aDESCRIPTIONVEDNOR_aBRAND_aDISTRIBUTOR_aSALES_aQUANTITY_a
 HOME 1MUFFINSVENDOR 1BRAND 1DIST 1$51910
 HOME 2MUFFINSVENDOR 1BRAND 1DIST 2$53610
TOTAL     $1,05520
DATA_2       
HOME_bDESCRIPTIONVEDNOR_bBRAND_bDISTRIBUTORQUANTITY_bSALES_bREBATES
HOME 2MUFFINSVENDOR 1BRAND 1DIST 210$536$17

 

the code I wrote is

data COMB;

merge data_1 (in=a)

           data_2(in=b);

by Description

if a=1 and b=1;

run;

 

I am getting 

HOME_aDESCRIPTIONVEDNOR_aBRAND_aDISTRIBUTOR_aSALES_aQUANTITY_aHOME_bDESCRIPTION_bVEDNOR_bBRAND_bQUANTITY_bSALES_bREBATES
HOME 1MUFFINSVENDOR 1BRAND 1DIST 1$51910HOME 2MUFFINSVENDOR 1BRAND 110$536$17
HOME 2MUFFINSVENDOR 1BRAND 1DIST 2$53610HOME 2MUFFINSVENDOR 1BRAND 110$536

$17

 What I want is 

HOMEDESCRIPTIONVEDNORBRANDDISTRIBUTORSALESQUANTITYREBATES
HOME 2MUFFINSVENDOR 1BRAND 1DIST 2$53610$17

I am not sure why the rebates and sales are getting duplicated.

Thanks,

 

4 REPLIES 4
SASKiwi
PROC Star

That's because DESCRIPTION contains duplicate values. You need to merge by other variables as well like HOME_2.

Gladis6680
Obsidian | Level 7

Ok, so it looks like i need to have something unique. I am working with transactional data, therefore I find it a little challenging 

 

DATA 1 has the transactional data with sales only and DATA 2 has transactional data with only rebates. Maybe I need to create a new variable and combine 2 variables to make it unique?

Tom
Super User Tom
Super User

You probably want to INTERLEAVE the observations instead of MERGING them.

Use SET instead of merge and you will get all of the observations from both datasets.

You can then apply your rules to combine them in a way that makes sense.

 

For example if you wanted to find the total cost after removing the rebates you might do something like to add the COST and subtract the REBATE into a new variable TOTAL and collapse to just one observation per ID group.

data want;
   set costs rebates;
   by id;
   if first.id then total=0;
   total+cost-rebate;
   if last.id then output;
run;
   

 

Patrick
Opal | Level 21

Looking at your sample data it appears the key for merging/joining needs to be:

HOME DESCRIPTION VEDNO BRAND DISTRIBUTOR

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 729 views
  • 0 likes
  • 4 in conversation