Hi everyone, I'm a bit new to SAS and I need a bit of help merging two datasets.
My first data set has a ticker symbol and multiple years for each ticker. Example: lib1.tickerdata
Ticker Year
APPL 2000
APPL 2001
APPL 2002
MSFT 2000
MSFT 2001
My second dataset holds an ID number for each ticker. Example: lib1.IDdata
Ticker ID number
APPL 0001
MSFT 0002
I would like to merge the two so that it looks like:
Ticker Year ID Number
APPL 2000 0001
APPL 2001 0001
APPL 2002 0001
MSFT 2000 0002
MSFT 2001 0002
How can I perform this task? Essentially, I'd like to keep lib1.Tickerdata the same and add a new variable, ID number, based on the ticker. Similar to what a VLOOKUP could do on excel. I used this code, but for some reason it gives me one extra observation than the original set (lib1.tickerdata). SAS also gives me the note: MERGE statement has more than one data set with repeats of BY values. Is this note okay given the fact that I'd like the keep the duplicates in lib1.tickerdata?
data lib1.merged
merge lib1.tickerdata (in=in1)
lib1.IDdata (in=in2)
by TICKER;
if in1;
run;
Any help would be greatly appreciated! Thanks in advance.
The SAS note indicates that you have duplicate ticker values in your second dataset. Is this what you expect? If so how do you want to handle the duplicates? If not how do you want to de-duplicate?
The SAS note indicates that you have duplicate ticker values in your second dataset. Is this what you expect? If so how do you want to handle the duplicates? If not how do you want to de-duplicate?
Your IDDATA dataset seems to have multiple entries for some tickers. Check if these unexpected duplicates are valid, and decide how to deal with them.
Your code looks fine. But the message is saying that there is more than one observation in the lookup table (IDDATA) for at least one value of TICKER that also has multiple observations for that value of TICKER.
If all of the repeating records in IDDATA have the same value of ID_NUMBER it will not matter. If they don't then you need to decide how to pick just one.
That makes sense! I'll take a look at the ID data set to find the repeating records.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.