DATA Step, Macro, Functions and more

Problem with merging datasets

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

Problem with merging datasets

Hey,

I am new to the SAS programming language (And programming generally)

I am doing my master thesis, and in that regard, i am going to use SAS a alot.

I have three datasets with 3 variables in each - (Ticker Year x/y/z).

The datasets are off different sizes.

I want to make one dataset (Ticker Year x y z) and only keep the rows that contains values in x y and z for the same ticker and year.

I tried the following, but it does not give me the right result, several observations that satisfy my criterias are left out:

data data.want;
merge
          data.ibes_april_zeroscore_clean_null (in = b)
          data.ibes_april_zeroscore_clean_one (in = m)
          data.ibes_april_zeroscore_clean_two (in = t);

if b & m & t;

run;

Best regards


Accepted Solutions
Solution
‎12-25-2014 12:27 PM
Respected Advisor
Posts: 3,156

Re: Problem with merging datasets

/*All of the data sets need to be presorted by the same variables*/

proc sort data=data.ibes_april_zeroscore_clean_null;

by Ticker Year;

run;

proc sort data=data.ibes_april_zeroscore_clean_one;

by Ticker Year;

run;

proc sort data=data.ibes_april_zeroscore_clean_two;

by Ticker Year;

run;

data data.want;

merge

          data.ibes_april_zeroscore_clean_null (in = b)

          data.ibes_april_zeroscore_clean_one (in = m)

          data.ibes_april_zeroscore_clean_two (in = t);

BY TICKER YEAR; /*this is what you have missed*/

if b & m & t;

run;

View solution in original post


All Replies
Respected Advisor
Posts: 3,156

Re: Problem with merging datasets

More information is needed before anything can be done.

1. Are all of your data sets presorted by Ticker, Year?

2. Is Ticker, Year unique (no duplicates) in each data set?

3. Explain "...contains value". Is the key missing (no corresponding Ticker, Year), or missing value (x/y/z take missing values, blank if Char, .(Dot) if number)?

4. It would be ideal if you can provide some sample input and expected outcome.

5. As far as your code goes, it missed "by" statement.

Happy Holidays!

Haikuo

Occasional Contributor
Posts: 10

Re: Problem with merging datasets

Hey Haikuo

Thanks for the quick answer.

I will try and answer your questions:

1) The first dataset is presorted by Ticker 

    The second and thirt dataset is presorted by date(year)

2) Each Ticker-Year observation is unique in each dataset, so no dublicates.

3) I only want the observations, that have an value in x y and z, after i have merged the dataset.  

4) Ticker Year   Value                          Ticker Year Value               Ticker Year Value               OUTPUT ==>      Ticker   Year     X   Y    Z

     AA     1990    X1                               AA    1990 Y1                    AA      1990 Z1                                              AA     1990     X1 Y1  Z1

     AA     1991    X2                               AA    1991 Y2

     AA     1992    X3

Simple example, i have a lot of observations for each ticker in different years, but the Ticker-YEAR combinations are not the same in each dataset.

5) Don't understand, should i add by statement instead of if?

Happy Holidays!

Solution
‎12-25-2014 12:27 PM
Respected Advisor
Posts: 3,156

Re: Problem with merging datasets

/*All of the data sets need to be presorted by the same variables*/

proc sort data=data.ibes_april_zeroscore_clean_null;

by Ticker Year;

run;

proc sort data=data.ibes_april_zeroscore_clean_one;

by Ticker Year;

run;

proc sort data=data.ibes_april_zeroscore_clean_two;

by Ticker Year;

run;

data data.want;

merge

          data.ibes_april_zeroscore_clean_null (in = b)

          data.ibes_april_zeroscore_clean_one (in = m)

          data.ibes_april_zeroscore_clean_two (in = t);

BY TICKER YEAR; /*this is what you have missed*/

if b & m & t;

run;

Occasional Contributor
Posts: 10

Re: Problem with merging datasets

Thanks Haikuo!

It worked!

I need to add one final touch to the dataset, then i am good to go!

I now have the dataset that we(you) created.

I have another dataset with two variables, ticker and gvkey, it contains approximately 17,000 distinct tickers with a corresponding distinct gvkey.

If i want to add one variable to my new made dataset with gvkey that match the ticker, how do i do that?

Respected Advisor
Posts: 3,156

Re: Problem with merging datasets

If "Ticker" is not unique in your new data set, then data step Merge may not be your best option. In general, one will opt for Proc SQL:

Proc sql;

create table want as

select a.*, b.gvkey from old a /*the created table*/

left join

new b /*the new comer*/

on a.ticker=b.ticker

;

quit;

Occasional Contributor
Posts: 10

Re: Problem with merging datasets

Thank you very much Haikuo, you saved my night Smiley Happy

Happy holidays (for now).

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 317 views
  • 0 likes
  • 2 in conversation