BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hamudi
Calcite | Level 5

Hey,

I am new to the SAS programming language (And programming generally)

I am doing my master thesis, and in that regard, i am going to use SAS a alot.

I have three datasets with 3 variables in each - (Ticker Year x/y/z).

The datasets are off different sizes.

I want to make one dataset (Ticker Year x y z) and only keep the rows that contains values in x y and z for the same ticker and year.

I tried the following, but it does not give me the right result, several observations that satisfy my criterias are left out:

data data.want;
merge
          data.ibes_april_zeroscore_clean_null (in = b)
          data.ibes_april_zeroscore_clean_one (in = m)
          data.ibes_april_zeroscore_clean_two (in = t);

if b & m & t;

run;

Best regards

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

/*All of the data sets need to be presorted by the same variables*/

proc sort data=data.ibes_april_zeroscore_clean_null;

by Ticker Year;

run;

proc sort data=data.ibes_april_zeroscore_clean_one;

by Ticker Year;

run;

proc sort data=data.ibes_april_zeroscore_clean_two;

by Ticker Year;

run;

data data.want;

merge

          data.ibes_april_zeroscore_clean_null (in = b)

          data.ibes_april_zeroscore_clean_one (in = m)

          data.ibes_april_zeroscore_clean_two (in = t);

BY TICKER YEAR; /*this is what you have missed*/

if b & m & t;

run;

View solution in original post

6 REPLIES 6
Haikuo
Onyx | Level 15

More information is needed before anything can be done.

1. Are all of your data sets presorted by Ticker, Year?

2. Is Ticker, Year unique (no duplicates) in each data set?

3. Explain "...contains value". Is the key missing (no corresponding Ticker, Year), or missing value (x/y/z take missing values, blank if Char, .(Dot) if number)?

4. It would be ideal if you can provide some sample input and expected outcome.

5. As far as your code goes, it missed "by" statement.

Happy Holidays!

Haikuo

hamudi
Calcite | Level 5

Hey Haikuo

Thanks for the quick answer.

I will try and answer your questions:

1) The first dataset is presorted by Ticker 

    The second and thirt dataset is presorted by date(year)

2) Each Ticker-Year observation is unique in each dataset, so no dublicates.

3) I only want the observations, that have an value in x y and z, after i have merged the dataset.  

4) Ticker Year   Value                          Ticker Year Value               Ticker Year Value               OUTPUT ==>      Ticker   Year     X   Y    Z

     AA     1990    X1                               AA    1990 Y1                    AA      1990 Z1                                              AA     1990     X1 Y1  Z1

     AA     1991    X2                               AA    1991 Y2

     AA     1992    X3

Simple example, i have a lot of observations for each ticker in different years, but the Ticker-YEAR combinations are not the same in each dataset.

5) Don't understand, should i add by statement instead of if?

Happy Holidays!

Haikuo
Onyx | Level 15

/*All of the data sets need to be presorted by the same variables*/

proc sort data=data.ibes_april_zeroscore_clean_null;

by Ticker Year;

run;

proc sort data=data.ibes_april_zeroscore_clean_one;

by Ticker Year;

run;

proc sort data=data.ibes_april_zeroscore_clean_two;

by Ticker Year;

run;

data data.want;

merge

          data.ibes_april_zeroscore_clean_null (in = b)

          data.ibes_april_zeroscore_clean_one (in = m)

          data.ibes_april_zeroscore_clean_two (in = t);

BY TICKER YEAR; /*this is what you have missed*/

if b & m & t;

run;

hamudi
Calcite | Level 5

Thanks Haikuo!

It worked!

I need to add one final touch to the dataset, then i am good to go!

I now have the dataset that we(you) created.

I have another dataset with two variables, ticker and gvkey, it contains approximately 17,000 distinct tickers with a corresponding distinct gvkey.

If i want to add one variable to my new made dataset with gvkey that match the ticker, how do i do that?

Haikuo
Onyx | Level 15

If "Ticker" is not unique in your new data set, then data step Merge may not be your best option. In general, one will opt for Proc SQL:

Proc sql;

create table want as

select a.*, b.gvkey from old a /*the created table*/

left join

new b /*the new comer*/

on a.ticker=b.ticker

;

quit;

hamudi
Calcite | Level 5

Thank you very much Haikuo, you saved my night Smiley Happy

Happy holidays (for now).

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 986 views
  • 0 likes
  • 2 in conversation