Hey,
I am new to the SAS programming language (And programming generally)
I am doing my master thesis, and in that regard, i am going to use SAS a alot.
I have three datasets with 3 variables in each - (Ticker Year x/y/z).
The datasets are off different sizes.
I want to make one dataset (Ticker Year x y z) and only keep the rows that contains values in x y and z for the same ticker and year.
I tried the following, but it does not give me the right result, several observations that satisfy my criterias are left out:
data data.want;
merge
data.ibes_april_zeroscore_clean_null (in = b)
data.ibes_april_zeroscore_clean_one (in = m)
data.ibes_april_zeroscore_clean_two (in = t);
if b & m & t;
run;
Best regards
/*All of the data sets need to be presorted by the same variables*/
proc sort data=data.ibes_april_zeroscore_clean_null;
by Ticker Year;
run;
proc sort data=data.ibes_april_zeroscore_clean_one;
by Ticker Year;
run;
proc sort data=data.ibes_april_zeroscore_clean_two;
by Ticker Year;
run;
data data.want;
merge
data.ibes_april_zeroscore_clean_null (in = b)
data.ibes_april_zeroscore_clean_one (in = m)
data.ibes_april_zeroscore_clean_two (in = t);
BY TICKER YEAR; /*this is what you have missed*/
if b & m & t;
run;
More information is needed before anything can be done.
1. Are all of your data sets presorted by Ticker, Year?
2. Is Ticker, Year unique (no duplicates) in each data set?
3. Explain "...contains value". Is the key missing (no corresponding Ticker, Year), or missing value (x/y/z take missing values, blank if Char, .(Dot) if number)?
4. It would be ideal if you can provide some sample input and expected outcome.
5. As far as your code goes, it missed "by" statement.
Happy Holidays!
Haikuo
Hey Haikuo
Thanks for the quick answer.
I will try and answer your questions:
1) The first dataset is presorted by Ticker
The second and thirt dataset is presorted by date(year)
2) Each Ticker-Year observation is unique in each dataset, so no dublicates.
3) I only want the observations, that have an value in x y and z, after i have merged the dataset.
4) Ticker Year Value Ticker Year Value Ticker Year Value OUTPUT ==> Ticker Year X Y Z
AA 1990 X1 AA 1990 Y1 AA 1990 Z1 AA 1990 X1 Y1 Z1
AA 1991 X2 AA 1991 Y2
AA 1992 X3
Simple example, i have a lot of observations for each ticker in different years, but the Ticker-YEAR combinations are not the same in each dataset.
5) Don't understand, should i add by statement instead of if?
Happy Holidays!
/*All of the data sets need to be presorted by the same variables*/
proc sort data=data.ibes_april_zeroscore_clean_null;
by Ticker Year;
run;
proc sort data=data.ibes_april_zeroscore_clean_one;
by Ticker Year;
run;
proc sort data=data.ibes_april_zeroscore_clean_two;
by Ticker Year;
run;
data data.want;
merge
data.ibes_april_zeroscore_clean_null (in = b)
data.ibes_april_zeroscore_clean_one (in = m)
data.ibes_april_zeroscore_clean_two (in = t);
BY TICKER YEAR; /*this is what you have missed*/
if b & m & t;
run;
Thanks Haikuo!
It worked!
I need to add one final touch to the dataset, then i am good to go!
I now have the dataset that we(you) created.
I have another dataset with two variables, ticker and gvkey, it contains approximately 17,000 distinct tickers with a corresponding distinct gvkey.
If i want to add one variable to my new made dataset with gvkey that match the ticker, how do i do that?
If "Ticker" is not unique in your new data set, then data step Merge may not be your best option. In general, one will opt for Proc SQL:
Proc sql;
create table want as
select a.*, b.gvkey from old a /*the created table*/
left join
new b /*the new comer*/
on a.ticker=b.ticker
;
quit;
Thank you very much Haikuo, you saved my night
Happy holidays (for now).
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.