BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sastuck
Pyrite | Level 9

Hello SAS community,

 

@Shmuel offered me the following code so that I could bring a variable into the data set I am running a regression on from another data set in my program:

 

*bring variable in from other data set; 
proc sql;
     create table paper.ceo_firm as 
      select  a.* , b.VOL
      from paper.ceo_firm as a
      left join paper.CSRP_Monthly_Stock_char4 as b
     on a.ticker = b.ticker and
          a.year = b.year;
quit;

I ran into an error and @Shmuel identified it, saying that:

 

"The difference in number of observations means that the combination of ticker and year are not unique.

SQL join makes a Cartesian join thus multiplies output.

If you know what other variables make it unique - add them to the conditions"

 

But I had trouble implementing this because, for one reason, I'm not sure how it would look in the code, and, secondly, I can't imagine how else to make this variable unique since basically my whole program has been revolving around the ticker*year combination.

 

Let me know if you can help take my data set to the next level! If this works to call VOL into my ceo_firm dataset, then I could use it for other variables as well, which could be very useful to me. 

 

Thanks for your time!

 

-SAStuck

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

So what do you want as an output dataset? If it needs to be unique by ticker / year, then what do you want VOL to be? The sum over each year, the average, the minimum or something else?

 

If you want the sum of vol, something like this should work:

 

proc sql;
     create table ceo_firm as 
      select  a.* , b.VOL
      from paper.ceo_firm as a
      left join
     (select ticker
                ,year
                ,sum(VOL) as VOL
      from paper.CSRP_Monthly_Stock_char4
      group by ticker
              ,year
     ) as b
     on a.ticker = b.ticker and
     a.year = b.year;
quit;

View solution in original post

7 REPLIES 7
SASKiwi
PROC Star

I think it would help to explain why both tables have duplicates by ticker plus year combination. At a guess your monthly stock table has monthly rows for each ticker/year. However why does your CEO firm table have multiple rows per ticker/year?

sastuck
Pyrite | Level 9

Would it help the situation if I drop rows with multiple ticker/years?

SASKiwi
PROC Star

Posting a few rows showing the duplicates would help us understand the data better otherwise we are just guessing what the best approach should be.

sastuck
Pyrite | Level 9

I ran the following code:

 

*find duplicates: ticker/year combo;
proc freq data=paper.CSRP_Monthly_Stock_char4;
	tables ticker * year / noprint 
		out=paper.multiple_tyCSRP (where=(count > 1));
run;

proc freq data=paper.ceo_firm;
	tables ticker * year / noprint 
		out=paper.multiple_tyfirm (where=(count > 1));
run;

And below I am posting examples of the output from the respective tables:

 

RowTickerYearCountPercent 
2A2011120.00198938 
3A2012120.00198938 
4A2013120.00198938 
5A2014120.00198938 
6A2015120.00198938 
7A2016120.00198938 
8A2017120.00198938 
9AA2010120.00198938 
10AA2011120.00198938 
11AA2012120.00198938 
12AA2013120.00198938 
13AA2014120.00198938 
14AA2015120.00198938 
15AA2016110.0018235984 
16AA2017120.00198938 
17AAAP2016120.00198938 
18AAAP2017120.00198938 
19AABA201760.00099469

 

 

That was paper.CSRP_Monthly_Stock_char4. 

 

paper.ceo_firm is empty (no duplicates). 

 

 

SASKiwi
PROC Star

So what do you want as an output dataset? If it needs to be unique by ticker / year, then what do you want VOL to be? The sum over each year, the average, the minimum or something else?

 

If you want the sum of vol, something like this should work:

 

proc sql;
     create table ceo_firm as 
      select  a.* , b.VOL
      from paper.ceo_firm as a
      left join
     (select ticker
                ,year
                ,sum(VOL) as VOL
      from paper.CSRP_Monthly_Stock_char4
      group by ticker
              ,year
     ) as b
     on a.ticker = b.ticker and
     a.year = b.year;
quit;
sastuck
Pyrite | Level 9

Your code worked. I will accept it as the solution. I did run into an error though. I know I am just making a simple mistake:

 

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 70         
 71         proc reg data=paper.ceo_firm plots(maxpoints=none);
 72         model salary=annual_return bonus stock_awards age male vol;
 ERROR: Variable VOL not found.
 NOTE: The previous statement has been deleted.
 73         run;
 
 WARNING: No variables specified for an SSCP matrix. Execution terminating.
 NOTE: PROCEDURE REG used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              3408.59k
       OS Memory           33216.00k
       Timestamp           04/15/2018 01:24:17 PM
       Step Count                        89  Switch Count  0
       Page Faults                       0
       Page Reclaims                     481
       Page Swaps                        0
       Voluntary Context Switches        3
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           48
       
 
 74         
 75         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 87       

Should I adapt your code to name the table paper.ceo_firm? Or adapt my model to pull from the table your code created, work.ceo_firm? Let me know! Thanks!

SASKiwi
PROC Star

I deliberately created a new dataset rather than overwrite the existing one as it is not good practice to do so and you get a warning in your SAS log. I suggest you go with the new name. If you want a permanent dataset try something like paper.ceo_firm2.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1224 views
  • 0 likes
  • 2 in conversation