BookmarkSubscribeRSS Feed
sastuck
Pyrite | Level 9

Good afternoon!

 

The code for my term paper is a bit of a mess right now. I am hoping someone on this page could help me organize it/identify extraneous code. 

 

One of my main issues is that I want to simply merge two datasets, but I have ended up somehow with:

 

PAPER.COMPUSTAT_EXECUCOMP

WORK.CEO

PAPER.CSRP_MONTHLY_STOCK

WORK.CEO_SORTED

WORK.FIRMS_SORTED

WORK.CEO_FIRM

WORK.CEOS_SORTED_NEW

 

That is when I run all of my code. Here it is:

 

libname paper "~/425/425_Final_Paper";

*import compustat dataset;
PROC IMPORT out=paper.CompuStat_Execucomp
	datafile="~/425/425_Final_Paper/CompuStat_Execucomp.csv"
	DBMS=CSV replace;
	getnames= yes;
	guessingrows=2000;
run;

*keep only CEOs;
data CEO;
	set paper.Compustat_ExecuComp;
 	if CEOANN = 'CEO' then output CEO;
run;

*import csrp dataset;
PROC IMPORT out=paper.CSRP_Monthly_Stock
	datafile="~/425/425_Final_Paper/CSRP_MonthlyStock.csv"
	DBMS=CSV replace;
	getnames= yes;
/*guessingrows=max;*/
run;

*delete rows with missing cusips;
data paper.CSRP_Monthly_Stock;
	set paper.CSRP_Monthly_Stock;
	if nmiss(cusip) > 0 then delete;
run;

*create new variable year;
data paper.CSRP_Monthly_Stock;
	set paper.CSRP_Monthly_Stock;
	year = int( date/10000 );
run;

*delete rows with missing return data;
data paper.CSRP_Monthly_Stock;
	set paper.CSRP_Monthly_Stock;
	if nmiss(ret) > 0 then delete;
run;

*sort by ticker;
proc sort 
	data=paper.CSRP_Monthly_Stock;
	by ticker;
run;

data paper.CSRP_Monthly_Stock;
	set paper.CSRP_Monthly_Stock;
	by ticker date;
	retain annual_return 1;
	annual_return = annual_return * (1 + RET);
	month = int( mod(date,10000) / 100);
	if month = 12 or last.ticker;
	annual_return = (annual_return - 1) * 100;
	output;
	annual_return = 1;
run;

/*practice merging*/
libname paper "~/425/425_Final_Paper";

/*keep only CEOs*/
data CEO;
	set paper.Compustat_ExecuComp;
 	if CEOANN = 'CEO' then output CEO;
run;

* 1. Sort CEO data by cusip and year & save sorted file as ceo_sorted ; 
PROC SORT DATA=paper.Compustat_ExecuComp OUT=ceos_sorted; 
  BY year; 
RUN; 

libname paper "~/425/425_Final_Paper";

* 2. Sort firm data by cusip and year & save sorted file as firms_sorted ; 
PROC SORT DATA=paper.CSRP_Monthly_Stock OUT=firms_sorted; 
  BY year; 
RUN; 

libname paper "~/425/425_Final_Paper";

/*keep only CEOs*/
data CEO;
	set paper.Compustat_ExecuComp;
 	if CEOANN = 'CEO' then output CEO;
run;

* 3. Merge CEO data and firm data by year in a data step; 
DATA ceo_firm ; 
  MERGE ceos_sorted_new firms_sorted; 
  BY year; 
RUN; 

/*use proc contents to see if there is a type mismatch*/
proc contents 
 	data=paper.CSRP_Monthly_Stock;
run;

proc contents 
 	data=paper.Compustat_ExecuComp;
run;

* change cusip to numeric variable; 
data ceos_sorted_new;
set ceos_sorted (rename=(cusip=temp));
cusip = input(temp, 12.);
drop temp;
run;

 

 

Let me know what you think!

 

Thanks so much.

 

-SAStuck

1 REPLY 1
LinusH
Tourmaline | Level 20
Simple suggestion is to do more in each data step, like you have three rounds of CSRP_Monthly_Stock in the beginning that could be consolidated to one step.
Data never sleeps

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 1298 views
  • 1 like
  • 2 in conversation