Good afternoon!
The code for my term paper is a bit of a mess right now. I am hoping someone on this page could help me organize it/identify extraneous code.
One of my main issues is that I want to simply merge two datasets, but I have ended up somehow with:
PAPER.COMPUSTAT_EXECUCOMP
WORK.CEO
PAPER.CSRP_MONTHLY_STOCK
WORK.CEO_SORTED
WORK.FIRMS_SORTED
WORK.CEO_FIRM
WORK.CEOS_SORTED_NEW
That is when I run all of my code. Here it is:
libname paper "~/425/425_Final_Paper"; *import compustat dataset; PROC IMPORT out=paper.CompuStat_Execucomp datafile="~/425/425_Final_Paper/CompuStat_Execucomp.csv" DBMS=CSV replace; getnames= yes; guessingrows=2000; run; *keep only CEOs; data CEO; set paper.Compustat_ExecuComp; if CEOANN = 'CEO' then output CEO; run; *import csrp dataset; PROC IMPORT out=paper.CSRP_Monthly_Stock datafile="~/425/425_Final_Paper/CSRP_MonthlyStock.csv" DBMS=CSV replace; getnames= yes; /*guessingrows=max;*/ run; *delete rows with missing cusips; data paper.CSRP_Monthly_Stock; set paper.CSRP_Monthly_Stock; if nmiss(cusip) > 0 then delete; run; *create new variable year; data paper.CSRP_Monthly_Stock; set paper.CSRP_Monthly_Stock; year = int( date/10000 ); run; *delete rows with missing return data; data paper.CSRP_Monthly_Stock; set paper.CSRP_Monthly_Stock; if nmiss(ret) > 0 then delete; run; *sort by ticker; proc sort data=paper.CSRP_Monthly_Stock; by ticker; run; data paper.CSRP_Monthly_Stock; set paper.CSRP_Monthly_Stock; by ticker date; retain annual_return 1; annual_return = annual_return * (1 + RET); month = int( mod(date,10000) / 100); if month = 12 or last.ticker; annual_return = (annual_return - 1) * 100; output; annual_return = 1; run; /*practice merging*/ libname paper "~/425/425_Final_Paper"; /*keep only CEOs*/ data CEO; set paper.Compustat_ExecuComp; if CEOANN = 'CEO' then output CEO; run; * 1. Sort CEO data by cusip and year & save sorted file as ceo_sorted ; PROC SORT DATA=paper.Compustat_ExecuComp OUT=ceos_sorted; BY year; RUN; libname paper "~/425/425_Final_Paper"; * 2. Sort firm data by cusip and year & save sorted file as firms_sorted ; PROC SORT DATA=paper.CSRP_Monthly_Stock OUT=firms_sorted; BY year; RUN; libname paper "~/425/425_Final_Paper"; /*keep only CEOs*/ data CEO; set paper.Compustat_ExecuComp; if CEOANN = 'CEO' then output CEO; run; * 3. Merge CEO data and firm data by year in a data step; DATA ceo_firm ; MERGE ceos_sorted_new firms_sorted; BY year; RUN; /*use proc contents to see if there is a type mismatch*/ proc contents data=paper.CSRP_Monthly_Stock; run; proc contents data=paper.Compustat_ExecuComp; run; * change cusip to numeric variable; data ceos_sorted_new; set ceos_sorted (rename=(cusip=temp)); cusip = input(temp, 12.); drop temp; run;
Let me know what you think!
Thanks so much.
-SAStuck
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Get started using SAS Studio to write, run and debug your SAS programs.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.