BookmarkSubscribeRSS Feed
sastuck
Pyrite | Level 9

Hello!

 

I am trying merge two datasets by ticker and year. When I run the code, however, SAS brings up an issue with another variable, cusip. I understand that the issue is that one dataset has this variable as a character variable while the other has it as a numeric one. That being said, shouldn't this be unrelated if I am trying to merge by ticker and year?

 

Here is the code:

 

libname paper "~/425/425_Final_Paper";

*import compustat dataset;
PROC IMPORT out=paper.CompuStat_Execucomp
	datafile="~/425/425_Final_Paper/CompuStat_Execucomp.csv"
	DBMS=CSV replace;
	getnames= yes;
	guessingrows=2000;
run;

*keep only CEOs;
data CEO;
	set paper.Compustat_ExecuComp;
 	if CEOANN = 'CEO' then output CEO;
run;

*import csrp dataset;
PROC IMPORT out=paper.CSRP_Monthly_Stock
	datafile="~/425/425_Final_Paper/CSRP_MonthlyStock.csv"
	DBMS=CSV replace;
	getnames= yes;
/*guessingrows=max;*/
run;

*make new variable annual_return;
data paper.CSRP_Monthly_Stock;
	set paper.CSRP_Monthly_Stock;
	retain annual_return 1;
	annual_return = annual_return * (1 + RET);
	month = int( mod(date,10000) / 100);
	if month = 12 or last.ticker;
	annual_return = (annual_return - 1) * 100;
	output;
	annual_return = 1;
run;

* 1. Sort CEO data by ticker and year & save sorted file as ceo_sorted ; 
PROC SORT DATA=work.CEO OUT=ceos_sorted; 
  BY ticker year; 
RUN; 

* 2. Sort firm data by ticker and year & save sorted file as firms_sorted ; 
PROC SORT DATA=paper.CSRP_Monthly_Stock OUT=firms_sorted; 
  BY ticker year; 
RUN; 

* 3. Merge CEO data and firm data by ticker and year in a data step; 
DATA ceo_firm; 
  MERGE ceos_sorted firms_sorted; 
  BY ticker year; 
RUN; 

And here is the log:

 

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
70
71 * 3. Merge CEO data and firm data by ticker and year in a data step;
72 DATA ceo_firm;
73 MERGE ceos_sorted firms_sorted;
ERROR: Variable CUSIP has been defined as both character and numeric.
74 BY ticker year;
75 RUN;
 
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.CEO_FIRM may be incomplete. When this step was stopped there were 0 observations and 170 variables.
WARNING: Data set WORK.CEO_FIRM was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 4010.78k
OS Memory 37560.00k
Timestamp 03/16/2018 08:08:06 PM
Step Count 216 Switch Count 0
Page Faults 0
Page Reclaims 631
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 8
 
 
76
77 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
89
 
 
Any help is appreciated!
 
 
1 REPLY 1
ballardw
Super User

@sastuck wrote:

Hello!

 

I am trying merge two datasets by ticker and year. When I run the code, however, SAS brings up an issue with another variable, cusip. I understand that the issue is that one dataset has this variable as a character variable while the other has it as a numeric one. That being said, shouldn't this be unrelated if I am trying to merge by ticker and year?

 


Yes it is critical a data set variable may only have one type. Since you have two different types SAS does not know which data type you actually want.

I would guess the "cusip" is something like "customer ip" or similar. If the value is an identifier and is not going to have arithmetic performed with it then likely it should be character.

 

You can over come issues like this by: 1) Best is controlling you data at import/creation (in this case note that you used guessing rows in on proc import and not the other which is a likely cause in this case) such as actually reading data when practical using a documented description of the data file(s) in question.

2) lots of "fixit" code like; Assumes the firms_sorted is the set with the numeric version of the cuspid and a character version is desired.

 

DATA ceo_firm; 
  MERGE ceos_sorted 
        firms_sorted (rename=(cuspid=cuspnum) )
  ; 
  BY ticker year; 
  /* to keep the version from the firms_sorted set the way MERGE
would work
the 12 in the next line should match the lenght of the character version of cuspid */ if not missing(cuspnum) then cuspid = put(cuspnum,best12. -L); drop cuspnum; RUN;

Or if you don't need the cuspid for any further analysis:

 

DATA ceo_firm; 
  MERGE ceos_sorted  (drop=cuspid )
        firms_sorted (drop=cuspid )
  ; 
  BY ticker year; 
RUN; 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 687 views
  • 0 likes
  • 2 in conversation