BookmarkSubscribeRSS Feed
sastuck
Pyrite | Level 9

Hello!

 

I am trying merge two datasets by ticker and year. When I run the code, however, SAS brings up an issue with another variable, cusip. I understand that the issue is that one dataset has this variable as a character variable while the other has it as a numeric one. That being said, shouldn't this be unrelated if I am trying to merge by ticker and year?

 

Here is the code:

 

libname paper "~/425/425_Final_Paper";

*import compustat dataset;
PROC IMPORT out=paper.CompuStat_Execucomp
	datafile="~/425/425_Final_Paper/CompuStat_Execucomp.csv"
	DBMS=CSV replace;
	getnames= yes;
	guessingrows=2000;
run;

*keep only CEOs;
data CEO;
	set paper.Compustat_ExecuComp;
 	if CEOANN = 'CEO' then output CEO;
run;

*import csrp dataset;
PROC IMPORT out=paper.CSRP_Monthly_Stock
	datafile="~/425/425_Final_Paper/CSRP_MonthlyStock.csv"
	DBMS=CSV replace;
	getnames= yes;
/*guessingrows=max;*/
run;

*make new variable annual_return;
data paper.CSRP_Monthly_Stock;
	set paper.CSRP_Monthly_Stock;
	retain annual_return 1;
	annual_return = annual_return * (1 + RET);
	month = int( mod(date,10000) / 100);
	if month = 12 or last.ticker;
	annual_return = (annual_return - 1) * 100;
	output;
	annual_return = 1;
run;

* 1. Sort CEO data by ticker and year & save sorted file as ceo_sorted ; 
PROC SORT DATA=work.CEO OUT=ceos_sorted; 
  BY ticker year; 
RUN; 

* 2. Sort firm data by ticker and year & save sorted file as firms_sorted ; 
PROC SORT DATA=paper.CSRP_Monthly_Stock OUT=firms_sorted; 
  BY ticker year; 
RUN; 

* 3. Merge CEO data and firm data by ticker and year in a data step; 
DATA ceo_firm; 
  MERGE ceos_sorted firms_sorted; 
  BY ticker year; 
RUN; 

And here is the log:

 

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
70
71 * 3. Merge CEO data and firm data by ticker and year in a data step;
72 DATA ceo_firm;
73 MERGE ceos_sorted firms_sorted;
ERROR: Variable CUSIP has been defined as both character and numeric.
74 BY ticker year;
75 RUN;
 
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.CEO_FIRM may be incomplete. When this step was stopped there were 0 observations and 170 variables.
WARNING: Data set WORK.CEO_FIRM was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 4010.78k
OS Memory 37560.00k
Timestamp 03/16/2018 08:08:06 PM
Step Count 216 Switch Count 0
Page Faults 0
Page Reclaims 631
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 8
 
 
76
77 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
89
 
 
Any help is appreciated!
 
 
1 REPLY 1
ballardw
Super User

@sastuck wrote:

Hello!

 

I am trying merge two datasets by ticker and year. When I run the code, however, SAS brings up an issue with another variable, cusip. I understand that the issue is that one dataset has this variable as a character variable while the other has it as a numeric one. That being said, shouldn't this be unrelated if I am trying to merge by ticker and year?

 


Yes it is critical a data set variable may only have one type. Since you have two different types SAS does not know which data type you actually want.

I would guess the "cusip" is something like "customer ip" or similar. If the value is an identifier and is not going to have arithmetic performed with it then likely it should be character.

 

You can over come issues like this by: 1) Best is controlling you data at import/creation (in this case note that you used guessing rows in on proc import and not the other which is a likely cause in this case) such as actually reading data when practical using a documented description of the data file(s) in question.

2) lots of "fixit" code like; Assumes the firms_sorted is the set with the numeric version of the cuspid and a character version is desired.

 

DATA ceo_firm; 
  MERGE ceos_sorted 
        firms_sorted (rename=(cuspid=cuspnum) )
  ; 
  BY ticker year; 
  /* to keep the version from the firms_sorted set the way MERGE
would work
the 12 in the next line should match the lenght of the character version of cuspid */ if not missing(cuspnum) then cuspid = put(cuspnum,best12. -L); drop cuspnum; RUN;

Or if you don't need the cuspid for any further analysis:

 

DATA ceo_firm; 
  MERGE ceos_sorted  (drop=cuspid )
        firms_sorted (drop=cuspid )
  ; 
  BY ticker year; 
RUN; 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 1046 views
  • 0 likes
  • 2 in conversation