Solved: Re: Variable in list does not match type prescribed for this list

sastuck · Posted 04-07-2018 08:03 AM

Hello,

I have a variable I would like to include in my regression model "bonus" that is character and needs to be numeric. While I am looking for help with the data step that could fix this, I am also wondering how this will fit into the context of the rest of my program, since I have been renaming the data sets as I've been going. So one question I have, for example, is if we change the name of the new data set (with bonus as a numeric variable), then do we need to change the data set going into the merge statement (which I use to run the regression) so that this change is put into use?

Here's what bonus looks like:

36656478161452797741211945451505255534647956761056894889091751038798281

ADDRESS	Char	41	$41.	$41.
AGE	Num	8	BEST12.	BEST32.
ALLOTHPD	Char	1	$1.	$1.
ALLOTHTOT	Char	1	$1.	$1.
BECAMECEO	Num	8	BEST12.	BEST32.
BONUS	Char	12	$12.	$12.
CEOANN	Char	3	$3.	$3.
CFOANN	Char	7	$7.	$7.
CHG_CTRL_PYMT	Num	8	BEST12.	BEST32.
CITY	Char	15	$15.	$15.
COMMENT	Char	1	$1.	$1.
CONAME	Char	34	$34.	$34.
CO_PER_ROL	Char	5	$5.	$5.
CUSIP	Char	8	$8.	$8.
DEFER_BALANCE_TOT	Num	8	BEST12.	BEST32.
DEFER_CONTRIB_CO_TOT	Num	8	BEST12.	BEST32.
DEFER_CONTRIB_EXEC_TOT	Num	8	BEST12.	BEST32.
DEFER_EARNINGS_TOT	Num	8	BEST12.	BEST32.
DEFER_RPT_AS_COMP_TOT	Num	8	BEST12.	BEST32.
DEFER_WITHDR_TOT	Num	8	BEST12.	BEST32.
EIP_UNEARN_NUM	Num	8	BEST12.	BEST32.
EIP_UNEARN_VAL	Num	8	BEST12.	BEST32.
EXCHANGE	Char	3	$3.	$3.
EXECDIR	Char	2	$2.	$2.
EXECID	Num	8	BEST12.	BEST32.
EXECRANK	Char	8	$8.	$8.
EXECRANKANN	Num	8	BEST12.	BEST32.
EXEC_FNAME	Char	12	$12.	$12.
EXEC_FULLNAME	Char	614	$614.	$614.
EXEC_LNAME	Char	22	$22.	$22.
EXEC_MNAME	Char	9	$9.	$9.
GENDER	Char	6	$6.	$6.
GVKEY	Num	8	BEST12.	BEST32.
INDDESC	Char	44	$44.	$44.
INTERLOCK	Char	3	$3.	$3.
JOINED_CO	Num	8	BEST12.	BEST32.
LEFTCO	Num	8	BEST12.	BEST32.
LEFTOFC	Num

and here is what my program looks like:

libname paper "~/425/425_Final_Paper";

*import compustat dataset;
PROC IMPORT out=paper.CompuStat_Execucomp
datafile="~/425/425_Final_Paper/CompuStat_Execucomp.csv"
DBMS=CSV replace;
getnames= yes;
guessingrows=2000;
run;

*keep only CEOs;
data paper.Compustat_ExecuComp2;
set paper.Compustat_ExecuComp;
if CEOANN = 'CEO';
run;

*** extra DATA step for checking previous data step results by comparing number of records selected;
data check_CEOANN;
set paper.Compustat_ExecuComp;
if CEOANN =: 'CEO';
run;

proc sort data=paper.Compustat_ExecuComp2;
by ticker year;
run;

*import csrp dataset;
PROC IMPORT out=paper.CSRP_Monthly_Stock_char
datafile="~/425/425_Final_Paper/CSRP_MonthlyStock_char.csv"
DBMS=CSV replace;
getnames= yes;
guessingrows=max;
run;

*remove bad data;
 data paper.CSRP_Monthly_Stock_char2;
 set paper.CSRP_Monthly_Stock_char (rename=(ret=character_ret));
 drop character_ret;
 if cusip = ' ' then delete;
 ret = input(character_ret, ??8.);
 if ret = . then delete;
 date = input(put(date, z8.), yymmdd8.);
 format date yymmdd10.;
 year = year(date);
 month = month(date);
 if cusip =: '?' then cusip = substr(cusip, 2);
 run;
 
proc contents 
data=paper.CSRP_Monthly_Stock_char;
run;

proc contents 
data=paper.CSRP_Monthly_Stock_char2;
run;

proc contents 
data=paper.multiple_CEOs;
run;

proc sort data=paper.CSRP_Monthly_Stock_char2;
by ticker year;
run;

proc sort data=paper.Compustat_ExecuComp2;
by ticker year;
run;

*Remove all bad years from both data sources;
data paper.Compustat_ExecuComp3;
length ticker $5;
merge paper.Compustat_ExecuComp2
paper.multiple_CEOs (keep=ticker year in=had_multiple_CEOs);
by ticker year;
if had_multiple_CEOs then delete;
run;

data paper.CSRP_Monthly_Stock_char3;
merge paper.CSRP_Monthly_Stock_char2
paper.multiple_CEOs (keep=ticker year in=had_multiple_CEOs);
by ticker year;
if had_multiple_CEOs then delete;
run;

*find additional bad data: multiple return records for the same month/year;
proc freq data=paper.CSRP_Monthly_Stock_char3;
tables ticker * year * month / noprint out=paper.multiple_returns (where=(count > 1));
run;

*Remove all matching year data for multiple returns;
proc sort data=paper.multiple_returns out=multiple_returns (keep=ticker year) NODUPKEY;
by ticker year;
run;

data paper.Compustat_ExecuComp4;
merge paper.Compustat_ExecuComp3
multiple_returns (in=had_multiple_returns);
by ticker year;
if had_multiple_returns then delete;
run;

data paper.CSRP_Monthly_Stock_char4;
length ticker $5;
merge paper.CSRP_Monthly_Stock_char3
multiple_returns (in=had_multiple_returns);
by ticker year;
if had_multiple_returns then delete;
run;

proc contents 
data=paper.CSRP_Monthly_Stock_char3;
run;

proc contents 
data=paper.CSRP_Monthly_Stock_char4;
run;

*create new variable annualized growth;
data paper.CSRP_annual_returns;
set paper.CSRP_Monthly_Stock_char4;
by ticker year;
retain annual_return 1;
annual_return = annual_return * (1 + RET);
if month = 12 or last.ticker;
annual_return = (annual_return - 1) * 100;
output;
annual_return = 1;
keep ticker year annual_return;
run;

*use proc contents to see if there is a type mismatch;
proc contents 
data=paper.CSRP_annual_returns;
run;

proc contents 
data=paper.Compustat_ExecuComp4;
run;

*MERGE statement has more than one data set with repeats of BY values;
proc sort data=paper.CSRP_annual_returns
               out=paper.CSRP_annual_returns2
               nodupkey;
  by ticker;
run;

*Merge CEO data and firm data; 
DATA paper.ceo_firm ;
   length ticker $5;
  MERGE
    paper.CSRP_annual_returns2 (in=in1)
    paper.compustat_execucomp4 (in=in2)
  ;
  BY ticker;
  if in1 and in2;
RUN;

proc contents 
data=paper.ceo_firm;
run;

*remove missing return data; 
data paper.ceo_firm;
set paper.ceo_firm;
if annual_return = . then delete;
run;

proc means data=paper.CSRP_Monthly_Stock_char3   n nmiss;
var ret;
run;

proc means data=paper.Compustat_ExecuComp4   n nmiss;
var salary;
run;

proc means data=paper.CSRP_annual_returns   n nmiss;
var annual_return;
run;

*create dummy variable that indicates whether the firm's market value increases in a given year;
data paper.ceo_firm;
set paper.ceo_firm;
by ticker year;
last_return = lag(annual_return);
if not first.ticker
then increase = (annual_return >  last_return);
else increase = .;
run;

Thanks so much for the help!

Astounding · Posted 04-07-2018 09:20 AM

Given that BONUS is originally part of executive compensation, here would be an easy place to revise the program:

*Remove all bad years from both data sources;
data paper.Compustat_ExecuComp3;
length ticker $5;
merge paper.Compustat_ExecuComp2
paper.multiple_CEOs (keep=ticker year in=had_multiple_CEOs);
by ticker year;
if had_multiple_CEOs then delete;
run;

One way to change it:

*Remove all bad years from both data sources;
data paper.Compustat_ExecuComp3;
length ticker $5;
merge paper.Compustat_ExecuComp2 (rename=(bonus=bonus_char))
paper.multiple_CEOs (keep=ticker year in=had_multiple_CEOs);
by ticker year;
if had_multiple_CEOs then delete;
bonus = input(bonus_char, ??12.);
drop bonus_char;
run;

You might want to examine the character values for BONUS. This program changes them to missing values. It could be appropriate to change them to 0 or to some other value. It all depends on what those character values contain, and what those values mean. If you want to isolate them and inspect them, it could be done in this way:

proc freq data=paper.CompuStat_ExecuComp2;

tables bonus;

where bonus > ' ' and input(bonus, 12.) = .;

run;

View solution in original post

Astounding · Posted 04-07-2018 09:20 AM

Given that BONUS is originally part of executive compensation, here would be an easy place to revise the program:

*Remove all bad years from both data sources;
data paper.Compustat_ExecuComp3;
length ticker $5;
merge paper.Compustat_ExecuComp2
paper.multiple_CEOs (keep=ticker year in=had_multiple_CEOs);
by ticker year;
if had_multiple_CEOs then delete;
run;

One way to change it:

*Remove all bad years from both data sources;
data paper.Compustat_ExecuComp3;
length ticker $5;
merge paper.Compustat_ExecuComp2 (rename=(bonus=bonus_char))
paper.multiple_CEOs (keep=ticker year in=had_multiple_CEOs);
by ticker year;
if had_multiple_CEOs then delete;
bonus = input(bonus_char, ??12.);
drop bonus_char;
run;

You might want to examine the character values for BONUS. This program changes them to missing values. It could be appropriate to change them to 0 or to some other value. It all depends on what those character values contain, and what those values mean. If you want to isolate them and inspect them, it could be done in this way:

proc freq data=paper.CompuStat_ExecuComp2;

tables bonus;

where bonus > ' ' and input(bonus, 12.) = .;

run;

Variable in list does not match type prescribed for this list

Re: Variable in list does not match type prescribed for this list

Re: Variable in list does not match type prescribed for this list

Registration is open

SAS Training: Just a Click Away