DATA Step, Macro, Functions and more

Repeating code for different file

Accepted Solution Solved
Reply
Contributor
Posts: 48
Accepted Solution

Repeating code for different file

[ Edited ]

Hi I am new to sas and currently wrote some very simple code to do probit/lpogit model with the data I downloaded. I did the 11 years (2006 to 2016) one by one to get the result, which is very time consuming. So I am wondering is there any way to repeat the steps from import file til the end by code? I always feel my code is pretty redundant so any suggestion for simplifying would be appreciated. Thank you!

Here is my code:

PROC IMPORT OUT= work.Ebsd
DATAFILE= "C:\Users\Xusheng\Desktop\OneDrive\MSCM5P04\A2\Original_Div\Div_2006.csv"
DBMS=CSV REPLACE;*signify currency, year and month. if cite a word as the name, should be end up with '.', check &currency.,&year.,&i.;
GETNAMES=YES;
DATAROW=2;
RUN;
data work.EBSd;
set work.Ebsd;
rename at=TotalAsset;
rename bkvlps=BookValue;
rename capx=CapitalExpenditure;
rename dpc=Depreciation;
rename dt=TotalDebt;
rename dv=Dividend;
rename ebit=EBIT;
rename gp=GrossProfit;
rename ni=NetIncome;
rename seq=ShareholderEquity;
rename txt=IncomeTax;
rename wcapch=WorkingCapitalChange;
rename xint=InterestExpense;
rename xopr=OperatingExpense;
drop costat tie dp curcd tic datafmt popsrc consol indfmt fyear datadate gvkey;
run;
data work.EBSd;
set work.Ebsd;
if TotalAsset=. then delete;
if BookValue=. then BookValue=0;
if CapitalExpenditure=. then CapitalExpenditure=0;
if Depreciation=. then Depreciation=0;
if TotalDebt=. then TotalDebt=0;
if EBIT=. then EBIT=0;
if GrossProfit=. then GrossProfit=0;
if NetIncome=. then NetIncome=0;
if OperatingExpense=. then OperatingExpense=0;
if InterestExpense=. then InterestExpense=0;
if WorkingCapitalChange=. then WorkingCapitalChange=0;
if Dividend=0 then Dividend=0;
if Dividend>0 then Dividend=1;
if Dividend=. then Dividend=0;
run;
data work.EBSd;
set work.Ebsd;
CFO= EBIT + Depreciation - IncomeTax - WorkingCapitalChange;
FCF= CFO - CapitalExpenditure;
D_E= TotalDebt/ShareholderEquity;
ROE= NetIncome/ShareholderEquity;
drop TotalAsset CapitalExpenditure Depreciation TotalDebt EBIT GrossProfit NetIncome ShareholderEquity IncomeTax WorkingCapitalChange InterestExpense OperatingExpense CFO;
run;
Proc Export data=Work.Ebsd
Outfile= "C:\Users\Xusheng\Desktop\OneDrive\MSCM5P04\A2\Ready-For-Probit-Logit\Div_2006.csv"
DBMS=csv replace;
Putname=yes;
run;
*Logit;
proc logistic data=work.Ebsd descending;
model Dividend = BookValue FCF D_E ROE/ ctable pprob=0.5;
output out = lout predicted=plogit;
run;
proc qlim data=work.Ebsd;
model Dividend = BookValue FCF D_E ROE/ discrete (dist=logit);
output out=logmargin marginal;
run;
proc means data=logmargin mean std;
var Meff_l_BookValue Meff_l_FCF Meff_l_D_E Meff_l_ROE;
run;
proc means data=lout;
var Dividend plogit;
run;
*Probit;
proc logistic data=work.Ebsd descending;
model Dividend = BookValue FCF D_E ROE/ link=probit ctable pprob=0.5;
output out=Pout predicted=pprobit;
run;
proc qlim data=work.Ebsd;
model Dividend = BookValue FCF D_E ROE/ discrete (dist=normal);
output out=probmargin marginal;
run;
proc means data=probmargin mean std;
var Meff_l_BookValue Meff_l_FCF Meff_l_D_E Meff_l_ROE;
run;
proc means data=Pout;
var Dividend pprobit;
run;
*Create PDF;
filename sas "C:\Users\Xusheng\Desktop\OneDrive\MSCM5P04\A2\Div_Probit_Logit.sas";
ods listing close;
ODS pdf file="C:\Users\Xusheng\Desktop\OneDrive\MSCM5P04\A2\Div_Probit_Logit.pdf";
data _null_;
infile sas ;
input;
temp=_infile_;
file print ods;
put _ods_;
run;
ODS pdf CLOSE;
ods listing;


Accepted Solutions
Solution
‎02-28-2017 01:58 PM
Frequent Contributor
Posts: 75

Re: Repeating code for different file

One word: macros

 

I don't know if your pdf code is part of the processing for each of the 11 files, but if so, you would be overwriting it each time. If it needs to run for each of the 11 files, then you can include it at the end of the macro. I've left it out. But for running the rest of code repeatedly for the 11 files, all you need to do is wrap it in a macro, and since your years are sequential, a simple macro %do loop is used to loop through each of the 11 files.

%macro processyr;
%do yr=2006 %to 2016;
	PROC IMPORT OUT= work.Ebsd
	DATAFILE= "C:\Users\Xusheng\Desktop\OneDrive\MSCM5P04\A2\Original_Div\Div_&yr..csv"
	DBMS=CSV REPLACE;*signify currency, year and month. if cite a word as the name, should be end up with '.', check &currency.,&year.,&i.;
	GETNAMES=YES;
	DATAROW=2;
	RUN;
	data work.EBSd;
	set work.Ebsd;
	rename at=TotalAsset;
	rename bkvlps=BookValue;
	rename capx=CapitalExpenditure;
	rename dpc=Depreciation;
	rename dt=TotalDebt;
	rename dv=Dividend;
	rename ebit=EBIT;
	rename gp=GrossProfit;
	rename ni=NetIncome;
	rename seq=ShareholderEquity;
	rename txt=IncomeTax;
	rename wcapch=WorkingCapitalChange;
	rename xint=InterestExpense;
	rename xopr=OperatingExpense;
	drop costat tie dp curcd tic datafmt popsrc consol indfmt fyear datadate gvkey;
	run;
	data work.EBSd;
	set work.Ebsd;
	if TotalAsset=. then delete;
	if BookValue=. then BookValue=0;
	if CapitalExpenditure=. then CapitalExpenditure=0;
	if Depreciation=. then Depreciation=0;
	if TotalDebt=. then TotalDebt=0;
	if EBIT=. then EBIT=0;
	if GrossProfit=. then GrossProfit=0;
	if NetIncome=. then NetIncome=0;
	if OperatingExpense=. then OperatingExpense=0;
	if InterestExpense=. then InterestExpense=0;
	if WorkingCapitalChange=. then WorkingCapitalChange=0;
	if Dividend=0 then Dividend=0;
	if Dividend>0 then Dividend=1;
	if Dividend=. then Dividend=0;
	run;

	data work.EBSd;
	set work.Ebsd;
	CFO= EBIT + Depreciation - IncomeTax - WorkingCapitalChange;
	FCF= CFO - CapitalExpenditure;
	D_E= TotalDebt/ShareholderEquity;
	ROE= NetIncome/ShareholderEquity;
	drop TotalAsset CapitalExpenditure Depreciation TotalDebt EBIT GrossProfit NetIncome ShareholderEquity IncomeTax WorkingCapitalChange InterestExpense OperatingExpense CFO;
	run;
	Proc Export data=Work.Ebsd
	Outfile= "C:\Users\Xusheng\Desktop\OneDrive\MSCM5P04\A2\Ready-For-Probit-Logit\Div_&yr..csv"
	DBMS=csv replace;
	Putname=yes;
	run;
	*Logit;
	proc logistic data=work.Ebsd descending;
	model Dividend = BookValue FCF D_E ROE/ ctable pprob=0.5;
	output out = lout predicted=plogit;
	run;
	proc qlim data=work.Ebsd;
	model Dividend = BookValue FCF D_E ROE/ discrete (dist=logit);
	output out=logmargin marginal;
	run;
	proc means data=logmargin mean std;
	var Meff_l_BookValue Meff_l_FCF Meff_l_D_E Meff_l_ROE;
	run;
	proc means data=lout;
	var Dividend plogit;
	run;
	*Probit;
	proc logistic data=work.Ebsd descending;
	model Dividend = BookValue FCF D_E ROE/ link=probit ctable pprob=0.5;
	output out=Pout predicted=pprobit;
	run;
	proc qlim data=work.Ebsd;
	model Dividend = BookValue FCF D_E ROE/ discrete (dist=normal);
	output out=probmargin marginal;
	run;
	proc means data=probmargin mean std;
	var Meff_l_BookValue Meff_l_FCF Meff_l_D_E Meff_l_ROE;
	run;
	proc means data=Pout;
	var Dividend pprobit;
	run;
%end;
%mend;

%processyr;

View solution in original post


All Replies
Solution
‎02-28-2017 01:58 PM
Frequent Contributor
Posts: 75

Re: Repeating code for different file

One word: macros

 

I don't know if your pdf code is part of the processing for each of the 11 files, but if so, you would be overwriting it each time. If it needs to run for each of the 11 files, then you can include it at the end of the macro. I've left it out. But for running the rest of code repeatedly for the 11 files, all you need to do is wrap it in a macro, and since your years are sequential, a simple macro %do loop is used to loop through each of the 11 files.

%macro processyr;
%do yr=2006 %to 2016;
	PROC IMPORT OUT= work.Ebsd
	DATAFILE= "C:\Users\Xusheng\Desktop\OneDrive\MSCM5P04\A2\Original_Div\Div_&yr..csv"
	DBMS=CSV REPLACE;*signify currency, year and month. if cite a word as the name, should be end up with '.', check &currency.,&year.,&i.;
	GETNAMES=YES;
	DATAROW=2;
	RUN;
	data work.EBSd;
	set work.Ebsd;
	rename at=TotalAsset;
	rename bkvlps=BookValue;
	rename capx=CapitalExpenditure;
	rename dpc=Depreciation;
	rename dt=TotalDebt;
	rename dv=Dividend;
	rename ebit=EBIT;
	rename gp=GrossProfit;
	rename ni=NetIncome;
	rename seq=ShareholderEquity;
	rename txt=IncomeTax;
	rename wcapch=WorkingCapitalChange;
	rename xint=InterestExpense;
	rename xopr=OperatingExpense;
	drop costat tie dp curcd tic datafmt popsrc consol indfmt fyear datadate gvkey;
	run;
	data work.EBSd;
	set work.Ebsd;
	if TotalAsset=. then delete;
	if BookValue=. then BookValue=0;
	if CapitalExpenditure=. then CapitalExpenditure=0;
	if Depreciation=. then Depreciation=0;
	if TotalDebt=. then TotalDebt=0;
	if EBIT=. then EBIT=0;
	if GrossProfit=. then GrossProfit=0;
	if NetIncome=. then NetIncome=0;
	if OperatingExpense=. then OperatingExpense=0;
	if InterestExpense=. then InterestExpense=0;
	if WorkingCapitalChange=. then WorkingCapitalChange=0;
	if Dividend=0 then Dividend=0;
	if Dividend>0 then Dividend=1;
	if Dividend=. then Dividend=0;
	run;

	data work.EBSd;
	set work.Ebsd;
	CFO= EBIT + Depreciation - IncomeTax - WorkingCapitalChange;
	FCF= CFO - CapitalExpenditure;
	D_E= TotalDebt/ShareholderEquity;
	ROE= NetIncome/ShareholderEquity;
	drop TotalAsset CapitalExpenditure Depreciation TotalDebt EBIT GrossProfit NetIncome ShareholderEquity IncomeTax WorkingCapitalChange InterestExpense OperatingExpense CFO;
	run;
	Proc Export data=Work.Ebsd
	Outfile= "C:\Users\Xusheng\Desktop\OneDrive\MSCM5P04\A2\Ready-For-Probit-Logit\Div_&yr..csv"
	DBMS=csv replace;
	Putname=yes;
	run;
	*Logit;
	proc logistic data=work.Ebsd descending;
	model Dividend = BookValue FCF D_E ROE/ ctable pprob=0.5;
	output out = lout predicted=plogit;
	run;
	proc qlim data=work.Ebsd;
	model Dividend = BookValue FCF D_E ROE/ discrete (dist=logit);
	output out=logmargin marginal;
	run;
	proc means data=logmargin mean std;
	var Meff_l_BookValue Meff_l_FCF Meff_l_D_E Meff_l_ROE;
	run;
	proc means data=lout;
	var Dividend plogit;
	run;
	*Probit;
	proc logistic data=work.Ebsd descending;
	model Dividend = BookValue FCF D_E ROE/ link=probit ctable pprob=0.5;
	output out=Pout predicted=pprobit;
	run;
	proc qlim data=work.Ebsd;
	model Dividend = BookValue FCF D_E ROE/ discrete (dist=normal);
	output out=probmargin marginal;
	run;
	proc means data=probmargin mean std;
	var Meff_l_BookValue Meff_l_FCF Meff_l_D_E Meff_l_ROE;
	run;
	proc means data=Pout;
	var Dividend pprobit;
	run;
%end;
%mend;

%processyr;
Contributor
Posts: 48

Re: Repeating code for different file

Thank  you nehalsanghvi! It works.

Contributor
Posts: 48

Re: Repeating code for different file

Just for curious, if my same data is discrete years say missing 2007 and 2008. How could I create macro based on this if I still want 2006 - 2016? Thank you.

PROC Star
Posts: 7,363

Re: Repeating code for different file

Just add a line right after your do loop:

 

%if &i. ne 2007 and &i. ne 2008 %then %do;

 

and, of course, add another %end before the %mend statement.

 

Art, CEO, AnalystFinder.com

 

Frequent Contributor
Posts: 75

Re: Repeating code for different file

[ Edited ]

There can be many approaches based on what is known and what is unknown, but if you know for sure your start year is 2006 and end year is 2016, then within the same do loop, you can add a check to see if the file exists before running the rest of the code. If it exists, then proceed with the rest of the code. If it doesn't exist, print a message to the log.

 

Here's how:

%macro processyr;
%do yr=2006 %to 2016;
	%let myfileref = "C:\Users\Xusheng\Desktop\OneDrive\MSCM5P04\A2\Original_Div\Div_&yr..csv";

	%if %sysfunc(fileexist(&myfileref.)) %then %do;
		PROC IMPORT OUT= work.Ebsd
		DATAFILE= &myfileref.
		DBMS=CSV REPLACE;*signify currency, year and month. if cite a word as the name, should be end up with '.', check &currency.,&year.,&i.;
		GETNAMES=YES;
		DATAROW=2;
		RUN;

		/* THE REST OF YOUR CODE GOES HERE */

	%end;
	%else %put The external file &myfileref does not exist;
%end;
%mend;

%processyr;

 

 

This code will skip over any missing files for years between 2006 and 2016.

Contributor
Posts: 48

Re: Repeating code for different file

Hi nehalsanghvi, thanks a lot! That helps.

##- Please type your reply above this line. Simple formatting, no
attachments. -##
PROC Star
Posts: 7,363

Re: Repeating code for different file

I don't have time to fully review your code, but did notice the following:

 

(1) you do multiple datasteps on the same file. Why not combine them into one datasetp.

(2) when you have what are really if then else types, add the else conditions. e.g.:

/*have*/
if Dividend=0 then Dividend=0;
if Dividend>0 then Dividend=1;
if Dividend=. then Dividend=0;

/*should have:*/
if Dividend>0 then Dividend=1;
else Dividend=0;

(3) You could concatenate all of the files into one file, using the indsname option to identify the year if it doesn't already exist in the data. Then you could do all/most of your work by including a BY YEAR; statement in your code for each analysis.

 

HTH,

Art, CEO, AnalystFinder.com

 

Contributor
Posts: 48

Re: Repeating code for different file

Thank you art297, I'm not quite sure about what do you mean by "combine
them into one datastep", does that mean by "macro"?
I think I will understand your advise with my using of SAS.

##- Please type your reply above this line. Simple formatting, no
attachments. -##
PROC Star
Posts: 7,363

Re: Repeating code for different file

Right after your proc import you have 3 data steps .. all on the same dataset.

 

SAS has to re-read the data 3 times rather than once. I'd write it like:

 

data work.EBSd (drop TotalAsset CapitalExpenditure Depreciation    
  TotalDebt EBIT GrossProfit NetIncome ShareholderEquity IncomeTax  
  WorkingCapitalChange InterestExpense OperatingExpense CFO);
  set work.Ebsd (drop=costat tie dp curcd tic datafmt popsrc
                                       consol indfmt fyear datadate gvkey
                rename=( at=TotalAsset
                                  bkvlps=BookValue
                                 capx=CapitalExpenditure
                                 dpc=Depreciation
                                 dt=TotalDebt
                                 dv=Dividend
                                 ebit=EBIT
                                 gp=GrossProfit
                                 ni=NetIncome
                                 seq=ShareholderEquity
                                 txt=IncomeTax
                                 wcapch=WorkingCapitalChange
                                 xint=InterestExpense
                                 xopr=OperatingExpense));
  if TotalAsset=. then delete;
  if BookValue=. then BookValue=0;
  if CapitalExpenditure=. then CapitalExpenditure=0;
  if Depreciation=. then Depreciation=0;
  if TotalDebt=. then TotalDebt=0;
  if EBIT=. then EBIT=0;
  if GrossProfit=. then GrossProfit=0;
  if NetIncome=. then NetIncome=0;
  if OperatingExpense=. then OperatingExpense=0;
  if InterestExpense=. then InterestExpense=0;
  if WorkingCapitalChange=. then WorkingCapitalChange=0;
  if Dividend>0 then Dividend=1;
  else Dividend=0;
  CFO= EBIT + Depreciation - IncomeTax - WorkingCapitalChange;
  FCF= CFO - CapitalExpenditure;
  D_E= TotalDebt/ShareholderEquity;
  ROE= NetIncome/ShareholderEquity;
run;

Also, you reset a number of missing values to 0, but don't do that for IncomeTax. As such, if there are any records that have a missing value for IncomeTax, the CFO value will be missing. Unless you want that, either do the same as you did for the other variables or change the statement to: 

CFO= sum(EBIT, Depreciation, (- IncomeTax), (- WorkingCapitalChange));

 Art, CEO, AnalystFinder.com

 

Contributor
Posts: 48

Re: Repeating code for different file

Thank you art297, I've been testing your suggestion about simplification
since last afternoon. I got what you said and tried to combine them with my
attempt and your code. However, there are some problem that cause SAS
stopped after simplification. I will continue trying these days, if it's
still not working I will post another question.

##- Please type your reply above this line. Simple formatting, no
attachments. -##
PROC Star
Posts: 7,363

Re: Repeating code for different file

Review your log after you run the code. It usually shows you what you have to correct.

 

Art, CEO, AnalystFinder.com

 

Contributor
Posts: 48

Re: Repeating code for different file

Ok, will do that. Thank you.

##- Please type your reply above this line. Simple formatting, no
attachments. -##
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 259 views
  • 0 likes
  • 3 in conversation