Hi community,
I have two indistinguishable sets of code for solving data for two different variables (the codes for each set includes: proc sort, proc means, merge, proc rank, and trim data), the order of the lines of code and everything is similar between these two sets of codes (the difference is only the name of the variable).
I can get to the goal with these two sets of code, but since I saw the aesthetic SAS, I believe that there should be one way to merge these codes (especially they are resemblant to each other). I am wondering if we can merge these two sets of code together. I am wondering if you can suggest me the way that I can search or follow.
Thanks in advance.
If you have a lot of common code preceding or following these two distinct data step, then you could embed them in a macro definition as below:
%macro mytask(dsn=);
**** preceding common code here *****;
%if &dsn=amihud %then %do;
data amihud_;
set finish_error;
by Type;
amihud=abs_r/trading_vol;
year=year(date);
if first.type then trading_vol=. and amihud=.;
run;
%end;
%else %if &dsn=closing %then %do;
data closing_;
set finish_error;
by Type;
b=divide(pa_us-pb_us,mean_pa_pb);
if b> 0.5*mean_pa_pb then b=.;
year=year(date);
run;
%end;
**** trailing common code here *****;
%mend mytask;
Note I assume that these two distinct steps occur in precisely the same relative position within the common code.
Then all you need to do is invoke the macro based on your needs. Below the macro is invoked twice, once for each distinct code:
%mytask(dsn=amihud);
%mytask(dsn=closing);
I'm not aware of any way to "merge code" other than by applying human intelligence and effort to the process.
This might be a case for macro processing, but we will need to see those codes first.
Hi @Kurt_Bremser , one part of my code is as below:
options compress=yes reuse=yes;
data finish_error;
set 'D:\link_to_the_dataset';
run;
proc sort data=finish_error;
by Type date;
run;
data amihud_;
set finish_error;
by Type;
amihud=abs_r/trading_vol;
year=year(date);
if first.type then trading_vol=. and amihud=.;
run;
.
.
.
proc rank data=create_var groups=100 out=temp;
by year;
var lag_p_us;
ranks rank;
run;
data trim_price;
set temp;
if 0 < rank < 99;
run;
proc sort data=trim_price;
by Type;
run;
data amihud_final (drop= p_us lag_p_us rank rename=(obs=obs_amihud));
set trim_price;
run;
Thank you!
And this code is for the variable amihud, I have a similar code for another variable named "b". The code for dealing with them are similar, just different in calculation here:
for amihud:
data amihud_;
set finish_error;
by Type;
amihud=abs_r/trading_vol;
year=year(date);
if first.type then trading_vol=. and amihud=.;
run;
for b:
data closing_;
set finish_error;
by Type;
b=divide(pa_us-pb_us,mean_pa_pb);
if b> 0.5*mean_pa_pb then b=.;
year=year(date);
run;
Other codes in these two sets are indistinguishable.
Thanks!
Is the question how to create both variables in one data step?
If so it looks like it should be possible.
Note that your first data step has either a mistake or a very strange syntax that should probably be commented to explain to yourself what it means.
data amihud_closing;
set finish_error;
by Type;
year=year(date);
* Calculate AMIHUD ;
amihud=abs_r/trading_vol;
* if first.type then trading_vol=. and amihud=.;
if first.type then call missing(trading_vol,amihud);
* Calculate B ;
b=divide(pa_us-pb_us,mean_pa_pb);
if b> 0.5*mean_pa_pb then b=.;
run;
Hi @Tom , can you please quote the data step that you feel it is strange, I am going to have a look at it carefully. Thanks
@Phil_NZ wrote:
Hi @Tom , can you please quote the data step that you feel it is strange, I am going to have a look at it carefully. Thanks
I left the line in the code, only as a comment. This statement:
trading_vol=. and amihud=.;
is setting TRADING_VOL to zero. SAS evaluates boolean expressions as either zero or one and neither one will be considered equal to missing so the result is always zero.
I replaced it with:
call missing(trading_vol,amihud);
which will set both TRADING_VOL and AMIHUD to missing.
Awesome, thank you @Tom for this comprehensive explanation
If you have a lot of common code preceding or following these two distinct data step, then you could embed them in a macro definition as below:
%macro mytask(dsn=);
**** preceding common code here *****;
%if &dsn=amihud %then %do;
data amihud_;
set finish_error;
by Type;
amihud=abs_r/trading_vol;
year=year(date);
if first.type then trading_vol=. and amihud=.;
run;
%end;
%else %if &dsn=closing %then %do;
data closing_;
set finish_error;
by Type;
b=divide(pa_us-pb_us,mean_pa_pb);
if b> 0.5*mean_pa_pb then b=.;
year=year(date);
run;
%end;
**** trailing common code here *****;
%mend mytask;
Note I assume that these two distinct steps occur in precisely the same relative position within the common code.
Then all you need to do is invoke the macro based on your needs. Below the macro is invoked twice, once for each distinct code:
%mytask(dsn=amihud);
%mytask(dsn=closing);
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.