Hi all
Please need to simplify the following code or alternative more efficient way to do it:
proc sort data=example ;
by ISIN datadate;
run;
data example ;
set example;
lag_ISIN=lag(ISIN);
lag_PRCCD=lag(PRCCD);
If lag_ISIN=ISIN then RET=PRCCD/lag_PRCCD-1;else RET=.;
run;
data example ;
set example;
by ISIN;
RET=PRCCD/lag_PRCCD - 1;
if first.ISIN then RET=.;
format ret percent12.1;
run;
There really isn't much to simplify here that I can see, you still end up with roughly the same lines of code. You could possibly remove one by using IFN() but I don't think you gain any efficiency doing so.
@georgel wrote:
Hi all
Please need to simplify the following code or alternative more efficient way to do it:
proc sort data=example ; by ISIN datadate; run; data example ; set example; lag_ISIN=lag(ISIN); lag_PRCCD=lag(PRCCD); If lag_ISIN=ISIN then RET=PRCCD/lag_PRCCD-1;else RET=.; run;
data example ;
set example;
by ISIN;
RET=PRCCD/lag_PRCCD - 1;
if first.ISIN then RET=.;
format ret percent12.1;
run;
There really isn't much to simplify here that I can see, you still end up with roughly the same lines of code. You could possibly remove one by using IFN() but I don't think you gain any efficiency doing so.
@georgel wrote:
Hi all
Please need to simplify the following code or alternative more efficient way to do it:
proc sort data=example ; by ISIN datadate; run; data example ; set example; lag_ISIN=lag(ISIN); lag_PRCCD=lag(PRCCD); If lag_ISIN=ISIN then RET=PRCCD/lag_PRCCD-1;else RET=.; run;
It seems that you are dealing with data from CRSP or Compustat in Finance
I agree with solution from @Reeza , just want to highlight that you should calculate the return first then set the if later like this answer (because the way "if" plays quite tricky).
You may receive a lot of notes that "Missing ..." in your log, but it makes sense and (maybe) safe because there should be some missing obs somewhere in your dataset. If you want to be over this note, the norm is to set the full condition if you want to clarify any NOTE.
For example:
lag_prccd=lag(prccd);
may cause the note" missing..."
but
lag_prccd=ifn(prccd<=0,., lag(prccd));
won't result in such a note.
Remember Compustat put the average of Bid/Ask spread as closing price (prccd in your case) and missing prccd will be set as 0 rather than ".", so be careful when dealing with such a variable in your dataset
Warm regards,
Phil.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.