BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tparrent
Fluorite | Level 6

I have a very simple bit of code below that I setup to calculate percentage change between two periods.

 

The calculations appear fine except for the first valid record.

 

The data set is arranged so that there are multiple stock tickers in time series. The data is sorted by ticker and first_of_month date.

 

I check to see if the ticker is equal to the prior record ticker. If it is, then I calculate the percentage change in price. If it is not, that indicates one ticker series has ended and another begun so I do not calculate the change as there is no prior period for the new ticker. For debugging purposes, I fill in the price_change field with 666

 

As you can see from the output snippet, the first record appripriately gets a 666 but then the second record fails to calcualte the price_change at all. I have included other debugging variables that indicate while the code sees the prior record with the same ticker, it fails to see the lagged price and does not perform the calculation.

 

This failure only happens on the first two records of the data set. When the ticker changes later on in the data set, everyone performs fine. I also checked this on several different datasets and the behavior is the same - the first ticker misses out on the first opportunity for a price_change calculation but everything else works fine.

 


DATA pct_changes;
    SET ticker_files;
    IF lag1(ticker) = ticker THEN
    DO;
        price_change = dif1(close_price)/lag1(close_price)*100;
        lag1price = lag1(close_price);
        dif1price = dif1(close_price);
        lag2price = lag2(close_price);
        old_price = "I see it";
    END;
    ELSE
        price_change = 666;

 

ticker    first_of_month    close_price    price_change    lag1price    dif1price    lag2price    old_price
MMC    01/01/98    16.2454    666    .    .    .    
MMC    02/01/98    16.2575    .    .    .    .    I see it
MMC    03/01/98    18.9624    16.637859449    16.2575    2.7049    .    I see it
MMC    04/01/98    19.2384    1.4555119605    18.9624    0.276    16.2575    I see it
MMC    05/01/98    20.1771    4.8793038922    19.2384    0.9387    18.9624    I see it

 

I appreciate any suggestions

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You haven't explained what you are trying to do but I expect that you want to run the LAG functions on every observation instead of conditionally running it only on some of the observations. The LAG() function does not take the value from the previous observation. It takes it from the stack that is build from the values that are pushed down with in runs.  Skipping some observations prevents those values from making into the lag stack to be later retrieved. Try this little test.

data test;
  input a @@;
  laga=lag(a);
  if _N_> 1 then NOT_REALLY_laga=lag(a);
cards;
1 2 3 4 5 6 7 8 9
;

You need to run the LAG() every time.  

DATA pct_changes;
    SET ticker_files;
    by ticker ;
    lag1price = lag1(close_price);
    dif1price = dif1(close_price);
    lag2price = lag2(close_price);
    if first.ticker then call missing(of lag1price dif1price lag2price);
   else price_change = 100*(dif1price/lag1price);
run;
 

 

View solution in original post

4 REPLIES 4
Tom
Super User Tom
Super User

You haven't explained what you are trying to do but I expect that you want to run the LAG functions on every observation instead of conditionally running it only on some of the observations. The LAG() function does not take the value from the previous observation. It takes it from the stack that is build from the values that are pushed down with in runs.  Skipping some observations prevents those values from making into the lag stack to be later retrieved. Try this little test.

data test;
  input a @@;
  laga=lag(a);
  if _N_> 1 then NOT_REALLY_laga=lag(a);
cards;
1 2 3 4 5 6 7 8 9
;

You need to run the LAG() every time.  

DATA pct_changes;
    SET ticker_files;
    by ticker ;
    lag1price = lag1(close_price);
    dif1price = dif1(close_price);
    lag2price = lag2(close_price);
    if first.ticker then call missing(of lag1price dif1price lag2price);
   else price_change = 100*(dif1price/lag1price);
run;
 

 

art297
Opal | Level 21

I agree with Tom, but not only did you misuse the lag function (i.e., by applying it conditionally), you also misused the dif function. Here is a way to use both correctly:

data ticker_files;
  informat first_of_month mmddyy8.;
  input ticker $  first_of_month  close_price;
  cards;
MMC    01/01/98    16.2454
MMC    02/01/98    16.2575
MMC    03/01/98    18.9624
MMC    04/01/98    19.2384
MMC    05/01/98    20.1771
;

data pct_changes (drop=counter);
  SET ticker_files;
  by ticker;
  if first.ticker then counter=1;
  else counter+1;
  lag1price = ifn(counter gt 1,lag(close_price),.);
  dif1price = ifn(counter gt 1,dif1(close_price),.);
  lag2price = ifn(counter gt 2,lag2(close_price),.);
  old_price = "I see it";
  price_change = ifn(counter gt 1,dif(close_price)*100,666);
run;

Art, CEO, AnalystFinder.com

tparrent
Fluorite | Level 6

Thanks guys - I understand it much better now based on your explanations.

 

As a bonus, I learned about first.var, CALL MISSING and proper use of the BY statement!

 

Just to make sure I understand this completely, is the stack reset when the next ticker arrives due to the BY statement? That's what I want and the code seems to behave that wya but thought I should verify.

 

Thanks again! It's been too many years since I coded in SAS...

art297
Opal | Level 21

No, the stack isn't reset with a by statement. The code simply substitutes a missing value when the first record of a by group is being processed.

 

Art, CEO, AnalystFinder.com

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 783 views
  • 2 likes
  • 3 in conversation