DATA step failing to calculate first value

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

DATA step failing to calculate first value

I have a very simple bit of code below that I setup to calculate percentage change between two periods.

 

The calculations appear fine except for the first valid record.

 

The data set is arranged so that there are multiple stock tickers in time series. The data is sorted by ticker and first_of_month date.

 

I check to see if the ticker is equal to the prior record ticker. If it is, then I calculate the percentage change in price. If it is not, that indicates one ticker series has ended and another begun so I do not calculate the change as there is no prior period for the new ticker. For debugging purposes, I fill in the price_change field with 666

 

As you can see from the output snippet, the first record appripriately gets a 666 but then the second record fails to calcualte the price_change at all. I have included other debugging variables that indicate while the code sees the prior record with the same ticker, it fails to see the lagged price and does not perform the calculation.

 

This failure only happens on the first two records of the data set. When the ticker changes later on in the data set, everyone performs fine. I also checked this on several different datasets and the behavior is the same - the first ticker misses out on the first opportunity for a price_change calculation but everything else works fine.

 


DATA pct_changes;
    SET ticker_files;
    IF lag1(ticker) = ticker THEN
    DO;
        price_change = dif1(close_price)/lag1(close_price)*100;
        lag1price = lag1(close_price);
        dif1price = dif1(close_price);
        lag2price = lag2(close_price);
        old_price = "I see it";
    END;
    ELSE
        price_change = 666;

 

ticker    first_of_month    close_price    price_change    lag1price    dif1price    lag2price    old_price
MMC    01/01/98    16.2454    666    .    .    .    
MMC    02/01/98    16.2575    .    .    .    .    I see it
MMC    03/01/98    18.9624    16.637859449    16.2575    2.7049    .    I see it
MMC    04/01/98    19.2384    1.4555119605    18.9624    0.276    16.2575    I see it
MMC    05/01/98    20.1771    4.8793038922    19.2384    0.9387    18.9624    I see it

 

I appreciate any suggestions


Accepted Solutions
Solution
3 weeks ago
Super User
Super User
Posts: 6,318

Re: DATA step failing to calculate first value

You haven't explained what you are trying to do but I expect that you want to run the LAG functions on every observation instead of conditionally running it only on some of the observations. The LAG() function does not take the value from the previous observation. It takes it from the stack that is build from the values that are pushed down with in runs.  Skipping some observations prevents those values from making into the lag stack to be later retrieved. Try this little test.

data test;
  input a @@;
  laga=lag(a);
  if _N_> 1 then NOT_REALLY_laga=lag(a);
cards;
1 2 3 4 5 6 7 8 9
;

You need to run the LAG() every time.  

DATA pct_changes;
    SET ticker_files;
    by ticker ;
    lag1price = lag1(close_price);
    dif1price = dif1(close_price);
    lag2price = lag2(close_price);
    if first.ticker then call missing(of lag1price dif1price lag2price);
   else price_change = 100*(dif1price/lag1price);
run;
 

 

View solution in original post


All Replies
Solution
3 weeks ago
Super User
Super User
Posts: 6,318

Re: DATA step failing to calculate first value

You haven't explained what you are trying to do but I expect that you want to run the LAG functions on every observation instead of conditionally running it only on some of the observations. The LAG() function does not take the value from the previous observation. It takes it from the stack that is build from the values that are pushed down with in runs.  Skipping some observations prevents those values from making into the lag stack to be later retrieved. Try this little test.

data test;
  input a @@;
  laga=lag(a);
  if _N_> 1 then NOT_REALLY_laga=lag(a);
cards;
1 2 3 4 5 6 7 8 9
;

You need to run the LAG() every time.  

DATA pct_changes;
    SET ticker_files;
    by ticker ;
    lag1price = lag1(close_price);
    dif1price = dif1(close_price);
    lag2price = lag2(close_price);
    if first.ticker then call missing(of lag1price dif1price lag2price);
   else price_change = 100*(dif1price/lag1price);
run;
 

 

Esteemed Advisor
Posts: 7,288

Re: DATA step failing to calculate first value

I agree with Tom, but not only did you misuse the lag function (i.e., by applying it conditionally), you also misused the dif function. Here is a way to use both correctly:

data ticker_files;
  informat first_of_month mmddyy8.;
  input ticker $  first_of_month  close_price;
  cards;
MMC    01/01/98    16.2454
MMC    02/01/98    16.2575
MMC    03/01/98    18.9624
MMC    04/01/98    19.2384
MMC    05/01/98    20.1771
;

data pct_changes (drop=counter);
  SET ticker_files;
  by ticker;
  if first.ticker then counter=1;
  else counter+1;
  lag1price = ifn(counter gt 1,lag(close_price),.);
  dif1price = ifn(counter gt 1,dif1(close_price),.);
  lag2price = ifn(counter gt 2,lag2(close_price),.);
  old_price = "I see it";
  price_change = ifn(counter gt 1,dif(close_price)*100,666);
run;

Art, CEO, AnalystFinder.com

Occasional Contributor
Posts: 6

Re: DATA step failing to calculate first value

Thanks guys - I understand it much better now based on your explanations.

 

As a bonus, I learned about first.var, CALL MISSING and proper use of the BY statement!

 

Just to make sure I understand this completely, is the stack reset when the next ticker arrives due to the BY statement? That's what I want and the code seems to behave that wya but thought I should verify.

 

Thanks again! It's been too many years since I coded in SAS...

Esteemed Advisor
Posts: 7,288

Re: DATA step failing to calculate first value

No, the stack isn't reset with a by statement. The code simply substitutes a missing value when the first record of a by group is being processed.

 

Art, CEO, AnalystFinder.com

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 97 views
  • 2 likes
  • 3 in conversation