BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Phil_NZ
Barite | Level 11

Hi SAS Users!

 

It is a familiar topic but today I face it and do not know how to trace it.

I have a couple of questions relating to my code as below:

data argentinalag;
    set argentina;
    by type;
	lags7=lag(s7);
    if first.type then lags7=.;
    lags3=lag(s3);
    if first.type then lags3=.;
    
	lags18=lag(s18);
    if first.type then lags18=.;

    *log(0) in meaningless so set these obs equalling to missing;
	if (1+s27/s2)>0 and (1+s27/s2) ne .  then do;
	cf_ope_act= log(1 + s27/s2);
    lagcf_ope_act=lag(cf_ope_act);
    if first.type then lagcf_ope_act=.;
	end;
	lags29 = lag(s29);
    if first.type then lags29=.;

	lags22 = lag(s22);
	if first.type then lags22=.;

	lags43 = lag(s43);
    if first.type then lags43=.;
   run;

And the log is as below:

63         	lags22 = lag(s22);
64         	if first.type then lags22=.;
65         
66         	lags43 = lag(s43);
67             if first.type then lags43=.;
68            run;

NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      238 at 52:7    238 at 52:11   

So, 2 questions are:

1. Can I merge the code 

if first.type then lags3=. and lags18=. and lags29=.;

or I must separate them as I did in my code

2. How to trace the missing value as the log above? what does "Number of times" = 238 mean?

Thank you!

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Best way to learn is to try it and see what happens.  For example here is a simple example using SASHELP.CLASS dataset that every SAS session should have access to. This will let you compare normal IF/THEN code with IFN() function code.  I also included and example of what happens when you conditionally execute the LAG() function.

proc sort data=sashelp.class out=class ;
  by sex name ;
run;

data test;
  set class ;
  by sex;

  lag_age1 = lag(age) ;
  if first.sex then lag_age1 = .;

  lag_age2=ifn(first.sex,.,lag(age));

  if first.sex then lag_age_wrong = .;
  else lag_age_wrong=lag(age);

run;

proc print;
run;
                                                                            lag_age_
Obs    Name       Sex    Age    Height    Weight    lag_age1    lag_age2      wrong

  1    Alice       F      13     56.5       84.0        .           .           .
  2    Barbara     F      13     65.3       98.0       13          13           .
  3    Carol       F      14     62.8      102.5       13          13          13
  4    Jane        F      12     59.8       84.5       14          14          14
  5    Janet       F      15     62.5      112.5       12          12          12
  6    Joyce       F      11     51.3       50.5       15          15          15
  7    Judy        F      14     64.3       90.0       11          11          11
  8    Louise      F      12     56.3       77.0       14          14          14
  9    Mary        F      15     66.5      112.0       12          12          12
 10    Alfred      M      14     69.0      112.5        .           .           .
 11    Henry       M      14     63.5      102.5       14          14          15
 12    James       M      12     57.3       83.0       14          14          14
 13    Jeffrey     M      13     62.5       84.0       12          12          12
 14    John        M      12     59.0       99.5       13          13          13
 15    Philip      M      16     72.0      150.0       12          12          12
 16    Robert      M      12     64.8      128.0       16          16          16
 17    Ronald      M      15     67.0      133.0       12          12          12
 18    Thomas      M      11     57.5       85.0       15          15          15
 19    William     M      15     66.5      112.0       11          11          11

 

View solution in original post

19 REPLIES 19
PeterClemmensen
Tourmaline | Level 20

1) Yes. You can do this instead.

 

if first.type then call missing(lags3, lags18, lags29);

2) You perform an operation on missing values 238 times in your data step.

Phil_NZ
Barite | Level 11

Hi @PeterClemmensen 

Thank you for your answer, I try to find from the dataset Argentina but cannot find any mistake from line 52, column 7.

I also attach the file Argentina here, can you please help me to spot it out ?

Many thanks!

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
PeterClemmensen
Tourmaline | Level 20

The line 52, column 7 is line 52 in the SAS log. That is where you need to look 🙂 Not the data.

Phil_NZ
Barite | Level 11

 Hi @PeterClemmensen !

 

The log is as below, can you please have a look, I cannot find any special variable on (line 52 and column 7) and (line 52 and column 11) from the log ?

 

37           *Set up lag variables;
38           data argentinalag;
39             set argentina;
40             by type;
41         	lags7=lag(s7);
42             if first.type then lags7=.;
43         	/*https://communities.sas.com/t5/SAS-Programming/Condition-of-calculating-Lag-in-a-datastep/m-p/715520#M221015*/
2                                                          The SAS System                             20:10 Friday, January 29, 2021

44             /*create other lag variables here*/
45             lags3=lag(s3);
46             if first.type then lags3=.;
47         
48         	lags18=lag(s18);
49             if first.type then lags18=.;
50         
51             *log(0) in meaningless so set these obs equalling to missing;
52         	if (1+s27/s2)>0 and (1+s27/s2) ne .  then do;
53         	cf_ope_act= log(1 + s27/s2);
54             lagcf_ope_act=lag(cf_ope_act);
55             if first.type then lagcf_ope_act=.;
56         	end;
57         	
58         	/*https://communities.sas.com/t5/SAS-Programming/compute-natural-logarithm-i-e-LN-in-sas/m-p/694806#M211930*/
59         
60         	lags29 = lag(s29);
61             if first.type then lags29=.;
62         
63         	lags22 = lag(s22);
64         	if first.type then lags22=.;
65         
66         	lags43 = lag(s43);
67             if first.type then lags43=.;
68            run;

NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      238 at 52:7    238 at 52:11   
NOTE: There were 1300 observations read from the data set WORK.ARGENTINA.
NOTE: The data set WORK.ARGENTINALAG has 1300 observations and 78 variables.
NOTE: Compressing data set WORK.ARGENTINALAG decreased size by 26.67 percent. 
      Compressed is 11 pages; un-compressed would require 15 pages.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.03 seconds
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Phil_NZ
Barite | Level 11
Hi @Kurt_Bremser !
Thank you!
Can I ask what do column 7 and 11 mean?
Many thanks!
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Kurt_Bremser
Super User

@Phil_NZ wrote:
Hi @Kurt_Bremser !
Thank you!
Can I ask what do column 7 and 11 mean?
Many thanks!

These are also log positions, text columns within the line. So s27 and s2 seem to be missing in the same observation.

I would also check that s2 is not zero before doing the division.

PeterClemmensen
Tourmaline | Level 20

I'm not at a SAS station right now. However, this block of code

 

	if (1+s27/s2)>0 and (1+s27/s2) ne .  then do;
	   cf_ope_act= log(1 + s27/s2);
       lagcf_ope_act=lag(cf_ope_act);
       if first.type then lagcf_ope_act=.;
	end;

worries me. Be very careful when you use the Lag Function conditionally. As I've mentioned earlier, the Lag Function is really a queue that you push values through each time it is called. Therefore, it is not really a lookback function as many programmers think.

 

So, when you use the Lag Function conditionally, you push values through the queue only when the condition is fulfilled. This usually leads to unexpected and incorrect results. 

 

Read the article Investigating the Lag Function to understand why 🙂

Phil_NZ
Barite | Level 11

Hi @PeterClemmensen 

Thank you for your document suggestion and explanation and warning, it raises my awareness well taken. I one more time read it and there are some points in your document that I do not fully understand

https://sasnrd.com/sas-lag-function-by-group-example/

 

1. So at the first page, you wrote

Both elements are missing at the start of execution. Each time it executes, SAS returns the right-most value from the queue. Furthermore, the present value of the value we want to lag, is inserted into the queue from the left

I  cannot visualize in my mind what is the right-most value mentioned there, and I tried to read the example afterwards but still can not get the idea thoroughly. Or you mean the right-most value is the value of result of lag2(x) ? and this result is inserted from the original data x ?

2. Regarding your conditional lagging, as for your example, I also fell into the fallacy about the expectation as you described, even you give the result out , I still do not understand how come we have such returned value

/*
 
id  x   Queue content
 
1   1   [ . | . ]
1   2   [ 2 | . ] Returned value: .
1   3   [ 2 | . ]
2   4   [ 4 | 2 ] Returned value: .
2   5   [ 4 | 2 ]
2   6   [ 6 | 4 ] Returned value: 2
 
*/

3. And I also ran your code to see the result

data want;
   set have;
   if mod(_N_, 2) = 0 then y = lag2(x);
run;

the result is

 
 

Does it equals to the code below ?

if mod(_N_, 2) = 0 then y = lag2(x);
else lag2(x) =.;

And thank you for introducing me the function ifn, it is so powerful.

 

3. Regarding your code in "Handling By-Groups" part, you have a code as below

data want;
   set have;
   by id;
   lagx = ifn(first.id, ., lag(x));
run;

It is really aesthetic, but I just want to cross-check if it equals to the code

data want;
    set have;
    by id;
    lagx = lag(x);
    if first.id then lagx=.;
run;

Many thanks!

 

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Tom
Super User Tom
Super User

Best way to learn is to try it and see what happens.  For example here is a simple example using SASHELP.CLASS dataset that every SAS session should have access to. This will let you compare normal IF/THEN code with IFN() function code.  I also included and example of what happens when you conditionally execute the LAG() function.

proc sort data=sashelp.class out=class ;
  by sex name ;
run;

data test;
  set class ;
  by sex;

  lag_age1 = lag(age) ;
  if first.sex then lag_age1 = .;

  lag_age2=ifn(first.sex,.,lag(age));

  if first.sex then lag_age_wrong = .;
  else lag_age_wrong=lag(age);

run;

proc print;
run;
                                                                            lag_age_
Obs    Name       Sex    Age    Height    Weight    lag_age1    lag_age2      wrong

  1    Alice       F      13     56.5       84.0        .           .           .
  2    Barbara     F      13     65.3       98.0       13          13           .
  3    Carol       F      14     62.8      102.5       13          13          13
  4    Jane        F      12     59.8       84.5       14          14          14
  5    Janet       F      15     62.5      112.5       12          12          12
  6    Joyce       F      11     51.3       50.5       15          15          15
  7    Judy        F      14     64.3       90.0       11          11          11
  8    Louise      F      12     56.3       77.0       14          14          14
  9    Mary        F      15     66.5      112.0       12          12          12
 10    Alfred      M      14     69.0      112.5        .           .           .
 11    Henry       M      14     63.5      102.5       14          14          15
 12    James       M      12     57.3       83.0       14          14          14
 13    Jeffrey     M      13     62.5       84.0       12          12          12
 14    John        M      12     59.0       99.5       13          13          13
 15    Philip      M      16     72.0      150.0       12          12          12
 16    Robert      M      12     64.8      128.0       16          16          16
 17    Ronald      M      15     67.0      133.0       12          12          12
 18    Thomas      M      11     57.5       85.0       15          15          15
 19    William     M      15     66.5      112.0       11          11          11

 

Phil_NZ
Barite | Level 11

Hi @Tom 

Many thanks for your dedicated explanation. Two ways of doing safely with lag are: using ifn or using lag with the condition at the end.

But I still cannot explain myself why we put the condition prior to the lag generation code, the result getting wrong as in your code

  if first.sex then lag_age_wrong = .;
  else lag_age_wrong=lag(age);

I am wondering if you can help me to explain it out.

Many thanks!

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Tom
Super User Tom
Super User

Make yourself some manipulatives so you can see what is happening.  For example take a deck of cards. For simplicity just use the 2 through 5.  Order the deck by suit and order within suit.  Place the cards face up.  This is your input dataset.

Take a piece of paper and mark three other places to put cards.  Label these CARD,  LAG(CARD) and LAG_CARD.

 

Now run this data step. 

data want;
  set have ;
  by suit;
  lag_card=lag(card);
  if first.suit then lag_card=.;
run;

For the SET option you will want to:

  • remove any card in the LAG_CARD location (LAG_CARD is not retained).
  • take the top card from the deck and move it to CARD location. 
  • Record the value of CARD for this iteration now since we are going to physically move the card but the code never modifies age.

For the assignment statement.

  • Take the TOP card from pile in the LAG(CARD) location and move it to the LAG_CARD location (this is the result of the assignment statement).
  • Move the card in the CARD location to the BOTTOM of the pile in the LAG(CARD) location.  This is pushing the vlaue of CARD into the stack/queue.

If it is the first card of the suit then

  • remove the card from the LAG_CARD location.

This ends the iteration so the current value of LAG_CARD is written to the dataset (as it the value of the CARD you pulled for this iteration.)

 

Repeat for all of the cards in the deck.

 

Now let's do the experiment with the re-ordered statements.

data want;
  set have ;
  by suit;
  if first.suit then lag_card=;
  else lag_card=lag(card);
run;

For the SET option you will want to:

  • remove any card in the LAG_CARD location (LAG_CARD is not retained).
  • take the top card from the deck and move it to CARD location. 
  • Record the value of CARD for this iteration now since we are going to physically move the card but the code never modifies age.

Now let's see what happens in the IF/THEN/ELSE.

IF it is the first card in the suit THEN

  • remove the CARD from the LAG_CARD location. (Notice that this never actually does anything since you already removed that card as part of the SET statement).

IF it is NOT the first card in the suit then 

  • Take the TOP card from pile in the LAG(CARD) location and move it to the LAG_CARD location (this is the result of the assignment statement).
  • Move the card in the CARD location to the BOTTOM of the pile in the LAG(CARD) location.  This is pushing the vlaue of CARD into the stack/queue.

This ends the iteration so the current value of LAG_CARD is written to the dataset (as it the value of the CARD you pulled for this iteration.)

 

Repeat for all of the cards in the deck.

 

Notice that in both the result for LAG_CARD in the first of each suit is the same.  But when you skip adding a card to pile for the first card in a suit then that card is not there to pull back out when you get to the second card.

 

Try with some suits only having one card.

 

Here is a way to simulate suit/card game with SAS code.  The four numbers in the first DO statement are the number of cards in the four suits. Try it with some of them having only 1 card.

data have;
  do n=4,4,4,4;
    suit+1;
    do card=1 to n;
      output;
    end;
  end;
run;

data want;
 set have;
 by suit;
 lag_card=lag(card);
 if first.suit then lag_card=.;
run;

proc print;
run;

data want;
 set have;
 by suit;
 if not first.suit then lag_card=lag(card);
run;

proc print;
run;
Kurt_Bremser
Super User
  1. to combine multiple statements, use DO/END
  2. you had 238 observations that supplied (a) missing value(s) to your calculation. Look at line 52 of your log, and search for missing values for the variables used there.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 19 replies
  • 5032 views
  • 11 likes
  • 4 in conversation