Desktop productivity for business analysts and programmers

LAG function (see previous obs): Missing values as a result of an IF statement

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

LAG function (see previous obs): Missing values as a result of an IF statement

[ Edited ]

Hi guys, can anyone help pls?

 

 I'm thinking that everything is ok with these IF's statements, It runs ok until obs number 6 when it starts to crash, I'm using lag functions to copy a part of the previous observarion to N variables and putting 1 to the rest of them until N_10.

 

 The only special thing that obs number 6 has is the fact that the var LEVEL has the same value on obs number 5, every time this happend it returns missing values... I thought that it couldn't cause any problem, but apparently it does.

 

 I expect to all the N variables filled with some value istead of this missing values in the middle of the want table:

 

 You need to copy and run the entire code to see better, I think the problem is on the IF's of the last dataset (want), the first data sets (have1 and have2) are OK:

 

Thanks and regards

 

data have1;
input COMPANY_ID LEVEL_1 SHARE1 LEVEL_2 SHARE2 LEVEL_3 SHARE3 LEVEL_4 SHARE4 LEVEL_5 SHARE5 LEVEL_6 SHARE6 LEVEL_7 SHARE7 LEVEL_8 SHARE8 LEVEL_9 SHARE9 LEVEL_10 SHARE10 LEVEL_11 SHARE11 LEVEL_12 SHARE12 LEVEL_13 SHARE13 LEVEL_14 SHARE14 LEVEL_15 SHARE15 LEVEL_16 SHARE16 LEVEL_17 SHARE17 LEVEL_18 SHARE18 LEVEL_19 SHARE19 LEVEL_20 SHARE20;
cards;
55555 1 8 1 92 2 50 3 99 4 20 4 20 4 20 4 20 4 20 3 1 4 1 4 99 2 50 3 99 3 1 0 0 0 0 0 0 0 0 0 0
77777 1 50 2 50 2 50 1 100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
;
run;
data have2 (keep = COMPANY_ID LEVEL SHARE);
set have1;
array LEVEL_ARRAY LEVEL_1 - LEVEL_20;
array SHARE_ARRAY SHARE1 - SHARE20;
do x=1 to 20;
	if LEVEL_ARRAY(X) ne 0 then do;
		LEVEL = LEVEL_ARRAY(X);
		SHARE = SHARE_ARRAY(X)/100;
		format SHARE 12.10;
	output;
	end;
end;
run;
data want (keep = COMPANY_ID LEVEL PREVIOUS SHARE N_1 N_2 N_3 N_4 N_5 N_6 N_7 N_8 N_9 N_10 RESULT);
attrib COMPANY_ID label = "";
attrib LEVEL label = "";
attrib PREVIOUS label = "";
attrib SHARE label = "";
set have2;
format N_1 12.10;
format N_2 12.10;
format N_3 12.10;
format N_4 12.10;
format N_5 12.10;
format N_6 12.10;
format N_7 12.10;
format N_8 12.10;
format N_9 12.10;
format N_10 12.10;
format RESULT 12.10;
PREVIOUS = LAG(LEVEL);
if LEVEL = 1 then do; N_1 = SHARE; N_2 = 1; N_3 = 1; N_4 = 1; N_5 = 1; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end; N1_LAG = LAG(N_1);
if LEVEL = 2 then do; N_1 = N1_LAG; N_2 = SHARE; N_3 = 1; N_4 = 1; N_5 = 1; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2);
if LEVEL = 3 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = SHARE; N_4 = 1; N_5 = 1; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3);
if LEVEL = 4 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = SHARE; N_5 = 1; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4);
if LEVEL = 5 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = SHARE; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5);
if LEVEL = 6 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = SHARE; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end;  N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5); N6_LAG = LAG(N_6);
if LEVEL = 7 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = N6_LAG; N_7 = SHARE; N_8 = 1; N_9 = 1; N_10 = 1; end; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5); N6_LAG = LAG(N_6); N7_LAG = LAG(N_7);
if LEVEL = 8 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = N6_LAG; N_7 = N7_LAG; N_8 = SHARE; N_9 = 1; N_10 = 1; end; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5); N6_LAG = LAG(N_6); N7_LAG = LAG(N_7); N8_LAG = LAG(N_8);
if LEVEL = 9 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = N6_LAG; N_7 = N7_LAG; N_8 = N8_LAG; N_9 = SHARE; N_10 = 1; end; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5); N6_LAG = LAG(N_6); N7_LAG = LAG(N_7); N8_LAG = LAG(N_8); N9_LAG = LAG(N_9);
if LEVEL = 10 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = N6_LAG; N_7 = N7_LAG; N_8 = N8_LAG; N_9 = N9_LAG; N_10 = SHARE; end; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5); N6_LAG = LAG(N_6); N7_LAG = LAG(N_7); N8_LAG = LAG(N_8); N9_LAG = LAG(N_9); N10_LAG = LAG(N_10);
RESULT = N_1 * N_2 * N_3 * N_4 * N_5 * N_6 * N_7 * N_8 * N_9 * N_10;
run;

 


Accepted Solutions
Solution
‎03-14-2016 08:37 AM
Respected Advisor
Posts: 4,608

Re: LAG function (see previous obs): Missing values as a result of an IF statement

Wouldn't this give you what you want?

 


data want (keep = COMPANY_ID LEVEL PREVIOUS SHARE N_1 N_2 N_3 N_4 N_5 N_6 N_7 N_8 N_9 N_10 RESULT);
retain n_1-n_10;
array n n_1-n_10;
set have2;
PREVIOUS = LAG(LEVEL);
n{level} = share;
do i = level+1 to dim(n);
    n{i} = 1;
    end;
RESULT = N_1 * N_2 * N_3 * N_4 * N_5 * N_6 * N_7 * N_8 * N_9 * N_10;
run;

Added note: Are you sure you want to copy information from the previous observation when it is about a different company_id?

PG

View solution in original post


All Replies
Respected Advisor
Posts: 4,977

Re: LAG function (see previous obs): Missing values as a result of an IF statement

The smaller problem you are facing is the calculation of RESULT.  It will not be exactly correct, since SAS limits numeric storage to 8 bytes (yielding roughly 15 significant digits).

 

The bigger problem you are facing is that LAG does not return the value from the previous observation.  Rather, it returns the value from the last time that the LAG function executed.  Just as you compute PREVIOUS for every observation, you will need to compute N1_LAG, N2_LAG, N3_LAG, etc. on every observation.  That will force LAG to return the value from the previous observation.

New Contributor
Posts: 4

Re: LAG function (see previous obs): Missing values as a result of an IF statement

Thanks, how can I do it on every observation? I changed the code to but doesnt work.

 

data have1;
input COMPANY_ID LEVEL_1 SHARE1 LEVEL_2 SHARE2 LEVEL_3 SHARE3 LEVEL_4 SHARE4 LEVEL_5 SHARE5 LEVEL_6 SHARE6 LEVEL_7 SHARE7 LEVEL_8 SHARE8 LEVEL_9 SHARE9 LEVEL_10 SHARE10 LEVEL_11 SHARE11 LEVEL_12 SHARE12 LEVEL_13 SHARE13 LEVEL_14 SHARE14 LEVEL_15 SHARE15 LEVEL_16 SHARE16 LEVEL_17 SHARE17 LEVEL_18 SHARE18 LEVEL_19 SHARE19 LEVEL_20 SHARE20;
cards;
55555 1 8 1 92 2 50 3 99 4 20 4 20 4 20 4 20 4 20 3 1 4 1 4 99 2 50 3 99 3 1 0 0 0 0 0 0 0 0 0 0
77777 1 50 2 50 2 50 1 100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
;
run;
data have2 (keep = COMPANY_ID LEVEL SHARE);
set have1;
array LEVEL_ARRAY LEVEL_1 - LEVEL_20;
array SHARE_ARRAY SHARE1 - SHARE20;
do x=1 to 20;
	if LEVEL_ARRAY(X) ne 0 then do;
		LEVEL = LEVEL_ARRAY(X);
		SHARE = SHARE_ARRAY(X)/100;
		format SHARE 12.10;
	output;
	end;
end;
run;
data want /*(keep = COMPANY_ID LEVEL PREVIOUS SHARE N_1 N_2 N_3 N_4 N_5 N_6 N_7 N_8 N_9 N_10 RESULT)*/;
attrib COMPANY_ID label = "";
attrib LEVEL label = "";
attrib PREVIOUS label = "";
attrib SHARE label = "";
set have2;
format N_1 12.10;
format N_2 12.10;
format N_3 12.10;
format N_4 12.10;
format N_5 12.10;
format N_6 12.10;
format N_7 12.10;
format N_8 12.10;
format N_9 12.10;
format N_10 12.10;
format RESULT 12.10;
PREVIOUS = LAG(LEVEL);
N1_LAG = LAG(N_1);
N2_LAG = LAG(N_2);
N3_LAG = LAG(N_3);
N4_LAG = LAG(N_4);
N5_LAG = LAG(N_5);
N6_LAG = LAG(N_6);
N7_LAG = LAG(N_7);
N8_LAG = LAG(N_8);
N9_LAG = LAG(N_9);
N10_LAG = LAG(N_10);
if LEVEL = 1 then do; N_1 = SHARE; N_2 = 1; N_3 = 1; N_4 = 1; N_5 = 1; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end;
if LEVEL = 2 then do; N_1 = N1_LAG; N_2 = SHARE; N_3 = 1; N_4 = 1; N_5 = 1; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end;
if LEVEL = 3 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = SHARE; N_4 = 1; N_5 = 1; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end;
if LEVEL = 4 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = SHARE; N_5 = 1; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end;
if LEVEL = 5 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = SHARE; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end;
if LEVEL = 6 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = SHARE; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end;
if LEVEL = 7 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = N6_LAG; N_7 = SHARE; N_8 = 1; N_9 = 1; N_10 = 1; end;
if LEVEL = 8 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = N6_LAG; N_7 = N7_LAG; N_8 = SHARE; N_9 = 1; N_10 = 1; end;
if LEVEL = 9 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = N6_LAG; N_7 = N7_LAG; N_8 = N8_LAG; N_9 = SHARE; N_10 = 1; end;
if LEVEL = 10 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = N6_LAG; N_7 = N7_LAG; N_8 = N8_LAG; N_9 = N9_LAG; N_10 = SHARE; end;
RESULT = N_1 * N_2 * N_3 * N_4 * N_5 * N_6 * N_7 * N_8 * N_9 * N_10;
run;
New Contributor
Posts: 4

Re: LAG function (see previous obs): Missing values as a result of an IF statement

Solved with a single word at the end of the code that was: retain _all_; Thanks for your help. Heres the final code.

 

Regards!

 

data have1;
input COMPANY_ID LEVEL_1 SHARE1 LEVEL_2 SHARE2 LEVEL_3 SHARE3 LEVEL_4 SHARE4 LEVEL_5 SHARE5 LEVEL_6 SHARE6 LEVEL_7 SHARE7 LEVEL_8 SHARE8 LEVEL_9 SHARE9 LEVEL_10 SHARE10 LEVEL_11 SHARE11 LEVEL_12 SHARE12 LEVEL_13 SHARE13 LEVEL_14 SHARE14 LEVEL_15 SHARE15 LEVEL_16 SHARE16 LEVEL_17 SHARE17 LEVEL_18 SHARE18 LEVEL_19 SHARE19 LEVEL_20 SHARE20;
cards;
55555 1 8 1 92 2 50 3 99 4 20 4 20 4 20 4 20 4 20 3 1 4 1 4 99 2 50 3 99 3 1 0 0 0 0 0 0 0 0 0 0
77777 1 50 2 50 2 50 1 100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
;
run;
data have2 (keep = COMPANY_ID LEVEL SHARE);
set have1;
array LEVEL_ARRAY LEVEL_1 - LEVEL_20;
array SHARE_ARRAY SHARE1 - SHARE20;
do x=1 to 20;
	if LEVEL_ARRAY(X) ne 0 then do;
		LEVEL = LEVEL_ARRAY(X);
		SHARE = SHARE_ARRAY(X)/100;
		format SHARE 12.10;
	output;
	end;
end;
run;
data want (keep = COMPANY_ID LEVEL PREVIOUS SHARE N_1 N_2 N_3 N_4 N_5 N_6 N_7 N_8 N_9 N_10 RESULT);
attrib COMPANY_ID label = "";
attrib LEVEL label = "";
attrib PREVIOUS label = "";
attrib SHARE label = "";
set have2;
format N_1 12.10;
format N_2 12.10;
format N_3 12.10;
format N_4 12.10;
format N_5 12.10;
format N_6 12.10;
format N_7 12.10;
format N_8 12.10;
format N_9 12.10;
format N_10 12.10;
format RESULT 12.10;
PREVIOUS = LAG(LEVEL);
if LEVEL = 1 then do; N_1 = SHARE; N_2 = 1; N_3 = 1; N_4 = 1; N_5 = 1; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end; do; N1_LAG = LAG(N_1); end;
if LEVEL = 2 then do; N_1 = N1_LAG; N_2 = SHARE; N_3 = 1; N_4 = 1; N_5 = 1; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end; do; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); end;
if LEVEL = 3 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = SHARE; N_4 = 1; N_5 = 1; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end; do; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); end;
if LEVEL = 4 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = SHARE; N_5 = 1; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end; do; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); end;
if LEVEL = 5 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = SHARE; N_6 = 1; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end; do; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5); end;
if LEVEL = 6 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = SHARE; N_7 = 1; N_8 = 1; N_9 = 1; N_10 = 1; end; do; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5); N6_LAG = LAG(N_6); end;
if LEVEL = 7 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = N6_LAG; N_7 = SHARE; N_8 = 1; N_9 = 1; N_10 = 1; end; do; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5); N6_LAG = LAG(N_6); N7_LAG = LAG(N_7); end;
if LEVEL = 8 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = N6_LAG; N_7 = N7_LAG; N_8 = SHARE; N_9 = 1; N_10 = 1; end; do; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5); N6_LAG = LAG(N_6); N7_LAG = LAG(N_7); N8_LAG = LAG(N_8); end;
if LEVEL = 9 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = N6_LAG; N_7 = N7_LAG; N_8 = N8_LAG; N_9 = SHARE; N_10 = 1; end; do; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5); N6_LAG = LAG(N_6); N7_LAG = LAG(N_7); N8_LAG = LAG(N_8); N9_LAG = LAG(N_9); end;
if LEVEL = 10 then do; N_1 = N1_LAG; N_2 = N2_LAG; N_3 = N3_LAG; N_4 = N4_LAG; N_5 = N5_LAG; N_6 = N6_LAG; N_7 = N7_LAG; N_8 = N8_LAG; N_9 = N9_LAG; N_10 = SHARE; end; do; N1_LAG = LAG(N_1); N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5); N6_LAG = LAG(N_6); N7_LAG = LAG(N_7); N8_LAG = LAG(N_8); N9_LAG = LAG(N_9); N10_LAG = LAG(N_10); end;
N1_LAG = LAG(N_1);  N2_LAG = LAG(N_2); N3_LAG = LAG(N_3); N4_LAG = LAG(N_4); N5_LAG = LAG(N_5); N6_LAG = LAG(N_6); N7_LAG = LAG(N_7); N8_LAG = LAG(N_8); N9_LAG = LAG(N_9); N10_LAG = LAG(N_10);
RESULT = N_1 * N_2 * N_3 * N_4 * N_5 * N_6 * N_7 * N_8 * N_9 * N_10;
retain _all_;
run;
Solution
‎03-14-2016 08:37 AM
Respected Advisor
Posts: 4,608

Re: LAG function (see previous obs): Missing values as a result of an IF statement

Wouldn't this give you what you want?

 


data want (keep = COMPANY_ID LEVEL PREVIOUS SHARE N_1 N_2 N_3 N_4 N_5 N_6 N_7 N_8 N_9 N_10 RESULT);
retain n_1-n_10;
array n n_1-n_10;
set have2;
PREVIOUS = LAG(LEVEL);
n{level} = share;
do i = level+1 to dim(n);
    n{i} = 1;
    end;
RESULT = N_1 * N_2 * N_3 * N_4 * N_5 * N_6 * N_7 * N_8 * N_9 * N_10;
run;

Added note: Are you sure you want to copy information from the previous observation when it is about a different company_id?

PG
New Contributor
Posts: 4

Re: LAG function (see previous obs): Missing values as a result of an IF statement

[ Edited ]

Thanks PGStats and Astounding it works.

Yes I'm sure, after I added "retain all" everything went good, there is no problem with other company's ID's (next company on the table) because every company always starts with Level 1, and Level 1 doesn't pick up any "N" from previous obs. I also tested with real data and more companys IDs! Thanks for asking.

Regards

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 409 views
  • 1 like
  • 3 in conversation