BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cwhit
Fluorite | Level 6

Hi, 

 

I am trying to create a new row from an existing row that is effectively 1 point in time divided by the next point in time. 

 

my table looks like this

 

FieldAFieldBFieldCT0PQ1PQ2PQ3....PQ32
ValueValue2Balance100989795...2

 

what I would like to do is create a new row where FieldA and FieldB are unchanged, FieldC will now be 'Decay' and we start dividing PQ1/T0  PQ2/PQ1 ...

 

FieldAFieldBFieldCT0PQ1PQ2PQ3....PQ32
ValueValue2Decay PQ1/T0PQ2/PQ1PQ3/PQ2...PQ32/PQ31

 

aka

 

FieldAFieldBFieldCT0PQ1PQ2PQ3....PQ32
ValueValue2Decay 98/10097/9895/97...2/somethign

 

and the final output would look like 

 

FieldAFieldBFieldCT0PQ1PQ2PQ3....PQ32
ValueValue2Balance100989795...2
ValueValue2Decay.0.980.9897959180.979381443 something

 

 

could you please help me figure out how to iterate through this process.

 

Best,

Chris w. 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @cwhit,

 


@cwhit wrote:

I created a new column PQ0 = T0 so the code would work


I think @Reeza meant T0--PQ32 (double dash), i.e. the list of variables from T0 through PQ32 in the program data vector (PDV). To be on the safe side (not assuming a certain variable order in the PDV), I would prefer T0 PQ1-PQ32. Either way, a new column is not necessary.

 


@cwhit wrote:

I just realized something strange is happening with the units ...

 

the output seems to be correct but the units are changing every other column. for example. Decay at PQ2 is 96.938 which is the right number but is somehow is not in the same units as Decay at PQ1 or PQ3 or PQ5.. very strange Any ideas as to why this is the case?  

 

FieldA FieldB FieldC T0 PQ1 PQ2 PQ3 PQ4 PQ5 PQ6 PQ0
a b balance 100 98 95 93 92 85 70 100
a b decay 100 0.98 96.93878 0.959368 95.89642 0.886373 78.97352 100

It's always good to investigate strange results. In this case, I'm afraid, it turns out that most of the results are incorrect (and not only some off by a factor of 100 as it may seem at first glance), e.g. PQ3=(93/95)*(98/100)=0.959368... and not 93/95=0.978947... what it should be.

 

Reason: The formula _rvars(i) = _rvars(i) / _rvars(i-1) is recursive. Currently, it overwrites the value which is needed as the denominator in the next iteration of the DO loop. To avoid this, you can perform the iteration in descending order (PQ32=..., PQ31=..., ...) or save the value (which would be lost otherwise) in a temporary variable, say h, and use the LAG function to return the previous value of h in each iteration. In the latter case I'd let the DO loop start at i=1 in order to get a nonmissing value from the LAG function for i=2 already. Variable T0 will then be set to missing because of the missing value from the LAG function in the first iteration, but this is exactly what you specified in your initial post. By using the DIVIDE function you can avoid the note "Missing values were generated ..." in the log.

 

Option 1: iteration in descending order:

data want(drop=i);
set have;
output;
FieldC = 'Decay';
array _rvars(*) T0 PQ1-PQ32;

do i=dim(_rvars) to 2 by -1;
  _rvars(i) = _rvars(i) / _rvars(i-1);
end;
_rvars(1)=.;

output;
run;

Option 2: LAG function with temporary variable:

data want(drop=i h);
set have;
output;
FieldC = 'Decay';
array _rvars(*) T0 PQ1-PQ32;

do i=1 to dim(_rvars);
  h=_rvars(i); /* save existing value for the next iteration */
  _rvars(i) = divide(_rvars(i),lag(h));
end;

output;
run;

 

View solution in original post

5 REPLIES 5
Reeza
Super User

 

 

data want;
set have;

FieldC = 'Balance';
output;

FieldC = 'Decay';
array _rvars(*) T0-PQ32;


do i=2 to dim(_rvars);
_rvars(i) = _rvars(i) / _rvars(i-1);
end;

output;

run;

 

I would highly recommend you reconsider this format and create a long format with one column for balance and a second for decay. 

SAS loops automatically and its worth taking advantage of that. 

 

Tutorial on Arrays if you have question about the code above:

https://stats.idre.ucla.edu/sas/seminars/sas-arrays/


@cwhit wrote:

Hi, 

 

I am trying to create a new row from an existing row that is effectively 1 point in time divided by the next point in time. 

 

my table looks like this

 

FieldA FieldB FieldC T0 PQ1 PQ2 PQ3 .... PQ32
Value Value2 Balance 100 98 97 95 ... 2

 

what I would like to do is create a new row where FieldA and FieldB are unchanged, FieldC will now be 'Decay' and we start dividing PQ1/T0  PQ2/PQ1 ...

 

FieldA FieldB FieldC T0 PQ1 PQ2 PQ3 .... PQ32
Value Value2 Decay   PQ1/T0 PQ2/PQ1 PQ3/PQ2 ... PQ32/PQ31

 

aka

 

FieldA FieldB FieldC T0 PQ1 PQ2 PQ3 .... PQ32
Value Value2 Decay   98/100 97/98 95/97 ... 2/somethign

 

and the final output would look like 

 

FieldA FieldB FieldC T0 PQ1 PQ2 PQ3 .... PQ32
Value Value2 Balance 100 98 97 95 ... 2
Value Value2 Decay . 0.98 0.989795918 0.979381443   something

 

 

could you please help me figure out how to iterate through this process.

 

Best,

Chris w. 


 

cwhit
Fluorite | Level 6

Hi, 

 

This was almost the solution - I just realized something strange is happening with the units (note, I created a new column PQ0 = T0 so the code would work)

 

the output seems to be correct but the units are changing every other column. for example. Decay at PQ2 is 96.938 which is the right number but is somehow is not in the same units as Decay at PQ1 or PQ3 or PQ5.. very strange Any ideas as to why this is the case?  

 

FieldAFieldBFieldCT0PQ1PQ2PQ3PQ4PQ5PQ6PQ0
abbalance100989593928570100
abdecay1000.9896.938780.95936895.896420.88637378.97352100
FreelanceReinh
Jade | Level 19

Hi @cwhit,

 


@cwhit wrote:

I created a new column PQ0 = T0 so the code would work


I think @Reeza meant T0--PQ32 (double dash), i.e. the list of variables from T0 through PQ32 in the program data vector (PDV). To be on the safe side (not assuming a certain variable order in the PDV), I would prefer T0 PQ1-PQ32. Either way, a new column is not necessary.

 


@cwhit wrote:

I just realized something strange is happening with the units ...

 

the output seems to be correct but the units are changing every other column. for example. Decay at PQ2 is 96.938 which is the right number but is somehow is not in the same units as Decay at PQ1 or PQ3 or PQ5.. very strange Any ideas as to why this is the case?  

 

FieldA FieldB FieldC T0 PQ1 PQ2 PQ3 PQ4 PQ5 PQ6 PQ0
a b balance 100 98 95 93 92 85 70 100
a b decay 100 0.98 96.93878 0.959368 95.89642 0.886373 78.97352 100

It's always good to investigate strange results. In this case, I'm afraid, it turns out that most of the results are incorrect (and not only some off by a factor of 100 as it may seem at first glance), e.g. PQ3=(93/95)*(98/100)=0.959368... and not 93/95=0.978947... what it should be.

 

Reason: The formula _rvars(i) = _rvars(i) / _rvars(i-1) is recursive. Currently, it overwrites the value which is needed as the denominator in the next iteration of the DO loop. To avoid this, you can perform the iteration in descending order (PQ32=..., PQ31=..., ...) or save the value (which would be lost otherwise) in a temporary variable, say h, and use the LAG function to return the previous value of h in each iteration. In the latter case I'd let the DO loop start at i=1 in order to get a nonmissing value from the LAG function for i=2 already. Variable T0 will then be set to missing because of the missing value from the LAG function in the first iteration, but this is exactly what you specified in your initial post. By using the DIVIDE function you can avoid the note "Missing values were generated ..." in the log.

 

Option 1: iteration in descending order:

data want(drop=i);
set have;
output;
FieldC = 'Decay';
array _rvars(*) T0 PQ1-PQ32;

do i=dim(_rvars) to 2 by -1;
  _rvars(i) = _rvars(i) / _rvars(i-1);
end;
_rvars(1)=.;

output;
run;

Option 2: LAG function with temporary variable:

data want(drop=i h);
set have;
output;
FieldC = 'Decay';
array _rvars(*) T0 PQ1-PQ32;

do i=1 to dim(_rvars);
  h=_rvars(i); /* save existing value for the next iteration */
  _rvars(i) = divide(_rvars(i),lag(h));
end;

output;
run;

 

Reeza
Super User

Thanks @FreelanceReinh, good catch on my mistakes there 🙂

 

@cwhit please change the correct answer to @FreelanceReinh 

cwhit
Fluorite | Level 6

Thank you both very much. This was very helpful and has saved me a lot of unnecessary data steps. 

 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 4567 views
  • 6 likes
  • 3 in conversation