DATA Step, Macro, Functions and more

Calculate a special "SUMPRODUCT" (finite series) using values from a different dataset's variables

Reply
Contributor
Posts: 45

Calculate a special "SUMPRODUCT" (finite series) using values from a different dataset's variables

Dear all,

 

I have the following dataset, DSN_1

 

pic1.jpg

 

 

 

and a second DSN_2

pic2.jpg

 

 

I would like to create a new variable Calcl in DSN_1,

 

pic3.jpg

 

that will calculate the following quantities per each obs of DSN_1

 

 

for ID = 18798  Calcl =1,100  +  1,100 * 0.25 *( p15 + p15*p16 + p15*p16*p17 + p15*p16*p17*p18)

 

for ID = 21345  Calcl =2 * 250  +  250 * 0.25 *( g9 + g9*g10 + g9*g10*g11 + g9*g10*g11*g12 + g9*g10*g11*g12*g13 + g9*g10*g11*g12*g13*g14)

 

for ID = 32178  Calcl =1,000  +  1,000 * 0.25 *( k4 + k4*k5 + k4*k5*k6)

 

Each calculation will start at (DIFF_MNTHS_FROM + 1) obs. and will have (Tot_MNTHS - 1) terms

i.e. for the 1st Calcl. will start at the 15th obs. of DSN_2 for cat1 and it will have 4 terms

      for the 2nd Calcl. will start at the 4th obs. of DSN_2 for cat2 and it will have 3 terms

      for the 3rd Calcl. will start at the 9th obs. of DSN_2 for cat3 and it will have 6 terms

 

additionally when variable DBL_SM_FRST_MONT equals "YES" then the very first term will be doubled

i.e. 2 * 250 for the second Calcl.

 

Any hints will be more than welcome.

 

Thanking you in advance.

 

Kind regards

Contributor
Posts: 45

Re: Calculate a special "SUMPRODUCT" (finite series) using values from a different dataset

Posted in reply to Zeus_Olympous

Forgive me but the wanted calculations should read as follows :

 

=1,100  +  1,100 * ( p15 + (0.25)^1 * p15*p16 + (0.25)^ 2 * p15*p16*p17 + (0.25)^3 * p15*p16*p17*p18)

 

=2 * 250  +  250 *( g9 + (0.25)^1 * g9*g10 + (0.25)^2 * g9*g10*g11 + (0.25)^3 * g9*g10*g11*g12 + (0.25)^4 * g9*g10*g11*g12*g13 + (0.25)^5 * g9*g10*g11*g12*g13*g14)

 

=1,000  +  1,000 * ( k4 + (0.25)^1 * k4*k5 + (0.25)^2 * k4*k5*k6)

 

Thank you for your attention

Contributor
Posts: 45

Re: Calculate a special "SUMPRODUCT" (finite series) using values from a different dataset

Posted in reply to Zeus_Olympous

Dear,

 

Please find below the datasets I have DSN_1 and DSN_2 .

 

Additionally I have added the dataset WANT with the required CALCL variable with values produced by the posted summation+products.

 

In fact what I need is for each observation of DSN_1 to "lookup" table DSN_2  values for the appropriate CAT(I) (as dictated by variable FLG_A and perform the calculations

 

=1,100  +  1,100 * ( p15 + (0.25)^1 * p15*p16 + (0.25)^ 2 * p15*p16*p17 + (0.25)^3 * p15*p16*p17*p18)

 

=2 * 250  +  250 *( g9 + (0.25)^1 * g9*g10 + (0.25)^2 * g9*g10*g11 + (0.25)^3 * g9*g10*g11*g12 + (0.25)^4 * g9*g10*g11*g12*g13 + (0.25)^5 * g9*g10*g11*g12*g13*g14)

 

=1,000  +  1,000 * ( k4 + (0.25)^1 * k4*k5 + (0.25)^2 * k4*k5*k6)

 

in fact there are only two conditions

 

1st cond:

 

Each calculation will start at (DIFF_MNTHS_FROM + 1) obs. and will have (Tot_MNTHS - 1) terms

i.e. for the 1st  Calcl. will start at the 15th obs. of DSN_2 for cat1 and it will have 4 terms

 

  (=1100+1100*(0,9711+(0,25**1)*0,9711*0,9735 +(0,25**2)*0,9711*0,9735 *0,9755 +0,25^3*0,9711*0,9735 *0,9755 *0,9777 )

 

      for the 2nd Calcl. will start at the 9th obs. of  DSN_2 for cat3 and it will have 6 terms

 

(=2*250+250*(0,9711 +(0,25**1)*0,9711 *0,9734 +(0,25**2)*0,9711 *0,9734 *0,9755 +(0,25**3)*0,9711 *0,9734 *0,9755 *0,9775 +(0,25**4)*0,9711 *0,9734 *0,9755 *0,9775 *0,9794 +(0,25**5)*0,9711 *0,9734 *0,9755 *0,9775 *0,9794 *0,9812 )

 

      for the 3rd Calcl. will start at the 4th obs. of   DSN_2 for cat2 and it will have 3 terms

(=1000+1000*(0,9367 + (0,25**1)*0,9367*0,9404 +(0,25**2)*0,9367*0,9404 *0,9438 )

 

 2nd cond.

 

when variable DBL_SM_FRST_MONT equals "YES" then the very first term will be doubled

i.e. 2 * 250 for the second Calcl.

 

I hope this clarifies my question.

Thanking you in advance

 

==================================================

 

data DSN_1;
input ID $ Start_Date End_Date TODAY :YYMMDD10. No C1 $ Year2 $ C3 $ Date Smiley Very HappyDMMYY10.
       DIFF_MNTHS_FROM DIFF_MNTHS_TO Tot_MNTHS Amount Mult FLG_A $ DBL_SM_FRST_MONT $;
cards;
18798 2016/09/01 2018/04/07 2017/11/13 14 19 5    500  0,25 cat1 NO
21345 2017/03/11 2018/06/12 2017/11/13  8 15 7    250  0,25 cat3 YES
32178 2017/08/01 2017/12/01 2017/11/13  3  7 4  1.000  0,25 cat2 NO
....
....
....
....
;
run;


data DSN_2;
input nr cat1 cat2 cat3 ;
cards;
0  1,0000   1,0000   1,0000
1  0,9160   0,9755   0,9604
2  0,9334   0,7908   0,9525
3  0,9364   0,9329   0,9554
4  0,9394   0,9367   0,9582
5  0,9424   0,9404   0,9609
6  0,9455   0,9438   0,9636
7  0,9486   0,9469   0,9662
8  0,9516   0,9499   0,9687
9  0,9547   0,9527   0,9711
10  0,9577   0,9556   0,9734
11  0,9605   0,9586   0,9755
12  0,9633   0,9616   0,9775
13  0,9661   0,9649   0,9794
14  0,9687   0,9683   0,9812
15  0,9711   0,9717   0,9829
16  0,9735   0,9749   0,9844
17  0,9755   0,9778   0,9888
18  0,9777   0,9805   0,9899
19  0,9798   0,9827   0,9907
20  0,9818   0,9845   0,9916
....
....
....
....
;
run;


data WANT;
input ID $ Start_Date End_Date TODAY :YYMMDD10. No C1 $ Year2 $ C3 $ Date Smiley Very HappyDMMYY10.
       DIFF_MNTHS_FROM DIFF_MNTHS_TO Tot_MNTHS Amount Mult FLG_A $ DBL_SM_FRST_MONT $ CALCL;
cards;
18798 2016/09/01 2018/04/07 2017/11/13 14 19 5    500  0,25 cat1 NO    2.507,12
21345 2017/03/11 2018/06/12 2017/11/13  8 15 7    250  0,25 cat3 YES     820,86
32178 2017/08/01 2017/12/01 2017/11/13  3  7 4  1.000  0,25 cat2 NO    2.208,93
....
....
....
....
;
run;

Trusted Advisor
Posts: 1,022

Re: Calculate a special "SUMPRODUCT" (finite series) using values from a different dataset

Posted in reply to Zeus_Olympous

@Zeus_Olympous

 

Thanks for posting data as SAS data, but I suspect you apparently didn't run your sample DATA steps.  DSN1 has 16 vars defined, but only 11 data values per line.  WANT has 17 vars defined but only 12 data values per line.

 


Zeus_Olympous wrote:

Dear,

 

Please find below the datasets I have DSN_1 and DSN_2 .

 

Additionally I have added the dataset WANT with the required CALCL variable with values produced by the posted summation+products.

 

In fact what I need is for each observation of DSN_1 to "lookup" table DSN_2  values for the appropriate CAT(I) (as dictated by variable FLG_A and perform the calculations

 

=1,100  +  1,100 * ( p15 + (0.25)^1 * p15*p16 + (0.25)^ 2 * p15*p16*p17 + (0.25)^3 * p15*p16*p17*p18)

 

=2 * 250  +  250 *( g9 + (0.25)^1 * g9*g10 + (0.25)^2 * g9*g10*g11 + (0.25)^3 * g9*g10*g11*g12 + (0.25)^4 * g9*g10*g11*g12*g13 + (0.25)^5 * g9*g10*g11*g12*g13*g14)

 

=1,000  +  1,000 * ( k4 + (0.25)^1 * k4*k5 + (0.25)^2 * k4*k5*k6)

 

in fact there are only two conditions

 

1st cond:

 

Each calculation will start at (DIFF_MNTHS_FROM + 1) obs. and will have (Tot_MNTHS - 1) terms

i.e. for the 1st  Calcl. will start at the 15th obs. of DSN_2 for cat1 and it will have 4 terms

 

  (=1100+1100*(0,9711+(0,25**1)*0,9711*0,9735 +(0,25**2)*0,9711*0,9735 *0,9755 +0,25^3*0,9711*0,9735 *0,9755 *0,9777 )

 

      for the 2nd Calcl. will start at the 9th obs. of  DSN_2 for cat3 and it will have 6 terms

 

(=2*250+250*(0,9711 +(0,25**1)*0,9711 *0,9734 +(0,25**2)*0,9711 *0,9734 *0,9755 +(0,25**3)*0,9711 *0,9734 *0,9755 *0,9775 +(0,25**4)*0,9711 *0,9734 *0,9755 *0,9775 *0,9794 +(0,25**5)*0,9711 *0,9734 *0,9755 *0,9775 *0,9794 *0,9812 )

 

      for the 3rd Calcl. will start at the 4th obs. of   DSN_2 for cat2 and it will have 3 terms

(=1000+1000*(0,9367 + (0,25**1)*0,9367*0,9404 +(0,25**2)*0,9367*0,9404 *0,9438 )

 

 2nd cond.

 

when variable DBL_SM_FRST_MONT equals "YES" then the very first term will be doubled

i.e. 2 * 250 for the second Calcl.

 

I hope this clarifies my question.

Thanking you in advance

 

==================================================

 

data DSN_1;
input ID $ Start_Date End_Date TODAY :YYMMDD10. No C1 $ Year2 $ C3 $ Date Smiley Very HappyDMMYY10.
       DIFF_MNTHS_FROM DIFF_MNTHS_TO Tot_MNTHS Amount Mult FLG_A $ DBL_SM_FRST_MONT $;
cards;
18798 2016/09/01 2018/04/07 2017/11/13 14 19 5    500  0,25 cat1 NO
21345 2017/03/11 2018/06/12 2017/11/13  8 15 7    250  0,25 cat3 YES
32178 2017/08/01 2017/12/01 2017/11/13  3  7 4  1.000  0,25 cat2 NO
....
....
....
....
;
run;


data DSN_2;
input nr cat1 cat2 cat3 ;
cards;
0  1,0000   1,0000   1,0000
1  0,9160   0,9755   0,9604
2  0,9334   0,7908   0,9525
3  0,9364   0,9329   0,9554
4  0,9394   0,9367   0,9582
5  0,9424   0,9404   0,9609
6  0,9455   0,9438   0,9636
7  0,9486   0,9469   0,9662
8  0,9516   0,9499   0,9687
9  0,9547   0,9527   0,9711
10  0,9577   0,9556   0,9734
11  0,9605   0,9586   0,9755
12  0,9633   0,9616   0,9775
13  0,9661   0,9649   0,9794
14  0,9687   0,9683   0,9812
15  0,9711   0,9717   0,9829
16  0,9735   0,9749   0,9844
17  0,9755   0,9778   0,9888
18  0,9777   0,9805   0,9899
19  0,9798   0,9827   0,9907
20  0,9818   0,9845   0,9916
....
....
....
....
;
run;


data WANT;
input ID $ Start_Date End_Date TODAY :YYMMDD10. No C1 $ Year2 $ C3 $ Date Smiley Very HappyDMMYY10.
       DIFF_MNTHS_FROM DIFF_MNTHS_TO Tot_MNTHS Amount Mult FLG_A $ DBL_SM_FRST_MONT $ CALCL;
cards;
18798 2016/09/01 2018/04/07 2017/11/13 14 19 5    500  0,25 cat1 NO    2.507,12
21345 2017/03/11 2018/06/12 2017/11/13  8 15 7    250  0,25 cat3 YES     820,86
32178 2017/08/01 2017/12/01 2017/11/13  3  7 4  1.000  0,25 cat2 NO    2.208,93
....
....
....
....
;
run;


 

Contributor
Posts: 45

Re: Calculate a special "SUMPRODUCT" (finite series) using values from a different dataset

My apologies.

 

=============================================

DSN_1 should read as

 

data DSN_1;
input ID $ Start_Date End_Date TODAY :YYMMDD10.
       DIFF_MNTHS_FROM DIFF_MNTHS_TO Tot_MNTHS Amount Mult FLG_A $ DBL_SM_FRST_MONT $;
cards;
18798 2016/09/01 2018/04/07 2017/11/13 14 19 5    500  0,25 cat1 NO
21345 2017/03/11 2018/06/12 2017/11/13  8 15 7    250  0,25 cat3 YES
32178 2017/08/01 2017/12/01 2017/11/13  3  7 4  1.000  0,25 cat2 NO
....
....
....
....
;
run;

 

and dataset WANT as:

 

data WANT;
input ID $ Start_Date End_Date TODAY :YYMMDD10.
       DIFF_MNTHS_FROM DIFF_MNTHS_TO Tot_MNTHS Amount Mult FLG_A $ DBL_SM_FRST_MONT $ CALCL;
cards;
18798 2016/09/01 2018/04/07 2017/11/13 14 19 5    500  0,25 cat1 NO    2.507,12
21345 2017/03/11 2018/06/12 2017/11/13  8 15 7    250  0,25 cat3 YES     820,86
32178 2017/08/01 2017/12/01 2017/11/13  3  7 4  1.000  0,25 cat2 NO    2.208,93
....
....
....
....
;
run;

=============================================

Thanks

Trusted Advisor
Posts: 1,022

Re: Calculate a special "SUMPRODUCT" (finite series) using values from a different dataset

  1. It looks like you still haven't actually run the data steps with your sample data.  I get notes like the following from them (even after removing the "...." lines:
      

    1 data DSN_1;

    2 input ID $ Start_Date End_Date TODAY :YYMMDD10.

    3 DIFF_MNTHS_FROM DIFF_MNTHS_TO Tot_MNTHS Amount Mult FLG_A $ DBL_SM_FRST_MONT $;

    4 cards;

    NOTE: Invalid data for Start_Date in line 5 7-16.

    NOTE: Invalid data for End_Date in line 5 18-27.

    NOTE: Invalid data for Mult in line 5 56-59.

    RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8

    5 18798 2016/09/01 2018/04/07 2017/11/13 14 19 5 500 0,25 cat1 NO

    ID=18798 Start_Date=. End_Date=. TODAY=21136 DIFF_MNTHS_FROM=14 DIFF_MNTHS_TO=19 Tot_MNTHS=5 Amount=500 Mult=. FLG_A=cat1

    DBL_SM_FRST_MONT=NO _ERROR_=1 _N_=1

  2. Where do you get the 1100 in the first observation?
  3. This looks like a homework problem.  What have you tried so far?

 

Here's a hint:  consider using the POINT= options of the SET statement. Put that inside a loop that traverses the needed observations from DSN2  (and note that OBSERVATION 15 is for NR=14):

 

data want;

   set dsn1;

   if something then cacls=2*amount;

   else cacls=amount;

 

 

   do P=   somewhere to somewhere_else;

      set dsn2 point=p;

      if flga='CAT1' then x=cat1;
      else if flga='CAT2' then  x=cat2;

      else if flga='CAT3' then x=cat3;

     cacls=cacls + some-function-of-X and multiplier;

   end;

run;

    

Super User
Posts: 19,878

Re: Calculate a special "SUMPRODUCT" (finite series) using values from a different dataset

Posted in reply to Zeus_Olympous

Are there rules that come with this, or are these values hardcoded, i.e. you'e doing the calculation once without need to expand this beyond this exact example?

 

And you need to post data as text, not images. 


Zeus_Olympous wrote:

 

 

that will calculate the following quantities per each obs of DSN_1

 

 

for ID = 18798  Calcl =1,100  +  1,100 * 0.25 *( p15 + p15*p16 + p15*p16*p17 + p15*p16*p17*p18)

 

for ID = 21345  Calcl =2 * 250  +  250 * 0.25 *( g9 + g9*g10 + g9*g10*g11 + g9*g10*g11*g12 + g9*g10*g11*g12*g13 + g9*g10*g11*g12*g13*g14)

 

for ID = 32178  Calcl =1,000  +  1,000 * 0.25 *( k4 + k4*k5 + k4*k5*k6)

 

Each calculation will start at (DIFF_MNTHS_FROM + 1) obs. and will have (Tot_MNTHS - 1) terms

i.e. for the 1st Calcl. will start at the 15th obs. of DSN_2 for cat1 and it will have 4 terms

      for the 2nd Calcl. will start at the 4th obs. of DSN_2 for cat2 and it will have 3 terms

      for the 3rd Calcl. will start at the 9th obs. of DSN_2 for cat3 and it will have 6 terms

 

additionally when variable DBL_SM_FRST_MONT equals "YES" then the very first term will be doubled

i.e. 2 * 250 for the second Calcl.

 

Any hints will be more than welcome.

 

Thanking you in advance.

 

Kind regards


 

Contributor
Posts: 45

Re: Calculate a special "SUMPRODUCT" (finite series) using values from a different dataset

Hello ,

 

I provided some clarifications above.

 

Thznk you.

Trusted Advisor
Posts: 1,022

Re: Calculate a special "SUMPRODUCT" (finite series) using values from a different dataset

Posted in reply to Zeus_Olympous

Dear Zeus:

 

Please come down from Olympus and show us mortals SAS datasets that describe your problem, and desired solution.

 

Athena is no doubt lurking around with the necessary wisdom, but will likely be more helpful to a well-defined problem placed on the alter of the sas forum temple.

Ask a Question
Discussion stats
  • 8 replies
  • 171 views
  • 2 likes
  • 3 in conversation