I have a dataset with columns Month, A and Q as below:
Month | A | Q | Calc |
Jan-19 | 10.00% | ||
Feb-19 | 20.00% | 21.00% | 12.10% |
Mar-19 | 15.00% | 16.00% | 14.04% |
Apr-19 | 30.00% | 29.00% | 18.11% |
May-19 | 40.00% | 39.00% | 25.17% |
Jun-19 | 30.00% | 29.00% | 32.47% |
Jul-19 | 10.00% | 11.00% | 36.04% |
I want to create Calc, where first answer is 10%*(1+21%)=12.10%
The next row onwards uses the previous rows calc so 12.10%*(1+16%)=14.04%
What is the best way of doing this in SAS so it doesnt have to be done in Excel?
Thanks!
I am not sure what dataset you are running against and if you are able to follow the logic. Assuming your sample is a good representative sample, Here is my test using your sample.
data have;
input Month : monyy7. A : percent2. Q :percent2.;* Calc;
format month monyy7. a q percent.;
cards;
Jan-19 10.00% . .
Feb-19 20.00% 21.00% 12.10%
Mar-19 15.00% 16.00% 14.04%
Apr-19 30.00% 29.00% 18.11%
May-19 40.00% 39.00% 25.17%
Jun-19 30.00% 29.00% 32.47%
Jul-19 10.00% 11.00% 36.04%
;
data want;
set have;
retain calc;
if _n_=1 then do;
output;
calc=a;
end;
else do;
calc=calc*(1+q);
output;
end;
format calc percent8.2;
run;
proc print noobs;run;
RESULT:
Month | A | Q | calc |
---|---|---|---|
JAN2019 | 10% | . | . |
FEB2019 | 20% | 21% | 12.10% |
MAR2019 | 15% | 16% | 14.04% |
APR2019 | 30% | 29% | 18.11% |
MAY2019 | 40% | 39% | 25.17% |
JUN2019 | 30% | 29% | 32.47% |
JUL2019 | 10% | 11% | 36.04% |
.The calc variable is the new computed variable ( requirement)
Are the values of A and Q on row two 0.20 and 0.21 respectively, or are they character strings with a % sign at the end?
Hi Paige, they are numeric so 0.20 and 0.21
data have;
input Month : monyy7. A : percent2. Q :percent2.;* Calc;
format month monyy7. a q percent.;
cards;
Jan-19 10.00% . .
Feb-19 20.00% 21.00% 12.10%
Mar-19 15.00% 16.00% 14.04%
Apr-19 30.00% 29.00% 18.11%
May-19 40.00% 39.00% 25.17%
Jun-19 30.00% 29.00% 32.47%
Jul-19 10.00% 11.00% 36.04%
;
/*10%*(1+21%)=12.10%*/
data want;
set have;
retain calc ;
if _n_=1 then calc=a;
else calc=calc*(1+q);
format calc percent8.2;
run;
More precise
data have;
input Month : monyy7. A : percent2. Q :percent2.;* Calc;
format month monyy7. a q percent.;
cards;
Jan-19 10.00% . .
Feb-19 20.00% 21.00% 12.10%
Mar-19 15.00% 16.00% 14.04%
Apr-19 30.00% 29.00% 18.11%
May-19 40.00% 39.00% 25.17%
Jun-19 30.00% 29.00% 32.47%
Jul-19 10.00% 11.00% 36.04%
;
data want;
set have;
retain calc;
if _n_=1 then do;
output;
calc=a;
end;
else do;
calc=calc*(1+q);
output;
end;
format calc percent8.2;
run;
Hi, this doesn't appear to run correctly. Calc is the newly created calculation so do you just need a and q in the dataset?
Hi @PetePatel Here is my understanding
1. You have a dataset with 3 variables- Month, A and Q
2. You want to create a new variable by the name CALC using a logic/formula
3. I believe the the code that I wrote does that as I tested with the sample you gave me.
Am I missing something?
Hi @novinosrin ,
When I run the code through exactly as you've provided, it gives me a dataset with month (Jan, Feb and May only) with A, Q and Calc as missing (.)
I also wondered why you specified the calc values upfront.
I am not sure what dataset you are running against and if you are able to follow the logic. Assuming your sample is a good representative sample, Here is my test using your sample.
data have;
input Month : monyy7. A : percent2. Q :percent2.;* Calc;
format month monyy7. a q percent.;
cards;
Jan-19 10.00% . .
Feb-19 20.00% 21.00% 12.10%
Mar-19 15.00% 16.00% 14.04%
Apr-19 30.00% 29.00% 18.11%
May-19 40.00% 39.00% 25.17%
Jun-19 30.00% 29.00% 32.47%
Jul-19 10.00% 11.00% 36.04%
;
data want;
set have;
retain calc;
if _n_=1 then do;
output;
calc=a;
end;
else do;
calc=calc*(1+q);
output;
end;
format calc percent8.2;
run;
proc print noobs;run;
RESULT:
Month | A | Q | calc |
---|---|---|---|
JAN2019 | 10% | . | . |
FEB2019 | 20% | 21% | 12.10% |
MAR2019 | 15% | 16% | 14.04% |
APR2019 | 30% | 29% | 18.11% |
MAY2019 | 40% | 39% | 25.17% |
JUN2019 | 30% | 29% | 32.47% |
JUL2019 | 10% | 11% | 36.04% |
.The calc variable is the new computed variable ( requirement)
Thanks.
That's incredibly strange as when I run your code through without making changes I get:
FEB2019 | . | . | . |
MAY2019 | . | . | . |
I wonder if it's a SAS issue
@PetePatel wrote:
Thanks.
That's incredibly strange as when I run your code through without making changes I get:
Month A Q calc
FEB2019 . . . MAY2019 . . .
I wonder if it's a SAS issue
Most likely a copy and paste issue. Make sure that you have to inserted TAB characters into the text.
Hi @PetePatel Riight, I can sense where the heck the problem is . I believe you are executing in SAS EG am i right?
Like @Tom pointed the TAB chars, add to the code and try
infile cards expandtabs;
Full version
data have;
infile cards expandtabs;
input Month : monyy7. A : percent2. Q :percent2.;* Calc;
format month monyy7. a q percent.;
cards;
Jan-19 10.00% . .
Feb-19 20.00% 21.00% 12.10%
Mar-19 15.00% 16.00% 14.04%
Apr-19 30.00% 29.00% 18.11%
May-19 40.00% 39.00% 25.17%
Jun-19 30.00% 29.00% 32.47%
Jul-19 10.00% 11.00% 36.04%
;
data want;
set have;
retain calc;
if _n_=1 then do;
output;
calc=a;
end;
else do;
calc=calc*(1+q);
output;
end;
format calc percent8.2;
run;
Yes sorry should have mentioned SAS EG.
Works perfectly now thanks!
@PetePatel wrote:
Hi, this doesn't appear to run correctly. Calc is the newly created calculation so do you just need a and q in the dataset?
You do not want the new "retained" variable in the incoming dataset. If it is there then when SAS reads the next observation the retained value is overwritten with the value read from the dataset.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.