Hello,
I have values in a table which have met certain conditions. i'm in a situation where i have already joined the table on its selft to create an rolling total based on the next value by doing a N-1 type join, however i'm a bit stumped of how to add extra rules
the next problem is where i have grouped keys, i want identify the first key in the group (VAR1), hold that value, then when the column called "FLAG" has a "Y" sum them together, then move on to the next group. see example of data then what i want to achieve
HAVE
VAR1 | VAR9 | VAR10 | FLAG |
987654321 | 9202.74 | 73 | N |
987654321 | 8960.37 | 134 | N |
987654321 | 8223.05 | 195 | Y |
987654321 | 6698.29 | 256 | Y |
987654321 | 5159.22 | 315 | Y |
987654321 | 4433.15 | 347 | Y |
987654321 | 3935.12 | 406 | N |
987654321 | 3689.69 | 486 | N |
987654321 | 3285.74 | 530 | N |
987654321 | 2245.72 | 589 | N |
98777444 | 5132.63 | 53 | N |
98777444 | 5117.29 | 60 | N |
98777444 | 5096.84 | 67 | N |
98777444 | 5088.66 | 70 | N |
98777444 | 5070.25 | 77 | N |
98777444 | 5048.77 | 84 | N |
98777444 | 5025.25 | 91 | N |
98777444 | 5002.76 | 98 | N |
98777444 | 4995.6 | 100 | N |
98777444 | 4977.19 | 107 | N |
98777444 | 4954.69 | 114 | N |
98777444 | 4926.06 | 121 | N |
98777444 | 4902.54 | 128 | N |
98777444 | 4894.36 | 131 | N |
98777444 | 4874.93 | 138 | N |
98777444 | 4847.31 | 145 | N |
98777444 | 4811.52 | 152 | N |
98777444 | 4770.62 | 159 | N |
98777444 | 4738.91 | 166 | N |
98777444 | 4731.76 | 168 | N |
98777444 | 4691.87 | 175 | N |
98777444 | 4664.26 | 182 | Y |
WANT
VAR1 | VAR9 | VAR10 | FLAG | Perc_inc | Rec_No |
987654321 | 9202.74 | 73 | N | 11.9% | 1 |
987654321 | 8223.05 | 195 | Y | . | 2 |
98777444 | 5132.63 | 53 | N | 10.0% | 1 |
98777444 | 4664.26 | 182 | Y | . | 2 |
previously when i've only had 2 or so records per group i've dont something like
proc sql;
create table work.test as
select a.*, b.rec_no as next_rec,
b.val as next_val,
b.val2 as next_val2
from work.test as a
left join
work.test as b
on a.n eq (b.n - 1);
quit;
But i'm finding it tricky to get a lead value, in a grouping, not sure how to tackle it - could be staring me in the face, just spent too long on it now.
thanks in advance
Try this:
data have;
input VAR1 :$10. VAR9 VAR10 FLAG :$1.;
datalines;
987654321 9202.74 73 N
987654321 8960.37 134 N
987654321 8223.05 195 Y
987654321 6698.29 256 Y
987654321 5159.22 315 Y
987654321 4433.15 347 Y
987654321 3935.12 406 N
987654321 3689.69 486 N
987654321 3285.74 530 N
987654321 2245.72 589 N
98777444 5132.63 53 N
98777444 5117.29 60 N
98777444 5096.84 67 N
98777444 5088.66 70 N
98777444 5070.25 77 N
98777444 5048.77 84 N
98777444 5025.25 91 N
98777444 5002.76 98 N
98777444 4995.6 100 N
98777444 4977.19 107 N
98777444 4954.69 114 N
98777444 4926.06 121 N
98777444 4902.54 128 N
98777444 4894.36 131 N
98777444 4874.93 138 N
98777444 4847.31 145 N
98777444 4811.52 152 N
98777444 4770.62 159 N
98777444 4738.91 166 N
98777444 4731.76 168 N
98777444 4691.87 175 N
98777444 4664.26 182 Y
;
data want;
merge
have
have (
rename=(var9=_var9 var10=_var10 flag=_flag)
where=(_flag = 'Y')
)
;
by var1;
if first.var1;
format perc_inc percent7.1;
perc_inc = (var9 - _var9) / _var9;
rec_no = 1;
output;
flag = _flag;
var9 = _var9;
var10 = _var10;
perc_inc = .;
rec_no = 2;
output;
drop _:;
run;
So you need to compare the overall first value of a group with the first value within that group where flag="Y"? Meaning, the result will always have only 2 observations per VAR1.
Try this:
data have;
input VAR1 :$10. VAR9 VAR10 FLAG :$1.;
datalines;
987654321 9202.74 73 N
987654321 8960.37 134 N
987654321 8223.05 195 Y
987654321 6698.29 256 Y
987654321 5159.22 315 Y
987654321 4433.15 347 Y
987654321 3935.12 406 N
987654321 3689.69 486 N
987654321 3285.74 530 N
987654321 2245.72 589 N
98777444 5132.63 53 N
98777444 5117.29 60 N
98777444 5096.84 67 N
98777444 5088.66 70 N
98777444 5070.25 77 N
98777444 5048.77 84 N
98777444 5025.25 91 N
98777444 5002.76 98 N
98777444 4995.6 100 N
98777444 4977.19 107 N
98777444 4954.69 114 N
98777444 4926.06 121 N
98777444 4902.54 128 N
98777444 4894.36 131 N
98777444 4874.93 138 N
98777444 4847.31 145 N
98777444 4811.52 152 N
98777444 4770.62 159 N
98777444 4738.91 166 N
98777444 4731.76 168 N
98777444 4691.87 175 N
98777444 4664.26 182 Y
;
data want;
merge
have
have (
rename=(var9=_var9 var10=_var10 flag=_flag)
where=(_flag = 'Y')
)
;
by var1;
if first.var1;
format perc_inc percent7.1;
perc_inc = (var9 - _var9) / _var9;
rec_no = 1;
output;
flag = _flag;
var9 = _var9;
var10 = _var10;
perc_inc = .;
rec_no = 2;
output;
drop _:;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.