Hello,
I have values in a table which have met certain conditions. i'm in a situation where i have already joined the table on its selft to create an rolling total based on the next value by doing a N-1 type join, however i'm a bit stumped of how to add extra rules
the next problem is where i have grouped keys, i want identify the first key in the group (VAR1), hold that value, then when the column called "FLAG" has a "Y" sum them together, then move on to the next group. see example of data then what i want to achieve
HAVE
VAR1 | VAR9 | VAR10 | FLAG |
987654321 | 9202.74 | 73 | N |
987654321 | 8960.37 | 134 | N |
987654321 | 8223.05 | 195 | Y |
987654321 | 6698.29 | 256 | Y |
987654321 | 5159.22 | 315 | Y |
987654321 | 4433.15 | 347 | Y |
987654321 | 3935.12 | 406 | N |
987654321 | 3689.69 | 486 | N |
987654321 | 3285.74 | 530 | N |
987654321 | 2245.72 | 589 | N |
98777444 | 5132.63 | 53 | N |
98777444 | 5117.29 | 60 | N |
98777444 | 5096.84 | 67 | N |
98777444 | 5088.66 | 70 | N |
98777444 | 5070.25 | 77 | N |
98777444 | 5048.77 | 84 | N |
98777444 | 5025.25 | 91 | N |
98777444 | 5002.76 | 98 | N |
98777444 | 4995.6 | 100 | N |
98777444 | 4977.19 | 107 | N |
98777444 | 4954.69 | 114 | N |
98777444 | 4926.06 | 121 | N |
98777444 | 4902.54 | 128 | N |
98777444 | 4894.36 | 131 | N |
98777444 | 4874.93 | 138 | N |
98777444 | 4847.31 | 145 | N |
98777444 | 4811.52 | 152 | N |
98777444 | 4770.62 | 159 | N |
98777444 | 4738.91 | 166 | N |
98777444 | 4731.76 | 168 | N |
98777444 | 4691.87 | 175 | N |
98777444 | 4664.26 | 182 | Y |
WANT
VAR1 | VAR9 | VAR10 | FLAG | Perc_inc | Rec_No |
987654321 | 9202.74 | 73 | N | 11.9% | 1 |
987654321 | 8223.05 | 195 | Y | . | 2 |
98777444 | 5132.63 | 53 | N | 10.0% | 1 |
98777444 | 4664.26 | 182 | Y | . | 2 |
previously when i've only had 2 or so records per group i've dont something like
proc sql;
create table work.test as
select a.*, b.rec_no as next_rec,
b.val as next_val,
b.val2 as next_val2
from work.test as a
left join
work.test as b
on a.n eq (b.n - 1);
quit;
But i'm finding it tricky to get a lead value, in a grouping, not sure how to tackle it - could be staring me in the face, just spent too long on it now.
thanks in advance
Try this:
data have;
input VAR1 :$10. VAR9 VAR10 FLAG :$1.;
datalines;
987654321 9202.74 73 N
987654321 8960.37 134 N
987654321 8223.05 195 Y
987654321 6698.29 256 Y
987654321 5159.22 315 Y
987654321 4433.15 347 Y
987654321 3935.12 406 N
987654321 3689.69 486 N
987654321 3285.74 530 N
987654321 2245.72 589 N
98777444 5132.63 53 N
98777444 5117.29 60 N
98777444 5096.84 67 N
98777444 5088.66 70 N
98777444 5070.25 77 N
98777444 5048.77 84 N
98777444 5025.25 91 N
98777444 5002.76 98 N
98777444 4995.6 100 N
98777444 4977.19 107 N
98777444 4954.69 114 N
98777444 4926.06 121 N
98777444 4902.54 128 N
98777444 4894.36 131 N
98777444 4874.93 138 N
98777444 4847.31 145 N
98777444 4811.52 152 N
98777444 4770.62 159 N
98777444 4738.91 166 N
98777444 4731.76 168 N
98777444 4691.87 175 N
98777444 4664.26 182 Y
;
data want;
merge
have
have (
rename=(var9=_var9 var10=_var10 flag=_flag)
where=(_flag = 'Y')
)
;
by var1;
if first.var1;
format perc_inc percent7.1;
perc_inc = (var9 - _var9) / _var9;
rec_no = 1;
output;
flag = _flag;
var9 = _var9;
var10 = _var10;
perc_inc = .;
rec_no = 2;
output;
drop _:;
run;
So you need to compare the overall first value of a group with the first value within that group where flag="Y"? Meaning, the result will always have only 2 observations per VAR1.
Try this:
data have;
input VAR1 :$10. VAR9 VAR10 FLAG :$1.;
datalines;
987654321 9202.74 73 N
987654321 8960.37 134 N
987654321 8223.05 195 Y
987654321 6698.29 256 Y
987654321 5159.22 315 Y
987654321 4433.15 347 Y
987654321 3935.12 406 N
987654321 3689.69 486 N
987654321 3285.74 530 N
987654321 2245.72 589 N
98777444 5132.63 53 N
98777444 5117.29 60 N
98777444 5096.84 67 N
98777444 5088.66 70 N
98777444 5070.25 77 N
98777444 5048.77 84 N
98777444 5025.25 91 N
98777444 5002.76 98 N
98777444 4995.6 100 N
98777444 4977.19 107 N
98777444 4954.69 114 N
98777444 4926.06 121 N
98777444 4902.54 128 N
98777444 4894.36 131 N
98777444 4874.93 138 N
98777444 4847.31 145 N
98777444 4811.52 152 N
98777444 4770.62 159 N
98777444 4738.91 166 N
98777444 4731.76 168 N
98777444 4691.87 175 N
98777444 4664.26 182 Y
;
data want;
merge
have
have (
rename=(var9=_var9 var10=_var10 flag=_flag)
where=(_flag = 'Y')
)
;
by var1;
if first.var1;
format perc_inc percent7.1;
perc_inc = (var9 - _var9) / _var9;
rec_no = 1;
output;
flag = _flag;
var9 = _var9;
var10 = _var10;
perc_inc = .;
rec_no = 2;
output;
drop _:;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.