BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
teelov
Quartz | Level 8

Hello,

I have values in a table which have met certain conditions. i'm in a situation where i have already joined the table on its selft to create an rolling total based on the next value by doing a N-1 type join, however i'm a bit stumped of how to add extra rules

 

the next problem is where i have grouped keys, i want identify the first key in the group (VAR1), hold that value, then when the column called "FLAG" has a "Y" sum them together, then move on to the next group. see example of data then what i want to achieve

 

HAVE

VAR1VAR9VAR10FLAG
9876543219202.7473N
9876543218960.37134N
9876543218223.05195Y
9876543216698.29256Y
9876543215159.22315Y
9876543214433.15347Y
9876543213935.12406N
9876543213689.69486N
9876543213285.74530N
9876543212245.72589N
987774445132.6353N
987774445117.2960N
987774445096.8467N
987774445088.6670N
987774445070.2577N
987774445048.7784N
987774445025.2591N
987774445002.7698N
987774444995.6100N
987774444977.19107N
987774444954.69114N
987774444926.06121N
987774444902.54128N
987774444894.36131N
987774444874.93138N
987774444847.31145N
987774444811.52152N
987774444770.62159N
987774444738.91166N
987774444731.76168N
987774444691.87175N
987774444664.26182Y

 

WANT

VAR1VAR9VAR10FLAGPerc_incRec_No
9876543219202.7473N11.9%1
9876543218223.05195Y.2
987774445132.6353N10.0%1
987774444664.26182Y.2

 

previously when i've only had 2 or so records per group i've dont something like

 

proc sql;
	create table work.test as
		select a.*, b.rec_no as next_rec,
			b.val as next_val,
			b.val2 as next_val2
		from work.test as a
			left join
				work.test as b
				on a.n eq (b.n - 1);
quit;

But i'm finding it tricky to get a lead value, in a grouping, not sure how to tackle it - could be staring me in the face, just spent too long on it now.

 

thanks in advance

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Try this:

data have;
input VAR1 :$10. VAR9  VAR10 FLAG :$1.;
datalines;
987654321 9202.74 73  N
987654321 8960.37 134 N
987654321 8223.05 195 Y
987654321 6698.29 256 Y
987654321 5159.22 315 Y
987654321 4433.15 347 Y
987654321 3935.12 406 N
987654321 3689.69 486 N
987654321 3285.74 530 N
987654321 2245.72 589 N
98777444  5132.63 53  N
98777444  5117.29 60  N
98777444  5096.84 67  N
98777444  5088.66 70  N
98777444  5070.25 77  N
98777444  5048.77 84  N
98777444  5025.25 91  N
98777444  5002.76 98  N
98777444  4995.6  100 N
98777444  4977.19 107 N
98777444  4954.69 114 N
98777444  4926.06 121 N
98777444  4902.54 128 N
98777444  4894.36 131 N
98777444  4874.93 138 N
98777444  4847.31 145 N
98777444  4811.52 152 N
98777444  4770.62 159 N
98777444  4738.91 166 N
98777444  4731.76 168 N
98777444  4691.87 175 N
98777444  4664.26 182 Y
;

data want;
merge
  have
  have (
    rename=(var9=_var9 var10=_var10 flag=_flag)
    where=(_flag = 'Y')
  )
;
by var1;
if first.var1;
format perc_inc percent7.1;
perc_inc = (var9 - _var9) / _var9;
rec_no = 1;
output;
flag = _flag;
var9 = _var9;
var10 = _var10;
perc_inc = .;
rec_no = 2;
output;
drop _:;
run;

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

So you need to compare the overall first value of a group with the first value within that group where flag="Y"? Meaning, the result will always have only 2 observations per VAR1.

teelov
Quartz | Level 8
sorry i should change my example you can have many "Y" - i'm just interested in the first
Kurt_Bremser
Super User

Try this:

data have;
input VAR1 :$10. VAR9  VAR10 FLAG :$1.;
datalines;
987654321 9202.74 73  N
987654321 8960.37 134 N
987654321 8223.05 195 Y
987654321 6698.29 256 Y
987654321 5159.22 315 Y
987654321 4433.15 347 Y
987654321 3935.12 406 N
987654321 3689.69 486 N
987654321 3285.74 530 N
987654321 2245.72 589 N
98777444  5132.63 53  N
98777444  5117.29 60  N
98777444  5096.84 67  N
98777444  5088.66 70  N
98777444  5070.25 77  N
98777444  5048.77 84  N
98777444  5025.25 91  N
98777444  5002.76 98  N
98777444  4995.6  100 N
98777444  4977.19 107 N
98777444  4954.69 114 N
98777444  4926.06 121 N
98777444  4902.54 128 N
98777444  4894.36 131 N
98777444  4874.93 138 N
98777444  4847.31 145 N
98777444  4811.52 152 N
98777444  4770.62 159 N
98777444  4738.91 166 N
98777444  4731.76 168 N
98777444  4691.87 175 N
98777444  4664.26 182 Y
;

data want;
merge
  have
  have (
    rename=(var9=_var9 var10=_var10 flag=_flag)
    where=(_flag = 'Y')
  )
;
by var1;
if first.var1;
format perc_inc percent7.1;
perc_inc = (var9 - _var9) / _var9;
rec_no = 1;
output;
flag = _flag;
var9 = _var9;
var10 = _var10;
perc_inc = .;
rec_no = 2;
output;
drop _:;
run;
teelov
Quartz | Level 8
i like this method, but there are some nice things to learn from here, i'm not sure how it will perform on a 200M dataset with 159vars but the techniques are great, thank you for your help... AS ALWAYS!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 757 views
  • 1 like
  • 2 in conversation