Solved: Re: Using first.variable and last.variable to get sum of 2 observation...

jawhitmire · Posted 07-10-2019 06:31 PM

SAS Version 9.4

Good day and thank you for looking at my question.

data work.have;
	infile datalines dlm=' ';
	input CN $1. @5 SEN $1. @9 RT $1. @12 Value;
	datalines;
x	p	d	5
x	p	b	7
x	u	d	6
x 	u	b	8
y	t	d	2
y	t	b	8
z	t	d	3
z	t	b	9
q	p	d	4
q	p	b	6
;
run;

proc sort data=work.have; by cn sen; run;

The sorted dataset, work.have, is identical to the original.

For each unique CN, I need to check that the first observation of SEN is the same as the second observation of SEN.

Also, I need the first RT to be D or B; if D then the second RT should be B; if B, then the second RT should be D.

Finally, I need to calculate the sum of the 2 VALUES that meet these conditions.

The resulting dataset should look like this:

My attempts have failed, but I have this:

data work.want;
	set work.have;
	by CN SEN;
	sumit=0;
	if first.CN and first.sen then do;
		count=0;
		save_value1=value;
	end;
	count+1;
	sumit=save_value1;
	if last.cn and last.sen then save_value2=value;
	sumit+save_value2;
run;
proc print data=work.want; run;

Thank you for assistance.

Jane

Tom · Posted 07-11-2019 11:58 AM

Note that if your BY statement has two variables like:

by CN SEN;

Then whenever you are at the first observation for a given value of CN you are also by definition at the first observation for any variable after it in the BY statement. So your IF condition can be simplified by eliminating the FIRST.CN.

if first.SEN

View solution in original post

heffo · Posted 07-10-2019 08:59 PM

I tried, but the thing I don't get is why the first sumit should be 0 and the rest of them should be the sum of all the value in the group. But, I got the rest. If you really need a 0 there, you would have to use a temporary variable that you retain and then make sure to only use that one if you are on row two in each group.

data work.want;
	set work.have;
	by CN SEN;
	if first.sen then do;
		sumit=0; 
		count=0;
		save_value1=value;
	end;
	count+1;
	sumit+value;
	if last.sen then save_value2=value;
run;

I think you got a bit confused with the first.CN. You basically don't need that one.

If you have first.CN then by definition first.SEN will be true. As an example.

CN SEN

A A

A B

B B

The third row will have have both first.CN and first.SEN true.

jawhitmire · Posted 07-11-2019 11:36 AM

The problem is that CN may have 4 identical characters. The following is my current version and is summing as desired, but fails to check one RT value is "d" and the other is "b":

data work.have;
	infile datalines dlm=' ';
	input CN $1. @5 SEN $1. @9 RT $1. @12 Value;
	datalines;
x	p	d	5
x	p	b	7
x	u	d	6
x 	u	b	8
y	t	d	2
y	t	b	8
z	t	d	3
z	t	b	9
q	p	d	4
q	p	b	6
;
run;

proc sort data=work.have; by CN SEN; run;
data work.want2 ;
	set work.have;
	by CN SEN;
	if first.CN or first.SEN then cumscore=value;
	else cumscore+value;
run;
proc print data=work.want2; title 'want2'; run;

proc print data=work.want; run;

output:

Tom · Posted 07-11-2019 11:58 AM

Note that if your BY statement has two variables like:

by CN SEN;

Then whenever you are at the first observation for a given value of CN you are also by definition at the first observation for any variable after it in the BY statement. So your IF condition can be simplified by eliminating the FIRST.CN.

if first.SEN

Using first.variable and last.variable to get sum of 2 observations

Re: Using first.variable and last.variable to get sum of 2 observations

Re: Using first.variable and last.variable to get sum of 2 observations

Re: Using first.variable and last.variable to get sum of 2 observations

Re: Using first.variable and last.variable to get sum of 2 observations