average of consecutive numbers in a column

peatjohnston · Posted 02-06-2011 11:19 AM

Hello,

I have a set of data given to me by a student who entered the data in the following format:

sub$ d v
a 5 2
a 19 3
a 20 2
a 21 5
a 23 2
a 18 3
a 9 .
b 6 2
b 18 1
b 19 3
b 18 4
b 11 2
b 3 .

I need to take the average of two consecutive numbers in column 'd' and put them in a new column so that I can multiply them by column 'v' for each of the subjects (column 'sub$').
The result of taking the average (ave) should look like this:

sub$ d v ave
a 5 2 12
a 19 3 19.5
a 20 2 20.5
a 21 5 22
a 23 2 20.5
a 18 3 13.5
a 9 . 0
b 6 2 12
b 18 1 18.5
b 19 3 18.5
b 18 4 14.5
b 11 2 7
b 3 . 0

My first thought was to transpose the data (I have other variables that can be used to transpose each row into a column for all subjects), and apply the operations on the newly created variables. But if there is an easier way, I would be interested to learn; I have spent a day looking through online documentation.

Thanks in advance.

Peter_C · Posted 02-06-2011 11:41 AM

here is a simple merge-without-by
data t1 ;
input sub $ d v ;
list; datalines;
a 5 2
a 19 3
a 20 2
a 21 5
a 23 2
a 18 3
a 9 .
b 6 2
b 18 1
b 19 3
b 18 4
b 11 2
b 3 .
;
data aves ;
merge t1 t1( firstobs=2 keep= sub d rename=( sub=sub2 d=d2 )) ;
* notice, no By statement ;
if sub ne sub2 then do ;
* handle last for SUB ;
ave = 0 ;
end ;
else
ave = (d + d2 ) /2 ;
drop sub2 d2 ;
put ( sub d v ave )( $3. 3*best6. );
run ;
which puts this report into the log[pre]a 5 2 12
a 19 3 19.5
a 20 2 20.5
a 21 5 22
a 23 2 20.5
a 18 3 13.5
a 9 . 0
b 6 2 12
b 18 1 18.5
b 19 3 18.5
b 18 4 14.5
b 11 2 7
b 3 . 0
NOTE: There were 12 observations read from the data set WORK.T1.
NOTE: There were 13 observations read from the data set WORK.T1.
NOTE: The data set WORK.AVES has 13 observations and 4 variables.
NOTE: DATA statement used[/pre]Without the BY statement the data step merges the "next" row columns SUB and D - renamed to ensure the step has next D in a different name. With no BY statement there is no LAST.SUB available so SUB2 (the next SUB) clarifies the end of a SUB when these differ.

The next stage AVE * V could be done in the same step, but you want to work on that yourself...... 😉

Ksharp · Posted 02-17-2011 12:14 AM

Hi.
Peter.C 's code is great.
Peter ,I have not think 'merge' statement can be used in that way. 🙂

[pre]
data temp;
input sub$ d v;
datalines;
a 5 2
a 19 3
a 20 2
a 21 5
a 23 2
a 18 3
a 9 .
b 6 2
b 18 1
b 19 3
b 18 4
b 11 2
b 3 .
;
run;
data result;
set temp;
by sub;
ave=mean(lag(d),d);
if not first.sub then do;
obs=_n_-1;
set temp point=obs;
output;
end;
if last.sub then do;
ave=0;
set temp point=_n_;
output;
end;
run;

[/pre]

Ksharp Message was edited by: Ksharp

Peter_C · Posted 02-17-2011 12:26 PM

thank you kSharp
it is the classic "read-ahead"
should be around the archives in more than one posting
frequently brought out when a poster wants LEAD instead of LAG

peterC

Ksharp · Posted 02-18-2011 01:27 AM

Hi.Peter.
I know the usage of 'merge' statement without 'by' statement,But I forget the dataset option 'firstobs' which can do it. It enlight my brain, and give me one more approach to process these data.Thanks.

Best Regards.
Ksharp

average of consecutive numbers in a column

Re: average of consecutive numbers in a column

Re: average of consecutive numbers in a column

Re: average of consecutive numbers in a column

Re: average of consecutive numbers in a column

Registration is open

average of consecutive numbers in a column

Re: average of consecutive numbers in a column

Re: average of consecutive numbers in a column

Re: average of consecutive numbers in a column

Re: average of consecutive numbers in a column

Registration is open

SAS Training: Just a Click Away