Help using Base SAS procedures

average of consecutive numbers in a column

Reply
Occasional Contributor
Posts: 17

average of consecutive numbers in a column

Hello,

I have a set of data given to me by a student who entered the data in the following format:

sub$ d v
a 5 2
a 19 3
a 20 2
a 21 5
a 23 2
a 18 3
a 9 .
b 6 2
b 18 1
b 19 3
b 18 4
b 11 2
b 3 .

I need to take the average of two consecutive numbers in column 'd' and put them in a new column so that I can multiply them by column 'v' for each of the subjects (column 'sub$').
The result of taking the average (ave) should look like this:

sub$ d v ave
a 5 2 12
a 19 3 19.5
a 20 2 20.5
a 21 5 22
a 23 2 20.5
a 18 3 13.5
a 9 . 0
b 6 2 12
b 18 1 18.5
b 19 3 18.5
b 18 4 14.5
b 11 2 7
b 3 . 0

My first thought was to transpose the data (I have other variables that can be used to transpose each row into a column for all subjects), and apply the operations on the newly created variables. But if there is an easier way, I would be interested to learn; I have spent a day looking through online documentation.

Thanks in advance.
Valued Guide
Posts: 2,177

Re: average of consecutive numbers in a column

Posted in reply to peatjohnston
here is a simple merge-without-by
data t1 ;
input sub $ d v ;
list; datalines;
a 5 2
a 19 3
a 20 2
a 21 5
a 23 2
a 18 3
a 9 .
b 6 2
b 18 1
b 19 3
b 18 4
b 11 2
b 3 .
;
data aves ;
merge t1 t1( firstobs=2 keep= sub d rename=( sub=sub2 d=d2 )) ;
* notice, no By statement ;
if sub ne sub2 then do ;
* handle last for SUB ;
ave = 0 ;
end ;
else
ave = (d + d2 ) /2 ;
drop sub2 d2 ;
put ( sub d v ave )( $3. 3*best6. );
run ;
which puts this report into the log[pre]a 5 2 12
a 19 3 19.5
a 20 2 20.5
a 21 5 22
a 23 2 20.5
a 18 3 13.5
a 9 . 0
b 6 2 12
b 18 1 18.5
b 19 3 18.5
b 18 4 14.5
b 11 2 7
b 3 . 0
NOTE: There were 12 observations read from the data set WORK.T1.
NOTE: There were 13 observations read from the data set WORK.T1.
NOTE: The data set WORK.AVES has 13 observations and 4 variables.
NOTE: DATA statement used[/pre]Without the BY statement the data step merges the "next" row columns SUB and D - renamed to ensure the step has next D in a different name. With no BY statement there is no LAST.SUB available so SUB2 (the next SUB) clarifies the end of a SUB when these differ.

The next stage AVE * V could be done in the same step, but you want to work on that yourself...... ;-)
Super User
Posts: 10,020

Re: average of consecutive numbers in a column

Posted in reply to peatjohnston
Hi.
Peter.C 's code is great.
Peter ,I have not think 'merge' statement can be used in that way. Smiley Happy


[pre]
data temp;
input sub$ d v;
datalines;
a 5 2
a 19 3
a 20 2
a 21 5
a 23 2
a 18 3
a 9 .
b 6 2
b 18 1
b 19 3
b 18 4
b 11 2
b 3 .
;
run;
data result;
set temp;
by sub;
ave=mean(lag(d),d);
if not first.sub then do;
obs=_n_-1;
set temp point=obs;
output;
end;
if last.sub then do;
ave=0;
set temp point=_n_;
output;
end;
run;

[/pre]



Ksharp Message was edited by: Ksharp
Valued Guide
Posts: 2,177

Re: average of consecutive numbers in a column

thank you kSharp
it is the classic "read-ahead"
should be around the archives in more than one posting
frequently brought out when a poster wants LEAD instead of LAG

peterC
Super User
Posts: 10,020

Re: average of consecutive numbers in a column

Hi.Peter.
I know the usage of 'merge' statement without 'by' statement,But I forget the dataset option 'firstobs' which can do it. It enlight my brain, and give me one more approach to process these data.Thanks.


Best Regards.
Ksharp
Ask a Question
Discussion stats
  • 4 replies
  • 261 views
  • 0 likes
  • 3 in conversation