turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- average of consecutive numbers in a column

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-06-2011 11:19 AM

Hello,

I have a set of data given to me by a student who entered the data in the following format:

sub$ d v

a 5 2

a 19 3

a 20 2

a 21 5

a 23 2

a 18 3

a 9 .

b 6 2

b 18 1

b 19 3

b 18 4

b 11 2

b 3 .

I need to take the average of two consecutive numbers in column 'd' and put them in a new column so that I can multiply them by column 'v' for each of the subjects (column 'sub$').

The result of taking the average (ave) should look like this:

sub$ d v ave

a 5 2 12

a 19 3 19.5

a 20 2 20.5

a 21 5 22

a 23 2 20.5

a 18 3 13.5

a 9 . 0

b 6 2 12

b 18 1 18.5

b 19 3 18.5

b 18 4 14.5

b 11 2 7

b 3 . 0

My first thought was to transpose the data (I have other variables that can be used to transpose each row into a column for all subjects), and apply the operations on the newly created variables. But if there is an easier way, I would be interested to learn; I have spent a day looking through online documentation.

Thanks in advance.

I have a set of data given to me by a student who entered the data in the following format:

sub$ d v

a 5 2

a 19 3

a 20 2

a 21 5

a 23 2

a 18 3

a 9 .

b 6 2

b 18 1

b 19 3

b 18 4

b 11 2

b 3 .

I need to take the average of two consecutive numbers in column 'd' and put them in a new column so that I can multiply them by column 'v' for each of the subjects (column 'sub$').

The result of taking the average (ave) should look like this:

sub$ d v ave

a 5 2 12

a 19 3 19.5

a 20 2 20.5

a 21 5 22

a 23 2 20.5

a 18 3 13.5

a 9 . 0

b 6 2 12

b 18 1 18.5

b 19 3 18.5

b 18 4 14.5

b 11 2 7

b 3 . 0

My first thought was to transpose the data (I have other variables that can be used to transpose each row into a column for all subjects), and apply the operations on the newly created variables. But if there is an easier way, I would be interested to learn; I have spent a day looking through online documentation.

Thanks in advance.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-06-2011 11:41 AM

here is a simple merge-without-by

data t1 ;

input sub $ d v ;

list; datalines;

a 5 2

a 19 3

a 20 2

a 21 5

a 23 2

a 18 3

a 9 .

b 6 2

b 18 1

b 19 3

b 18 4

b 11 2

b 3 .

;

data aves ;

merge t1 t1( firstobs=2 keep= sub d rename=( sub=sub2 d=d2 )) ;

* notice, no By statement ;

if sub ne sub2 then do ;

* handle last for SUB ;

ave = 0 ;

end ;

else

ave = (d + d2 ) /2 ;

drop sub2 d2 ;

put ( sub d v ave )( $3. 3*best6. );

run ;

which puts this report into the log[pre]a 5 2 12

a 19 3 19.5

a 20 2 20.5

a 21 5 22

a 23 2 20.5

a 18 3 13.5

a 9 . 0

b 6 2 12

b 18 1 18.5

b 19 3 18.5

b 18 4 14.5

b 11 2 7

b 3 . 0

NOTE: There were 12 observations read from the data set WORK.T1.

NOTE: There were 13 observations read from the data set WORK.T1.

NOTE: The data set WORK.AVES has 13 observations and 4 variables.

NOTE: DATA statement used[/pre]Without the BY statement the data step merges the "next" row columns SUB and D - renamed to ensure the step has next D in a different name. With no BY statement there is no LAST.SUB available so SUB2 (the next SUB) clarifies the end of a SUB when these differ.

The next stage AVE * V could be done in the same step, but you want to work on that yourself...... ;-)

data t1 ;

input sub $ d v ;

list; datalines;

a 5 2

a 19 3

a 20 2

a 21 5

a 23 2

a 18 3

a 9 .

b 6 2

b 18 1

b 19 3

b 18 4

b 11 2

b 3 .

;

data aves ;

merge t1 t1( firstobs=2 keep= sub d rename=( sub=sub2 d=d2 )) ;

* notice, no By statement ;

if sub ne sub2 then do ;

* handle last for SUB ;

ave = 0 ;

end ;

else

ave = (d + d2 ) /2 ;

drop sub2 d2 ;

put ( sub d v ave )( $3. 3*best6. );

run ;

which puts this report into the log[pre]a 5 2 12

a 19 3 19.5

a 20 2 20.5

a 21 5 22

a 23 2 20.5

a 18 3 13.5

a 9 . 0

b 6 2 12

b 18 1 18.5

b 19 3 18.5

b 18 4 14.5

b 11 2 7

b 3 . 0

NOTE: There were 12 observations read from the data set WORK.T1.

NOTE: There were 13 observations read from the data set WORK.T1.

NOTE: The data set WORK.AVES has 13 observations and 4 variables.

NOTE: DATA statement used[/pre]Without the BY statement the data step merges the "next" row columns SUB and D - renamed to ensure the step has next D in a different name. With no BY statement there is no LAST.SUB available so SUB2 (the next SUB) clarifies the end of a SUB when these differ.

The next stage AVE * V could be done in the same step, but you want to work on that yourself...... ;-)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-17-2011 12:14 AM

Hi.

Peter.C 's code is great.

Peter ,I have not think 'merge' statement can be used in that way.

[pre]

data temp;

input sub$ d v;

datalines;

a 5 2

a 19 3

a 20 2

a 21 5

a 23 2

a 18 3

a 9 .

b 6 2

b 18 1

b 19 3

b 18 4

b 11 2

b 3 .

;

run;

data result;

set temp;

by sub;

ave=mean(lag(d),d);

if not first.sub then do;

obs=_n_-1;

set temp point=obs;

output;

end;

if last.sub then do;

ave=0;

set temp point=_n_;

output;

end;

run;

[/pre]

Ksharp Message was edited by: Ksharp

Peter.C 's code is great.

Peter ,I have not think 'merge' statement can be used in that way.

[pre]

data temp;

input sub$ d v;

datalines;

a 5 2

a 19 3

a 20 2

a 21 5

a 23 2

a 18 3

a 9 .

b 6 2

b 18 1

b 19 3

b 18 4

b 11 2

b 3 .

;

run;

data result;

set temp;

by sub;

ave=mean(lag(d),d);

if not first.sub then do;

obs=_n_-1;

set temp point=obs;

output;

end;

if last.sub then do;

ave=0;

set temp point=_n_;

output;

end;

run;

[/pre]

Ksharp Message was edited by: Ksharp

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-17-2011 12:26 PM

thank you kSharp

it is the classic "read-ahead"

should be around the archives in more than one posting

frequently brought out when a poster wants LEAD instead of LAG

peterC

it is the classic "read-ahead"

should be around the archives in more than one posting

frequently brought out when a poster wants LEAD instead of LAG

peterC

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-18-2011 01:27 AM

Hi.Peter.

I know the usage of 'merge' statement without 'by' statement,But I forget the dataset option 'firstobs' which can do it. It enlight my brain, and give me one more approach to process these data.Thanks.

Best Regards.

Ksharp

I know the usage of 'merge' statement without 'by' statement,But I forget the dataset option 'firstobs' which can do it. It enlight my brain, and give me one more approach to process these data.Thanks.

Best Regards.

Ksharp