Desktop productivity for business analysts and programmers

Selecting (and averaging) the first few and last few records of a dataset

Reply
Contributor GDA
Contributor
Posts: 21

Selecting (and averaging) the first few and last few records of a dataset

Hello all SAS guru's

I have a data set that has 4 columns.

PART_NUM (character)
SUB_ASSEMBLY (numeric)
DATE (date)
VALUE (numeric)

What I'm looking to do is get an average of VALUE from the first 2 and an average of VALUE from the last 2 records of the data set based on DATE and SUB_ASSEMBLY for each PART_NUM.
Each part_num can have a different number of sub_assembly values.

For example if the data set was: (data is in columns. doesn't copy very well with just text)
PART_NUM SUB_ASSEMBLY DATE VALUE
A 0 3/1/2010 5
A 1 3/1/2010 4
A 2 3/1/2010 5
A 0 3/2/2010 4
A 1 3/2/2010 4
A 2 3/2/2010 5
B 0 3/1/2010 5
B 1 3/1/2010 4
B 0 3/2/2010 4
B 1 3/2/2010 4
...
A 0 3/20/2010 6
A 1 3/20/2010 7
A 2 3/20/2010 4
A 0 3/21/2010 4
A 1 3/21/2010 5
A 2 3/21/2010 3
B 0 3/20/2010 6
B 1 3/20/2010 7
B 0 3/21/2010 3
B 1 3/21/2010 5

This would yield a table with 4 columns with unique PART_NUM, SUB_ASSEMBLY and two "averaged" columns.
PART_NUM SUB_ASSEMBLY Average(First2) Average(Second2)
A 0 4.5 5.0
A 1 4.0 6.0
A 2 5.0 3.5
B 0 4.5 4.5
B 1 4.0 6.0

As an example, the first row of the output table with PART_NUM "A" and SUB_ASSEMBLY "0"
the average(First 2) is the avg of 5 on 3/1/2010 and 4 on 3/2/2010
the average(Second 2) is the avg of 6 on 3/20/2010 and 3 on 3/20/2010

I am using Enterprise Guide and was hoping that there was a way to do this within that environment. I can't figure one out.
So I'm hoping that some somebody either knows how or can give some advice on SAS code that will solve this.

I'm also hoping I explained this clearly enough.

thanks
N/A
Posts: 0

Re: Selecting (and averaging) the first few and last few records of a dataset

data tmp;
attrib date format=ddmmyy10.;

input PART_NUM $1. SUB_ASSEMBLY DATE mmddyy10. VALUE;
cards;
A 0 03/01/2010 5
A 1 03/01/2010 4
A 2 03/01/2010 5
A 0 03/02/2010 4
A 1 03/02/2010 4
A 2 03/02/2010 5
B 0 03/01/2010 5
B 1 03/01/2010 4
B 0 03/02/2010 4
B 1 03/02/2010 4
A 0 3/20/2010 6
A 1 3/20/2010 7
A 2 3/20/2010 4
A 0 3/21/2010 4
A 1 3/21/2010 5
A 2 3/21/2010 3
B 0 3/20/2010 6
B 1 3/20/2010 7
B 0 3/21/2010 3
B 1 3/21/2010 5

;
run;

proc sort data=tmp;
by PART_NUM SUB_ASSEMBLY DATE;
run;

data tmp_avg;
attrib PART_NUM format=$1.
SUB_ASSEMBLY format=8.
avg_first format=comma22.2
avg_second format=comma22.2
;
set tmp;
by PART_NUM SUB_ASSEMBLY DATE;
array var{2};
if first.PART_NUM or first.SUB_ASSEMBLY then do;
num_obs=0;
do i = 1 to 2;
var(i)=0;
end;
end;
num_obs + 1;

if num_obs le 2 then var(1)+VALUE;
if num_obs ge 3 then var(2)+VALUE;

if last.PART_NUM or last.SUB_ASSEMBLY then do;
avg_first=var(1)/2;
avg_second=var(2)/2;
output;
end;

keep PART_NUM SUB_ASSEMBLY avg_first avg_second;

run;
Ask a Question
Discussion stats
  • 1 reply
  • 102 views
  • 0 likes
  • 2 in conversation