BookmarkSubscribeRSS Feed
Cho8
Calcite | Level 5

Average of first 12 non-missing values vertically.

 

XY0Y1Y2Y3Y4Y5Y6Y7Y8Y9Y10
2012011%0%1%0%0%(0%).....
2012023%(1%)2%1%0%0%1%....
2012031%0%1%0%(0%)0%0%0%...
2012042%1%0%1%2%0%(0%)(0%)1%..
2012051%0%0%2%(0%)0%1%0%3%1%.
2012060%1%1%2%0%(0%)0%(0%)1%3%2%
2012071%2%(0%)(0%)0%0%3%0%2%(0%)0%
2012081%(0%)0%1%1%0%2%2%(0%)1%3%
2012091%1%(0%)3%0%2%1%(0%)2%3%2%
2012102%(0%)(0%)1%4%3%2%1%4%4%0%
2012111%2%1%(0%)0%(0%)0%3%0%0%6%
2012122%0%1%(0%)1%6%0%2%3%(0%)(0%)
2013012%2%0%(0%)0%0%0%(0%)2%(0%)1%
2013020%0%1%0%0%1%(0%)1%5%(1%)1%
2013031%1%0%1%0%1%0%1%(0%)1%4%
2013040%4%1%1%1%3%2%4%(1%)1%4%
AVG0.0129210.0048530.0054520.0081070.0069150.0096270.0082070.0072930.0188980.011633 
9 REPLIES 9
PaigeMiller
Diamond | Level 26

This would be much easier if your data is in a long data set instead of a wide data set (wide data sets are rarely required and much harder to program).

 

As an example (to save myself some typing) here I get the average for the first THREE non-missing records. Note, this code will work for how ever many Y variables you may have, and with minor modifications will work for any number of non-missing records.

 

data have;
    input x y1 y2;
	cards;
	201201 0.1 .
	201202 0.3 .
	201203 -0.2 0.2
	201204 0 0.25
	201205 0.1 0.4
	201206 -0.1 -0.1
;

/* CONVERT TO A LONG DATA SET */
proc transpose data=have out=have_t;
	by x;
	var y:;
run;
proc sort data=have_t;
	by _name_ x;
run;
data want;
	set have_t;
	by _name_;
	if first._name_ then do;
		sum=0;
		counter=0;
	end;
	if not missing(col1) then do;
		counter+1;
		if counter<=3 then sum+col1;
	end;
	if last._name_ then do;
		mean=sum/3;
		output;
	end;
run;
--
Paige Miller
Cho8
Calcite | Level 5

its giving 0 for all sum ,counter and mean.

 

missing value for Col1..

 

when i use mean=sum/12

PaigeMiller
Diamond | Level 26

@Cho8 wrote:

its giving 0 for all sum ,counter and mean.

 

missing value for Col1..

 

when i use mean=sum/12


It does not give 0 when I run it. You must not be running the same code.

--
Paige Miller
ballardw
Super User

If you want to average percentages you better have both the numerator and denominator used to calculate the percentage as unless the denominator is exactly the same for every percentage then the "average" is suspect at best a possibly terribly misleading at worst.

Suppose I drive a car 1 mile and get 100 miles per gallon because I was able to do a lot of coasting. Then the next leg measure was 100 miles and I get 30 mpg because of more typical driving conditions. Is the "averager" mgp (100+30)/2 = 65 mpg?

 

Even worse, I strongly suspect the values you show are rounded, which even further reduces the reliability of the "average".

mkeintz
PROC Star

For each variable, you want to calculate the average of the first 12 non-missing values.  This program (untested since you haven't generated your sample data in the form of a working data step) is one approach using the data in its current wide format:

 

data want (drop=i);
  set have end=end_of_have;
  array tmpsum {10} _temporary_;
  array tmpn   {10} _temporary_;

  array y {*} y: ;

  do i=1 to 10;
    if y{i}=. or tmpn{i}=12 then continue;
    tmpn{i}+1;
	tmpsum{i}+y{i};
  end;
  output;
  if end_of_have;
  do i=1 to 10;
    y{i}=tmpsum{i}/12;
  end;
  output;
run;

 The tmpsum and tmpn array keep track of sums and counts of the 10 Y variables.  For each iteration of the loop the CONTINUE statement means to go to the next iteration without performing subsequent within-loop statements.  The provides a way to only add the first 12 values of each Y in the corresponding tmpsum/tmpn arrays.

 

There are two explicit OUTPUT statements - one for each obs in the data set, and one which is performed only at the end of data set HAVE, which writes on the desired average values.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Cho8
Calcite | Level 5

where do u print the vertical average for each variable

mkeintz
PROC Star

So now you want to PRINT the results?  My program (did you actually run it and examine the resulting dataset?) appends an observation to the end of the dataset, with vars Y1-Y10 containing the desired average. 

 

So run a proc print on that dataset.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User

Is it what you are looking for ?

 

data have;
infile datalines expandtabs truncover;
input y  (d1-d9) (: percent8.);
datalines;
201201	1%	0%	1%	0%	0%	(0%)	.	.	.	.	.
201202	3%	(1%)	2%	1%	0%	0%	1%	.	.	.	.
201203	1%	0%	1%	0%	(0%)	0%	0%	0%	.	.	.
201204	2%	1%	0%	1%	2%	0%	(0%)	(0%)	1%	.	.
201205	1%	0%	0%	2%	(0%)	0%	1%	0%	3%	1%	.
201206	0%	1%	1%	2%	0%	(0%)	0%	(0%)	1%	3%	2%
201207	1%	2%	(0%)	(0%)	0%	0%	3%	0%	2%	(0%)	0%
201208	1%	(0%)	0%	1%	1%	0%	2%	2%	(0%)	1%	3%
201209	1%	1%	(0%)	3%	0%	2%	1%	(0%)	2%	3%	2%
201210	2%	(0%)	(0%)	1%	4%	3%	2%	1%	4%	4%	0%
201211	1%	2%	1%	(0%)	0%	(0%)	0%	3%	0%	0%	6%
201212	2%	0%	1%	(0%)	1%	6%	0%	2%	3%	(0%)	(0%)
201301	2%	2%	0%	(0%)	0%	0%	0%	(0%)	2%	(0%)	1%
201302	0%	0%	1%	0%	0%	1%	(0%)	1%	5%	(1%)	1%
201303	1%	1%	0%	1%	0%	1%	0%	1%	(0%)	1%	4%
201304	0%	4%	1%	1%	1%	3%	2%	4%
;

proc transpose data=have out=temp name=y prefix=_;
var d1-d9;
id y;
run;

data temp;
 set temp;
 array x{*} _:;
 count=0;
 do i=1 to dim(x);
  if not missing(x{i})  then do;
  count+1;
  sum=sum(sum,x{i});
  if count=12 then leave;
 end;
 end;
 _avg=sum/count;
 drop i  count sum;
run;

proc transpose data=temp out=want;
var _:;
id y;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1715 views
  • 3 likes
  • 6 in conversation