turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- mean value in time series with array for multible ...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-30-2016 04:20 AM - edited 08-30-2016 04:25 AM

Hi guys!

i have a time series with daily values for soil moisture (var6-var371) for a couple of years (var5) and different stations (key). Now i want to calculate the mean values for each day over the years for each station

my raw-data looks like this

i tried to build an array, but sas is not happy with my code (ERROR: Array subscript out of range at Zeile 1053 Spalte 19) i guess it is something with my variable names?!

```
data test3;
set test2;
by key;
array varkum [365];
do i = 1 to 365;
varkum[i]=6666;
end;
array var [366] var6-var371;
if first.key then do i = 6 to 371;
varkum[i] = varkum[i]+var[i];
end;
run;
```

as result i imagine a dataset like the one below, with cumulative values for each day. i woult further proceed with adding a row with the counts (years) of each station, then divide the cumulative values by the counts.

and the something like this to extract the the means (marked values in the table above)

```
proc sql;
create table neu as
select * from test3
group by key
having var5=max(var5);
quit;
```

so... can youl tell me wats wrong with my array-code? or any idea of a smoother way?

thanks, sandra

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-30-2016 04:34 AM

If you want to build the mean over the years, then why does your result dataset still have multiple years per station?

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

Maxims of Maximally Efficient SAS Programmers

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-30-2016 04:37 AM

that woul just be a intermediate step becaus i have no better idea as to cumulate the values over the years an then extract the last observation for each station which then is the mean for the years

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-30-2016 04:48 AM

Building the mean over the years is easily achieved with proc means:

```
proc means data=have noprint;
by key;
var var6-var371;
output out=want mean(var6-var371)=;
run;
```

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

Maxims of Maximally Efficient SAS Programmers

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-30-2016 04:43 AM

Hi,

Well, please post test data as a datastep in the text box in future, I will not download Excel files from the web.

Now, first I comment on your code:

data have; key=187999;var5=1986; var6=54 ; var7=32; run; data want; set have; by key; array varknum{2}; array var{2} var6-var7; do i=1 to 2; varknum{i}=6666; /* Why this? */ end; if first.key then do; /* Note the do end block here */ do i=1 to 2; varknum{i}=varknum{i}+var{i}; /* will add var to 6666 first time? */ end; end; run;

However, then looking at it again, we note that the above code is a bit pointless. Simply:

data want; set have; array var{2} var6-var7; mean=mean(of var{*}); run;

Will get you the mean of the array. However we can take this one further and look at the actual structure of the data - which is the real underlying problem here. For programming, it is far easier to program with normalised (vertical data), as an example:

data have; key=187999;var5=1986; var6=54 ; var7=32; run; proc transpose data=have out=inter; by key var5; var var6-var7; run; proc sql; create table WANT as select A.*, B.MEAN, B.N from INTER A left join (select KEY,VAR5,mean(COL1) as MEAN,count(distinct VAR5) as N from INTER group by KEY,VAR5) B on A.KEY=B.KEY and A.VAR5=B.VAR5; quit;

In the above we don't need to know how many elements are present, we simply apply aggregates. For out reports, you can simply transpose up again.