turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Obtaining mean for variable number of variables pe...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-19-2016 03:49 PM

Dear All,

I am working on a project for which I have to get mean levels of a disease marker (let's call it DM). The issue is that the mean must only include measures before the individual developed disease, and the time each person may have developed disease differs. So, for example, if ID=1 has 7 time points where DM was assessed and this person developed disease at time 5, I would need the mean of DM from time 1 to time 4. If ID=2 has DM measured from time 1 to 6 and the person developed disease at time 3, I'd need the mean DM from time 1 to time 2. If another person did not develop disease, I'd get the mean of all the available DM measures. I am really stuck on how to do this. Any help would be greatly appreciated!

Many thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-19-2016 03:57 PM

It all depends on your data structure. Please post an example.

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-19-2016 04:07 PM

Dear PGStats,

I have a wide data set (an example is given below).

**data** test;

input id dm1-dm7 disease time;

cards;

1 35 38 40 43 53 55 55 1 7

2 25 29 20 . 25 30 28 0 .

3 38 40 40 50 48 30 35 1 5

4 44 43 41 39 38 . 37 1 3

5 28 25 30 28 30 27 31 0 .

6 38 40 45 . 50 29 28 1 5

;

run;

Above, ID 1 developed disease at the 7th time point, so I'd need to get the means of DM1-DM6.

ID 2 did not develop disease, so I'd need the means of all available DM values.

ID 3 developed disease at time 5 so I'd get the mean of DM from times 1 to 4, etc.

Thank you for any help!

TDC.

**run**;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-19-2016 04:29 PM

In case there is only one record per ID (and MEAN means AVERAGE)

than you can do it by:

**data** test;

input id dm1-dm7 disease time;

array dm {*} dm1-dm7;

sum_dm = 0;

if time > 1 then do;

do i=1 to (time -1);

sum_dm = sum(of sum_dm, dm(i));

end;

MEAN = sum_dm / (time -1);

end;

else MEAN = .;

cards;

1 35 38 40 43 53 55 55 1 7

2 25 29 20 . 25 30 28 0 .

3 38 40 40 50 48 30 35 1 5

4 44 43 41 39 38 . 37 1 3

5 28 25 30 28 30 27 31 0 .

6 38 40 45 . 50 29 28 1 5

;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-19-2016 05:48 PM

@Shmuel, your denominator will be wrong if missing dm values occur before time. If you want to calculate your own average, you must keep track of the count of non-missing dm values as well as the sum of dm values.

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-19-2016 04:29 PM

Something like this perhaps:

```
data want;
set test;
array dm dm1-dm7;
if disease and not missing(time) then do;
do i=1 to time;
dmsum=sum(dmsum,dm[i]);
end;
dmmean= dmsum/time;
end;
drop i;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-19-2016 04:30 PM

Transpose and use proc means.

```
proc sort data=test;
by id disease time;
run;
proc transpose data=test out=long;
by id disease time;
var dm1-dm7;
run;
data long;
set long;
```

*calculate measurement time from variable name, generally bad practice to store data in column headers;
dm_time=input(compress(_name_, , 'kd'), 8.);
run;
proc means data=long nway mean n std;
class id disease ;
var col1;
where dm_time < time or time=.;
run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-19-2016 05:43 PM

If you want to keep it wide:

```
data want;
set test;
array dm dm1-dm7;
array x{7};
do i = 1 to coalesce(time, dim(dm)+1) - 1;
x{i} = dm{i};
end;
meanDm = mean(of x{*});
drop i x:;
run;
```

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-20-2016 08:10 AM

Thank you All so much!! I really appreciate your help!!

TDC.