Solved: Aggregate the data values

mehul4frnds · Posted 03-30-2018 05:53 AM

HI,

I need help to get codes for data aggregating

I have a dataset of lab results where some of the values are repeating on the same date. I would like to create a new variable where I can keep mean/median value if two or three data values are present otherwise it should be same as old variable.

My dataset looks like this:

Patient Id date result age gender

1 1/1/2011 3.50 23 1

1 1/1/2011 3.60 23 1

1 2/1/2011 3.00 23 1

1 2/1/2011 2.82 23 1

2 1/1/2011 3.42 20 1

2 2/1/2011 3.23 20 1

2 3/1/2011 4.11 20 1

2 3/1/2011 2.82 20 1

3 1/1/2011 3.18 33 2

3 2/1/2011 4.21 33 2

3 3/1/2011 2.12 33 2

I want new dataset look like this:

Patient Id date result age gender Re_results

1 1/1/2011 3.50 23 1 3.55

1 1/1/2011 3.60 23 1 3.55

1 2/1/2011 3.00 23 1 3.00

1 3/1/2011 2.82 23 1 2.82

2 1/1/2011 3.42 20 1 3.42

2 2/1/2011 3.23 20 1 3.23

2 3/1/2011 4.11 20 1 3.47

2 3/1/2011 2.82 20 1 3.47

3 1/1/2011 3.18 33 2 3.18

3 2/1/2011 4.21 33 2 4.21

3 3/1/2011 2.12 33 2 2.12

Kindly suggest

Tom · Posted 03-30-2018 02:57 PM

PROC SQL makes this easy. It will remerge aggregate results back to the detail records automatically.

Most statistics on a single value will return that value as the result. min, max, mean, median.

data have;
  input id date result age gender ;
  informat date mmddyy. ;
  format date yymmdd10. ;
cards;
1 1/1/2011 3.50 23 1
1 1/1/2011 3.60 23 1
1 2/1/2011 3.00 23 1
1 2/1/2011 2.82 23 1
2 1/1/2011 3.42 20 1
2 2/1/2011 3.23 20 1
2 3/1/2011 4.11 20 1
2 3/1/2011 2.82 20 1
3 1/1/2011 3.18 33 2
3 2/1/2011 4.21 33 2
3 3/1/2011 2.12 33 2
;

proc sql ;
  create table want as 
  select *,mean(result) as re_results
  from have
  group by id,date
  ;
quit;

Obs    id       date       result    age    gender    re_results

  1     1    2011-01-01     3.60      23       1         3.550
  2     1    2011-01-01     3.50      23       1         3.550
  3     1    2011-02-01     3.00      23       1         2.910
  4     1    2011-02-01     2.82      23       1         2.910
  5     2    2011-01-01     3.42      20       1         3.420
  6     2    2011-02-01     3.23      20       1         3.230
  7     2    2011-03-01     4.11      20       1         3.465
  8     2    2011-03-01     2.82      20       1         3.465
  9     3    2011-01-01     3.18      33       2         3.180
 10     3    2011-02-01     4.21      33       2         4.210
 11     3    2011-03-01     2.12      33       2         2.120

View solution in original post

LinusH · Posted 03-30-2018 02:01 PM

For mean youbwould use SQL avg(result) , and by not specifying all other columns in the GROUP BY, SAS will automatically emerge stats with the detail records.

Data never sleeps

Tom · Posted 03-30-2018 02:57 PM

PROC SQL makes this easy. It will remerge aggregate results back to the detail records automatically.

Most statistics on a single value will return that value as the result. min, max, mean, median.

data have;
  input id date result age gender ;
  informat date mmddyy. ;
  format date yymmdd10. ;
cards;
1 1/1/2011 3.50 23 1
1 1/1/2011 3.60 23 1
1 2/1/2011 3.00 23 1
1 2/1/2011 2.82 23 1
2 1/1/2011 3.42 20 1
2 2/1/2011 3.23 20 1
2 3/1/2011 4.11 20 1
2 3/1/2011 2.82 20 1
3 1/1/2011 3.18 33 2
3 2/1/2011 4.21 33 2
3 3/1/2011 2.12 33 2
;

proc sql ;
  create table want as 
  select *,mean(result) as re_results
  from have
  group by id,date
  ;
quit;

Obs    id       date       result    age    gender    re_results

  1     1    2011-01-01     3.60      23       1         3.550
  2     1    2011-01-01     3.50      23       1         3.550
  3     1    2011-02-01     3.00      23       1         2.910
  4     1    2011-02-01     2.82      23       1         2.910
  5     2    2011-01-01     3.42      20       1         3.420
  6     2    2011-02-01     3.23      20       1         3.230
  7     2    2011-03-01     4.11      20       1         3.465
  8     2    2011-03-01     2.82      20       1         3.465
  9     3    2011-01-01     3.18      33       2         3.180
 10     3    2011-02-01     4.21      33       2         4.210
 11     3    2011-03-01     2.12      33       2         2.120

mehul4frnds · Posted 04-02-2018 05:28 AM

Thank you so much for the detailed reply

Aggregate the data values

Re: Aggregate the data values

Re: Aggregate the data values

Re: Aggregate the data values

Re: Aggregate the data values

Registration is open

Call for Content EXTENDED

Aggregate the data values

Re: Aggregate the data values

Re: Aggregate the data values

Re: Aggregate the data values

Re: Aggregate the data values

Registration is open

Call for Content EXTENDED

SAS Training: Just a Click Away