Aggregate the data values

Accepted Solution Solved
Reply
Contributor
Posts: 21
Accepted Solution

Aggregate the data values

HI,

 

I need help to get codes for data aggregating 

 

I have a dataset of lab results where some of the values are repeating on the same date. I would like to create a new variable where I can keep mean/median value if two or three data values are present otherwise it should be same as old variable.

 

My dataset looks like this:

Patient Id      date     result     age    gender 

1          1/1/2011        3.50           23         1

1          1/1/2011        3.60           23         1

1          2/1/2011        3.00           23         1

1          2/1/2011        2.82           23         1

2          1/1/2011        3.42           20         1

2          2/1/2011        3.23           20         1

2          3/1/2011        4.11           20         1

2          3/1/2011        2.82           20         1

3          1/1/2011        3.18           33         2

3          2/1/2011        4.21           33         2

3          3/1/2011        2.12           33         2

 

 

I want new dataset look like this:

Patient Id      date     result     age    gender  Re_results

1          1/1/2011        3.50           23         1       3.55

1          1/1/2011        3.60           23         1       3.55

1          2/1/2011        3.00           23         1       3.00

1          3/1/2011        2.82           23         1       2.82

2          1/1/2011        3.42           20         1       3.42

2          2/1/2011        3.23           20         1       3.23

2          3/1/2011        4.11           20         1       3.47

2          3/1/2011        2.82           20         1       3.47

3          1/1/2011        3.18           33         2       3.18

3          2/1/2011        4.21           33         2       4.21

3          3/1/2011        2.12           33         2       2.12

 

Kindly suggest 

 


Accepted Solutions
Solution
‎04-02-2018 05:28 AM
Super User
Super User
Posts: 8,070

Re: Aggregate the data values

Posted in reply to mehul4frnds

PROC SQL makes this easy. It will remerge aggregate results back to the detail records automatically.

Most statistics on a single value will return that value as the result.  min, max, mean, median.

data have;
  input id date result age gender ;
  informat date mmddyy. ;
  format date yymmdd10. ;
cards;
1 1/1/2011 3.50 23 1
1 1/1/2011 3.60 23 1
1 2/1/2011 3.00 23 1
1 2/1/2011 2.82 23 1
2 1/1/2011 3.42 20 1
2 2/1/2011 3.23 20 1
2 3/1/2011 4.11 20 1
2 3/1/2011 2.82 20 1
3 1/1/2011 3.18 33 2
3 2/1/2011 4.21 33 2
3 3/1/2011 2.12 33 2
;

proc sql ;
  create table want as 
  select *,mean(result) as re_results
  from have
  group by id,date
  ;
quit;
Obs    id       date       result    age    gender    re_results

  1     1    2011-01-01     3.60      23       1         3.550
  2     1    2011-01-01     3.50      23       1         3.550
  3     1    2011-02-01     3.00      23       1         2.910
  4     1    2011-02-01     2.82      23       1         2.910
  5     2    2011-01-01     3.42      20       1         3.420
  6     2    2011-02-01     3.23      20       1         3.230
  7     2    2011-03-01     4.11      20       1         3.465
  8     2    2011-03-01     2.82      20       1         3.465
  9     3    2011-01-01     3.18      33       2         3.180
 10     3    2011-02-01     4.21      33       2         4.210
 11     3    2011-03-01     2.12      33       2         2.120

View solution in original post


All Replies
Super User
Posts: 5,876

Re: Aggregate the data values

Posted in reply to mehul4frnds
For mean youbwould use SQL avg(result) , and by not specifying all other columns in the GROUP BY, SAS will automatically emerge stats with the detail records.
Data never sleeps
Solution
‎04-02-2018 05:28 AM
Super User
Super User
Posts: 8,070

Re: Aggregate the data values

Posted in reply to mehul4frnds

PROC SQL makes this easy. It will remerge aggregate results back to the detail records automatically.

Most statistics on a single value will return that value as the result.  min, max, mean, median.

data have;
  input id date result age gender ;
  informat date mmddyy. ;
  format date yymmdd10. ;
cards;
1 1/1/2011 3.50 23 1
1 1/1/2011 3.60 23 1
1 2/1/2011 3.00 23 1
1 2/1/2011 2.82 23 1
2 1/1/2011 3.42 20 1
2 2/1/2011 3.23 20 1
2 3/1/2011 4.11 20 1
2 3/1/2011 2.82 20 1
3 1/1/2011 3.18 33 2
3 2/1/2011 4.21 33 2
3 3/1/2011 2.12 33 2
;

proc sql ;
  create table want as 
  select *,mean(result) as re_results
  from have
  group by id,date
  ;
quit;
Obs    id       date       result    age    gender    re_results

  1     1    2011-01-01     3.60      23       1         3.550
  2     1    2011-01-01     3.50      23       1         3.550
  3     1    2011-02-01     3.00      23       1         2.910
  4     1    2011-02-01     2.82      23       1         2.910
  5     2    2011-01-01     3.42      20       1         3.420
  6     2    2011-02-01     3.23      20       1         3.230
  7     2    2011-03-01     4.11      20       1         3.465
  8     2    2011-03-01     2.82      20       1         3.465
  9     3    2011-01-01     3.18      33       2         3.180
 10     3    2011-02-01     4.21      33       2         4.210
 11     3    2011-03-01     2.12      33       2         2.120
Contributor
Posts: 21

Re: Aggregate the data values

Thank you so much for the detailed reply

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 88 views
  • 0 likes
  • 3 in conversation