DATA Step, Macro, Functions and more

Median of one variable by values of another

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 138
Accepted Solution

Median of one variable by values of another

Hi,

I have data composed of an ID variable, a continuous variable (values 1-100), and a categorical variable (200 possible values), like this:

ID     cont_var     cat_var

1         30               1

2         25               2

3         42               2

4         97               1

5         55               1

6         12               2

The data are unique at the ID level.

What I want is to add a variable that summarizes the median of the cont_var by level of cat_var, like this:

ID     cont_var     cat_var     new_var

1         30               1             55     

2         25               2             25

3         42               2             25

4         97               1             55

5         55               1             55

6         12               2             25

I know how to do this by collapsing the data to levels of the categorical variable, but since I ultimately want to keep my dataset at the individual ID level, I'm hoping there's a way to do it without collapsing and rejoining.

Any help is much appreciated.


Accepted Solutions
Solution
‎12-01-2014 03:48 PM
Respected Advisor
Posts: 3,124

Re: Median of one variable by values of another

If you have 9.4, then:

data have;

     input ID     cont_var     cat_var;

     cards;

1         30               1

2         25               2

3         42               2

4         97               1

5         55               1

6         12               2

;

proc sql;

     create table want as

           select *, median(cont_var) as median from have group by cat_var

                order by id;

quit;

View solution in original post


All Replies
Solution
‎12-01-2014 03:48 PM
Respected Advisor
Posts: 3,124

Re: Median of one variable by values of another

If you have 9.4, then:

data have;

     input ID     cont_var     cat_var;

     cards;

1         30               1

2         25               2

3         42               2

4         97               1

5         55               1

6         12               2

;

proc sql;

     create table want as

           select *, median(cont_var) as median from have group by cat_var

                order by id;

quit;

Super User
Posts: 10,500

Re: Median of one variable by values of another

proc summary data=have nway;

     class cat_var;

     var cont_var;

     output out= med median=;

run;

proc sql;

     create table want as

     select a.*, b.cont_var as new var

     from have as a left join med as b on

          a.cat_var=b.cat_var

     order by ID;

quit;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 179 views
  • 3 likes
  • 3 in conversation