BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

My current data is below. I would like to create a new column called 'median for category+date', where it is equal to the median by date and category. That is order the dataset by date and category and then compute the median and store it in the column 'median for category+date'.

current:

Date Category Value
1-Jan-19 C 17
1-Jan-19 B 21
1-Jan-19 B 12
1-Jan-19 B 12
1-Jan-19 A 17
1-Jan-19 A 6
1-Feb-19 D 19
1-Feb-19 C 23
1-Feb-19 D 24
1-Feb-19 D 21
1-Feb-19 A 16
1-Mar-19 A 9
1-Mar-19 D 24
1-Mar-19 D 15
1-Mar-19 A 7
1-Mar-19 D 21
1-Mar-19 D 5
1-Apr-19 C 9
1-Apr-19 A 15
1-Apr-19 A 5
1-Apr-19 C 15
1-Apr-19 A 19
1-May-19 D 17
1-May-19 C 9
1-May-19 A 7
1-May-19 B 13
1-May-19 C 10
1-May-19 A 14

 

desired:

Date Category Value Median for category+date
1-Jan-19 A 17 11.5
1-Jan-19 A 6 11.5
1-Jan-19 B 21 12
1-Jan-19 B 12 12
1-Jan-19 B 12 12
1-Jan-19 C 17 17
1-Feb-19 A 16 16
1-Feb-19 C 23 3
1-Feb-19 D 19 21
1-Feb-19 D 24 21
1-Feb-19 D 21 21
1-Mar-19 A 9 8
1-Mar-19 A 7 8
1-Mar-19 D 24 18
1-Mar-19 D 15 18
1-Mar-19 D 21 18
1-Mar-19 D 5 18
1-Apr-19 A 15 15
1-Apr-19 A 5 15
1-Apr-19 A 19 15
1-Apr-19 C 9 12
1-Apr-19 C 15 12
1-May-19 A 7 10.5
1-May-19 A 14 10.5
1-May-19 B 13 13
1-May-19 C 9 9.5
1-May-19 C 10 9.5
1-May-19 D 17 17
1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Like this?

proc sql;
  select DATE, CATEGORY, VALUE, median(VALUE) as MEDIAN
  from TABLE
  group by DATE, CATEGORY 
  order by DATE, CATEGORY;

 

View solution in original post

12 REPLIES 12
Sajid01
Meteorite | Level 14

Please post your current data as a datastep

sasprogramming
Quartz | Level 8

The data is csv formatted and was imported using 'proc import'

sasprogramming
Quartz | Level 8
proc import
file="C:/Users/ggqeo/Documents/My SAS Files/current.csv"
out=current
dbms=csv;
run;
Sajid01
Meteorite | Level 14
Thanks.
ChrisNZ
Tourmaline | Level 20

Like this?

proc sql;
  select DATE, CATEGORY, VALUE, median(VALUE) as MEDIAN
  from TABLE
  group by DATE, CATEGORY 
  order by DATE, CATEGORY;

 

sasprogramming
Quartz | Level 8

This works thanks!

Just need the quit; at the end

sasprogramming
Quartz | Level 8

What is the function within sql I could use to also create a Q1 (as 25% percentile) column?

 

Thanks

sasprogramming
Quartz | Level 8

I tried this and it didn't work as intended, this was the result.

proc sql;
  select DATE, CATEGORY, VALUE, PCTL(75, VALUE) AS Q3
  from current
  group by DATE, CATEGORY 
  order by DATE, CATEGORY;
quit;

sasprogramming_0-1626847455530.png

 

ChrisNZ
Tourmaline | Level 20

Mm you're right.

It's back to proc means then, unless someone who actually uses statistics is more knowledgeable than me.

ballardw
Super User

I don't believe Proc SQL treats PCTL function as summary function. At least from this bit in the log:

24   proc sql;
25     create table want as
26     select sex,age, pctl(25,height) as p25
27     from sashelp.class
28     group by sex, age
29     order by sex,age
30     ;
NOTE: A GROUP BY clause has been discarded because neither the SELECT clause nor the optional
      HAVING clause of the associated table-expression referenced a summary function.
NOTE: Table WORK.WANT created, with 19 rows and 3 columns.

So since not a summary function it is applying the function to each row. And guess what the 25th (or 75th or 10th ...) percentile is of a single value?

 

Note that the documentation examples look like:

lower_quartile=PCTL(25, 2, 4, 1, 3);

including for DS2 and FedSQL. So not a summary across records function by design.

FreelanceReinh
Jade | Level 19

Hi @sasprogramming,

 

There is currently no PROC SQL summary function for percentiles other than the median. The MEDIAN function (as a PROC SQL summary function) was introduced in SAS 9.4. Usage Note 12133 describes this and also that other Base SAS functions (such as PCTL) are only applied to the value(s) on the current row.

 

So, for the first and third quartile PROC MEANS (or PROC SUMMARY) is the procedure of choice, as Chris said.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 3607 views
  • 3 likes
  • 5 in conversation