Edit: Seems to be caused by this, as found by PaigeMiller below: https://support.sas.com/kb/66/896.html
Hello, I am trying to get the median height of each person/ID (they all have multiple height measurements over time). I want to merge this median for each ID back into the dataset so I have the median height for each patient across their multiple records. This is my code:
proc sql;
create table get_median_heights as
select *, median(height) as median_height
from height_records
group by ID;
quit;
But this does not seem to be giving the median per person/ID, for example here is a toy example of one of the results I get:
ID | height | median_height |
001 | 67 | 69 |
001 | . | 69 |
001 | 67 | 69 |
001 | 67 | 69 |
001 | . | 69 |
002 | 64 | 60 |
002 | 65 | 60 |
002 | 65 | 60 |
002 | 65 | 60 |
003 | . | . |
003 | 70 | . |
003 | 70 | . |
003 | 70 | . |
003 | . | . |
003 | 70 | . |
003 | 70 | . |
003 | . | . |
003 | 70 | . |
003 | . | . |
Why is ID 001 returning 69 instead of 67 and 002 is returning 60 instead of 65? And why is 003 returning as missing? I could have sworn this code used to work (I wrote it years ago), did something change with the behavior?
Yes, you have the version of SAS that is impacted by the bug. You can try downloading and installing the Hot Fix mentioned there, but only if your company/university allows you to do so (my company does not allow me to do this). If you can't install the Hot Fix, I would use PROC MEANS/PROC SUMMARY to compute medians and then the problem goes away.
I have a different version of SAS, so my output is not affected by the bug.
I cannot duplicate this behavior of PROC SQL. To me, it seems as if the medians are correct from my program below, and different than what you show. Please, from now on, provide the data not as a screen capture of a table, as you have done, but please do provide the data as working SAS data step code (instructions and examples), as I suspect your data is not really what you are showing.
data have;
input id $ height;
cards;
001 67
001 .
001 67
001 67
001 .
002 64
002 65
002 65
002 65
003 .
003 70
003 70
003 70
003 .
003 70
003 70
003 .
003 70
003 .
;
proc sql;
create table abc as select *,median(height) as median_height
from have
group by id;
quit;
Hi, thank you for the reply and sorry about that, will note that for the future. However I tried running your code example and it again does not seem to return the correct values:
There seems to be something very wrong with my organization's SAS implementation perhaps?
Please run the command below and report the numbers and letters and punctuation written to the log directly below the command.
%put &sysvlong4;
Hello, it returns this: 9.04.01M7P08052020, is that version impacted by this bug? I'll admit I am not so sure looking at the version listed on that page...
Also, for some reason proc sorting by ID first seems to fix the issue? This does not seem to be listed as a fix on that page you link to.
data have; input id $ height; cards; 001 67 001 . 001 67 001 67 001 . 002 64 002 65 002 65 002 65 003 . 003 70 003 70 003 70 003 . 003 70 003 70 003 . 003 70 003 . ; proc sort data = have; by id; run; proc sql; create table abc as select id, height, median(height) as median_height from have group by id; quit;
Very weird...
Yes, you have the version of SAS that is impacted by the bug. You can try downloading and installing the Hot Fix mentioned there, but only if your company/university allows you to do so (my company does not allow me to do this). If you can't install the Hot Fix, I would use PROC MEANS/PROC SUMMARY to compute medians and then the problem goes away.
I have a different version of SAS, so my output is not affected by the bug.
Thank you for all your help!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.