Is it possible or does it make sense to calculate an interquartile range across variables? Below is a sample data set:
data one;
input response_id $ A1 A2 RF SE1 SE2 SE3 SE4 I1 I2 I3 I4 RPS;
datalines;
1 77 86 81 84 86 83 76 81 84 83 85 82.4
2 81 86 85 83 79 78 71 80 83 82 86 81.4
;
run;
proc print data=one;
run;
I want to find the interquartile range of the 11 values of A1-I4. I essentially want to create a new variable next to RPS that represents the interquartile range of the 11 values. For clarity the value for the RPS was generated by adding the values of A1 to I4 and dividing by the number of values (11). Also, I would like the IQR to be expressed as a range from the first to third quartiles rather than difference between the first and third quartiles. Is this possible to do?
@JackZ295 wrote:
Is it possible or does it make sense to calculate an interquartile range across variables? Below is a sample data set:
data one; input response_id $ A1 A2 RF SE1 SE2 SE3 SE4 I1 I2 I3 I4 RPS; datalines; 1 77 86 81 84 86 83 76 81 84 83 85 82.4 2 81 86 85 83 79 78 71 80 83 82 86 81.4 ; run; proc print data=one; run;
I want to find the interquartile range of the 11 values of A1-I4. I essentially want to create a new variable next to RPS that represents the interquartile range of the 11 values. For clarity the value for the RPS was generated by adding the values of A1 to I4 and dividing by the number of values (11). Also, I would like the IQR to be expressed as a range from the first to third quartiles rather than difference between the first and third quartiles. Is this possible to do?
You should provide an actual example of what you want for that blue text. IQR is DEFINED as a difference.
You can get the percentiles with the PCTL function but you don't get a single number for the range.
data one; input response_id $ A1 A2 RF SE1 SE2 SE3 SE4 I1 I2 I3 I4 RPS; iqr = iqr( A1, A2, RF, SE1, SE2, SE3, SE4, I1, I2, I3, I4); p25 = pctl(25, A1, A2, RF, SE1, SE2, SE3, SE4, I1, I2, I3, I4); p75 = pctl(75, A1, A2, RF, SE1, SE2, SE3, SE4, I1, I2, I3, I4); datalines; 1 77 86 81 84 86 83 76 81 84 83 85 82.4 2 81 86 85 83 79 78 71 80 83 82 86 81.4 ; run;
If you want to bodge p25 and p75 into a single value you will need to create a character variable as desired.
@ballardw writes:
iqr = iqr( A1, A2, RF, SE1, SE2, SE3, SE4, I1, I2, I3, I4);
or even simpler if appropriate
iqr = iqr( of A1--I4);
@JackZ295 wrote:
Is it possible or does it make sense to calculate an interquartile range across variables? Below is a sample data set:
data one; input response_id $ A1 A2 RF SE1 SE2 SE3 SE4 I1 I2 I3 I4 RPS; datalines; 1 77 86 81 84 86 83 76 81 84 83 85 82.4 2 81 86 85 83 79 78 71 80 83 82 86 81.4 ; run; proc print data=one; run;
I want to find the interquartile range of the 11 values of A1-I4. I essentially want to create a new variable next to RPS that represents the interquartile range of the 11 values. For clarity the value for the RPS was generated by adding the values of A1 to I4 and dividing by the number of values (11). Also, I would like the IQR to be expressed as a range from the first to third quartiles rather than difference between the first and third quartiles. Is this possible to do?
So the answer is, YES it is possible and others have provided code. But just because it is possible, that does not mean you SHOULD do this. The real question is, does it make any sense to do this given the source and meaning of these variables? Only you understand the problem well enough to know what these values represent, and whether it makes sense to find statistics across all of these variables. In general, I would be skeptical of doing such a calculation across different variables, without clear explanation and justification.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.