Solved
Contributor
Posts: 45

# calculating descriptive statistics (quartiles) by hand - an exercise

So I am reviewing some basic stats concepts but as opposed to using a procedure or Excel, I would like to get some sas programming practice and actually program them by scratch. At the moment I am struggling with the following: I have calculated the position for quartiles Q1, Q2 and Q3. How can I now now use this value to actually pick the observation in the dataset days?

Your help would be greatly appreciated it. Thanks.

******;

DATA days;

INPUT days;

DATALINES;

9

13

13

13

14

15

15

15

25

25

25

26

36

36

49

;

run;

proc sort data = days; run;

/*identifies the number of observation in a dataset*/

%let dsid=%sysfunc(open(days));

%let num=%sysfunc(attrn(&dsid,nlobs));

%let rc=%sysfunc(close(&dsid));

%put There are &num. observations in dataset days.;

data temp;

set days;

q1_pos = ceil((25/100)*&num.);

q2_pos = ceil((50/100)*&num.);

q3_pos = ceil((75/100)*&num.);

run;

Accepted Solutions
Solution
‎02-03-2014 06:41 PM
PROC Star
Posts: 8,164

## Re: calculating descriptive statistics (quartiles) by hand - an exercise

Even though you've indicated that you have found your correct answer, and I'm not really sure what you are trying to accomplish, I don't think that your code will work.

You call a proc sort with no by variable, thus it won't be having any effect.

I think you are trying to accomplish something like the following:

proc sort data = days;

by days;

run;

/*identifies the number of observation in a dataset*/

%let dsid=%sysfunc(open(days));

%let num=%sysfunc(attrn(&dsid,nlobs));

%let rc=%sysfunc(close(&dsid));

%put There are &num. observations in dataset days.;

data temp;

set days;

quartile=ifn(_n_ le ceil((25/100)*&num.),1,

ifn(_n_ le ceil((50/100)*&num.),2,

ifn(_n_ le ceil((75/100)*&num.),3,

4)));

run;

All Replies
Contributor
Posts: 45

## Re: calculating descriptive statistics (quartiles) by hand - an exercise

OK, I got something going for those that are newbies and curious like me. The code below does the job, OR just use proc Univariate or something else. Thanks.

DATA days;

INPUT days;

DATALINES;

9

13

13

13

14

15

15

15

25

25

25

26

36

36

49

;

run;

proc sort data = days; by days; run;

/*identifies the number of observation in a dataset*/

%let dsid=%sysfunc(open(days));

%let num=%sysfunc(attrn(&dsid,nlobs));

%let rc=%sysfunc(close(&dsid));

%put There are &num. observations in dataset days.;

data temp;

set days;

q1_pos = ceil((25/100)*&num.);

q2_pos = ceil((50/100)*&num.);

q3_pos = ceil((75/100)*&num.);

run;

/*getting the quartile position into a macro variable*/

data _null_;

set temp;

call symput("q1_pos",q1_pos);

call symput("q2_pos",q2_pos);

call symput("q3_pos",q3_pos);

run;

%put position of Q1 is &q1_pos.;

%put position of Q2 is &q2_pos.;

%put position of Q3 is &q3_pos.;

/*creating 2 datasets for each quartile*/

data Q1 (keep = q1) Q3 (keep = q3);

set temp;

if _N_ in(&q1_pos.) then do;

Q1= days;

output Q1;

end;

if _N_ in(&q3_pos.) then do;

Q3= days;

output Q3;

end;

run;

/*Calculating the interquartile range, and upper and lower bounds for outliers*/

data iqr;

set q1;

set q3;

IQR = q3-q1;

LL = q1 - 1.5*IQR;

UL = q3 + 1.5*IQR;

run;

Solution
‎02-03-2014 06:41 PM
PROC Star
Posts: 8,164

## Re: calculating descriptive statistics (quartiles) by hand - an exercise

Even though you've indicated that you have found your correct answer, and I'm not really sure what you are trying to accomplish, I don't think that your code will work.

You call a proc sort with no by variable, thus it won't be having any effect.

I think you are trying to accomplish something like the following:

proc sort data = days;

by days;

run;

/*identifies the number of observation in a dataset*/

%let dsid=%sysfunc(open(days));

%let num=%sysfunc(attrn(&dsid,nlobs));

%let rc=%sysfunc(close(&dsid));

%put There are &num. observations in dataset days.;

data temp;

set days;

quartile=ifn(_n_ le ceil((25/100)*&num.),1,

ifn(_n_ le ceil((50/100)*&num.),2,

ifn(_n_ le ceil((75/100)*&num.),3,

4)));

run;

Contributor
Posts: 45

## Re: calculating descriptive statistics (quartiles) by hand - an exercise

Good catch, Arthur. The proc sort passed without the by statement (which I forgot) because the data was already sorted. My simple code does what I want. But so does yours in a much more elegant way! Thanks Arthur.

🔒 This topic is solved and locked.