So I am reviewing some basic stats concepts but as opposed to using a procedure or Excel, I would like to get some sas programming practice and actually program them by scratch. At the moment I am struggling with the following: I have calculated the position for quartiles Q1, Q2 and Q3. How can I now now use this value to actually pick the observation in the dataset days?
Your help would be greatly appreciated it. Thanks.
******;
DATA days;
INPUT days;
DATALINES;
9
13
13
13
14
15
15
15
25
25
25
26
36
36
49
;
run;
proc sort data = days; run;
/*identifies the number of observation in a dataset*/
%let dsid=%sysfunc(open(days));
%let num=%sysfunc(attrn(&dsid,nlobs));
%let rc=%sysfunc(close(&dsid));
%put There are &num. observations in dataset days.;
data temp;
set days;
q1_pos = ceil((25/100)*&num.);
q2_pos = ceil((50/100)*&num.);
q3_pos = ceil((75/100)*&num.);
run;
Even though you've indicated that you have found your correct answer, and I'm not really sure what you are trying to accomplish, I don't think that your code will work.
You call a proc sort with no by variable, thus it won't be having any effect.
I think you are trying to accomplish something like the following:
proc sort data = days;
by days;
run;
/*identifies the number of observation in a dataset*/
%let dsid=%sysfunc(open(days));
%let num=%sysfunc(attrn(&dsid,nlobs));
%let rc=%sysfunc(close(&dsid));
%put There are &num. observations in dataset days.;
data temp;
set days;
quartile=ifn(_n_ le ceil((25/100)*&num.),1,
ifn(_n_ le ceil((50/100)*&num.),2,
ifn(_n_ le ceil((75/100)*&num.),3,
4)));
run;
OK, I got something going for those that are newbies and curious like me. The code below does the job, OR just use proc Univariate or something else. Thanks.
DATA days;
INPUT days;
DATALINES;
9
13
13
13
14
15
15
15
25
25
25
26
36
36
49
;
run;
proc sort data = days; by days; run;
/*identifies the number of observation in a dataset*/
%let dsid=%sysfunc(open(days));
%let num=%sysfunc(attrn(&dsid,nlobs));
%let rc=%sysfunc(close(&dsid));
%put There are &num. observations in dataset days.;
data temp;
set days;
q1_pos = ceil((25/100)*&num.);
q2_pos = ceil((50/100)*&num.);
q3_pos = ceil((75/100)*&num.);
run;
/*getting the quartile position into a macro variable*/
data _null_;
set temp;
call symput("q1_pos",q1_pos);
call symput("q2_pos",q2_pos);
call symput("q3_pos",q3_pos);
run;
%put position of Q1 is &q1_pos.;
%put position of Q2 is &q2_pos.;
%put position of Q3 is &q3_pos.;
/*creating 2 datasets for each quartile*/
data Q1 (keep = q1) Q3 (keep = q3);
set temp;
if _N_ in(&q1_pos.) then do;
Q1= days;
output Q1;
end;
if _N_ in(&q3_pos.) then do;
Q3= days;
output Q3;
end;
run;
/*Calculating the interquartile range, and upper and lower bounds for outliers*/
data iqr;
set q1;
set q3;
IQR = q3-q1;
LL = q1 - 1.5*IQR;
UL = q3 + 1.5*IQR;
run;
Even though you've indicated that you have found your correct answer, and I'm not really sure what you are trying to accomplish, I don't think that your code will work.
You call a proc sort with no by variable, thus it won't be having any effect.
I think you are trying to accomplish something like the following:
proc sort data = days;
by days;
run;
/*identifies the number of observation in a dataset*/
%let dsid=%sysfunc(open(days));
%let num=%sysfunc(attrn(&dsid,nlobs));
%let rc=%sysfunc(close(&dsid));
%put There are &num. observations in dataset days.;
data temp;
set days;
quartile=ifn(_n_ le ceil((25/100)*&num.),1,
ifn(_n_ le ceil((50/100)*&num.),2,
ifn(_n_ le ceil((75/100)*&num.),3,
4)));
run;
Good catch, Arthur. The proc sort passed without the by statement (which I forgot) because the data was already sorted. My simple code does what I want. But so does yours in a much more elegant way! Thanks Arthur.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.