- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have a range of numbers for a variable, with a minimum of 316 and a maximum value of 9999. I would like to split the volume for that variable, application_score, into ten bands. However, 9999 is a special value and is going to be put in a separate band - is there code to split the records, minus records that contain 9999, into ten bands without dropping/excluding 9999 permanently from the dataset?
Question: There are 23,020 records for the variable and 245 have the 9999 value, so is there code to split the remaining 22,775 records into 10 bands?
My code so far (which splits all of the records into 10 bands for the variable, instead of all records minus records that contain 9999, then splitting the remaining records into 10 bands):
data test;
set RC.MOD_LP_CRDS_FND_DB;
run;
proc sort data=test;
by application_score;
run;
/* Request deciles with the PCTLPTS= option */
proc univariate data=test noprint;
var application_score;
output out=out1 pctlpts=10 to 100 by 10 pctlpre=P;
run;
ods noproctitle;
proc print data=out1;
title 'PROC UNIVARIATE Results';
run;
proc transpose data=out1 out=out2 (rename=(col1=end));
run;
/* Create the CNTLIN data set */
data crfmt;
set out2 end=last;
if _N_=1 then hlo='L';
start=lag(end);
label='>' || catx(' - ',start,end);
if hlo='L' then label=catx(' - ','low',end);
fmtname='xfmt';
eexcl='N';
if last then do;
hlo='H';
label='>' || catx('- ',start,'high');
end;
run;
proc print data=crfmt;
var fmtname start end label eexcl hlo;
title 'CNTLIN data set';
run;
/* Create a format based on deciles */
proc format cntlin=crfmt;
select xfmt;
run;
/* Use the format in a procedure step */
proc freq data=test;
tables application_score;
format application_score xfmt.;
title 'Decile Counts';
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
where application_score ne 9999;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PROC RANK with option GROUPS=10 will split the data into 10 groups. If you specify as the first line in the PROC
proc rank data=mydata(where=(application_score^=9999)) groups=10 out=outputdataset;
then it splits all the records where application_score is not 9999 into 10 groups.
Then you can combine the results where application_score equals 9999 with the results from PROC RANK to get your 10 groups plus another group where application_score is 9999.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for the reply. My output requires the lower and upper band limits to be observed as I will use them to to create the ten bands later on.
Example: After looking at the outputs of my code with the lower and upper band limits, I will add this code:
if application_score=9999, then BUR_CC='11';
else if -999<=application_score<=426 then BUR_CC='11';
else if 316<=application_score<=483 then BUR_CC='10';
else if 484<=application_score<=504 then BUR_CC='09';
else if 505<=application_score<=515 then BUR_CC='08';
else if 516<=application_score<=523 then BUR_CC='07';
else if 524<=application_score<=531 then BUR_CC='06';
else if 532<=application_score<=540 then BUR_CC='05';
else if 541<=application_score<=551 then BUR_CC='04';
else if 552<=application_score<=563 then BUR_CC='03';
else if 564<=application_score<=580 then BUR_CC='02';
else if 581<=application_score<=9998 then BUR_CC='01';
I have added your code to my existing code but I still get the same outputs as before (screenshots are below) - is there any way for the Results/output to show me the updated lower and upper band limits?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
where application_score ne 9999;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content