Hi i have data with two variables one is score and other is binary variable "label".
score label
245 1
261 0
250 1
300 1
270 0
I am interested to find counts of 1 in label variable for the given manual bins. I have written code but it is giving only frequency of squares in the bins as.
proc iml;
use work.prtf_data;
read all var {Score} into x;
close work.prtf_data;
cutpts = {.M 261 273 283 291 298 305 312 330 341 342 .I};
r = bin(x, cutpts);
call tabulate(BinNumber, Freq, r);
lbls = {"< 261" "262-273" "274-283" "284-291" "292-298" "299-305" "306-312" "313-330" "331-341" "> 342"};
print Freq[colname=lbls];
i want my results as
"score" "" 1 freq"
<261 2
262-273 0
.....
....
Please suggest me any solution. Thanks
data class;
input score label;
cards;
245 1
261 0
250 1
300 1
270 0
200 0
;
run;
proc iml;
use class;
read all var {score label} ;
close ;
cutpts = {.M 262 .I};
r = bin(score, cutpts);
do i=1 to max(r);
idx=loc(r=i);
group=group//i;
freq=freq//sum(label[idx]);
end;
lbls = t( {"< 261" "262-273"} );
print lbls group freq;
quit;
So you only want to count the data for which label=1? If so, use a WHERE clause to subset the data when you read it in:
proc iml;
use work.prtf_data where(label=1);
read all var {Score} into x;
close work.prtf_data;
I can't tell from your example if the LABEL variable is numeric or character. If it is character, then use
WHERE(label='1')
For the general question "how do I count the number of observations in uneven bins," see "Bin observations by using custom cut points and unevenly spaced bins."
data class;
input score label;
cards;
245 1
261 0
250 1
300 1
270 0
200 0
;
run;
proc iml;
use class;
read all var {score label} ;
close ;
cutpts = {.M 262 .I};
r = bin(score, cutpts);
do i=1 to max(r);
idx=loc(r=i);
group=group//i;
freq=freq//sum(label[idx]);
end;
lbls = t( {"< 261" "262-273"} );
print lbls group freq;
quit;
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.