Hello everyone,
My purpose is to divide the table 3 equal groups depending on one variable. On the other hand, I would like to see my Target variable 1 and 0 values. Additively, I also want to see ranges(Min and Max) for every groups. My desired output as below;
I prepared a sample data, first I used PROC RANK procedure and then I used PROC MEANS I can get Min and Max ranges and Total Count but I cannot get count values for 1 and 0 for my Target variable.
Data Have;
Length Score 8 Target 8;
Infile Datalines missover;
Input Score Target;
Datalines;
1200 1
1210 0
1220 1
1230 0
1240 1
1250 0
1260 1
1270 0
1280 1
1290 0
1300 0
1310 1
1320 0
1330 0
1340 1
1350 0
1360 0
1370 1
1380 0
1390 0
1400 1
1410 0
1420 0
1430 0
1440 1
1450 0
1460 0
1470 0
1480 1
1490 0
1500 0
;
Run;
PROC RANK DATA=Have Groups=3 Out=Ranked_Have;
Var Score;
Ranks Bucket;
Run;
PROC MEANS DATA=Ranked_Have NWAY NOPRINT;
CLASS BUCKET;
VAR SCORE;
OUTPUT OUT=RANGES MIN=MIN MAX=MAX;
RUN;
Then I tried PROC TABULATE procudure and some addition steps to reach my aim. However, this time, I could not get ranges(Max and Min).
PROC TABULATE DATA=Ranked_Have Out=DefaultCounts(DROP=_TYPE_ _PAGE_ _TABLE_);
CLASS Target / Order=Unformatted Missing;
CLASS Bucket / Order=Unformatted Missing;
TABLE Bucket,
Target*N ALL={LABEL="Total (ALL)"}*N;
RUN;
PROC SORT DATA=DefaultCounts(KEEP=N BUCKET) OUT=SORTED_COUNT;
BY BUCKET;
RUN;
PROC TRANSPOSE DATA=SORTED_COUNT OUT=SORTED_COUNT2(DROP=_NAME_ RENAME=(TARGET1=NonDefault TARGET2=Default_Count TARGET3=Total_Count))
PREFIX=TARGET;
BY BUCKET;
VAR N;
RUN;
Can anybody help me about this situation, please?
Thank you,
but I cannot get count values for 1 and 0 for my Target variable
Number of 1s is the SUM of the TARGET variable.
Number of 0s is the N-SUM of the TARGET variable.
Add those additional statistics to your PROC MEANS and then in a data step do the subtraction to the get the number of 0's.
Or if you want to add a different step you could try PROC FREQ.
Thank you,
I tried something but it does not work. Could you tell where should I put N, NMISS and SUM function in the following procedure?
PROC MEANS DATA=Ranked_Have NOPRINT MIN MAX N NMISS SUM ;
CLASS BUCKET;
VAR SCORE;
VAR Target;
OUTPUT OUT=RANGES SUM=Target NMISS=Target MIN=MIN MAX=MAX ;
RUN;
PROC MEANS DATA=Ranked_Have NWAY N NMISS MEAN MIN MAX STACKODS;
CLASS BUCKET;
VAR SCORE TARGET ;
ods output summary=ranked_stats;
RUN;
@turcay wrote:
Thank you,
I tried something but it does not work. Could you tell where should I put N, NMISS and SUM function in the following procedure?
PROC MEANS DATA=Ranked_Have NOPRINT MIN MAX N NMISS SUM ; CLASS BUCKET; VAR SCORE; VAR Target; OUTPUT OUT=RANGES SUM=Target NMISS=Target MIN=MIN MAX=MAX ; RUN;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.