I have a file. Columns of each student's ID card, city of residence, educational institution, column on whether or not he dropped out of school. If we drop then a column gets 1 if we don't drop then a column gets zero.
For example
I want to prepare a table with a calculation of how many students there are in each group of cities by type of institution. How many students dropped out by city and type of institution. And what is the percentage of dropouts by city and type of institution. I want to prepare the table using SAS with the TABULATE function
Example of results
The goal is to calculate the number and percentage of dropouts without also showing the non-dropouts. That is, not to show the complements. I was able to calculate the amounts but not percentages. This is what I tried:
1.
proc tabulate data=a missing;
class City Educational_Institution ;
var Dropped_Out;
table City* Educational_Institution, n*Dropped_Out Dropped_Out*sum Dropped_Out*pctn<n> ;
run;
2.
proc tabulate data=a missing;
class City Educational_Institution ;
var Dropped_Out;
table City* Educational_Institution, n*Dropped_Out Dropped_Out*sum Dropped_Out*pctn<Dropped_Out> ;
run;
The highlighted part does not work. I also tried all kinds of options with PCTSUM. What am I missing?
I would appreciate help. Thanks
Doesn't work is awful vague.
Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.
No output? Post any log in a code box.
Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.
Not even going to try to diagnose PCTN or PCTSUM issues in Tabulate without an actual example data set.
Note that if you VAR variable that is numeric and has values of 1/0 then the statistic N is the number of non-missing values, SUM is the number of observations where the variable has the value of 1 and Mean is the percentage as a decimal of the 1s.
Hi, here is a code I tried and log:
71 data a; 72 INPUT id:4. city: $10. type_inst: $10. drooped; 73 datalines; NOTE: The data set WORK.A has 20 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 94 ; 95 run; 96 /*1*/ 97 proc tabulate data= a missing s=[just=c] format=comma7.1; 98 class city type_inst ; 99 var drooped ; 100 table city=''*type_inst='', n*drooped drooped*sum drooped*pctn<n> /row=float; 101 run; ERROR: n is invalid in the following denominator nesting : n. NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE TABULATE used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 101 ! /*ERROR: n is invalid in the following denominator nesting : n*/ 102 /*2*/ 103 104 proc tabulate data= a missing s=[just=c] format=comma7.1; 105 class city type_inst ; 106 var drooped ; 107 table city=''*type_inst='', n*drooped drooped*sum drooped*pctn<drooped> /row=float; 108 run; NOTE: There were 20 observations read from the data set WORK.A. NOTE: PROCEDURE TABULATE used (Total process time): real time 0.04 seconds cpu time 0.04 seconds 108 ! /*all 100%*/ 109 110 /*3*/ 111 proc tabulate data= a missing s=[just=c] format=comma7.1; 112 class city type_inst ; 113 var drooped ; 114 table city=''*type_inst='', n*drooped drooped*sum drooped*pctsum<all> /row=float; 115 run; WARNING: Invalid denominator nesting element: all. NOTE: There were 20 observations read from the data set WORK.A. NOTE: PROCEDURE TABULATE used (Total process time): real time 0.04 seconds cpu time 0.04 seconds 115 ! /*all 100%*/ 116
Provide the data step reading the datalines as the CODE from the editor. The Log strips out the actual datalines so can't recreate your data.
Here is an example of building an example data set with some random values.
One variable is coded 1/0 to demonstrate how to get nice single summaries.
data junk; do cat = 'XXX','YYY'; do subcat='A','B','C'; /* generate random number of values of each cat value*/ do i=1 to (rand('integer',15,60)); /* generate some random 1 and 0 values*/ x= rand('bernoulli',0.3); output; end; end; end; run; proc tabulate data=junk; class cat subcat; var x; table cat all='Total', x *(n='Number nonmissing' sum='Number of 1'*f=best5. mean='% of 1'*f=percent8.1) ; table cat *(subcat all='Cat Total') all='Overall Total', x *(n='Number nonmissing' sum='Number of 1'*f=best5. mean='% of 1'*f=percent8.1) ; run;
The 0.3 in the call to Rand('Bernoulli', 0.3) means that with enough values generated about 0.3 or 30 percent will have a value of 1, the remainder will have 0.
The Rand('integer') used that way returns value between 15 and 60 as integers with uniform distribution(equal probability of any of the values.
This is simple enough that you should be able to eyeball the N and Sum values and verify that the Mean (as percent) has the correct percent for the Subcat.
Percentages and counts are much easier to calculate via PROC FREQ, and after that you can print out the information in whatever format you want.
proc sort data=a;
by educationa_institution city;
run;
proc freq data=a;
by educationa_institution city;
tables dropped_out/noprint out=want;
run;
Hi @Sheril1
Have a look at this SESUG paper This Works on So Many Levels: Building Summary Tables in SAS® with Hash Programming
It may provide you with a way to get closer to your goal.
Hope this helps,
Ahmed
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.