BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SAS_new
Calcite | Level 5

Hello everyone-

I have a dataset for students and I would like to summarize the data by race ethnicity and by offense.  I had attempted to produce a two way table using Proc Freq but the results did not happen as I expected.   

The first table shows how the data is currently stored.  A row contain information on student and there could be multiple students in the dataset with multiple offense codes.  I would like the results to summarize all student records by race and by offense.  I understand that the counts will be duplicated.  Is there a way to produce this type of table? 

studentraceoffense
12595WHITEW1
1857ASIAND1
11589BLACKV1
54812WHITEW1
636884HISPANICD1
115887HISPANICD1
12595WHITEW1
11589BLACKV1

W1D1V1
WHITE300
ASIAN010
BLACK002
HISPANIC020
1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

sas_new: I haven't kept up with this thread, thus may be missing something, but it sure sounds like you are looking for something like:

data have;

  input (school student race offense) ($);

  cards;

A 12595 WHITE W1

A 1857 ASIAN D1

A 11589 BLACK V1

A 54812 WHITE W1

A 636884 HISPANIC D1

A 115887 HISPANIC D1

A 12595 WHITE W1

A 11589 BLACK V1

B 12595 WHITE W1

B 1857 ASIAN D1

B 11589 BLACK V1

B 54812 WHITE W1

B 636884 HISPANIC D1

B 115887 HISPANIC D1

B 12595 WHITE W1

B 11589 BLACK V1

;

ods output crosstabfreqs =  xtab;

Proc Freq data=have;

  tables school*race*offense/

         out=temp missprint;

run;

proc sort data=xtab;

  by school race offense;

run;

proc transpose data=xtab (where=

        (not(missing(offense) or missing(race))))

         out=want (drop=_:);

  var frequency;

  by school race;

  id offense;

run;

View solution in original post

22 REPLIES 22
Reeza
Super User

What was your code?

Proc Freq data=have;

table race*offense;

run;

SAS_new
Calcite | Level 5

That was my code-

Proc Freq data=have;

     tables race*offense/list missing;run;

I need the offense codes to be columns and summarized by race.  That is where I am having the difficulty.

Reeza
Super User

The LIST options tells it to be rows not columns. Removing it would give you what you wanted.

Or are you looking for a dataset in that format.

SAS_new
Calcite | Level 5

I also tried removing the 'list' option.  The dataset is huge and I am not able to get an output. If possible, I would like the code to produce a separate table with the summarized data so that I can merge with some other tables.

ps. I have also listed school as one of the variable so its a 3 way frequency-- like this:

Proc Freq data=have;

     tables school*race*offense; run;

Reeza
Super User

How big is your dataset?

SAS_new
Calcite | Level 5

I have over a million

Reeza
Super User

SAS_new wrote:

I also tried removing the 'list' option.  The dataset is huge and I am not able to get an output.

The general procedure is run a proc freq, save the output and proc transpose it.

But if you can't get an output that's an issue. A million rows shouldn't be much of an issue really.

You could try sorting it first, by school, race and offense.

You could also show the error you're getting, ie. why you say no output.

ballardw
Super User

SAS_new wrote:

I also tried removing the 'list' option.  The dataset is huge and I am not able to get an output. If possible, I would like the code to produce a separate table with the summarized data so that I can merge with some other tables.

ps. I have also listed school as one of the variable so its a 3 way frequency-- like this:

Proc Freq data=have;

     tables school*race*offense; run;

You may be looking for something more like;

proc freq data=have noprint;

table school*race*offense/ out= want ; /* you may want to play with the order of the variables depending on what you want*/

run;

SAS_new
Calcite | Level 5

Yes Reeza, I think I am looking for a dataset in that format.  How would I go about doing it?

SAS_new
Calcite | Level 5

I have used the noprint option and also created an output datafile:

Proc freq data=have;

     tables schoolcode*race*offense/out=separate;run;

The problem is that the output is not coming out the way I am expecting. I would like one row per race and one column per offense so it wold look like this:

W1D1V1
WHITE300
ASIAN010
BLACK002
HISPANIC020

It currently looks somewhat like this:

WHITED10
WHITE W13
WHITEV10
ASIANWI0
ASIAND11
ASIANV10
BLACKW10
BLACKD10
BLACKV12
HISPANICW10
HISPANIC D12
HISPANIC V10
Reeza
Super User

Use proc transpose now. 

SAS_new
Calcite | Level 5

Proc transpose does not work.  The data table has multiple rows of the same race by different offenses.

code::

PROC TRANSPOSE DATA=TST OUT=TEST;

BY SCHOOLCODE ;

ID RACE;

VAR FCODE; RUN;

SAS error message log shows this:

ERROR: The ID value "WHITE" occurs twice in the same BY group.

NOTE: The above message was for the following BY group:

SchoolCode=0100040

ERROR: The ID value "HISPANIC" occurs twice in the same BY group.

ERROR: The ID value "HISPANIC" occurs twice in the same BY group.

NOTE: The above message was for the following BY group:

SchoolCode=0100255

ERROR: The ID value "HISPANIC" occurs twice in the same BY group.

ERROR: The ID value "HISPANIC" occurs twice in the same BY group.

ERROR: The ID value "HISPANIC" occurs twice in the same BY group.

ERROR: The ID value "HISPANIC" occurs twice in the same BY group.

Reeza
Super User

The table you initially posted doesn't have the school information, so would you like a different table for every school? or another variable in the data indicating school.

My guess is the second, so do by school race in the proc transpose instead of just by school.

SAS_new
Calcite | Level 5

It does not appear to be summarizing?  I need the count of kids who fall into each cell.   Would I need to create a new variable or use the frequency count as the variable and somehow use that into the Proc Transpose code?  How would I do it though?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 22 replies
  • 1733 views
  • 4 likes
  • 4 in conversation