Hi all,
I'm trying to clean up a table that I created using PROC FREQ. Essentially I want to create separate datasets by the defined type, but the type options change with each run.
I'm trying to avoid listing every type:
If Type="Reading" then output x;
If Type="Writing, Speaking, And Listening" then output x1;
etc because depending on the run, the type would change.
For example:
I'd want the yellow highlighted part to be a dataset, and the green to be another. Any ideas?
Thanks in advance!
If your rate represents a score or something that you want to create a row per value and you have multiple values of type with the same rate then this may be one way:
proc tabulate data=have; class rate type; table rate, type*(n='count' colpctn rowpctn pctn) ; run;
Tabulate will do different percentages like proc freq will with a tables involving var1*var2, which one depends on whether you want the overall (pctn), percentage within the row grouping (rowpctn) or the column (colpctn).
Some actual data, in the form of a data step, might get us further. So far your limited example would have counts of 1 for every combination of rate and type.
If your problem is as simple as you describe then this code will do 🙂
data x x1;
set have;
if type in ('Reading') then output x;
if type in ('Writing', 'Speaking', 'Listening') then output x1;
run;
Thanks, but I was trying to avoid having to list out the types though, because it changes. I'm looking for a way to automate it more.
Can you tell us how you will use those multiple data sets?
A great many times the additional data sets just create more work later.
Oh, good point!
I was hoping to make a table going out by type, not down. So right now, it's in this format:
Type | Rate | Count | Percent |
Reading | |||
Reading | |||
Reading | |||
Reading | |||
Writing, Speaking, And Listening | |||
Writing, Speaking, And Listening | |||
Writing, Speaking, And Listening | |||
Writing, Speaking, And Listening |
But I wanted to change it to this:
Rate | Reading Count | Reading Percent | Writing, Speaking, And Listening Count | Writing, Speaking, And Listening Percent |
The only way I know how to do it is by separating out by type and merging it back by rate.
Hi:
There are several procedures that will do what you want. However, the bigger issue is that you have unique values on every row for rate, so even if you summarized to get the count and the percent, you'd only have a 1 for every rate. So I'm not entirely clear on what you want, since your example of desire output only showed header information and there were no numbers in the cells to link back to your example of data.
Cynthia
This is just a snapshot of the data, as I'm not able to post actual full data. There are about 30 different types (beyond what's listed), all with a range of 1 to 105. The actual data has repeated rates, so the merge would be exactly what I need. My only thought was to have a more automated way of separating a large dataset, into subsets and merging them back together by rate. I'm hoping to find a way to do this, without listing each of the different type categories, because they're not static with each run. (i.e. If I run is on test x, the types are reading, writing. If I run it on test y, the types are reading, math, english.)
Hopefully this clears it up a bit:
I need to change this:
Type | Rate | Count | Percent |
Reading | 1 | 10 | |
Reading | 5 | 11 | |
Reading | 8 | 12 | |
Reading | 105 | 13 | |
Writing, Speaking, And Listening | 1 | 10 | |
Writing, Speaking, And Listening | 2 | 11 | |
Writing, Speaking, And Listening | 8 | 12 | |
Writing, Speaking, And Listening | 105 | 13 |
To this:
Rate | Reading Count | Reading Percent | Writing, Speaking, And Listening Count | Writing, Speaking, And Listening Percent |
1 | 10 | 10 | ||
2 | . | 11 | ||
5 | 11 | . | ||
8 | 12 | 12 | ||
105 | 13 | 13 |
If your rate represents a score or something that you want to create a row per value and you have multiple values of type with the same rate then this may be one way:
proc tabulate data=have; class rate type; table rate, type*(n='count' colpctn rowpctn pctn) ; run;
Tabulate will do different percentages like proc freq will with a tables involving var1*var2, which one depends on whether you want the overall (pctn), percentage within the row grouping (rowpctn) or the column (colpctn).
Some actual data, in the form of a data step, might get us further. So far your limited example would have counts of 1 for every combination of rate and type.
This is even better than I was hoping for!! Thank you so much!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.