BookmarkSubscribeRSS Feed
byal9000
Calcite | Level 5
With subtitle: "And if so, How?"

Hello!

I have a bit of an odd problem. I just started with my company and am currently working update some old code. It uses a lot of Proc Freqs to help us check out the validity of our data. We have a LOT of variables, but most of them share between three or four valid value ranges. It was asked if it would be possible to condense the proc freqs into a table with a list of variables with the same valid value range comprising the rows, and the values that they could take across the top (and any invalid values that may be in there). All we want is a display of the counts of each different response for each variable.

I have been working on this by myself for a while now and haven't been able to get very far. I have had a few ideas on how to attempt this, but so far none of them have worked. I've been focusing mainly on proc report as we use it in other sections of the code, but just recently began to look at proc tabulate as well.

I apologize if this is unclear, if more information is requested I would be more than happy to provide it. Thanks for any help!
8 REPLIES 8
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Investigate using PROC TRANSPOSE to turn vertical info into a horizontal perspective.

Scott Barry
SBBWorks, Inc.
byal9000
Calcite | Level 5
Okay, thanks for the tip. I'll look into it and post whether I succeed or not.

Chris
byal9000
Calcite | Level 5
I'm not sure that will solve my problem. With it I can change each variable into an observation, but then I have 100k new variables from all of the observations. I would need some way of consolidating all of those variables into a smaller range of variables, one variable containing a count for each specific value or value range that appeared, correct? And even if I do that, how do I account for the invalid values? In general, I can't create a generic variable that contains invalid counts, as there are times when it is important to know what the invalid values actually are.


I guess my original question may have been this. Is it possible to use proc report or proc tabulate to create a table where each row is a different variable and the columns are predefined ranges and any invalid values in the variables (the predefined ranges I believe would come from a proc format)? Each cell on the table would then contain a count of how many observations contained that value or range for that variable. The output from running proc freq on 100-200 similar variables (3-4 times) is lengthy and cumbersome to analyze. The hope was that we could just use another procedure to consolidate the information, and to save paper.

I apologize, I am fairly new to SAS. I had great success teaching myself macros and a lot of other things, but this seems like it should be easy and is really giving me a difficult time. Thanks again for any help!
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
It would help to illustrate in a post-reply exactly the INPUT side and desired OUTPUT side which you would like to see happen in your SAS application/program.

Consider that you also have full-control with DATA step programming to generate your output statistics/summary information.

I understood that you had many observations, for example, with a variable ID, and you would like to generate a "collapsed" summary report showing counts of ID, represented as columns (and possibly to consider other key-variables with separate rows/observations). With this objective, I encourage you to explore PROC TRANSPOSE to transform your vertical data to horizontal, then summarize, if needed, and then you can use either PROC PRINT or if desired PROC REPORT to externalize your data in a report.

Scott Barry
SBBWorks, Inc.
ChrisNZ
Tourmaline | Level 20
Desired output has one row per variable or 1 row per (list of variables sharing the same range)?
If the latter, how do you want your counts summarised?
Peter_C
Rhodochrosite | Level 12
do the values of the variables you need to summarise have similar values?
Or are some numeric continuous others discrete and others of varying string length and content?
ballardw
Super User
If I am understanding your problem I believe that proc tabulate may have a good shot at what you need.

A skeleton program would look like:

Proc tabulate data=;
class /missing;
/*the missing is important as otherwise the default behavior for tabulate
will only count the records that all have nonmissing values for the class
variables. */
/*one table per variable*/
table n, ;
table n, ;
.
.
.
/* or one row with all of the variables across the page */
table n, ;

run;


Depending on the number of categories for each variable I would look at how many columns you want across a page and select the number of variables per row. It may be that you want to use tabulate with a noprint option to create an output dataset and then proc transpose to arrange data for proc print.
byal9000
Calcite | Level 5
I apologize for disappearing for a while. We had a lot of work all of a sudden.

To answer the earlier questions:

Desired output would be 1 row per variable.

Also, all of the variables that I would put in one table would have similar values. I would put all the numeric continuous in one report, all of the discrete in another, and all of the character variables with the same valid values in another.

I will once again look into proc tabulate (and the data step/proc transpose ideas) as I did not have much time to look at it before my attention was needed elsewhere. Thanks for all the help!

Chris

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 860 views
  • 0 likes
  • 5 in conversation