I'm stumped by this one, and am unsure how to even get started. I have an extremely large dataset which I need to analyze by counting the values for each variable, and have the results written to an output dataset. I need a detailed count of the number of times each value occurs for each variable. Starting with a dataset constructed like this: data have;
input id $ color $ class $ size $;
datalines;
A1 Blue A XL
B2 Red B XL
C3 Blue A L
D4 Yellow C M
E5 Red C S
F6 Blue B L
G7 Yellow B M
H8 Blue C S
I9 Red B S
J10 Yellow A L
;
run; I need to get to an output that looks like this: attribute value count color Blue 4 color Red 3 color Yellow 3 class A 3 class B 4 class C 3 size S 3 size M 2 size L 3 size XL 2 My input table is 300+ columns by millions of rows (unfortunately I cannot get this changed), and somewhat dynamic. New variables could appear or existing variable names could change. Also, the allowed values in any given column could change from time to time. Thus, I really don't want to hard-code the variable names or allowed values. I tried using proc transpose to at least get the variable names read into my 'attribute' column, but then was unable to figure out how to read the allowed values for each. I'm looking for a method to loop through each variable, pick up all the values for each, and output the frequency count. Sounded simple to me at first, but I can't even figure out how to get started. I'm on SAS 9.4/EG 7.1. Thanks for any ideas you can provide.
... View more