Here's the problem.
I am dealing with a table which contains 271 variables and 3.5 Millions of observations.
Moreover, I have no information on this table such as the meaning of each variable and the values each variable could take, especially the characters variables.
I have made a proc contents to obtain the number of observations and the list of variables.
I am interested by the value character variable could take.
Usually I made a proc sort nodupkey by variable x and I obtains the list of possible values for a variable could take.
My question is: is there a better way to do that with many variables and only the character variables
Regards,
Alain LePage
It depends on your specific needs, but based on what you describe I'd use something like this:
ods select NLevels;
proc freq data = sashelp.class nlevels;
table _character_;
run;
Well, first tip, you don't need to run a contents, you can just query sashelp.vtable and vcolumn to find that info. With regards to the data, if you don't know anything about the data, what do you intend to do with it, without any documentation, data is just a waste of disk space. If your list comes back with var1 contains XYZ,DEF,OPT, is that going to help? Anyways, proc freq is probably your best bet here.
A basic example:
proc freq data=sashelp.class; tables _character_ / out=temp; run;
It depends on your specific needs, but based on what you describe I'd use something like this:
ods select NLevels;
proc freq data = sashelp.class nlevels;
table _character_;
run;
If you have a table that is not documented, and none want to/can describe it for you, throw it away. Especially if it has that many variables.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.