Desktop productivity for business analysts and programmers

Asking the expert - How to get all the possible value a variable could take in a huge dataset.

Accepted Solution Solved
Reply
Contributor
Posts: 59
Accepted Solution

Asking the expert - How to get all the possible value a variable could take in a huge dataset.

Here's the problem.

 

I am dealing with a table which contains 271 variables and 3.5 Millions of observations.

Moreover, I have no information on this table such as the meaning of each variable and the values each variable could take, especially the characters variables.

 

I have made a proc contents to obtain the number of observations and the list of variables.

 

I am interested by the value character variable could take.

 

Usually I made a proc sort nodupkey by variable x and I obtains the list of possible values for a variable could take.

 

My question is: is there a better way to do that with many variables and only the character variables

 

Regards,

 

Alain LePage

 


Accepted Solutions
Solution
2 weeks ago
PROC Star
Posts: 311

Re: Asking the expert - How to get all the possible value a variable could take in a huge dataset.

It depends on your specific needs, but based on what you describe I'd use something like this:

 

ods select NLevels;

proc freq data = sashelp.class nlevels;
    table _character_;
run;

 

 

View solution in original post


All Replies
Super User
Super User
Posts: 8,174

Re: Asking the expert - How to get all the possible value a variable could take in a huge dataset.

Well, first tip, you don't need to run a contents, you can just query sashelp.vtable and vcolumn to find that info.  With regards to the data, if you don't know anything about the data, what do you intend to do with it, without any documentation, data is just a waste of disk space.  If your list comes back with var1 contains XYZ,DEF,OPT, is that going to help?  Anyways, proc freq is probably your best bet here.

A basic example:

proc freq data=sashelp.class;
  tables _character_ / out=temp;
run;
Solution
2 weeks ago
PROC Star
Posts: 311

Re: Asking the expert - How to get all the possible value a variable could take in a huge dataset.

It depends on your specific needs, but based on what you describe I'd use something like this:

 

ods select NLevels;

proc freq data = sashelp.class nlevels;
    table _character_;
run;

 

 

Contributor
Posts: 59

Re: Asking the expert - How to get all the possible value a variable could take in a huge dataset.

Posted in reply to collinelliot
Thanks for your help
Super User
Posts: 5,490

Re: Asking the expert - How to get all the possible value a variable could take in a huge dataset.

If you have a table that is not documented, and none want to/can describe it for you, throw it away. Especially if it has that many variables.

Data never sleeps
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 201 views
  • 3 likes
  • 4 in conversation