Use this logic to identify .....create the new datasets using DUPkey ,NODUP ....option in existing data set ....create out data set which contains elimination of duplicated values....open the dataset DSID ...count no .of observations ..This is the think u need..
Suggestion to Peter.C -- go easy -- clearly, the OP is looking for various options to achieve a specific result (generate a set / count of distinct variable values), although not necessarily within your personal interest area(s). Better to contribute when providing some idea, suggestion, or question (for clarification) -- rather than just being critical.
The SAS language is quite challenging and powerful, as some know and others have heard, so interest on the forum to increase one's arsenal of SAS solution techniques through self-training, formal education and discussion forums, is quite reasonable.
I would like to encourage posters to explain the context for such constraints. Otherwise we appear to be answering and thwarting teachers questions designed to promote learning through the posters' personal effort.
I expect you are right and I should be more gentle.
How, gently, would you encourage the poster of a teachers's question, to do "the rtfm"?
We are in total agreement that there is a lacking interest by some posters on this forum to learn by first considering the R-T-F(ine)-M approach -- instead some individuals care more to just get the answer. Considering the wealth of "hosted" documentation and technical reference information provided by SAS Institute, as well as SAS System Help resources, one would expect at least a cursory attempt to find an answer or suitable response before generating a post. For some, possibly, and for others, likely not.
i am trying to solve this issue because of the data is 100GB. and 500 variables.
if i run proc sql for distinct count it has taken more time. thats why i am searching for alternate solution. By using proc freq ; i got count with less time than proc sql;
thank you for adding the information describing why you asked for a non-sql and non-datastep solution.
It may have been possible to define an sql query so that it ran as efficiently as proc freq. I expect the same is true of a data step approach. However, without the information about data size (number of observations, total size and key definitions) we cannot provide optimisation.
I'm glad you have found a practical solution.
( who is sorry he misinterpreted your first question )