Help with create a summary of distinct value for each variable

Accepted Solution Solved
Reply
Super Contributor
Posts: 371
Accepted Solution

Help with create a summary of distinct value for each variable

Hi Everyone,

I have a dataset with n variables. I want to create a summary file of distinct/unique values for each variables.


data have;

  input a0 a1 a2 a3 a4 a5 a6;

  datalines;

0 5 9 1 0 8 1

1 4 0 1 0 5 0

1 8 0 1 0 1 1

;

as you see, a0 only take value 0 or 1; a1 values are 5 4 8; a2 values are 9 and 0; a3 take only value 1...

The output file should be:
a0 a1 a2 a3 a4 ...
1  5  9  1  0...
0  4  0
   8


I wonder if there is any simple way to get it done. Right now I can only do it through sort nondupkey one by one and merge, which is not efficient.

Thank you so much for your help.

HHC


Accepted Solutions
Solution
‎01-28-2014 09:40 AM
Respected Advisor
Posts: 3,777

Re: Help with create a summary of distinct value for each variable

This looks about right.

data have;
  input a0 a1 a2 a3 a4 a5 a6;
  datalines;
0 5 9 1 0 8 1
1 4 0 1 0 5 0
1 8 0 1 0 1 1
;;;;
   run;
proc summary data=have missing chartype;
  
class a:;
   ways 1;
  
output out=distinct(drop=_type_ _freq_) / levels;
  
run;
proc print;
  
run;

proc sort data=distinct;
   by _level_;
   run;
data flat;
   update distinct(obs=0) distinct;
   by _level_;
   run;
proc print;
  
run;

View solution in original post


All Replies
Super User
Posts: 5,257

Re: Help with create a summary of value for each variable

Not really summary, rather distinct values?

Without knowing the underlying requirement, it seems like an odd request.

However, I would transpose the data, and the do select distinct/proc sort nodupkey. And if it's really necessary, transpose it back..

Data never sleeps
Solution
‎01-28-2014 09:40 AM
Respected Advisor
Posts: 3,777

Re: Help with create a summary of distinct value for each variable

This looks about right.

data have;
  input a0 a1 a2 a3 a4 a5 a6;
  datalines;
0 5 9 1 0 8 1
1 4 0 1 0 5 0
1 8 0 1 0 1 1
;;;;
   run;
proc summary data=have missing chartype;
  
class a:;
   ways 1;
  
output out=distinct(drop=_type_ _freq_) / levels;
  
run;
proc print;
  
run;

proc sort data=distinct;
   by _level_;
   run;
data flat;
   update distinct(obs=0) distinct;
   by _level_;
   run;
proc print;
  
run;

Super Contributor
Posts: 371

Re: Help with create a summary of distinct value for each variable

Thank you so much, Data_null !!!
One quick question since you are here. If the name of variables shares nothing in common, say a1, f4, var1, date... I cannot use the "class a: ;" .

What should we put after "class" so that SAS conduct the proc summary for all variables on the file?

Thank you again,

HHC

Valued Guide
Posts: 2,175

Re: Help with create a summary of distinct value for each variable

something like

proc summary missing data= your.data ;

class _all_ ;

output out= results ;

run ;

Respected Advisor
Posts: 3,777

Re: Help with create a summary of distinct value for each variable

Like says you can use _ALL_  a  SAS Variable List one of the most power features of the SAS language.

Super Contributor
Posts: 371

Re: Help with create a summary of distinct value for each variable

Thank you, Peter and Data_null_.

Lesson learned.

HHC

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 338 views
  • 6 likes
  • 4 in conversation