BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PetePatel
Quartz | Level 8

Hi,

 

I have a large dataset (3m records) with around 10,000 columns (variables).

 

I need to find the number of distinct values in each column as shown below:

 

From this:

IDNameNum
123last name10000
123last name20000
345s drop30000
456s drop40000
123s drop40000

 

To this:

ID3
Name2
Num4

 

What is the most efficient way of generating these results for c. 10,000 columns?

 

Cheers

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
ods select none;
ods output nlevels=want;
proc freq data=sashelp.class nlevels ;
table _all_/missing;
run;
ods select all;

View solution in original post

5 REPLIES 5
Sathish_jammy
Lapis Lazuli | Level 10

 

Try the below code to get the distinct value for the variables...

 

Proc sql;
select count(ID)as ID, count(Name)as Name, count(Num) as Num from dataset_name;
Quit;
PetePatel
Quartz | Level 8

Thanks, is there a quicker way than having a really long script with 10,000 vars?

Ksharp
Super User
ods select none;
ods output nlevels=want;
proc freq data=sashelp.class nlevels ;
table _all_/missing;
run;
ods select all;

jeffharris
Calcite | Level 5

Why does this work?

Ksharp
Super User
Did you run the code and check WANT table ?


Var NLevels
Name 19
Sex 2
Age 6
Height 17
Weight 15

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 17575 views
  • 10 likes
  • 4 in conversation