BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SASuserlot
Barite | Level 11

I tried the following code. I came  to know I want not able to find the distinct count if the variable missing in SQL is there a way to circumvent this issue? I have the data with 2 distinct names ( alpha, " ") but when I do distinct count I am only getting one count. Thanks

data xx;
name= "alpha";
output;
name=" ";
output;
run;

proc sql;
create table xx1 as select   count(distinct name) as cnt from xx ;
quit;
1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @SASuserlot,

 

You could add 1 for missing values:

select count(distinct name)+max(name=" ") as cnt

 

Or count distinct hexadecimal representations in your character variable of length 5:

select count(distinct put(name,$hex10.)) as cnt

View solution in original post

11 REPLIES 11
AMSAS
SAS Super FREQ

Is this what you are attempting to do?

Note I changed your xx dataset, as I suspect you want a count of the distinct values  ie Alpha count = 5 _blank_ count = 3 in my example

 

data have;data xx;
	do i=1 to 5 ;
		name= "alpha";
		output;
	end ;
	do i=1 to 3 ;
		name=" ";
		output;
	end ;
run;

proc sql;
	create table xx1 as 
	select distinct name, count(*)
	from xx 
	group by name ;
quit;

 

If this isn't want you are looking for can you provide an example with a have and want dataset (ie what you have, and what you want after processing)
Also provide more detailed explanation of what you are attempting to do 

SASuserlot
Barite | Level 11

Your example is good but not the quite I am looking for . You have  two different names  (alpha, ' ')out of 10 observations. I was looking for how many different names out of 10 observations not count for 'alpha' or count of missing. @FreelanceReinh  response is working for me as of now

FreelanceReinh
Jade | Level 19

Hi @SASuserlot,

 

You could add 1 for missing values:

select count(distinct name)+max(name=" ") as cnt

 

Or count distinct hexadecimal representations in your character variable of length 5:

select count(distinct put(name,$hex10.)) as cnt
SASuserlot
Barite | Level 11

Thanks for your response.  your both solutions working for me as of now. Additional question in second solutions:  how $hex10. doing the job? By any chance if my character length for 'name' changes to 200 in future, is $hex10. works?

FreelanceReinh
Jade | Level 19

@SASuserlot wrote:

Additional question in second solutions:  how $hex10. doing the job? By any chance if my character length for 'name' changes to 200 in future, is $hex10. works?


The width w of the $HEXw. format must be greater than or equal to twice the defined length of the character variable to be formatted. So use $hex400. for a length-200 variable (Edit: assuming the usual single-byte characters, not Unicode, etc.). The other solution is independent of the length of variable NAME.

SASuserlot
Barite | Level 11

Thank you. How about if variable is numeric type?

FreelanceReinh
Jade | Level 19

@SASuserlot wrote:

How about if variable is numeric type?


You might need to know what values you can expect in the numeric variable: Are there special missing values (like .A, .B, ..., ._) and would you like to count these as distinct? Are there floating point values which may contain tiny rounding errors? In this case would you like to count, say, 3 and 3.00000000000001 as different numbers?

 

If no special missing values are involved (only the usual numeric missing value .), you can use

select count(distinct numvar)+max(numvar=.)

 

To include distinct special missing values in the count, use

select count(distinct put(numvar,hex16.))

(This time the width of the format is always 16.)

 

Both of the above solutions would distinguish between 3 and 3.00000000000001, etc. If you don't want this, use the ROUND function or apply a format that suits your precision goal.

Example:

select count(distinct put(numvar,best12.))

This would also include distinct special missing values, if any.

PaigeMiller
Diamond | Level 26

PROC FREQ has no problems

 

proc freq data=xx nlevels;
    ods output nlevels=xx1;
    tables name;
run;
--
Paige Miller
SASuserlot
Barite | Level 11

Thanks for your response. Yes PROC freq don't have any issues. I was looking  to explore with  SQL. I think @FreelanceReinh  response is working  at present.

PaigeMiller
Diamond | Level 26

@SASuserlot wrote:

Thanks for your response. Yes PROC freq don't have any issues. I was looking  to explore with  SQL. I think @FreelanceReinh  response is working  at present.


That's fine. I presented the PROC FREQ answer for people reading along who aren't limiting themselves to PROC SQL. Or for people who prefer straightforward and readable and easily understandable code (as you can see PROC SQL requires a little bit more complicated and less intuitive code).

--
Paige Miller
SASuserlot
Barite | Level 11

@PaigeMiller  thank you, I really appreciate that. Just wanted let you  I communicated correctly. I really appreciate your quick responses to my questions all the time.😀

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 8123 views
  • 7 likes
  • 4 in conversation