Hi, this might be a really dumb question, but I haven't found a way around this one.
Whenever I create distinct values from a category, it counts all the missings from that category as 1, asi it should. However, I can't think of a SIMPLE way of removing those 'missing' values from the distinct calculation (e.g: Column A values values: 1, 2, 3, missing, missing. Distinct by group (column A) = 4, I would like it to be 3). I can imagine some workarounds, marking as 1 or 0 if a certain value has certain properties and then do a sum by group, but that is what I'm trying to avoid. Has anybody come across this and can provide a solution?
Thanks in advance!
Santiago
I think this may want a small example start data set and what the final results should look like.
Hi Santiago,
you can use Data Source Filter, it remove blank or missing values from table level.
Teja Surapaneni.
Hi guys, thank you all for the answers so far.
I could remove the missings from the dataset, but given the workflow within the company, it is a request someone else should do and this takes a long time. To avoid this is why I'm looking for a for a way to do this in Visual Analytics using only calculated elements and filters. Sadly, I cannot use a where clause when making a new aggregated measure (distint by group in this case). BTW, I'm aware that this might not be possible, I'm just wondering if someone had a handy solution without modifying the dataset.
Here's an example of the table I'm working with:
Well, this:
data have;
infile cards;
input
category :$1.
pid :1.
cid :1.
sid :1.
;
cards;
A 1 1 1
A 2 1 .
A 3 2 .
A . . .
B 2 1 2
B 3 2 .
B 4 3 3
;
run;
proc sql;
create table want as
select
count(distinct pid) as pid, count(distinct cid) as cid, count(distinct sid) as sid
from have;
quit;
gives me exactly the result you wanted:
pid | cid | sid |
----------------------------
4 | 3 | 3 |
What keeps you from using a where condition to exclude the missing values?
Hi
Where are you seeing this , in a SAS dataset? if you perform a proc sql over a SAS dataset, then it'll exclude the null/missing values while using Count(). If you are querying from a Database, then you may want to use a function to handle the NULLs before you can perform a count().
you can use ISNULL() , IFNULL() or COALESCE() etc depending on which function your database is supporting.
Thanks
Karthik
Hi starnassi,
Did you ever work out a solution, I just came across a problem that now has this problem too.
I was thinking you could do the distinct count -1 but not sure I can be sure there will always be a MISSING value.
I too need to do in SASVA rather than data prep as I am looking to do the distinct count on an calculated item with an IF ELSE statement.
Thanks
Worked it out.. This just checks to see if there is 1 or more missing using same aggregate and the -1 if there is.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.