BookmarkSubscribeRSS Feed
starnassi
Calcite | Level 5

Hi, this might be a really dumb question, but I haven't found a way around this one.

Whenever I create distinct values from a category, it counts all the missings from that category as 1, asi it should. However, I can't think of a SIMPLE way of removing those 'missing' values from the distinct calculation (e.g: Column A values values: 1, 2, 3, missing, missing. Distinct by group (column A) = 4, I would like it to be 3). I can imagine some workarounds, marking as 1 or 0 if a certain value has certain properties and then do a sum by group, but that is what I'm trying to avoid. Has anybody come across this and can provide a solution?

Thanks in advance!

Santiago

9 REPLIES 9
ballardw
Super User

I think this may want a small example start data set and what the final results should look like.

TejaSurapaneni
Lapis Lazuli | Level 10

Hi Santiago,

you can use Data Source Filter, it remove blank or missing values from table level.

Teja Surapaneni.

starnassi
Calcite | Level 5

Hi guys, thank you all for the answers so far.

I could remove the missings from the dataset, but given the workflow within the company, it is a request someone else should do and this takes a long time. To avoid this is why I'm looking for a for a way to do this in Visual Analytics using only calculated elements and filters. Sadly, I cannot use a where clause when making a new aggregated measure (distint by group in this case). BTW, I'm aware that this might not be possible, I'm just wondering if someone had a handy solution without modifying the dataset.

Here's an example of the table I'm working with:

Modelo Tabla.PNG

Kurt_Bremser
Super User

Well, this:

data have;

infile cards;

input

  category :$1.

  pid :1.

  cid :1.

  sid :1.

;

cards;

A 1 1 1

A 2 1 .

A 3 2 .

A . . .

B 2 1 2

B 3 2 .

B 4 3 3

;

run;

proc sql;

create table want as

select

count(distinct pid) as pid, count(distinct cid) as cid, count(distinct sid) as sid

from have;

quit;

gives me exactly the result you wanted:

pid  cid  sid

----------------------------

4    3    3
Karthikeyan
Fluorite | Level 6

Hi

Where are you seeing this , in a SAS dataset?  if you perform a proc sql over a SAS dataset, then it'll exclude the null/missing values while using Count(). If you are querying from a Database, then you may want to use a function to handle the NULLs before you can perform a count().

you can use ISNULL() , IFNULL() or COALESCE() etc depending on which function your database is supporting.

Thanks

Karthik

McGoo
Calcite | Level 5

Hi starnassi,

 

Did you ever work out a solution, I just came across a problem that now has this problem too. 

 

I was thinking you could do the distinct count -1 but not sure I can be sure there will always be a MISSING value.

 

I too need to do in SASVA rather than data prep as I am looking to do the distinct count on an calculated item with an IF ELSE statement.

 

Thanks

McGoo
Calcite | Level 5

Worked it out..  This just checks to see if there is 1 or more missing using same aggregate and the -1 if there is.

 

sasva.png

mcgarrick
Calcite | Level 5
works perfectly. Thankyou!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 7376 views
  • 12 likes
  • 7 in conversation