SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Information Map Expressions CASE WHEN and COUNT DISTINCT

Reply
Contributor
Posts: 71

Information Map Expressions CASE WHEN and COUNT DISTINCT

[ Edited ]

Hi SAS Experts,

I am struggling a bit with Information Map Expressions.
I want to only count distinct for if another value is in (12,13,14).
So far I did not get very close.

I tried:

CASE WHEN <<T_STA.PNMG>>eq 12 THEN count(distinct(<<T_STA.STA_ID>>)) END

WHEN <<T_STA.PNMG>> = 12 THEN count(distinct(<<T_STA.STA_ID>>))

CASE WHEN <<T_STA.PNMG>> CONTAINS (12,13,14) THEN count(distinct(<<T_STA.STA_ID>>)) END

but there is always an error telling me that:

ERROR 22-322: Syntax error, expected one of the following: !, !!, &, (, *, **, +, -, /, <, <=, <>, =, >, >=, ?, AND, BETWEEN,
              CONTAINS, EQ, EQT, GE, GET, GT, GTT, LE, LET, LIKE, LT, LTT, NE, NET, OR, ^=, |, ||, ~=

Super User
Posts: 5,256

Re: Information Map Expressions CASE WHEN and COUNT DISTINCT

Long time since I played with information map expressions but in normal SQL END should be at the end of the CASE statement, not after WHEN.

Also, it seems not optimal to have aggregate functions here, it could really slow down performance.
Data never sleeps
Contributor
Posts: 71

Re: Information Map Expressions CASE WHEN and COUNT DISTINCT

Hi LinusH

I see. I keep playing around with it.

The IM is the base for a web report and Web Report Studio does not offer aggregation functions with count distinct so is there any other place where I can put them?

User want to build ad-hoc reports with different count-distinct's depending on the requirements. I can define the count-distinct (thanks to information map studio) but that is it right?

Respected Advisor
Posts: 3,892

Re: Information Map Expressions CASE WHEN and COUNT DISTINCT

As @LinusH already pointed out, performance could become an issue. You need to test with your full volume data.

 

As for IMs: I haven't done a lot with it but I always understood them as some sort of SQL view on metadata level. In order to figure out what SQL syntax would work, you could first just play around with code and then only build it in metadata once you've got a coding solution. 

 

I believe for what you're after, you would have to come up with something as "ugly" as below:

libname t_sta (work);

data T_STA.have;
  do PNMG=12 to 12;
    do STA_ID=1 to 3;
      output;
      output;
    end;
  end;
  do PNMG=13 to 14;
    do STA_ID=1 to 6;
      output;
    end;
  end;
run;


proc sql;
  select 
    sta_id,
    PNMG,
    CASE 
      WHEN have.PNMG = 12 THEN select COUNT(DISTINCT STA_ID) from t_sta.have where pnmg=12 
      WHEN have.PNMG = 13 THEN select COUNT(DISTINCT STA_ID) from t_sta.have where pnmg=13
      else .
    END
    as id_cnt
  from t_sta.have
  ;
quit;

libname t_sta clear;

Contributor
Posts: 71

Re: Information Map Expressions CASE WHEN and COUNT DISTINCT

OK I got it working but the data does not aggregate nor does it count correctly.

CASE
WHEN (<<T_STAM.PNMG>> IN ('12','15','78'))
THEN COUNT(DISTINCT <<T_STAM.PRONR>>)
END

the sql generated by the IM looks like this:



PROC SQL;
Create table WORK.%RESULT_TABLE_NAME% as
SELECT
table0.year AS DIR_2 LABEL='year' FORMAT=$4.,
table1.PARAMETER AS DIR_3 LABEL='Parameter' FORMAT=$30.,
( CASE
WHEN (table0.PNMG IN ('12','15','78'))
THEN COUNT(DISTINCT table0.PROBNR)
END

) AS DIR_4 LABEL='counting'
FROM
T_STA( compress=YES) table0 Inner join T_ERG( compress=YES) table1 on table0.STA_ID = table1.STA_ID
GROUP BY
1,
2;
quit;
Super User
Posts: 5,256

Re: Information Map Expressions CASE WHEN and COUNT DISTINCT

OLAP cubes have a specific statistic for count distinct : NUNIQUE. Then you could in the IM specify for a calculated entry something like;

CASE WHEN <<T_STA.PNMG>> IN (12,13,14)
THEN <<NUNIQUE_from_cube>>
ELSE 0 (or . depending on your requirement)
END
Data never sleeps
Contributor
Posts: 71

Re: Information Map Expressions CASE WHEN and COUNT DISTINCT

The big problem now is that as soon as I wrap the count distinct in a CASE WHEN/THEN
the result is NOT grouped. That makes is useless. Every row is shown by itself.
As soon as I take the CASE WHEN/THEN out the group by works.
How can I fix that? I noticed a "warning" saying that the aggregation option have had to be refreshed
however when I look at them they look the same.
Contributor
Posts: 71

Re: Information Map Expressions CASE WHEN and COUNT DISTINCT

[ Edited ]

I think I found a work around by first creating an IM Studio variable filling it with distinct sta_ID case when xxxx=xyy and then using that variable to do a count distinct on it (creating a classification variable and a counting variable)

 

Ask a Question
Discussion stats
  • 7 replies
  • 289 views
  • 1 like
  • 3 in conversation