BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
IgawaKei29
Quartz | Level 8

I am having an issue where I am getting duplicate line output.  I am pretty sure it is how I have my case statements situated, but I cannot seem to find a solution for this.

 

proc sql;
create table work.test as
select distinct

ID,
Name,

CASE 
when Active in ('Y') then 'YES'
else 'NO'
end as Active_Flag,

CASE
when IDENTIFIED in ('Y') then count(distinct Flags)
else 0
end as IDENTIFIED_Count,

CASE
when ONLINE in ('Y') then count(distinct Flags)
else 0
end as ONLINE_Count,

count(distinct GOAL) as Goal_Count

 

I would think that this would give me 2 lines of output.  Which would be something like this.

 

ID       Name               Active_Flag      IDENTIFIED_Count  ONLINE_Count    Goal_Count

1    Test_Account                Y                                      12                         42                             53

1     Test_Account               N                                      12                         42                             53

 

What I am getting is something like this

 

ID       Name               Active_Flag      IDENTIFIED_Count  ONLINE_Count    Goal_Count

1    Test_Account                Y                                      12                         0                               0

1     Test_Account               Y                                      12                         42                             0

1     Test_Account               Y                                      12                         0                              53

1    Test_Account                Y                                      12                         42                             53

 

1     Test_Account               N                                      12                         0                               0

1     Test_Account               N                                      12                         42                             0

1     Test_Account               N                                      12                         0                             53

1     Test_Account               N                                      12                         42                             53

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Would this do it?

proc sql;
/*create table want as*/
select
  name,
  case
    when active = 'Y'
    then 'YES'
    else 'NO'
  end as active_flag,
  count(distinct(case when identified = 'Y' then flag else '' end)) as identified_count,
  count(distinct(case when online = 'Y' then flag else '' end)) as online_count,
  count(distinct goal) as goal_count
from have
group by name, calculated active_flag;
quit;

Result is this:

                                    identified_
Name                  active_flag         count  online_count  goal_count
-------------------------------------------------------------------------
Account_1             NO                      1             1           1
Account_1             YES                     3             2           5

View solution in original post

9 REPLIES 9
novinosrin
Tourmaline | Level 20

Hello @IgawaKei29  You have shown your outputs, please show your input sample too

Kurt_Bremser
Super User

The case is executed for every incoming observation, so the summary function count() does not have it's desired effect.

 

Please supply some example input data in usable form (data step with datalines) and what you expect to get out of it.

IgawaKei29
Quartz | Level 8

Sure, its a massive table, but here is an example of the data.

 

ID       Name           Active        Flag     IDENTIFIED     ONLINE   GOAL

1      Account_1         Y              01               Y                      Y         102

2      Account_1         N              02               Y                     Y           112

3      Account_1         Y              04                N                     Y            90

4      Account_1         Y              08               Y                      N            55

5      Account_1         Y              04                Y                    N             60

6      Account_1         Y             04                N                     Y               84

Kurt_Bremser
Super User

"Usable" means THIS:

data have;
input ID $ Name :$20. Active :$1. Flag :$2. IDENTIFIED :$1. ONLINE :$1. GOAL;
datalines;
1 Account_1 Y 01 Y Y 102
2 Account_1 N 02 Y Y 112
3 Account_1 Y 04 N Y 90
4 Account_1 Y 08 Y N 55
5 Account_1 Y 04 Y N 60
6 Account_1 Y 04 N Y 84
;

as you were already shown in https://communities.sas.com/t5/General-SAS-Programming/SAS-Duplicate-Indicator/m-p/304424

 

And what do you expect out of it?

IgawaKei29
Quartz | Level 8
Thank you, trying to wrap my head around this has been fun.

So I need to roll up the accounts basically. So the output would look like this. (Subtract out the ID for simplicity)

Account_1 YES 3 4 5
Account_1 NO 1 1 1
Kurt_Bremser
Super User

Would this do it?

proc sql;
/*create table want as*/
select
  name,
  case
    when active = 'Y'
    then 'YES'
    else 'NO'
  end as active_flag,
  count(distinct(case when identified = 'Y' then flag else '' end)) as identified_count,
  count(distinct(case when online = 'Y' then flag else '' end)) as online_count,
  count(distinct goal) as goal_count
from have
group by name, calculated active_flag;
quit;

Result is this:

                                    identified_
Name                  active_flag         count  online_count  goal_count
-------------------------------------------------------------------------
Account_1             NO                      1             1           1
Account_1             YES                     3             2           5
IgawaKei29
Quartz | Level 8

Thank you for the help.  Looks like I will need to look at these nested count case statements, rather than use a traditional case statement.  

Kurt_Bremser
Super User

In SQL, you first need to create something in order to count it. So the summary function will be on the "outside", and the case/when on the "inside".

In your case, it's a little more tricky as you use one column to determine if another needs to be counted. What comes in handy is the fact that missing values are not taken into account, so this is what my case statements create when the condition is not met.

Kurt_Bremser
Super User

PS mind that this solution was something I have never used before, but I have others seen doing similar things here on the communities. Your post let me learn something new.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2047 views
  • 0 likes
  • 3 in conversation