BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
vegan_renegade
Obsidian | Level 7

I have a long code, but made a mock one for our purposes here, which creates the same error. I have 5 out of 10 patients that were evaluated, yet I get 60% instead of 50% evaluated when using "else 0". But when I use "else .", I get 50%, which is correct. I'm wondering why? Problem solved after 2 days, but perhaps someone can help me understand why using else 0 vs. else .  would make a difference?

 

Mock data:

 


data test;
   input patient_id evaluated $;
   datalines;
1 Yes
2 Yes
3 Yes
4 Yes
5 Yes
6 No
7 No
8 No
9 No
10 No
;
run;

test.PNG

 

First code that gives me 60% with "else 0" in the second subquery:

 

proc sql;
create table test1 as
select "Evaluation Complete" as Variable,

	(select (Count(distinct patient_id))
		from test
		where evaluated = "Yes") as Count,

	(select (Count(distinct case when evaluated="Yes" then patient_id else 0 end)/(Count(distinct patient_id)*1.0))
		from test) as Percent format percent8.1

from test(obs=1);
quit;

test1.PNG

 

Second code where we replace "else 0" with "else ." and gives us 50% for patients evaluated (correct):

 

proc sql;
create table test2 as
select "Evaluation Complete" as Variable,

	(select (Count(distinct patient_id))
		from test
		where evaluated = "Yes") as Count,

	(select (Count(distinct case when evaluated="Yes" then patient_id else . end)/(Count(distinct patient_id)*1.0))
		from test) as Percent format percent8.1

from test(obs=1);
quit;

test2.PNG

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

You're assigning the value of 0 to all ones you don't want to count, which is the extra value so you get 6 distinct values (5 actual IDs plus the 0).

Missing is not counted so you get 5 values.

 

 

proc sql;
create table test1 as
select "Evaluation Complete" as Variable,

 *, case when evaluated="Yes" then patient_id else 0 end as variable0,
 case when evaluated="Yes" then patient_id else . end as variable_missing
from test;
quit;

Reeza_1-1714062916344.png


@vegan_renegade wrote:

I have a long code, but made a mock one for our purposes here, which creates the same error. I have 5 out of 10 patients that were evaluated, yet I get 60% instead of 50% evaluated when using "else 0". But when I use "else .", I get 50%, which is correct. I'm wondering why? Problem solved after 2 days, but perhaps someone can help me understand why using else 0 vs. else .  would make a difference?

 

Mock data:

 


data test;
   input patient_id evaluated $;
   datalines;
1 Yes
2 Yes
3 Yes
4 Yes
5 Yes
6 No
7 No
8 No
9 No
10 No
;
run;

test.PNG

 

First code that gives me 60% with "else 0" in the second subquery:

 

proc sql;
create table test1 as
select "Evaluation Complete" as Variable,

	(select (Count(distinct patient_id))
		from test
		where evaluated = "Yes") as Count,

	(select (Count(distinct case when evaluated="Yes" then patient_id else 0 end)/(Count(distinct patient_id)*1.0))
		from test) as Percent format percent8.1

from test(obs=1);
quit;

test1.PNG

 

Second code where we replace "else 0" with "else ." and gives us 50% for patients evaluated (correct):

 

proc sql;
create table test2 as
select "Evaluation Complete" as Variable,

	(select (Count(distinct patient_id))
		from test
		where evaluated = "Yes") as Count,

	(select (Count(distinct case when evaluated="Yes" then patient_id else . end)/(Count(distinct patient_id)*1.0))
		from test) as Percent format percent8.1

from test(obs=1);
quit;

test2.PNG

 


 

View solution in original post

3 REPLIES 3
Reeza
Super User

You're assigning the value of 0 to all ones you don't want to count, which is the extra value so you get 6 distinct values (5 actual IDs plus the 0).

Missing is not counted so you get 5 values.

 

 

proc sql;
create table test1 as
select "Evaluation Complete" as Variable,

 *, case when evaluated="Yes" then patient_id else 0 end as variable0,
 case when evaluated="Yes" then patient_id else . end as variable_missing
from test;
quit;

Reeza_1-1714062916344.png


@vegan_renegade wrote:

I have a long code, but made a mock one for our purposes here, which creates the same error. I have 5 out of 10 patients that were evaluated, yet I get 60% instead of 50% evaluated when using "else 0". But when I use "else .", I get 50%, which is correct. I'm wondering why? Problem solved after 2 days, but perhaps someone can help me understand why using else 0 vs. else .  would make a difference?

 

Mock data:

 


data test;
   input patient_id evaluated $;
   datalines;
1 Yes
2 Yes
3 Yes
4 Yes
5 Yes
6 No
7 No
8 No
9 No
10 No
;
run;

test.PNG

 

First code that gives me 60% with "else 0" in the second subquery:

 

proc sql;
create table test1 as
select "Evaluation Complete" as Variable,

	(select (Count(distinct patient_id))
		from test
		where evaluated = "Yes") as Count,

	(select (Count(distinct case when evaluated="Yes" then patient_id else 0 end)/(Count(distinct patient_id)*1.0))
		from test) as Percent format percent8.1

from test(obs=1);
quit;

test1.PNG

 

Second code where we replace "else 0" with "else ." and gives us 50% for patients evaluated (correct):

 

proc sql;
create table test2 as
select "Evaluation Complete" as Variable,

	(select (Count(distinct patient_id))
		from test
		where evaluated = "Yes") as Count,

	(select (Count(distinct case when evaluated="Yes" then patient_id else . end)/(Count(distinct patient_id)*1.0))
		from test) as Percent format percent8.1

from test(obs=1);
quit;

test2.PNG

 


 

ballardw
Super User

I think the issue is that COUNT will count any valid value.

 

Try:

proc sql;
create table test1 as
select "Evaluation Complete" as Variable,

	(select (Count(distinct patient_id))
		from test
		where evaluated = "Yes") as Count,

	(select (Count(distinct case when evaluated="Yes" then patient_id else . end)/(Count(distinct patient_id)*1.0))
		from test) as Percent format percent8.1

from test(obs=1);
quit;

Your example doesn't include any repeated patient_id. If your data doesn't have any such I would question why do this instead of Proc Freq.

PaigeMiller
Diamond | Level 26

I would advise you to use PROC FREQ whenever you want to count things, instead of creating your own somewhat complicated SQL method, which clearly didn't work. SAS has put a huge amount of effort into validating the PROC FREQ results and tested it in a bazillion real world situations so you don't have to worry about getting the wrong output.

--
Paige Miller

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 919 views
  • 3 likes
  • 4 in conversation