BookmarkSubscribeRSS Feed
vegan_renegade
Obsidian | Level 7

I have a long code, but made a mock one for our purposes here, which creates the same error. I have 5 out of 10 patients that were evaluated, yet I get 60% instead of 50% evaluated when using "else 0". But when I use "else .", I get 50%, which is correct. I'm wondering why? Problem solved after 2 days, but perhaps someone can help me understand why using else 0 vs. else .  would make a difference?

 

Mock data:

 


data test;
   input patient_id evaluated $;
   datalines;
1 Yes
2 Yes
3 Yes
4 Yes
5 Yes
6 No
7 No
8 No
9 No
10 No
;
run;

test.PNG

 

First code that gives me 60% with "else 0" in the second subquery:

 

proc sql;
create table test1 as
select "Evaluation Complete" as Variable,

	(select (Count(distinct patient_id))
		from test
		where evaluated = "Yes") as Count,

	(select (Count(distinct case when evaluated="Yes" then patient_id else 0 end)/(Count(distinct patient_id)*1.0))
		from test) as Percent format percent8.1

from test(obs=1);
quit;

test1.PNG

 

Second code where we replace "else 0" with "else ." and gives us 50% for patients evaluated (correct):

 

proc sql;
create table test2 as
select "Evaluation Complete" as Variable,

	(select (Count(distinct patient_id))
		from test
		where evaluated = "Yes") as Count,

	(select (Count(distinct case when evaluated="Yes" then patient_id else . end)/(Count(distinct patient_id)*1.0))
		from test) as Percent format percent8.1

from test(obs=1);
quit;

test2.PNG

 

3 REPLIES 3
Reeza
Super User

You're assigning the value of 0 to all ones you don't want to count, which is the extra value so you get 6 distinct values (5 actual IDs plus the 0).

Missing is not counted so you get 5 values.

 

 

proc sql;
create table test1 as
select "Evaluation Complete" as Variable,

 *, case when evaluated="Yes" then patient_id else 0 end as variable0,
 case when evaluated="Yes" then patient_id else . end as variable_missing
from test;
quit;

Reeza_1-1714062916344.png


@vegan_renegade wrote:

I have a long code, but made a mock one for our purposes here, which creates the same error. I have 5 out of 10 patients that were evaluated, yet I get 60% instead of 50% evaluated when using "else 0". But when I use "else .", I get 50%, which is correct. I'm wondering why? Problem solved after 2 days, but perhaps someone can help me understand why using else 0 vs. else .  would make a difference?

 

Mock data:

 


data test;
   input patient_id evaluated $;
   datalines;
1 Yes
2 Yes
3 Yes
4 Yes
5 Yes
6 No
7 No
8 No
9 No
10 No
;
run;

test.PNG

 

First code that gives me 60% with "else 0" in the second subquery:

 

proc sql;
create table test1 as
select "Evaluation Complete" as Variable,

	(select (Count(distinct patient_id))
		from test
		where evaluated = "Yes") as Count,

	(select (Count(distinct case when evaluated="Yes" then patient_id else 0 end)/(Count(distinct patient_id)*1.0))
		from test) as Percent format percent8.1

from test(obs=1);
quit;

test1.PNG

 

Second code where we replace "else 0" with "else ." and gives us 50% for patients evaluated (correct):

 

proc sql;
create table test2 as
select "Evaluation Complete" as Variable,

	(select (Count(distinct patient_id))
		from test
		where evaluated = "Yes") as Count,

	(select (Count(distinct case when evaluated="Yes" then patient_id else . end)/(Count(distinct patient_id)*1.0))
		from test) as Percent format percent8.1

from test(obs=1);
quit;

test2.PNG

 


 

ballardw
Super User

I think the issue is that COUNT will count any valid value.

 

Try:

proc sql;
create table test1 as
select "Evaluation Complete" as Variable,

	(select (Count(distinct patient_id))
		from test
		where evaluated = "Yes") as Count,

	(select (Count(distinct case when evaluated="Yes" then patient_id else . end)/(Count(distinct patient_id)*1.0))
		from test) as Percent format percent8.1

from test(obs=1);
quit;

Your example doesn't include any repeated patient_id. If your data doesn't have any such I would question why do this instead of Proc Freq.

PaigeMiller
Diamond | Level 26

I would advise you to use PROC FREQ whenever you want to count things, instead of creating your own somewhat complicated SQL method, which clearly didn't work. SAS has put a huge amount of effort into validating the PROC FREQ results and tested it in a bazillion real world situations so you don't have to worry about getting the wrong output.

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 185 views
  • 2 likes
  • 4 in conversation