BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
shami
Fluorite | Level 6

I would like to count the number of tests collected. For each observation, a sample collected would be counted as 1. For example if visit_id 01 had samples 1, 2, 3 and 4 collected that would be counted as one. If visit_id 06 has zero samples collected that would be 0. I'm guessing I have to create a new variable that does that, I'm just not sure how. Thank you.

DATA COUNT;
INPUT VISIT_ID SAMPLE_1 SAMPLE_2 SAMPLE_3 SAMPLE_4;
CARDS;

01 1 1 0 1
02 1 0 1 1
03 0 0 1 0
04 1 1 1 1
05 1 0 0 0
06 0 0 0 0 ; RUN;
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@shami wrote:

I would like to count the number of tests collected. For each observation, a sample collected would be counted as 1. For example if visit_id 01 had samples 1, 2, 3 and 4 collected that would be counted as one. If visit_id 06 has zero samples collected that would be 0. I'm guessing I have to create a new variable that does that, I'm just not sure how. Thank you.


It really helps to provide a general rule stating variable names and values. Such as Count the observations where any of the variables Sample1 through Sample4 have at least one value > 0. Then show an example of the desired output based on your given example.

 

For my description above:

proc means data=count n;
where max(SAMPLE_1, SAMPLE_2, SAMPLE_3, SAMPLE_4)> 0;
var visit_id;
run;

Which shows an output like:

Analysis
Variable
: VISIT_ID
N
5

The Where statement means that SAS will use any observation where there is at least one value greater than 0 for the procedure.

 

Caveat: this would include 0.0001 as well if by chance one of the variables had such.

 

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26

Your words and subject line are unclear. You say "new variable that counts across" but later you say "For example if visit_id 01 had samples 1, 2, 3 and 4 collected that would be counted as one." These two are contradictory. I don't really know what you want

 

Nevertheless, if you have 0s and 1s, then "counts across", if that is what you want, is done like this.

 

data want;
     set have;
     count=sum(of sample_1-sample_4);
run;
--
Paige Miller
shami
Fluorite | Level 6

My apologies for the unclear subject line. The body describes exactly what I'm trying to do. I don't want to count across. I'm sorry for the confusion.

PaigeMiller
Diamond | Level 26

@shami wrote:

My apologies for the unclear subject line. The body describes exactly what I'm trying to do. I don't want to count across. I'm sorry for the confusion.


So I showed how to compute this for the sum. You wanted 1 or 0. Got it. In my example, instead of sum, here's a thought question for you ... what statistic, instead of sum, gives you a 1 or 0 depending on whether there is a 1 in the row, or all zeros in the row?

--
Paige Miller
ballardw
Super User

@shami wrote:

I would like to count the number of tests collected. For each observation, a sample collected would be counted as 1. For example if visit_id 01 had samples 1, 2, 3 and 4 collected that would be counted as one. If visit_id 06 has zero samples collected that would be 0. I'm guessing I have to create a new variable that does that, I'm just not sure how. Thank you.


It really helps to provide a general rule stating variable names and values. Such as Count the observations where any of the variables Sample1 through Sample4 have at least one value > 0. Then show an example of the desired output based on your given example.

 

For my description above:

proc means data=count n;
where max(SAMPLE_1, SAMPLE_2, SAMPLE_3, SAMPLE_4)> 0;
var visit_id;
run;

Which shows an output like:

Analysis
Variable
: VISIT_ID
N
5

The Where statement means that SAS will use any observation where there is at least one value greater than 0 for the procedure.

 

Caveat: this would include 0.0001 as well if by chance one of the variables had such.

 

Tom
Super User Tom
Super User

Your descriptions are confusing.

 

If you want to count the total number of SAMPLES then you want to sum both ways.  Order should not matter. 

 

So assuming that the values of the SAMPLE_n variables represents the number of test done on each sample you could use the SAS function SUM(,) to add the values in each observation and then the SQL aggregate function SUM() to find the grand total across all observations.

proc sql;
  select sum( sum(sample_1,sample_2,sample_3_sample_4)) as grand_count
  from count
  ;
quit;

But since the values you show are only 1 or 0 perhaps the SAMPLE_n variables indicate whether or not any samples were taken?  (If so then why was the dataset named COUNT??)  And perhaps what you want to find out is how many observations had ANY samples taken?  In that case for boolean 1/0 variables the MAX() will indicate if any of them was true or not.

 

In that case you might want to do something like this using the SAS function MAX(,) and then the SQL function SUM().

proc sql
select sum(max(sample_1,sample_2,sample_3,sample_4)) 
   as nvisits label='Number of visits with any samples'
from count;
quit;
shami
Fluorite | Level 6

I would like to count the number of tests collected. For each observation, a sample collected would be counted as 1. For example if visit_id 01 had samples 1, 2, 3 and 4 collected that would be counted as one. If visit_id 06 has zero samples collected that would be 0. I'm guessing I have to create a new variable that does that, I'm just not sure how. Thank you.

DATA COUNT;
INPUT VISIT_ID SAMPLE_1 SAMPLE_2 SAMPLE_3 SAMPLE_4;
CARDS;

01 1 1 0 1
02 1 0 1 1
03 0 0 1 0
04 1 1 1 1
05 1 0 0 0
06 0 0 0 0 ; RUN;
antonbcristina
SAS Employee

You might want to change the code based on your logic, but it sounds like you want to check if the sum of the indicator values (sample_i ) for each observation are greater than 0. If it is, then assign a value of 1, otherwise 0. 

 

data want;
	set count;

	tests=0;
	if sum(of sample_:)>0 then tests=1;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 719 views
  • 1 like
  • 6 in conversation