DATA Step, Macro, Functions and more

Counting specific values

Accepted Solution Solved
Reply
Contributor
Posts: 45
Accepted Solution

Counting specific values

Howdy Y'all,

 

I would like to add a new column to a dataset but I am not sure how to do so. My dataset has a variable called KEYVAR (character variable) with three different values. A participant can appear multiple times in my dataset, with each row containing a similar or different value for KEYVAR. What I want to do is create a new variable call NEWVAR that counts how many times a participant has a specific value for KEYVAR; when a participant does not have an observation for that specific value, I want NEWVAR to have a result of zero.

 

The first code I used was:

 

PROC SQL;

         CREATE TABLE WANT AS

         SELECT*, COUNT (PARTICIPANT) AS NEWVAR

         FROM HAVE

         WHERE KEYVAR = 'Y';

QUIT;

 

While the code does create a new column, it has a major problem. It creates a new dataset but with only observations containing the specific value I want; however, I need a dataset that retains all my observations. Secondly, the WHERE statement also restricts the count to the specific value, so any participant lacking the 'Y' in KEYVAR is ignored completely.

 

Here's an example of the dataset I would like (in this example, I want to count every instance of "Y"):

 

PARTICIPANTKEYVARNEWVAR
AY1
AN1
BY3
BY3
BY3
CW0
CN0
CW0
DY2
DN2
DN2
DY2
DW2

 

I am using SAS 9.4

 

Thanks for the help y'all! 


Accepted Solutions
Solution
‎03-21-2017 03:48 PM
Super User
Posts: 17,868

Re: Counting specific values


TXSASneophyte wrote:

Howdy Y'all,

 

I would like to add a new column to a dataset but I am not sure how to do so. My dataset has a variable called KEYVAR (character variable) with three different values. A participant can appear multiple times in my dataset, with each row containing a similar or different value for KEYVAR. What I want to do is create a new variable call NEWVAR that counts how many times a participant has a specific value for KEYVAR; when a participant does not have an observation for that specific value, I want NEWVAR to have a result of zero.

 

The first code I used was:

 

PROC SQL;

         CREATE TABLE WANT AS

         SELECT*, COUNT (PARTICIPANT) AS NEWVAR

         FROM HAVE

         WHERE KEYVAR = 'Y';

QUIT;

 

While the code does create a new column, it has a major problem. It creates a new dataset but with only observations containing the specific value I want; however, I need a dataset that retains all my observations. Secondly, the WHERE statement also restricts the count to the specific value, so any participant lacking the 'Y' in KEYVAR is ignored completely.

 

Here's an example of the dataset I would like (in this example, I want to count every instance of "Y"):

 

PARTICIPANT KEYVAR NEWVAR
A Y 1
A N 1
B Y 3
B Y 3
B Y 3
C W 0
C N 0
C W 0
D Y 2
D N 2
D N 2
D Y 2
D W 2

 

I am using SAS 9.4

 

Thanks for the help y'all! 


Use SQL SUM() instead.

 

PROC SQL;
         CREATE TABLE WANT AS
         SELECT 
			*, 
			sum(KEYVAR='Y') AS COUNT_VAR
         FROM HAVE
         GROUP BY PARTICIPANT;
QUIT;

View solution in original post


All Replies
Respected Advisor
Posts: 4,654

Re: Counting specific values

 

PROC SQL;
         CREATE TABLE WANT AS
         SELECT 
			*, 
			COUNT (*) AS NEWVAR
         FROM HAVE
         GROUP BY PARTICIPANT, KEYVAR;
QUIT;
PG
Contributor
Posts: 45

Re: Counting specific values

Thank you for the code. Unfortunately it does not do what I need. Here's a table of what your code did, comparing the column I want to create in blue and the column your code created in green:

 

PARTICIPANTKEYVARNEWVARPGStats_CODE 
AY11
AN11
BY33
BY33
BY33
CW02
CN01
CW02
DY22
DN22
DN22
DY22
DW21

 

Your code gives me a count for every value in KEYVAR but I am only interested in the count for the "Y" value in KEYVAR, and if a certain participant has no "Y" at all in any of their observations, then I want a value of zero for them in NEWVAR (look at participant C). 

 

 

Valued Guide
Posts: 505

Re: Counting specific values

Not sure I understand fully the issue

Assigning the value of zero for groups missing a cetain category

data have;
input
PARTICIPANT $ KEYVAR $ NEWVAR;
cards4;
A Y 1
A N 1
B Y 3
B Y 3
B Y 3
C W 0
C N 0
C W 0
D Y 2
D N 2
D N 2
D Y 2
D W 2
;;;;
run;quit;

HAVE
====

Up to 40 obs WORK.HAVE total obs=13

Obs    PARTICIPANT    KEYVAR    NEWVAR

  1         A           Y          1
  2         A           N          1
  3         B           Y          3
  4         B           Y          3
  5         B           Y          3
  6         C           W          0
  7         C           N          0
  8         C           W          0
  9         D           Y          2
 10         D           N          2
 11         D           N          2
 12         D           Y          2
 13         D           W          2


WANT
====

Obs    KEYVAR     N     W     Y

 1      A         1     0     1
 2      B         0     0     3
 3      C         1     2     0    * C has no Y;
 4      D         2     1     2

FULL SOLUTION
=============

Ods Exclude All;
Ods Output Observed=want(Rename=Label=Keyvar);
Proc Corresp Data=have Observed dim=1;
Table PARTICIPANT, KEYVAR;
*weight newvar;  * turn this on if newvar is a count;
run;quit;
Ods Select All;
proc print data=want;
run;quit;
Contributor
Posts: 45

Re: Counting specific values

My computer is taking forever to run this code; I'll run again tonight to see what it does. But thanks for the suggestion!

Respected Advisor
Posts: 4,654

Re: Counting specific values

In your original example, on the second line, NEWVAR=1 with KEYVAR="N" ? How so?

PG
Contributor
Posts: 45

Re: Counting specific values

The second line has NEWVAR = 1 with KEYVAR = 'N' because I set up the dataset assuming a group by statement was possible. The second line doesn't mean that I wanted 1 for 'N' but that participant "A" had at least one 'Y'. 

Solution
‎03-21-2017 03:48 PM
Super User
Posts: 17,868

Re: Counting specific values


TXSASneophyte wrote:

Howdy Y'all,

 

I would like to add a new column to a dataset but I am not sure how to do so. My dataset has a variable called KEYVAR (character variable) with three different values. A participant can appear multiple times in my dataset, with each row containing a similar or different value for KEYVAR. What I want to do is create a new variable call NEWVAR that counts how many times a participant has a specific value for KEYVAR; when a participant does not have an observation for that specific value, I want NEWVAR to have a result of zero.

 

The first code I used was:

 

PROC SQL;

         CREATE TABLE WANT AS

         SELECT*, COUNT (PARTICIPANT) AS NEWVAR

         FROM HAVE

         WHERE KEYVAR = 'Y';

QUIT;

 

While the code does create a new column, it has a major problem. It creates a new dataset but with only observations containing the specific value I want; however, I need a dataset that retains all my observations. Secondly, the WHERE statement also restricts the count to the specific value, so any participant lacking the 'Y' in KEYVAR is ignored completely.

 

Here's an example of the dataset I would like (in this example, I want to count every instance of "Y"):

 

PARTICIPANT KEYVAR NEWVAR
A Y 1
A N 1
B Y 3
B Y 3
B Y 3
C W 0
C N 0
C W 0
D Y 2
D N 2
D N 2
D Y 2
D W 2

 

I am using SAS 9.4

 

Thanks for the help y'all! 


Use SQL SUM() instead.

 

PROC SQL;
         CREATE TABLE WANT AS
         SELECT 
			*, 
			sum(KEYVAR='Y') AS COUNT_VAR
         FROM HAVE
         GROUP BY PARTICIPANT;
QUIT;
Contributor
Posts: 45

Re: Counting specific values

This does exactly what I need, thanks for the help Smiley Very Happy

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 257 views
  • 0 likes
  • 4 in conversation