Help using Base SAS procedures

How to sum a text variable?

Reply
Occasional Contributor
Posts: 8

How to sum a text variable?

I'm wondering how to sum 3 variables. Each variable either has a "yes", "no" or "maybe". I want to create a new dataset where I sum up those variables to find the total number of yes, no and maybe answers. How do you do that? Thanks so much

Trusted Advisor
Posts: 1,926

Re: How to sum a text variable?

PROC FREQ

Occasional Contributor
Posts: 8

Re: How to sum a text variable?

Posted in reply to PaigeMiller

Shouldn't there be a sum command at some point?

Super User
Super User
Posts: 7,055

Re: How to sum a text variable?

You aren't "summing" but instead are "counting".  PROC FREQ data=mytable; TABLES myvar ; RUN;  will produce a report of the number of observations for each value of MYVAR.  So if you variable has values yes/no/maybe then it will produce three rows of output.  The freq (or count) is what you are calling the sum.

Occasional Contributor
Posts: 8

Re: How to sum a text variable?

Thanks. That gave me 3 tables each giving me the count and percentage of the yes, no, maybe answers for each variable. But I want to add them up and get the total number of yes, no and maybes across all of them plus the percentage of each. There must be a way to do this w/o me doing it by hand?

Trusted Advisor
Posts: 1,926

Re: How to sum a text variable?

Hi, Nikknock ... you are not adding or summing with text variables ... let's get those words out of the discussion ... we are counting.

You could use a SAS data step before PROC FREQ to convert your three variables into a single variable. Something like this:

data want;

     set have;

     value=x1; output;

     value=x2; output;

     value=x3; output;

run;

proc freq;

tables value;

run;

Super User
Super User
Posts: 7,055

Re: How to sum a text variable?

You need to clarify what you want to count then.

If you want to count the number of yes's in the 3 variables for this particular observation then there is not a good single command.

For three is probably easiest to just list them as in this response below from JohnW. 

You could use array processing to loop over the variables.

array resp var1-var3 ;

yes=0;

no=0;

maybe=0;

do over resp;

   if resp='yes' then yes=yes+1;

   else if resp='no' then no=no+1;

   else if resp='maybe' then maybe=maybe+1;

run;

Trusted Advisor
Posts: 1,926

Re: How to sum a text variable?

nikknock wrote:

Shouldn't there be a sum command at some point?

What is the sum of five different observations that are in the following sequence: "Yes", "Yes", "No", "Yes", "Maybe"??

Occasional Contributor
Posts: 18

Re: How to sum a text variable?

Not sure if you're looking to sum across variables, but if you are, you could do something like:

 

DATA
INDATA;

INPUT VAR1 $ VAR2 $ VAR3 $;

DATALINES;

Yes No Maybe

No Yes Maybe

No No Yes

;

RUN;

DATA Counts;

SET INDATA;

Yes   = (VAR1='Yes')  +(VAR2='Yes')   +(VAR3='Yes');

No    = (VAR1='No')   +(VAR2='No')    +(VAR3='No');

Maybe = (VAR1='Maybe')+(VAR2='Maybe') +(VAR3='Maybe');

RUN;

PROC MEANS

DATA=Counts

NOPRINT;

VAR  Yes No Maybe;

OUTPUT     OUT=MEANS SUM(Yes No Maybe)=Yes No Maybe;

RUN;

Occasional Contributor
Posts: 18

Re: How to sum a text variable?

Paige and Tom, those are much more elegant responses than my brute force technique!

The only other thing I will add is a data quality check...if you find (perhaps from an initial PROC FREQ as suggested above) that your data are like the following:

Yes YES yes Y

No NO no N

etc.,

it may be easiest to create a format to bin these together to cleaned up values with PROC FORMAT and apply the format in one of your steps.  You may also benefit from the TRIM function if there are extra spaces making otherwise similar looking responses show up as different rows.  If your data don't suffer from this problem, then be sure to thank the person who gave it to you!

Happy counting.

Trusted Advisor
Posts: 1,926

Re: How to sum a text variable?

Mixed case shouldn't be a problem, just convert all characters to Upper (or Lower) case before you run PROC FREQ

If sometimes the word YES appears and other times the single letter Y appears (and similarly for NO and MAYBE), then you could easily truncate all strings to the first character, and count those.

The only real problem would be if there's even less consistency in the data, and YES can also appear as sí or DA or JA or OUI, then you might have to resort to formats

Frequent Contributor
Frequent Contributor
Posts: 83

Re: How to sum a text variable?

Hi Nikknock

The following code works for me.

DATA INDATA;

INPUT VAR1 $ VAR2 $ VAR3 $;

DATALINES;
Yes No Maybe
No Yes Maybe
No No Yes
;

RUN;


data indata;
set indata;
id=_n_;
run;
proc sort data=indata;
by id;
run;
proc transpose data=indata out=outdata;
by id;
var VAR1   VAR2   VAR3  ;
run;
data outdata;
set outdata;
count=1;
drop id;
rename _name_=VAR;
run;
proc sort data=outdata;
by VAR col1;
;
run;
proc summary data=outdata ;
by VAR col1;
var count;
output out=want (drop=_Smiley Happysum=;
run;

and the output is


VAR


COL1


count


VAR1


No


2


VAR1


Yes


1


VAR2


No


2


VAR2


Yes


1


VAR3


Maybe


2


VAR3


Yes


1

New Contributor
Posts: 3

Re: How to sum a text variable?

Hi Plz check this and reply back. If this is not what you want I can learn from this.

DATA a;

INPUT VAR1 $ VAR2 $ VAR3 $;

DATALINES;

Yes No Maybe

No Yes Maybe

No No Yes

;

RUN;

proc tabulate data=a out=b;

class var1 var2 var3;

table var1 var2 var3;

run;

data c;

set b;

if var1='' and var2 ne '' then var1=var2;

if var1='' and var3 ne '' then var1=var3;

drop var2 var3;

run;

proc sort data=c;by _type_;run;

options missing=0;

proc transpose data=c out=d(drop=_name_ _type_);

id var1;

by _TYPE_;

var n;

run;

Ask a Question
Discussion stats
  • 12 replies
  • 940 views
  • 2 likes
  • 6 in conversation