turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Horizontal count of number of variables that meet ...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-06-2012 12:01 PM

I have 923 variables in my dataset and 6345 observations.

For each observation, I need to calculate the number of times any of the 923 variables has a value of "9".

To help illustrate what I need to accomplish, the example below is simplified with just a small handful of observations and a small handful of variables. As you can see, some of the variables allow string text values.

In observation #3219, there are 6 variables for which the value is "9".

In observation #3220, there are 5 variables for which the value is "9".

And so forth for all 923 observations.

I need to be able to perform this calculation in SAS. I already know how to perform this calculation in MSExcel quickly and easily using "COUNTIF" but I am not aware of something that could accomplish this in SAS. Thank you in advance for your kind assistance.

Please respond if you know how to perform this calculation in SAS.

Accepted Solutions

Solution

07-06-2012
12:15 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Bautista

07-06-2012 12:15 PM

Assume you have both numeric and character variables:

data have;

input var1-var5 (var6-var9)($);

cards;

1 2 3 4 9 a b c d

9 1 2 3 4 9 9 e d

9 9 9 1 9 9 9 9 d

;

data want(drop=i);

set have;

n9=0; c9=0;

array _n(*) _numeric_;

array _c(*) _character_;

do i=1 to dim(_n);

if _n(i)=9 then n9+1;

end;

do i=1 to dim(_c);

if _c(i)='9' then c9+1;

end;

number9=n9+c9;

run;

proc print;run;

Obs var1 var2 var3 var4 var5 var6 var7 var8 var9 n9 c9 number9

1 1 2 3 4 9 a b c d 1 0 1

2 9 1 2 3 4 9 9 e d 1 2 3

3 9 9 9 1 9 9 9 9 d 4 3 7

Linlin

All Replies

Solution

07-06-2012
12:15 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Bautista

07-06-2012 12:15 PM

Assume you have both numeric and character variables:

data have;

input var1-var5 (var6-var9)($);

cards;

1 2 3 4 9 a b c d

9 1 2 3 4 9 9 e d

9 9 9 1 9 9 9 9 d

;

data want(drop=i);

set have;

n9=0; c9=0;

array _n(*) _numeric_;

array _c(*) _character_;

do i=1 to dim(_n);

if _n(i)=9 then n9+1;

end;

do i=1 to dim(_c);

if _c(i)='9' then c9+1;

end;

number9=n9+c9;

run;

proc print;run;

Obs var1 var2 var3 var4 var5 var6 var7 var8 var9 n9 c9 number9

1 1 2 3 4 9 a b c d 1 0 1

2 9 1 2 3 4 9 9 e d 1 2 3

3 9 9 9 1 9 9 9 9 d 4 3 7

Linlin

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Linlin

07-06-2012 12:59 PM

hi ... using Linlin's data (assuming all you have are single digit numbers) ... you have a choice (_numeric_, _character_, _all_) ...

**data want;**

**set have;**

**num9 = countc(catt(of _numeric_),'9');**

**char9 = countc(catt(of _character_),'9');**

**all9 = countc(catt(of _all_),'9')**

**run;**

**num9 char9 all9 **

** 1 0 1**

** 1 2 3**

** 4 3 7**

if all you have are numbers, just do the _numeric_ part

Message was edited by: Mike Zdeb ... added "of _all_"

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MikeZdeb

07-09-2012 01:37 PM

Thank you, Mike. Please correct me if I am wrong, but I think** all9=****countc(catt(of _all_),'9')** would give me a total of all the 9's that appear in the entire observation and not the number of variables that contain just the number 9 as the value of the variable, yes? If I am not mistaken, it would treat numbers like 19, 29, 39, 49, 59, 69, 79, and 89 as instances where there is one 9. And it would treat values like 199, 299, 399 and 499 as instances where there are two 9's.

When I tried your code, it told me there were 2043 instances of the number 9 in the first observation of my dataset, despite the fact that my dataset only has 923 variables. So that tells me it is counting the number of instances of the number 9 within each variable value.

I am trying Lin Lin's code but am having some questions about the variable names. I will post a question about that on her post.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Bautista

07-09-2012 06:17 PM

hi ... the original posting showed single characters and that's when COUNTC works

you already have some replies that work (like CATS and SUM in SQL), here's another that does work with COUNTC in a data step

**proc format;**

**value nn 9 ='1' other='0';**

**value $n '9'='1' other='0';**

**run;**

**filename x dummy;**

**data want;**

**set have;**

**file x;**

**put (_numeric_) (nn.) (_character_) ($n.) @;**

**nines = countc(_file_,'1');**

**put;**

**run;**

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Linlin

07-09-2012 02:57 PM

Thank you, LinLin. This worked. I needed to study your code, but after I did and I understood what each part means I was able to modify it slightly so it fulfills my needs, with my real working database.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Bautista

07-09-2012 03:37 PM

If you don't mind using proc sql, here is another way: (data borrowed from linlin)

data have;

input var1-var5 (var6-var9)($);

cards;

1 2 3 4 9 a b c d

9 1 2 3 4 9 9 e d

9 9 9 1 9 9 9 9 d

;

proc sql;

select cats('cats(',name,')','=','"9"') into :lst separated by ',' from dictionary.columns where libname='WORK' AND MEMNAME='HAVE';

select sum(&lst) as ct from have;

quit;

Haikuo

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Bautista

07-09-2012 10:56 PM

You can use function COUNTW() and don't forget to make this variable long enough to hold the whole line.

**data want;**

**set have;**

**all9 = countw(catx(' ',of _all_),'9') ;**

**run;**

Ksharp

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

07-10-2012 12:24 AM

hi ... you can do that, but it doesn't work

COUNTW counts the number of words, not the occurrences of specified words (or characters)

the '9' in the COUNTW function is used as a word delimiter, not as a search string

**data have;**

**input a b c d e;**

**datalines;**

**1 2 3 4 5**

**9 8 9 8 9**

**9 9 9 9 9**

**99 9 9999 9 9 9**

**89 89 89 89 89**

**;**

**data want;**

**set have;**

**all9 = countw(catx(' ',of _all_),'9') ;**

**run;**

** a b c d e all9**

** 1 2 3 4 5 1**

** 9 8 9 8 9 3**

** 9 9 9 9 9 5**

**99 9 9999 9 9 5**

**89 89 89 89 89 6**

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MikeZdeb

07-10-2012 01:08 AM

It is obviously that I make a mistake here.

NOT testing my code.

Thanks. Mike

Ksharp

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MikeZdeb

07-10-2012 01:21 AM

A remedy for my mistake.

data have; input a b c d e; datalines; 1 2 3 4 5 9 8 9 8 9 9 9 9 9 9 99 9 9999 9 9 9 89 89 89 89 89 ; run; data want; set have; all9 = countc(prxchange('s/\S{2,}//o',-1,catx(' ',of _all_)),'9') ; run;

Ksharp

Message was edited by: xia keshan Make code more robust.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

07-10-2012 12:08 PM

hi ... nice remedy

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

07-12-2012 05:54 PM

Your idea works great, but I need to do the same with two character criteria ('49'). I´m a new SAS user, and I tried your syntax and it had funny results... ¿could you please explain me why this happened?

Thanks in advance,

Andrea

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AndreaAb

07-12-2012 11:17 PM

COUNTC() only take care of one character . I don't understand what is two character criteria ('49') ?

Do you mean a string '49' or two single character '4' or '9' ?

You'd better post an example and output to clarify your problem.

Ksharp

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

07-12-2012 11:43 PM

Sorry. The criteria I needed was an specific number with two digits (49), and what I really needed is to know whether there is any of the expected number in any of a set of variables. I just finded out that I can do this with an array, works really good as long as each variable in the array has the same format (in my case, numeric). This is an example:

data want; set have;

array test{*}p_5_1--p_5_10;

do i=1 to dim(test);

if test(i)= 1 then ind1 = 1;

else if test(i)= 2 then Ind2 = 1;

else if test(i)= 3 then Ind3 = 1;

etc...

else if test(i)= 68 then Ind68 = 1;

end; drop i; run;

these are things you find yourself doing when someone has become reaaaally original when generating a data base... I´m trying to stay positive about this...