How to count different values in rows

Yurie · Posted 08-05-2015 05:10 PM

Hello,

I need to count different values in a row for a longitudinal database. For example, I have

data First;

input ID type$ rev1 rev2 rev3 rev4;

cards;

101 a 0 0 . 1

101 b 1 1 0 0

101 c 0 . 1 0

102 b 1 0 1 0

102 c 1 0 . 1

102 d . 1 1 0

;

run;

I need the result 1:

ID Type count_0 count_missing count_1

101 a 2 1 1

101 b 2 0 2

101 c 2 1 1

102 b 2 0 2

102 c 1 1 2

102 d 1 1 2

Result 2:

ID Total_0 Total_missing Total_1

101 6 2 4

102 4 2 6

Thank you so much for any help!!

I will be very appreciate for any help!

mastropi · Posted 08-06-2015 04:22 AM

Hello Yurie,

One way is to use the SQL procedure:

data First;

input ID type$ rev1 rev2 rev3 rev4;

cards;

101 a 0 0 . 1

101 b 1 1 0 0

101 c 0 . 1 0

102 b 1 0 1 0

102 c 1 0 . 1

102 d . 1 1 0

;

proc sql;

create table First_counts as

select

id

,type

,(case when rev1 = 0 then 1 else 0 end) +

(case when rev2 = 0 then 1 else 0 end) +

(case when rev3 = 0 then 1 else 0 end) +

(case when rev4 = 0 then 1 else 0 end) as count_0

,(case when missing(rev1) then 1 else 0 end) +

(case when missing(rev2) then 1 else 0 end) +

(case when missing(rev3) then 1 else 0 end) +

(case when missing(rev4) then 1 else 0 end) as count_miss

,(case when rev1 = 1 then 1 else 0 end) +

(case when rev2 = 1 then 1 else 0 end) +

(case when rev3 = 1 then 1 else 0 end) +

(case when rev4 = 1 then 1 else 0 end) as count_1

from First;

select * from First_counts;

select

id

,sum(count_0) as total_0

,sum(count_miss) as total_miss

,sum(count_1) as total_1

from First_counts

group by id

order by id;

quit;

For result 1: if the number of rev* variables increase you could write a macro to generate the lines that compute count_0, count_miss, count_1.

Otherwise you could use ARRAYs in a DATA step and compute the counts "manually" by iterating on the different rev* variables. I leave this as an exercise.

HTH

Daniel

EricHoogenboom · Posted 08-06-2015 04:29 AM

Hello Yurie,

Use arrays for your goal.

Below a start. Things may be written shorter, but this is easy to read.

data want (drop=_:);

set first;

count_0 = 0;

count_missing = 0;

count_1 = 0;

array a_rev rev:;

do _i = 1 to dim(a_rev);

if a_rev(_i) = 0 then count_0 = count_0 + 1;

etc;

end;

run;

Hope this helps,

Eric

RW9 · Posted 08-06-2015 04:52 AM

Well, your quite correct about using arrays. However as the poster is counting certain characters, the easiest way would be to create a string of all elements and then countc() to count each one:

data want (drop=tmp rev1-rev4);

set first;

array rev{4};

tmp=cats(of rev{*});

count_miss=countc(tmp,".");

count_0=countc(tmp,"0");

count_1=countc(tmp,"1");

run;

ballardw · Posted 08-06-2015 10:53 AM

A transpose and report procedure approach:

data First;
input ID type$ rev1 rev2 rev3 rev4;
cards;
101 a 0 0 . 1
101 b 1 1 0 0
101 c 0 . 1 0
102 b 1 0 1 0
102 c 1 0 . 1
102 d . 1 1 0
;
run;

proc transpose data=first out=transfirst;
by id type;
run;

proc tabulate data=transfirst;
class id type;
class col1 / missing;
table id* type, col1='Value'*n=''*f=best4.;
table id, col1='Value'*n=''*f=best4.;
run;

Yurie · Posted 08-06-2015 12:04 PM

Hello, Ballardw, RW9, Eric, and Mastropi

I am so grateful for your answers. All answers are great and provide me a big help! It's really amazing!

Best & blessings,

Yurie

MikeZdeb · Posted 08-06-2015 06:22 PM

Hi. If the goal is two data sets (assumes the data set FIRST is sorted by ID ...

data result1 (keep=id type count:) result2 (keep=id total:);

call missing(total_1, total_0, total_miss);

do until (last.id);

set first;

by id;

count_1 = sum(of rev:);

count_miss = nmiss(of rev:);

count_0 = n(of rev:) - count_1;

total_1 + count_1;

total_0 + count_0;

total_miss + count_miss;

output result1;

end;

output result2;

run;

RESULT1 ...

Obs id type count_0 count_miss count_1

1 101 a 2 1 1

2 101 b 2 0 2

3 101 c 2 1 1

4 102 b 2 0 2

5 102 c 1 1 2

6 102 d 1 1 2

RESULT2 ...

Obs id total_0 total_miss total_1

1 101 6 2 4

2 102 4 2 6

ChrisNZ · Posted 08-09-2015 05:38 AM

data _V/view=_V;

set FIRST;

COUNT_1 = sum(of REV:);

COUNT_MISS = nmiss(of REV:);

COUNT_0 = n(of REV:) - COUNT_1;

drop REV:;

run;

proc means data=_V noprint;

class ID TYPE ;

var COUNT:;

types ID ID*TYPE;

output out=SUM sum=;

run;

proc print data=SUM noobs;

where _TYPE_=2;

var ID COUNT:;

proc print data=SUM noobs;

where _TYPE_=3;

var ID TYPE COUNT:;

run;

ID	COUNT_1	COUNT_MISS	COUNT_0
101	4	2	6
102	6	2	4

ID	type	COUNT_1	COUNT_MISS	COUNT_0
101	a	1	1	2
101	b	2	0	2
101	c	1	1	2
102	b	2	0	2
102	c	2	1	1
102	d	2	1	1

High-Performance SAS Coding - Third Edition

MikeZdeb · Posted 08-09-2015 07:12 PM

Hi. Thirteen statements versus fifteen in just a data step ... a virtual tie (with desired variable names from the data step).

ps Could have simplified the CALL MISSING statement by placing it after the appearance of the various TOTAL variables ...

data result1 (keep=id type count:) result2 (keep=id total:);

do until (last.id);

set first;

by id;

count_1 = sum(of rev:);

count_miss = nmiss(of rev:);

count_0 = n(of rev:) - count_1;

total_1 + count_1;

total_0 + count_0;

total_miss + count_miss;

output result1;

end;

output result2;

call missing(of total:);

run;

ChrisNZ · Posted 08-09-2015 08:43 PM

What do you mean a tie? Two fewer! And I can rename the variables too, and put the drop statement as an option!

Seriously, this was more about separating the data preparation from the calculations.

No better or worse, just different.

High-Performance SAS Coding - Third Edition

MadhuKorni · Posted 08-10-2015 04:27 AM

data First;

input ID type$ rev1 rev2 rev3 rev4;

dat = _infile_;

var = substr(dat,index(dat," "));

cnt_0 = count(var,'0');

cnt_m = count(var,'.');

cnt_1 = count(var,'1');

drop dat var rev:;

cards;

101 a 0 0 . 1

101 b 1 1 0 0

101 c 0 . 1 0

102 b 1 0 1 0

102 c 1 0 . 1

102 d . 1 1 0

;

proc sql;

create table want as

select id, sum(cnt_0) as cnt_0,sum(cnt_m) as cnt_m,sum(cnt_1) as cnt_1 from First group by id;

quit;

ChrisNZ · Posted 08-10-2015 05:37 AM

data FIRST;

input ID TYPE $ REV1 REV2 REV3 REV4;

cards;

101 a 0 0 . 1

101 b 1 1 0 0

101 c 0 . 1 0

102 b 1 0 1 0

102 c 1 0 . 1

102 d . 1 1 0

run;

proc sql;

create table SUM1(drop=TOTAL_N) as

select ID

,sum(sum ( REV1, REV2, REV3, REV4)) as TOTAL_1

,sum(cmiss( REV1, REV2, REV3, REV4)) as TOTAL_M

,sum(n ( REV1, REV2, REV3, REV4)) as TOTAL_N

,calculated TOTAL_N - calculated TOTAL_1 as TOTAL_0

from FIRST

group by ID;

create table SUM2(drop=TOTAL_N) as

select ID

,TYPE

,sum(sum ( REV1, REV2, REV3, REV4)) as TOTAL_1

,sum(cmiss( REV1, REV2, REV3, REV4)) as TOTAL_M

,sum(n ( REV1, REV2, REV3, REV4)) as TOTAL_N

,calculated TOTAL_N - calculated TOTAL_1 as TOTAL_0

from FIRST

group by ID, TYPE;

quit;

High-Performance SAS Coding - Third Edition

MikeZdeb · Posted 08-10-2015 05:59 PM

Hi ... since you can use CAT functions with numeric data, how about a different data step ...

data First;

input ID :$3. type :$1. rev1-rev4;

cnt_0 = count(catt(of rev:),'0');

cnt_m = count(catt(of rev:),'.');

cnt_1 = count(catt(of rev:),'1');

drop rev:;

cards;

101 a 0 0 . 1

101 b 1 1 0 0

101 c 0 . 1 0

102 b 1 0 1 0

102 c 1 0 . 1

102 d . 1 1 0

;

Searching for Variable Values with CAT Functions: An Alternative to Arrays and Loops

http://www.albany.edu/~msz03/epi697/ho/cat_find_rev.pdf

and ...

http://www.sas.com/content/dam/SAS/en_ca/User%20Group%20Presentations/TASS/Zdeb-CATFunctions.pdf

How to count different values in rows

Re: How to count different values in rows

Re: How to count different values in rows

Re: How to count different values in rows

Re: How to count different values in rows

Re: How to count different values in rows

Re: How to count different values in rows

Re: How to count different values in rows

Re: How to count different values in rows

Re: How to count different values in rows

Re: How to count different values in rows

Re: How to count different values in rows

Re: How to count different values in rows

SAS Innovate 2026 Registration is Open

SAS Training: Just a Click Away