Help using Base SAS procedures

How to count different values in rows

Reply
Contributor
Posts: 32

How to count different values in rows

Hello,

I need to count different values in a row for a longitudinal database. For example, I have

data First;

input ID type$ rev1 rev2 rev3 rev4;

cards;

101 a 0 0 . 1

101 b 1 1 0 0

101 c 0 .  1 0

102 b 1 0 1 0

102 c 1 0 .  1

102 d .  1 1 0

;

run;

I need the result 1:

ID    Type  count_0  count_missing  count_1

101    a        2           1                      1

101    b        2           0                      2

101    c        2           1                      1

102    b        2           0                      2

102    c        1           1                      2

102    d        1           1                      2

Result 2:

ID     Total_0  Total_missing   Total_1

101         6            2                     4

102         4            2                     6

Thank you so much for any help!!

I will be very appreciate for any help!

Occasional Contributor
Posts: 8

Re: How to count different values in rows

Hello Yurie,

One way is to use the SQL procedure:

data First;

input ID type$ rev1 rev2 rev3 rev4;

cards;

101 a 0 0 . 1

101 b 1 1 0 0

101 c 0 .  1 0

102 b 1 0 1 0

102 c 1 0 .  1

102 d .  1 1 0

;

proc sql;

    create table First_counts as

    select

        id

        ,type

        ,(case when rev1 = 0 then 1 else 0 end) +

         (case when rev2 = 0 then 1 else 0 end) +

         (case when rev3 = 0 then 1 else 0 end) +

         (case when rev4 = 0 then 1 else 0 end) as count_0

        ,(case when missing(rev1) then 1 else 0 end) +

         (case when missing(rev2) then 1 else 0 end) +

         (case when missing(rev3) then 1 else 0 end) +

         (case when missing(rev4) then 1 else 0 end) as count_miss

        ,(case when rev1 = 1 then 1 else 0 end) +

         (case when rev2 = 1 then 1 else 0 end) +

         (case when rev3 = 1 then 1 else 0 end) +

         (case when rev4 = 1 then 1 else 0 end) as count_1

    from First;

    select * from First_counts;

    select

        id

        ,sum(count_0) as total_0

        ,sum(count_miss) as total_miss

        ,sum(count_1) as total_1

    from First_counts

    group by id

    order by id;

quit;

For result 1: if the number of rev* variables increase you could write a macro to generate the lines that compute count_0, count_miss, count_1.

Otherwise you could use ARRAYs in a DATA step and compute the counts "manually" by iterating on the different rev* variables. I leave this as an exercise.

HTH

Daniel

Occasional Contributor
Posts: 11

Re: How to count different values in rows

Hello Yurie,

Use arrays for your goal.

Below a start. Things may be written shorter, but this is easy to read.

data want (drop=_Smiley Happy;

   set first;

   count_0 = 0;

   count_missing = 0;

   count_1 = 0;

   array a_rev rev:;

   do _i = 1 to dim(a_rev);

      if a_rev(_i) = 0 then count_0 = count_0 + 1;

      etc;

   end;

run;

Hope this helps,

Eric

Super User
Super User
Posts: 7,970

Re: How to count different values in rows

Posted in reply to EricHoogenboom

Well, your quite correct about using arrays.  However as the poster is counting certain characters, the easiest way would be to create a string of all elements and then countc() to count each one:

data want (drop=tmp rev1-rev4);

  set first;

  array rev{4};

  tmp=cats(of rev{*});

  count_miss=countc(tmp,".");

  count_0=countc(tmp,"0");

  count_1=countc(tmp,"1");

run;

Super User
Posts: 11,343

Re: How to count different values in rows

A transpose and report procedure approach:

data First;
input ID type$ rev1 rev2 rev3 rev4;
cards;
101 a 0 0 . 1
101 b 1 1 0 0
101 c 0 .  1 0
102 b 1 0 1 0
102 c 1 0 .  1
102 d .  1 1 0
;
run;


proc transpose data=first out=transfirst;
by id type;
run;

proc tabulate data=transfirst;
class id type;
class col1 / missing;
table id* type, col1='Value'*n=''*f=best4.;
table id, col1='Value'*n=''*f=best4.;
run;

Contributor
Posts: 32

Re: How to count different values in rows

Hello, Ballardw, RW9, Eric, and Mastropi

I am so grateful for your answers. All answers are great and provide me a big help! It's really amazing!

Best & blessings,

Yurie

Valued Guide
Posts: 765

Re: How to count different values in rows

Hi.  If the goal is two data sets (assumes the data set FIRST is sorted by ID ...

data result1 (keep=id type countSmiley Happy result2 (keep=id totalSmiley Happy;

call missing(total_1, total_0, total_miss);

do until (last.id);

set first;

by id;

  count_1    = sum(of revSmiley Happy;

  count_miss = nmiss(of revSmiley Happy;

  count_0    = n(of revSmiley Happy - count_1;

  total_1    + count_1;

  total_0    + count_0;

  total_miss + count_miss;

  output result1;

end;

output result2;

run;

RESULT1 ...

Obs    id    type    count_0  count_miss     count_1

1     101     a         2          1           1

2     101     b         2          0           2

3     101     c         2          1           1

4     102     b         2          0           2

5     102     c         1          1           2

6     102     d         1          1           2

RESULT2 ...

Obs     id    total_0   total_miss     total_1

1     101       6          2             4

2     102       4          2             6

PROC Star
Posts: 1,760

Re: How to count different values in rows

data _V/view=_V;

  set FIRST;

  COUNT_1 = sum(of REVSmiley Happy;

  COUNT_MISS = nmiss(of REVSmiley Happy;

  COUNT_0 = n(of REVSmiley Happy - COUNT_1;

  drop REV:;

run;

proc means data=_V noprint;

  class ID TYPE ;

  var COUNT:;

  types ID   ID*TYPE;

  output out=SUM sum=;

run;

proc print data=SUM noobs;

  where _TYPE_=2;

  var ID COUNT:;

proc print data=SUM noobs;

  where _TYPE_=3;

  var ID TYPE COUNT:;

run;

ID

COUNT_1

COUNT_MISS

COUNT_0

101

4

2

6

102

6

2

4

ID

type

COUNT_1

COUNT_MISS

COUNT_0

101

a

1

1

2

101

b

2

0

2

101

c

1

1

2

102

b

2

0

2

102

c

2

1

1

102

d

2

1

1

Valued Guide
Posts: 765

Re: How to count different values in rows

Hi.  Thirteen statements versus fifteen in just a data step ... a virtual tie (with desired variable names from the data step).

ps Could have simplified the CALL MISSING statement by placing it after the appearance of the various TOTAL variables ...

data result1 (keep=id type countSmiley Happy result2 (keep=id totalSmiley Happy;

do until (last.id);

set first;

by id;

  count_1    = sum(of revSmiley Happy;

  count_miss = nmiss(of revSmiley Happy;

  count_0    = n(of revSmiley Happy - count_1;

  total_1    + count_1;

  total_0    + count_0;

  total_miss + count_miss;

  output result1;

end;

output result2;

call missing(of totalSmiley Happy;

run;

PROC Star
Posts: 1,760

Re: How to count different values in rows

What do you mean a tie? Two fewer! And I can rename the variables too, and put the drop statement as an option! Smiley Wink

Seriously, this was more about separating the data preparation from the calculations.

No better or worse, just different. Smiley Happy

Contributor
Posts: 65

Re: How to count different values in rows

data First;

input ID type$ rev1 rev2 rev3 rev4;

dat = _infile_;

var = substr(dat,index(dat," "));

cnt_0 = count(var,'0');

cnt_m = count(var,'.');

cnt_1 = count(var,'1');

drop dat var rev:;

cards;

101 a 0 0 . 1

101 b 1 1 0 0

101 c 0 .  1 0

102 b 1 0 1 0

102 c 1 0 .  1

102 d .  1 1 0

;

proc sql;

create table want as

select id, sum(cnt_0) as cnt_0,sum(cnt_m) as cnt_m,sum(cnt_1) as cnt_1 from First group by id;

quit;

PROC Star
Posts: 1,760

Re: How to count different values in rows

Posted in reply to MadhuKorni

data FIRST;

input ID TYPE $ REV1 REV2 REV3 REV4;

cards;

101 a 0 0 . 1

101 b 1 1 0 0

101 c 0 .  1 0

102 b 1 0 1 0

102 c 1 0 .  1

102 d .  1 1 0

run;

proc sql;

  create table SUM1(drop=TOTAL_N) as

  select ID

        ,sum(sum  ( REV1, REV2, REV3, REV4)) as TOTAL_1

        ,sum(cmiss( REV1, REV2, REV3, REV4)) as TOTAL_M

        ,sum(n    ( REV1, REV2, REV3, REV4)) as TOTAL_N

        ,calculated TOTAL_N - calculated TOTAL_1 as TOTAL_0

  from FIRST

  group by ID;

  create table SUM2(drop=TOTAL_N) as

  select ID

        ,TYPE

        ,sum(sum  ( REV1, REV2, REV3, REV4)) as TOTAL_1

        ,sum(cmiss( REV1, REV2, REV3, REV4)) as TOTAL_M

        ,sum(n    ( REV1, REV2, REV3, REV4)) as TOTAL_N

        ,calculated TOTAL_N - calculated TOTAL_1 as TOTAL_0

  from FIRST

  group by ID, TYPE;

quit;

Valued Guide
Posts: 765

Re: How to count different values in rows

Posted in reply to MadhuKorni

Hi ... since you can use CAT functions with numeric data, how about a different data step ...

data First;

input ID :$3. type :$1. rev1-rev4;

cnt_0 = count(catt(of revSmiley Happy,'0');

cnt_m = count(catt(of revSmiley Happy,'.');

cnt_1 = count(catt(of revSmiley Happy,'1');

drop rev:;

cards;

101 a 0 0 . 1

101 b 1 1 0 0

101 c 0 .  1 0

102 b 1 0 1 0

102 c 1 0 .  1

102 d .  1 1 0

;

Searching for Variable Values with CAT Functions: An Alternative to Arrays and Loops

http://www.albany.edu/~msz03/epi697/ho/cat_find_rev.pdf

and ...

http://www.sas.com/content/dam/SAS/en_ca/User%20Group%20Presentations/TASS/Zdeb-CATFunctions.pdf

Ask a Question
Discussion stats
  • 12 replies
  • 683 views
  • 8 likes
  • 8 in conversation