Help using Base SAS procedures

Counting cumulatively

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 12
Accepted Solution

Counting cumulatively

Hello,

I would like some help using the "COUNT" function or similar.

My problem:

I have data (Table 1) that I would like to convert into Table 2 by: Creating a new column for each "illness", cumulatively counting each illness by equipment no., the hours are "summarized"

Table 1:

HoursEquipment No.Illness
010A
1010A
1010C
1010D
10010B
10010E
100010A
100010B
100010C
10020A
100020B
100020C

Table 2:

HoursEquipment No.ABCDE
01010000
101020100
1001021101
10001032201
02000000
102000000
1002010000
10002011100

Any thoughts?

Thank you


Accepted Solutions
Solution
‎04-06-2012 12:36 AM
Super User
Posts: 9,662

Re: Counting cumulatively

data have;
input hours eq ill $;
retain n 1;
datalines;
0 10 B
25 10 A
86 10 A
86 10 B
86 10 D
225 10 A
225 10 B 
225 10 D
1264 10 A
1264 10 B
1264 10 D
1264 10 E
1264 10 C
0 20 A
16 20 A
57 20 A
57 20 B
57 20 D
57 20 C
161 20 A
161 20 B
161 20 C
161 20 E
887 20 A
887 20 C
887 20 E
;
run;

proc sort data=have ;by eq hours ;run;
proc transpose data=have out=x(drop=_:);
by eq hours ;
id ill;
var n;
run;
proc sql noprint;
 select distinct cats('_', ill,'+',ill) into : count separated by ';' from have ;
 select distinct cats('_', ill,'=0') into : zero separated by ';' from have ;
quit;
proc stdize data=x reponly missing=0 out=xx;run; 
data want(keep=eq hours _:);
 set xx;
 by eq;
 if first.eq then do; &zero; end;
 &count ;
run;



Ksharp

View solution in original post


All Replies
PROC Star
Posts: 7,356

Counting cumulatively

Not sure why you want to do this, and there is probably a more direct route, but here is one of many possibilities:

proc format;

  value $ill

      'A'='A'

      'B'='B'

      'C'='C'

      'D'='D'

      'E'='E'

  ;

  value hrs

     0=0

    10=10

    100=100

    1000=1000

  ;

run;

data have;

  input Hours Equipment_No Illness $;

  format Hours hrs.;

  format Illness $Ill.;

  cards;

0          10          A

10          10          A

10          10          C

10          10          D

100          10          B

100          10          E

1000          10          A

1000          10          B

1000          10          C

100          20          A

1000          20          B

1000          20          C

;

proc summary completetypes data=have nway;

  class equipment_no;

  class hours / preloadfmt;

  class illness / preloadfmt;

  output out=temp(drop=_type_ rename=_freq_=count);

run;

proc transpose data=temp

  out=want (drop=_Smiley Happy;

  var count;

  by Equipment_no hours;

run;

data want (drop=colSmiley Happy;

  set want;

  retain a b c d e;

  array columns col:;

  array ncol a--e;

  by Equipment_No;

  if first.Equipment_No then do i=1 to 5;

    ncol(i)=columns(i);

  end;

  else do i=1 to 5;

    ncol(i)+columns(i);

  end;

run;

Respected Advisor
Posts: 3,124

Counting cumulatively

Here is another mainly data step approach, I don't feel slick about this approach either.

data have1;

input Hours $ equipmentno$     Illness $;

cards;

0     10     A

10     10     A

10     10     C

10     10     D

100     10     B

100     10     E

1000     10     A

1000     10     B

1000     10     C

100     20     A

1000     20     B

1000     20     C

;

proc sql;

select distinct illness into :var separated by ' ' from have1;

select distinct hours into :hours separated by ',' from have1;

select distinct equipmentno into :eq separated by ',' from have1;

quit;

data have2;

length equipmentno hours $8;

  do equipmentno=&eq;

     do hours=&hours;

output;

end;

  end;

run;

data have2;

set have2;

equipmentno=left(equipmentno);

hours=left(hours);

run;

data have3;

merge have1 have2;

by equipmentno hours;

run;

     data want(drop=illSmiley Happy;

retain &var 0;

do until (last.equipmentno);

          set have3;

by equipmentno hours notsorted;

          array v &var;

do over v;

              if illness=vname(v) then v+1;

end;

if last.hours then output;

end;

        do over v;

          v=0;

        end;

run;

 

proc print;run;

Regards,

Haikuo

Respected Advisor
Posts: 4,640

Re: Counting cumulatively

My proposal:

data have;
input hours eq ill$;
datalines;
0 10 A
10 10 A
10 10 C
10 10 D
100 10 B
100 10 E
1000 10 A
1000 10 B
1000 10 C
100 20 A
1000 20 B
1000 20 C
;

proc freq data=have noprint;
tables eq*ill*hours / sparse out=have0(drop=PERCENT);
run;

data have1;
set have0;
by eq ill;
if first.ill then i = 0;
i + COUNT;
run;

proc sort data=have1; by eq hours ill;run;

proc transpose data=have1 out=want(drop=_Smiley Happy;
by eq hours;
id ill;
var i;
run;

proc print data=want; run;

PG

PG
Respected Advisor
Posts: 3,124

Re: Counting cumulatively

Very good, PG! a lot slicker than mine.

Haikuo

Occasional Contributor
Posts: 12

Re: Counting cumulatively

Thank you, all 3 proposals work!

Though I think i simplified the input data set too much to put here. Each equipment is checked at different hours so the intervals are different for each. A sample of an actual set of checks is below (Though I have about 250 different equipment No.s)

HoursEquipment no.Illness
010B
2510A
8610A
8610B
8610D
22510A
22510B
22510D
126410A
126410B
126410D
126410E
126410C
020A
1620A
5720A
5720B
5720D
5720C
16120A
16120B
16120C
16120E
88720A
88720C
88720E


So I would like to produce a summary table with only the hours relevant to each equipment.

Occasional Contributor
Posts: 12

Re: Counting cumulatively

So I've extended PG's proposal and I get what I'm looking for but it seems a roundabout way of doing it, any ideas of something "shorter" ?

data have;
input hours eq ill$;
datalines;
0 10 B
25 10 A
86 10 A
86 10 B
86 10 D
225 10 A
225 10 B 
225 10 D
1264 10 A
1264 10 B
1264 10 D
1264 10 E
1264 10 C
0 20 A
16 20 A
57 20 A
57 20 B
57 20 D
57 20 C
161 20 A
161 20 B
161 20 C
161 20 E
887 20 A
887 20 C
887 20 E

;

proc freq data=have noprint;
tables eq*ill*hours / sparse out=have0(drop=PERCENT);
run;

data have1;
set have0;
by eq ill;
if first.ill then i = 0;
i + COUNT;
run;

proc sort data=have1; by eq hours ill ;run;

proc transpose data=have1 out=want(drop=_Smiley Happy;
by eq hours;
id ill;
var i;
run;

data want2;
set want;
unique=eq||hours;
run;

data have2 (drop=ill);
set have;
UNIQUE=eq||hours;
proc sort nodupkey;
by eq hours;
run;

PROC SQL;
CREATE TABLE MERGED AS
SELECT *
FROM have2 LEFT JOIN want2
ON (have2.unique=want2.unique);
QUIT;

proc sort data=merged (drop=unique) nodupkey;
by eq hours;
run;

Solution
‎04-06-2012 12:36 AM
Super User
Posts: 9,662

Re: Counting cumulatively

data have;
input hours eq ill $;
retain n 1;
datalines;
0 10 B
25 10 A
86 10 A
86 10 B
86 10 D
225 10 A
225 10 B 
225 10 D
1264 10 A
1264 10 B
1264 10 D
1264 10 E
1264 10 C
0 20 A
16 20 A
57 20 A
57 20 B
57 20 D
57 20 C
161 20 A
161 20 B
161 20 C
161 20 E
887 20 A
887 20 C
887 20 E
;
run;

proc sort data=have ;by eq hours ;run;
proc transpose data=have out=x(drop=_:);
by eq hours ;
id ill;
var n;
run;
proc sql noprint;
 select distinct cats('_', ill,'+',ill) into : count separated by ';' from have ;
 select distinct cats('_', ill,'=0') into : zero separated by ';' from have ;
quit;
proc stdize data=x reponly missing=0 out=xx;run; 
data want(keep=eq hours _:);
 set xx;
 by eq;
 if first.eq then do; &zero; end;
 &count ;
run;



Ksharp

Occasional Contributor
Posts: 12

Re: Counting cumulatively

Thank you Ksharp. Just what I'm looking for. I'm constantly surprised at the large number of ways of doing the same thing.

But I'm struggling to follow the part of creating count and zero :

proc sql noprint;

select distinct cats('_', ill,'+',ill) into : count separated by ';' from have ;

select distinct cats('_', ill,'=0') into : zero separated by ';' from have ;

quit;


What is the need for the '_' ? I can't seem to visualize this. It also leaves a "_" plus the variable name in the final table. I've tried playing around with this to understand it and remove it, but no success...

Super Contributor
Posts: 1,636

Re: Counting cumulatively

Hi,

two macro variables &count and &zero were created by

"proc sql noprint;

select distinct cats('_', ill,'+',ill) into : count separated by ';' from have ;

select distinct cats('_', ill,'=0') into : zero separated by ';' from have ;

quit;"

&count has the value :  _a+a ; _b+b; _c+c; _d+d; _e+e;

&zero has the value:  _a=0;  _b=0;  _c=0;  _d=0;  _e=0;

so Ksharp's code is the same as

proc sort data=have ;by eq hours ;run;

proc transpose data=have out=x(drop=_Smiley Happy;

by eq hours ;

id ill;

var n;

run;

proc stdize data=x reponly missing=0 out=xx;run;

data want(keep=eq hours _Smiley Happy;

set xx;

by eq;

if first.eq then do;

             _a=0; _b=0; _c=0; _d=0; _e=0;

            end;

    _a+a ;_b+b;_c+c;_d+d;_e+e;

run;

Super User
Posts: 9,662

Re: Counting cumulatively

That is the charm of SAS.

Linlin has already expain it too.

Also You can use

%put &zero &count ;

to check what the macro variable's value are  .

Ksharp

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 506 views
  • 10 likes
  • 6 in conversation