BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Longduration
Calcite | Level 5

Hello,

I would like some help using the "COUNT" function or similar.

My problem:

I have data (Table 1) that I would like to convert into Table 2 by: Creating a new column for each "illness", cumulatively counting each illness by equipment no., the hours are "summarized"

Table 1:

HoursEquipment No.Illness
010A
1010A
1010C
1010D
10010B
10010E
100010A
100010B
100010C
10020A
100020B
100020C

Table 2:

HoursEquipment No.ABCDE
01010000
101020100
1001021101
10001032201
02000000
102000000
1002010000
10002011100

Any thoughts?

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input hours eq ill $;
retain n 1;
datalines;
0 10 B
25 10 A
86 10 A
86 10 B
86 10 D
225 10 A
225 10 B 
225 10 D
1264 10 A
1264 10 B
1264 10 D
1264 10 E
1264 10 C
0 20 A
16 20 A
57 20 A
57 20 B
57 20 D
57 20 C
161 20 A
161 20 B
161 20 C
161 20 E
887 20 A
887 20 C
887 20 E
;
run;

proc sort data=have ;by eq hours ;run;
proc transpose data=have out=x(drop=_:);
by eq hours ;
id ill;
var n;
run;
proc sql noprint;
 select distinct cats('_', ill,'+',ill) into : count separated by ';' from have ;
 select distinct cats('_', ill,'=0') into : zero separated by ';' from have ;
quit;
proc stdize data=x reponly missing=0 out=xx;run; 
data want(keep=eq hours _:);
 set xx;
 by eq;
 if first.eq then do; &zero; end;
 &count ;
run;



Ksharp

View solution in original post

10 REPLIES 10
art297
Opal | Level 21

Not sure why you want to do this, and there is probably a more direct route, but here is one of many possibilities:

proc format;

  value $ill

      'A'='A'

      'B'='B'

      'C'='C'

      'D'='D'

      'E'='E'

  ;

  value hrs

     0=0

    10=10

    100=100

    1000=1000

  ;

run;

data have;

  input Hours Equipment_No Illness $;

  format Hours hrs.;

  format Illness $Ill.;

  cards;

0          10          A

10          10          A

10          10          C

10          10          D

100          10          B

100          10          E

1000          10          A

1000          10          B

1000          10          C

100          20          A

1000          20          B

1000          20          C

;

proc summary completetypes data=have nway;

  class equipment_no;

  class hours / preloadfmt;

  class illness / preloadfmt;

  output out=temp(drop=_type_ rename=_freq_=count);

run;

proc transpose data=temp

  out=want (drop=_:);

  var count;

  by Equipment_no hours;

run;

data want (drop=col:);

  set want;

  retain a b c d e;

  array columns col:;

  array ncol a--e;

  by Equipment_No;

  if first.Equipment_No then do i=1 to 5;

    ncol(i)=columns(i);

  end;

  else do i=1 to 5;

    ncol(i)+columns(i);

  end;

run;

Haikuo
Onyx | Level 15

Here is another mainly data step approach, I don't feel slick about this approach either.

data have1;

input Hours $ equipmentno$     Illness $;

cards;

0     10     A

10     10     A

10     10     C

10     10     D

100     10     B

100     10     E

1000     10     A

1000     10     B

1000     10     C

100     20     A

1000     20     B

1000     20     C

;

proc sql;

select distinct illness into :var separated by ' ' from have1;

select distinct hours into :hours separated by ',' from have1;

select distinct equipmentno into :eq separated by ',' from have1;

quit;

data have2;

length equipmentno hours $8;

  do equipmentno=&eq;

     do hours=&hours;

output;

end;

  end;

run;

data have2;

set have2;

equipmentno=left(equipmentno);

hours=left(hours);

run;

data have3;

merge have1 have2;

by equipmentno hours;

run;

     data want(drop=ill:);

retain &var 0;

do until (last.equipmentno);

          set have3;

by equipmentno hours notsorted;

          array v &var;

do over v;

              if illness=vname(v) then v+1;

end;

if last.hours then output;

end;

        do over v;

          v=0;

        end;

run;

 

proc print;run;

Regards,

Haikuo

PGStats
Opal | Level 21

My proposal:

data have;
input hours eq ill$;
datalines;
0 10 A
10 10 A
10 10 C
10 10 D
100 10 B
100 10 E
1000 10 A
1000 10 B
1000 10 C
100 20 A
1000 20 B
1000 20 C
;

proc freq data=have noprint;
tables eq*ill*hours / sparse out=have0(drop=PERCENT);
run;

data have1;
set have0;
by eq ill;
if first.ill then i = 0;
i + COUNT;
run;

proc sort data=have1; by eq hours ill;run;

proc transpose data=have1 out=want(drop=_:);
by eq hours;
id ill;
var i;
run;

proc print data=want; run;

PG

PG
Haikuo
Onyx | Level 15

Very good, PG! a lot slicker than mine.

Haikuo

Longduration
Calcite | Level 5

Thank you, all 3 proposals work!

Though I think i simplified the input data set too much to put here. Each equipment is checked at different hours so the intervals are different for each. A sample of an actual set of checks is below (Though I have about 250 different equipment No.s)

HoursEquipment no.Illness
010B
2510A
8610A
8610B
8610D
22510A
22510B
22510D
126410A
126410B
126410D
126410E
126410C
020A
1620A
5720A
5720B
5720D
5720C
16120A
16120B
16120C
16120E
88720A
88720C
88720E


So I would like to produce a summary table with only the hours relevant to each equipment.

Longduration
Calcite | Level 5

So I've extended PG's proposal and I get what I'm looking for but it seems a roundabout way of doing it, any ideas of something "shorter" ?

data have;
input hours eq ill$;
datalines;
0 10 B
25 10 A
86 10 A
86 10 B
86 10 D
225 10 A
225 10 B 
225 10 D
1264 10 A
1264 10 B
1264 10 D
1264 10 E
1264 10 C
0 20 A
16 20 A
57 20 A
57 20 B
57 20 D
57 20 C
161 20 A
161 20 B
161 20 C
161 20 E
887 20 A
887 20 C
887 20 E

;

proc freq data=have noprint;
tables eq*ill*hours / sparse out=have0(drop=PERCENT);
run;

data have1;
set have0;
by eq ill;
if first.ill then i = 0;
i + COUNT;
run;

proc sort data=have1; by eq hours ill ;run;

proc transpose data=have1 out=want(drop=_:);
by eq hours;
id ill;
var i;
run;

data want2;
set want;
unique=eq||hours;
run;

data have2 (drop=ill);
set have;
UNIQUE=eq||hours;
proc sort nodupkey;
by eq hours;
run;

PROC SQL;
CREATE TABLE MERGED AS
SELECT *
FROM have2 LEFT JOIN want2
ON (have2.unique=want2.unique);
QUIT;

proc sort data=merged (drop=unique) nodupkey;
by eq hours;
run;

Ksharp
Super User
data have;
input hours eq ill $;
retain n 1;
datalines;
0 10 B
25 10 A
86 10 A
86 10 B
86 10 D
225 10 A
225 10 B 
225 10 D
1264 10 A
1264 10 B
1264 10 D
1264 10 E
1264 10 C
0 20 A
16 20 A
57 20 A
57 20 B
57 20 D
57 20 C
161 20 A
161 20 B
161 20 C
161 20 E
887 20 A
887 20 C
887 20 E
;
run;

proc sort data=have ;by eq hours ;run;
proc transpose data=have out=x(drop=_:);
by eq hours ;
id ill;
var n;
run;
proc sql noprint;
 select distinct cats('_', ill,'+',ill) into : count separated by ';' from have ;
 select distinct cats('_', ill,'=0') into : zero separated by ';' from have ;
quit;
proc stdize data=x reponly missing=0 out=xx;run; 
data want(keep=eq hours _:);
 set xx;
 by eq;
 if first.eq then do; &zero; end;
 &count ;
run;



Ksharp

Longduration
Calcite | Level 5

Thank you Ksharp. Just what I'm looking for. I'm constantly surprised at the large number of ways of doing the same thing.

But I'm struggling to follow the part of creating count and zero :

proc sql noprint;

select distinct cats('_', ill,'+',ill) into : count separated by ';' from have ;

select distinct cats('_', ill,'=0') into : zero separated by ';' from have ;

quit;


What is the need for the '_' ? I can't seem to visualize this. It also leaves a "_" plus the variable name in the final table. I've tried playing around with this to understand it and remove it, but no success...

Linlin
Lapis Lazuli | Level 10

Hi,

two macro variables &count and &zero were created by

"proc sql noprint;

select distinct cats('_', ill,'+',ill) into : count separated by ';' from have ;

select distinct cats('_', ill,'=0') into : zero separated by ';' from have ;

quit;"

&count has the value :  _a+a ; _b+b; _c+c; _d+d; _e+e;

&zero has the value:  _a=0;  _b=0;  _c=0;  _d=0;  _e=0;

so Ksharp's code is the same as

proc sort data=have ;by eq hours ;run;

proc transpose data=have out=x(drop=_:);

by eq hours ;

id ill;

var n;

run;

proc stdize data=x reponly missing=0 out=xx;run;

data want(keep=eq hours _:);

set xx;

by eq;

if first.eq then do;

             _a=0; _b=0; _c=0; _d=0; _e=0;

            end;

    _a+a ;_b+b;_c+c;_d+d;_e+e;

run;

Ksharp
Super User

That is the charm of SAS.

Linlin has already expain it too.

Also You can use

%put &zero &count ;

to check what the macro variable's value are  .

Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1282 views
  • 10 likes
  • 6 in conversation