BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
WesBarris
Obsidian | Level 7

I have a data set that includes severl "defect" columns.  Up to three defects can be recorded in each row.  I'm struggling to write code that will give me a summary of occurrances of all "defects".  For example, take the following data:

0,0,0

1,0,0

1,2,0

5,0,1

0,0,0

0,1,0

0,0,0

5,0,0

0,0,9

0,0,0

There are 10 rows.  Defect number 1 (one) occurs 4 times (40%), defect 2 occurs 1 time (10%), defect 5 occurs 2 times (20%).  I've put together some SAS code that will summarize one column.  I'm having trouble figuring out how to do this for defects in multiple columns.  I also want to add a textual discription of each defect to the output table but I want to get the numbers correct first.  Here is some working sample code.  Can anyone steer me in the right direction?

%let nanimals=10;

data check3;

   infile datalines dlm=',';

   input defect1 defect2 defect3;

   datalines;

0,0,0

1,0,0

1,2,0

5,0,1

0,0,0

0,1,0

0,0,0

5,0,0

0,0,9

0,0,0

;

proc sort; by defect1;

proc summary data=check3; by defect1;  where ((defect1>0 and defect1<20) or (defect2>0 and defect2<20) or (defect3>0 and defect3<20));

   var defect2 defect3;

   output out=check1 sum=defect2 defect3 n=total;

data check2; set check1;

   percent=100*total/&nanimals;

   format percent 5.1;

proc print noobs;

   var total percent;

title Defect Summary;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
Linlin
Lapis Lazuli | Level 10

another approach:

data check3;

   infile datalines dlm=',';

   input defect1 defect2 defect3;

   datalines;

0,0,0

1,0,0

1,2,0

5,0,1

0,0,0

0,1,0

0,0,0

5,0,0

0,0,9

0,0,0

;

data temp(keep=n defect);

  set check3 nobs=nobs;

  array df(*) def:;

  do _n_=1 to dim(df);

  defect=df(_n_);

  if defect>0 then  do;

  n=nobs;

  output; end;

  end;

  run;

proc sql;

   create table want as

     select distinct defect,count(*) as total,calculated total/n as ratio

        from temp

         group by defect;

quit;

proc print;run;

obs    defect    total    ratio

1        1        4       0.4

2        2        1       0.1

3        5        2       0.2

4        9        1       0.1

Linlin

View solution in original post

5 REPLIES 5
Haikuo
Onyx | Level 15

You could try using Array() to first get the number:

data check3;

   infile datalines dlm=',';

   input defect1 defect2 defect3;

   datalines;

0,0,0

1,0,0

1,2,0

5,0,1

0,0,0

0,1,0

0,0,0

5,0,0

0,0,9

0,0,0

;

data want (drop=defect:);

set check3 end=last nobs=nobs;

array df (*) _numeric_;

   do _n_=1 to dim(df);

     d1+ifn(df(_n_)=1,1,0);

d2+ifn(df(_n_)=2,1,0);

d5+ifn(df(_n_)=5,1,0);

end;

if last then do;

    d1_rate=d1/nobs;

d2_rate=d2/nobs;

d5_rate=d5/nobs;

    output;

end;

run;

proc print;run;

Regards,

Haikuo

Edit to add the ratio.

HB
Barite | Level 11 HB
Barite | Level 11

If that was an SQL data set, I would run this:

select defect, count(defect)/counter as defect_percent

from (

SELECT one as defect, (SELECT count(one) from defects) as counter from defects

union all

SELECT two as defect,  (SELECT count(two) from defects) from defects

union all

SELECT three as defect, (SELECT count(three) from defects) from defects

)

where defect>0

group by defect, counter

order by defect

which yeilds this:

defectdefect_percent
10.4
20.1
50.2
90.1
Query14

I'm not good enough at SAS yet to integrate that but maybe you are.  Just an idea.

Linlin
Lapis Lazuli | Level 10

another approach:

data check3;

   infile datalines dlm=',';

   input defect1 defect2 defect3;

   datalines;

0,0,0

1,0,0

1,2,0

5,0,1

0,0,0

0,1,0

0,0,0

5,0,0

0,0,9

0,0,0

;

data temp(keep=n defect);

  set check3 nobs=nobs;

  array df(*) def:;

  do _n_=1 to dim(df);

  defect=df(_n_);

  if defect>0 then  do;

  n=nobs;

  output; end;

  end;

  run;

proc sql;

   create table want as

     select distinct defect,count(*) as total,calculated total/n as ratio

        from temp

         group by defect;

quit;

proc print;run;

obs    defect    total    ratio

1        1        4       0.4

2        2        1       0.1

3        5        2       0.2

4        9        1       0.1

Linlin

ballardw
Super User

The text for descriptions is the easy part. Proc Format to the rescue!

Proc format;

value defect

0 = 'No defect'

1 = 'Text describing defect 1'

2 = 'Text describing defect 2'

(repeat as needed)

;

run;

Then in your final print or other report associate the defect variable(s) with the format.

format defect defect.;

WesBarris
Obsidian | Level 7

Thanks Ballardw.  I did not know how to do this using proc format.  Instead, I was merging tables of descriptions.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1000 views
  • 3 likes
  • 5 in conversation