Re: How to "ghost" points in calculating baseline limits & centerline ...

Rodcjones · Posted 07-23-2017 11:43 AM

I am seeking to implement and automate (to the extent possible) the guidelines regarding calculating and extending baseline centerlines in SPC charts as descrbed in the The Health Care Data Guide: Learning from Data for Improvement by Lloyd P. Provost & Sandra Murray, p. 123-125. The problematic paragraphs are

<<If it is desirable to extend the limits after they are calculated, any points affected by special causes should be removed from the baseline and the limits recalculated. The limits should only be extended when they are calculated using data without special causes . . . When recalculating the limits so that they can be extended, the special cause data should not be used, but it is usually desirable to leave the data points affected by special cause on the chart. One way to do this is to "cause" or "ghost" the data points affected by the special cause . . . . "Ghosting" here means leaving the data visible on the graph but excluding it from teh calculation of the mean and limits. The limits revised by this technique will reflect only common cause variation in the process and may now be extended into the future.>>

I am comfortable generating limits, extending baselines, and displaying the charts with SAS. And I can think of ways to "ghost" out special cause points when calcuating a mean (for example) and then using that value to overwrite what is in a Limits data set. But that right now it is a clunky and manual process.

I wanted to pose the question whether any fellow users had come up with efficient ways to take baseline calculations and remove special cause from centerline and limits calcuations but retain the point in graphic output. I'm pasting some of the relevant code (in this case for a p chart) just for the step of generating the baseline statistics if it helps with context.

Thank you!

ods output crosslist=MONTH&run;

PROC FREQ DATA=&ds

 (where=( DATEPART(DISCH_DATE_TIME) GE "&start"D AND DATEPART(DISCH_DATE_TIME) Le "&end"D)) ;

TABLES _YEAR_MONTH*&v/CROSSLIST ;

&subset;

run;

data month&run.V;

set month&run;

if &v=" &value";

if _year_month ne '';

run;

　

data month&run.p;

set month&run;

if F_&v="Total";

if _year_month ne '';

run;

proc sql; create table month&run.vp1 as

select month&run.v._year_month, month&run.v.frequency as &v label="&label (N)", month&run.p.frequency as Population label="Population", 

month&run.v.rowpercent as Percentage label="&label (%)"

from month&run.v right join month&run.p on

month&run.v._year_month=month&run.p._year_month;

QUIT;

　

　

goptions reset=goptions ;

ods graphics on/

 imagefmt=png reset iMAGENAME="&v &period" height=4.5in 

border=off ;

title;

symbol v=dot width=2;

ods listing gpath="\\Data";

;proc shewhart data=month&run.vp1 ;

pchart &v*_YEAR_MONTH / subgroupn = population

 ODSTITLE= "&chart of &label, &period by &unit"

 

 totpanels=1 DISCRETE skiphlabels=1 markers

yscale=percent

 tests=1 to 4

testnmethod=standardize

CTEXT=BLACK

CTESTS = (1 BLACK 2 blue 3 ORANGE 4 GREY)

TESTLABEL1="*"

TESTLABEL2="9 pts on 1 side"

TESTLABEL3="6 point trend"

TESTLABEL4="alternating pattern" ;

;

label &v ="&LABEL (%)";

RUN;

BuckyRansdell · Posted 08-08-2017 04:01 PM

PROC SHEWHART recognizes four switch variables that you can use to control which subgroups are included in computations and which are displayed on control charts: http://go.documentation.sas.com/?docsetId=qcug&docsetTarget=qcug_shewhart_sect492.htm&docsetVersion=....

This code runs PROC SHEWHART and produces OUTHISTORY= and OUTTABLE= data sets. Subgroups that are outside the control limits are identified by the _EXLIM_ variable in the OUTTABLE= data set. The DATA steps set the _COMP_ flag to eliminate those subgroups from the computation, then merge the result with the OUTHISTORY= data set. The resulting data set is used as a HISTORY= input data set to a second run of PROC SHEWHART, which computes control limits with subgroups outside the original limits eliminated from the computation.

data foo;
   do group = 1 to 15;
      do i = 1 to 5;
         x = rannor(123);
	 if group eq 5 then x + 2;
	 output;
      end;
   end;
   drop i;
run;

ods html file='switch.htm';
ods graphics / imagemap;

proc shewhart data=foo;
   xchart x * group /
      markers cout
      outhistory=foohist outtable=footab
      odstitle='Control Limits Computed Using All Subgroups';
run;

data footab;
   set footab;
   if _EXLIM_ ne '' then
      _COMP_='N';
   else 
      _COMP_='Y';
   keep group _COMP_;
run;

data foohist2;
   merge foohist footab;
run;

proc shewhart history=foohist2;
   xchart x * group /
      markers cout
      odstitle='Control Limits Computed With Special Causes Omitted';
run;

ods html close;

Here are the two charts. Note that the displayed subgroups are the same but the control limits differ.

I should point out that when you request tests for special causes the OUTTABLE= data set will contain a variable called _TESTS_ that flags the subgroups for which the various tests are positive. You can write DATA step code to set the _COMP_ flag based on the _TESTS_ value instead of the _EXLIM_ value.

Rodcjones · Posted 08-14-2017 10:20 AM

Thank you, Bucky. This is very helpful and opened up a lot of new info for me.
One follow up I have is that this method doesn’t seem to work for p charts or X-bar S charts. I assume that’s because those are driven off the record level values rather than those summarized by group. But not sure. I’m pasting example below.
Is there an alternative that will work for p and X-bar S charts? Or maybe I’m mistaken and I can get pointed in the right direction.

proc sql;
   create table work.month4vp1
       (_year_month char(7),
        defects num(8),population num(8),percentage num(8));

insert into work.month4vp1
values('2013-10',87,115,75.65)
values('2013-11',59,97,60.82)
values('2013-12',56,102,54.9)
values('2014-01',15,39,38.46)
values('2014-02',24,40,60)
values('2014-03',34,62,54.84)
values('2014-04',41,72,56.94)
values('2014-05',55,76,72.37)
values('2014-06',21,35,60)
values('2014-07',29,52,55.77)
values('2014-08',75,123,60.98)
values('2014-09',144,288,50)
values('2014-10',76,129,58.91)
values('2014-11',38,83,45.78)
values('2014-12',46,70,65.71)
values('2015-01',36,50,72)
values('2015-02',31,53,58.49)
values('2015-03',51,83,61.45)
values('2015-04',32,70,45.71)
values('2015-05',77,117,65.81);
quit;

***run it first without ghosting the 2013-10 group;

;proc shewhart data=month4vp1
;
   pchart defects*_YEAR_MONTH / subgroupn = population
       totpanels=1 DISCRETE skiphlabels=1 markers
   yscale=percent
   tests=1 to 4
   testnmethod=standardize
CTEXT=BLACK
CTESTS = (1 BLACK 2 blue 3 ORANGE 4 GREY)
test2run=8
TESTLABEL1="*"
TESTLABEL2="8 pts on 1 side"
TESTLABEL3="6 point trend"
TESTLABEL4="alternating pattern" ;
      RUN;

*now ghost out using _exlim_ and _comp_;
******************************;
proc shewhart data=month4vp1 ;
pchart defects*_YEAR_MONTH / subgroupn = population
nochart outhistory=hist4 outtable=tab4
;
run;

data tab4;
   set tab4;
   if _EXLIM_ ne '' then
      _COMP_='N';
   else
      _COMP_='Y';
   keep group _COMP_;
run;

data history4;
merge hist4 tab4;
run;

;proc shewhart data=month4vp1
history=history4;
   pchart defects*_YEAR_MONTH / subgroupn = population
       totpanels=1 DISCRETE skiphlabels=1 markers
   yscale=percent
   tests=1 to 4
   testnmethod=standardize
CTEXT=BLACK
CTESTS = (1 BLACK 2 blue 3 ORANGE 4 GREY)
test2run=8
TESTLABEL1="*"
TESTLABEL2="8 pts on 1 side"
TESTLABEL3="6 point trend"
TESTLABEL4="alternating pattern" ;
      RUN;

Rodcjones · Posted 09-07-2017 10:20 AM

Liz Edwards at SAS tech support worked through this with me and determined my use of the original data set in a data= statement in the last PROC SHEWHART was the fatal flaw. Deleting it and going only with the history= statement gives the desired results.

proc shewhart data=month4vp1
history=history4;

How to "ghost" points in calculating baseline limits & centerline in PROC SHEWHART but display them

Re: How to "ghost" points in calculating baseline limits & centerline in PROC SHEWHART

Re: How to "ghost" points in calculating baseline limits & centerline in PROC SHEWHART

Re: How to "ghost" points in calculating baseline limits & centerline in PROC SHEWHART

How to "ghost" points in calculating baseline limits & centerline in PROC SHEWHART but display them

Re: How to "ghost" points in calculating baseline limits & centerline in PROC SHEWHART

Re: How to "ghost" points in calculating baseline limits & centerline in PROC SHEWHART

Re: How to "ghost" points in calculating baseline limits & centerline in PROC SHEWHART

Registration is open

SAS Training: Just a Click Away