<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic REFLINE by variable with a large dataset in Graphics Programming</title>
    <link>https://communities.sas.com/t5/Graphics-Programming/REFLINE-by-variable-with-a-large-dataset/m-p/607797#M19161</link>
    <description>&lt;P&gt;I am trying to draw a number of histograms, using SGPANEL and PANELBY (SAS 9.4M4 within EG). I would also like to add, to each histogram, a vertical REFLINE showing the location of the 80th percentile for each of the BY groups. I have these values stored in my dataset in a variable called P80.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SGPANEL DATA=example;
	PANELBY a_type / COLUMNS=2 ROWS=3 SPACING=10;
	HISTOGRAM a_quantity;
	REFLINE P80 / AXIS = X LEGENDLABEL="P80" NAME="pline";
	KEYLEGEND "pline" / POSITION=TOP;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;My problem is that my code to draw the graphs and save them to a PDF, which runs in a couple minutes without the REFLINE or with the REFLINE set to a constant value, takes upwards of half an hour with the REFLINE referring to a variable. I think it is related to the size of my dataset: around 25 BY groups, with around 100k observations in each group. My guess is that by calling REFLINE with reference to a variable, SAS is checking the value for each observation, even though I only want it to draw one line. It doesn't seem to make any difference if I only store the P80 value once per BY group.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there a way I can tell SAS I only need one line per panel? Or some other way to speed this up? I would really like to be able to keep multiple panels per page if possible.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
    <pubDate>Wed, 27 Nov 2019 18:00:25 GMT</pubDate>
    <dc:creator>astermi</dc:creator>
    <dc:date>2019-11-27T18:00:25Z</dc:date>
    <item>
      <title>REFLINE by variable with a large dataset</title>
      <link>https://communities.sas.com/t5/Graphics-Programming/REFLINE-by-variable-with-a-large-dataset/m-p/607797#M19161</link>
      <description>&lt;P&gt;I am trying to draw a number of histograms, using SGPANEL and PANELBY (SAS 9.4M4 within EG). I would also like to add, to each histogram, a vertical REFLINE showing the location of the 80th percentile for each of the BY groups. I have these values stored in my dataset in a variable called P80.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SGPANEL DATA=example;
	PANELBY a_type / COLUMNS=2 ROWS=3 SPACING=10;
	HISTOGRAM a_quantity;
	REFLINE P80 / AXIS = X LEGENDLABEL="P80" NAME="pline";
	KEYLEGEND "pline" / POSITION=TOP;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;My problem is that my code to draw the graphs and save them to a PDF, which runs in a couple minutes without the REFLINE or with the REFLINE set to a constant value, takes upwards of half an hour with the REFLINE referring to a variable. I think it is related to the size of my dataset: around 25 BY groups, with around 100k observations in each group. My guess is that by calling REFLINE with reference to a variable, SAS is checking the value for each observation, even though I only want it to draw one line. It doesn't seem to make any difference if I only store the P80 value once per BY group.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there a way I can tell SAS I only need one line per panel? Or some other way to speed this up? I would really like to be able to keep multiple panels per page if possible.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Wed, 27 Nov 2019 18:00:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Graphics-Programming/REFLINE-by-variable-with-a-large-dataset/m-p/607797#M19161</guid>
      <dc:creator>astermi</dc:creator>
      <dc:date>2019-11-27T18:00:25Z</dc:date>
    </item>
    <item>
      <title>Re: REFLINE by variable with a large dataset</title>
      <link>https://communities.sas.com/t5/Graphics-Programming/REFLINE-by-variable-with-a-large-dataset/m-p/608719#M19173</link>
      <description>&lt;P&gt;I don't know how to get the program to go faster, but here is one suggestion. You are probably plotting 100k reference lines for every cell in the panel.&amp;nbsp; You probably merged the percentiles and the data like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data Have;
call streaminit(1);
do type = 1 to 6;
   do i = 1 to 100000;
      x = rand("Lognormal");
      output;
   end;
end;
run;

proc means data=Have noprint;
   by type;
   var x;
   output out=Pctl p80=p80;
run;

/* 100k reflines drawn for each cell */
data example;
merge Have Pctl;
by Type;
run;

PROC SGPANEL DATA=example;
	PANELBY type / COLUMNS=2 ROWS=3 SPACING=10;
	HISTOGRAM x;
	REFLINE P80 / AXIS = X LEGENDLABEL="P80" NAME="pline";
	KEYLEGEND "pline" / POSITION=TOP;
RUN;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Instead, set the P80 variable to missing except for one observation. That will cause only one reference line to be drawn, like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;/* 1 refline drawn for each cell */
data example;
merge Have Pctl;
by Type;
if NOT first.Type then P80=.;
run;

PROC SGPANEL DATA=example;
	PANELBY type / COLUMNS=2 ROWS=3 SPACING=10;
	HISTOGRAM x;
	REFLINE P80 / AXIS = X LEGENDLABEL="P80" NAME="pline";
	KEYLEGEND "pline" / POSITION=TOP;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The program still have to look at every observation to see if there is a valid refline value, but only one value is actually drawn.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 02 Dec 2019 15:02:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Graphics-Programming/REFLINE-by-variable-with-a-large-dataset/m-p/608719#M19173</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2019-12-02T15:02:43Z</dc:date>
    </item>
    <item>
      <title>Re: REFLINE by variable with a large dataset</title>
      <link>https://communities.sas.com/t5/Graphics-Programming/REFLINE-by-variable-with-a-large-dataset/m-p/609186#M19189</link>
      <description>&lt;P&gt;Yeah, I had tried that but it didn't seem to make much of a difference. What did eventually seem to work was removing the PANELBY statement and using an explicit loop through each of my groups. Then I could grab the appropriate value for each group and set the REFLINE to that value rather than a variable. This brought my time back down to a couple minutes per run.&lt;/P&gt;</description>
      <pubDate>Tue, 03 Dec 2019 21:30:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Graphics-Programming/REFLINE-by-variable-with-a-large-dataset/m-p/609186#M19189</guid>
      <dc:creator>astermi</dc:creator>
      <dc:date>2019-12-03T21:30:37Z</dc:date>
    </item>
  </channel>
</rss>

