<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Scoring clusters in SAS Studio</title>
    <link>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250708#M217</link>
    <description>&lt;P&gt;I don't get any error message when I run your code, after I remove the missing=0 option.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;33
34   proc distance data=xl.'roaddata1$'n out=distmatrix method=Euclid;
35   var interval (FACT_AADT/ std=std);
36   id SEGMENT_ID;
37   run;

NOTE: The data set WORK.DISTMATRIX has 7999 observations and 8000 variables.
NOTE: PROCEDURE DISTANCE used (Total process time):
      real time           10.11 seconds
      cpu time            3.93 seconds


38
39   proc cluster data=distmatrix outtree=trees
40   method=ward noprint;
41   id SEGMENT_ID;
42   run;

NOTE: The input data set is a TYPE=DISTANCE data set. For such a data set, the
      procedure requires that the order of the rows match the order of the
      variables.
NOTE: Input distances have been squared.
WARNING: Ties for minimum distance between clusters have been detected at 6852
         level(s) in the cluster history.
NOTE: The data set WORK.TREES has 15997 observations and 11 variables.
NOTE: PROCEDURE CLUSTER used (Total process time):
      real time           10.18 seconds
      cpu time            10.09 seconds


43
44   proc tree data=trees noprint out=Final n=10;
45   id SEGMENT_ID;
46   run;

NOTE: The data set WORK.FINAL has 7999 observations and 3 variables.
NOTE: PROCEDURE TREE used (Total process time):
      real time           0.26 seconds
      cpu time            0.09 seconds
&lt;/PRE&gt;</description>
    <pubDate>Wed, 17 Feb 2016 20:44:09 GMT</pubDate>
    <dc:creator>PGStats</dc:creator>
    <dc:date>2016-02-17T20:44:09Z</dc:date>
    <item>
      <title>Scoring clusters</title>
      <link>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250615#M214</link>
      <description>&lt;P&gt;Hello, I&amp;nbsp;am currently using SAS university and I keep getting the error&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;"ERROR: Missing values are not allowed in the lower triangle of a distance data set."&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;I am trying to create a ranking system for a series of roads based on the number of daily cars. Basically I want to cluster the data to find more natural breaking points and get a score of 1-10.&amp;nbsp;I've tried the "missing=" function with no luck. Could this be a SAS univeristy issue?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;FILENAME REFFILE "/folders/myfolders/roaddata1.csv" TERMSTR=CR;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;PROC IMPORT DATAFILE=REFFILE&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;DBMS=CSV&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;OUT=roaddata1;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;GETNAMES=YES;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;RUN;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;PROC CONTENTS DATA=roaddata1; RUN;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;EM&gt;%web_open_table(WORK.IMPORT);&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;proc distance data=roaddata1 out=distmatrix method=Euclid;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;var interval (FACT_AADT/ std=std) missing=0;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;id SEGMENT_ID;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;run;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;EM&gt;proc cluster data=distmatrix outtree=trees&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;method=ward noprint;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;id SEGMENT_ID;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;run;&lt;/EM&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;proc tree data=trees noprint out=Final n=10;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;id SEGMENT_ID;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;THANK YOU!&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;(I had to convert the file from .csv to .xlsx for upload. How is .csv not a valid upload for a data community!?)&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 13:49:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250615#M214</guid>
      <dc:creator>Brendankjansdkj</dc:creator>
      <dc:date>2016-02-17T13:49:58Z</dc:date>
    </item>
    <item>
      <title>Re: Scoring clusters</title>
      <link>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250688#M215</link>
      <description>&lt;P&gt;I can see only two modes in your data. Why not take deciles&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;?&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/1918i2122BF5F5C1099B9/image-size/medium?v=mpbl-1&amp;amp;px=-1" border="0" alt="SGPlot.png" title="SGPlot.png" /&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 19:54:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250688#M215</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2016-02-17T19:54:16Z</dc:date>
    </item>
    <item>
      <title>Re: Scoring clusters</title>
      <link>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250698#M216</link>
      <description>&lt;P&gt;Thank you PG. Deciles will work fine for these data as a backup. I guess I just like&amp;nbsp;clustering and I'm a believer that the numbers are maybe a little bit more meaningful than just decile cuts.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm absoultely confused whats wrong with these data to get that error&amp;nbsp;message as there is no missing anything. For future reference for myself, I may wait to see if somebody wants to explain to me whats wrong with the script.&amp;nbsp;Otherwise I may have to figure out how to script this is R for similar scoring projects...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Best&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 20:29:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250698#M216</guid>
      <dc:creator>Brendankjansdkj</dc:creator>
      <dc:date>2016-02-17T20:29:59Z</dc:date>
    </item>
    <item>
      <title>Re: Scoring clusters</title>
      <link>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250708#M217</link>
      <description>&lt;P&gt;I don't get any error message when I run your code, after I remove the missing=0 option.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;33
34   proc distance data=xl.'roaddata1$'n out=distmatrix method=Euclid;
35   var interval (FACT_AADT/ std=std);
36   id SEGMENT_ID;
37   run;

NOTE: The data set WORK.DISTMATRIX has 7999 observations and 8000 variables.
NOTE: PROCEDURE DISTANCE used (Total process time):
      real time           10.11 seconds
      cpu time            3.93 seconds


38
39   proc cluster data=distmatrix outtree=trees
40   method=ward noprint;
41   id SEGMENT_ID;
42   run;

NOTE: The input data set is a TYPE=DISTANCE data set. For such a data set, the
      procedure requires that the order of the rows match the order of the
      variables.
NOTE: Input distances have been squared.
WARNING: Ties for minimum distance between clusters have been detected at 6852
         level(s) in the cluster history.
NOTE: The data set WORK.TREES has 15997 observations and 11 variables.
NOTE: PROCEDURE CLUSTER used (Total process time):
      real time           10.18 seconds
      cpu time            10.09 seconds


43
44   proc tree data=trees noprint out=Final n=10;
45   id SEGMENT_ID;
46   run;

NOTE: The data set WORK.FINAL has 7999 observations and 3 variables.
NOTE: PROCEDURE TREE used (Total process time):
      real time           0.26 seconds
      cpu time            0.09 seconds
&lt;/PRE&gt;</description>
      <pubDate>Wed, 17 Feb 2016 20:44:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250708#M217</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2016-02-17T20:44:09Z</dc:date>
    </item>
    <item>
      <title>Re: Scoring clusters</title>
      <link>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250719#M218</link>
      <description>&lt;P&gt;Interesting. Did you run that on the desktop version or in sas studio? Using SAS Studio I ran this...&lt;/P&gt;&lt;PRE&gt;FILENAME REFFILE "/folders/myfolders/roaddata1.csv" TERMSTR=CR;
PROC IMPORT DATAFILE=REFFILE
DBMS=CSV
OUT=roaddata1;
GETNAMES=YES;
RUN;
PROC CONTENTS DATA=roaddata1; RUN;

%web_open_table(WORK.IMPORT);
 
proc distance data=roaddata1 out=distmatrix method=Euclid;
var interval (FACT_AADT/ std=std);
id SEGMENT_ID;
run;

proc cluster data=distmatrix outtree=trees
method=ward noprint;
id SEGMENT_ID;
run;

proc tree data=trees noprint out=Final n=10;
id SEGMENT_ID;&lt;/PRE&gt;&lt;P&gt;and got this.... (I'm thinking it is a SAS Studio Problem)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;PRE&gt;NOTE: The data set WORK.DISTMATRIX has 8000 observations and 8001 variables.
 NOTE: PROCEDURE DISTANCE used (Total process time):
       real time           17.06 seconds
       cpu time            0.80 seconds
       
 
 91         
 92         proc cluster data=distmatrix outtree=trees
 93         method=ward noprint;
 94         id SEGMENT_ID;
 95         run;
 
 NOTE: The input data set is a TYPE=DISTANCE data set. For such a data set, the procedure requires that the order of the rows match 
       the order of the variables.
 ERROR: Missing values are not allowed in the lower triangle of a distance data set.
 NOTE: The SAS System stopped processing this step because of errors.
 WARNING: The data set WORK.TREES may be incomplete.  When this step was stopped there were 0 observations and 11 variables.
 NOTE: PROCEDURE CLUSTER used (Total process time):
       real time           58.22 seconds
       cpu time            1.48 seconds
       
 96         
 
 
 97         proc tree data=trees noprint out=Final n=10;
 98         id SEGMENT_ID;
 99         
 100        OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 WARNING: The data set WORK.FINAL may be incomplete.  When this step was stopped there were 0 observations and 3 variables.
 112        &lt;/PRE&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;</description>
      <pubDate>Wed, 17 Feb 2016 21:03:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250719#M218</guid>
      <dc:creator>Brendankjansdkj</dc:creator>
      <dc:date>2016-02-17T21:03:35Z</dc:date>
    </item>
    <item>
      <title>Re: Scoring clusters</title>
      <link>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250721#M219</link>
      <description>&lt;P&gt;I think you may have an empty line as the end of your csv file that is causing the problem.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 21:06:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250721#M219</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2016-02-17T21:06:09Z</dc:date>
    </item>
    <item>
      <title>Re: Scoring clusters</title>
      <link>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250741#M220</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you! Because of your posts I looked at it again and found that it was not a blank row, but rather I had to specify the type/shape of the matrix. It works now!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc distance data=roaddata2 out=distmatrix method=Euclid &lt;STRONG&gt;shape=square&lt;/STRONG&gt;;&lt;BR /&gt;var interval (FACT_AADT/ std=std);&lt;BR /&gt;id SEGMENT_ID;&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 23:33:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/Scoring-clusters/m-p/250741#M220</guid>
      <dc:creator>Brendankjansdkj</dc:creator>
      <dc:date>2016-02-17T23:33:46Z</dc:date>
    </item>
  </channel>
</rss>

