I tried to do a cluster analysis. how ever it didn't create a dendrogram. the data set and the SAS code I used is attached herewith. could anyone tell me what's wrong with it?
Many of us won't open attachments, so you will have a better response if you post your code.
When I run your data step this is the Log:
15 data chili; 16 length Acc $6; 17 input Acc &$ StC NA SP PGH BH LC LS LP NFA FP CC AC SE MS CP CM AS FCI 17 ! FCM FS FSPA NBF FSBE FBEA FC FSr SC NSF GU PH MLL MLW PHF CWF FL FW 17 ! FWt FT FPP DFt DFl DFF TSWt FPK Yd; 18 datalines; NOTE: Invalid data for StC in line 46 1-2. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6---- 46 C2 2 7 5 7 5 3 2 3 1 7 4 3 7 0 0 2 0 3 9 4 4 0 4 1 7 3 1 2 2 38. 65 0 6.4 2.1 30.3 27.6 4.4 2.2 94.2 0.1 35.2 55.0 59.0 79.0 3.2 373 129 .0 35.0 Acc=C1 2 7 StC=. NA=2 SP=7 PGH=5 BH=7 LC=5 LS=3 LP=2 NFA=3 FP=1 CC=7 AC=4 SE=3 MS=7 CP=0 CM=0 AS=2 FCI=0 FCM=3 FS=9 FSPA=4 NBF=4 FSBE=0 FBEA=4 FC=1 FSr=7 SC=3 NSF=1 GU=2 PH=2 MLL=38 MLW=6.4 PHF=2.1 CWF=30.3 FL=27.6 FW=4.4 FWt=2.2 FT=94.2 FPP=0.1 DFt=35.2 DFl=55 DFF=59 TSWt=79 FPK=3.2 Yd=373 _ERROR_=1 _N_=14 NOTE: Invalid data for StC in line 48 1-2. 48 C4 2 7 7 7 3 7 1 7 1 5 8 5 7 0 0 2 1 6 9 5 3 0 4 1 3 2 1 3 2 70. 65 7 7.5 2.8 35.7 19.2 6.4 2.1 85.0 0.2 19.3 39.0 45.0 72.0 2.9 227 129 .0 19.0 Acc=C3 1 1 StC=. NA=2 SP=7 PGH=7 BH=7 LC=3 LS=7 LP=1 NFA=7 FP=1 CC=5 AC=8 SE=5 MS=7 CP=0 CM=0 AS=2 FCI=1 FCM=6 FS=9 FSPA=5 NBF=3 FSBE=0 FBEA=4 FC=1 FSr=3 SC=2 NSF=1 GU=3 PH=2 MLL=70.7 MLW=7.5 PHF=2.8 CWF=35.7 FL=19.2 FW=6.4 FWt=2.1 FT=85 FPP=0.2 DFt=19.3 DFl=39 DFF=45 TSWt=72 FPK=2.9 Yd=227 _ERROR_=1 _N_=15 NOTE: Invalid data for StC in line 50 1-2. 50 C6 2 7 7 7 5 7 3 7 1 7 5 5 7 0 0 3 1 6 9 1 2 0 1 1 3 2 1 3 2 60. 65 4 7.2 2.6 32.3 18.2 7.6 1.3 169.0 0.1 70.3 34.0 39.0 72.0 2.9 41 129 5.0 70.0 Acc=C5 2 7 StC=. NA=2 SP=7 PGH=7 BH=7 LC=5 LS=7 LP=3 NFA=7 FP=1 CC=7 AC=5 SE=5 MS=7 CP=0 CM=0 AS=3 FCI=1 FCM=6 FS=9 FSPA=1 NBF=2 FSBE=0 FBEA=1 FC=1 FSr=3 SC=2 NSF=1 GU=3 PH=2 MLL=60.4 MLW=7.2 PHF=2.6 CWF=32.3 FL=18.2 FW=7.6 FWt=1.3 FT=169 FPP=0.1 DFt=70.3 DFl=34 DFF=39 TSWt=72 FPK=2.9 Yd=415 _ERROR_=1 _N_=16 NOTE: Invalid data for StC in line 52 1-2. 52 C8 2 1 3 7 3 3 1 3 1 5 4 5 7 0 0 2 0 3 8 4 3 0 3 1 7 3 1 2 2 38. 65 9 10.0 4.0 26.3 25.4 3.9 1.7 87.3 0.1 39.1 36.0 60.0 114.0 5.2 4 129 00.0 39.0 Acc=C7 2 5 StC=. NA=2 SP=1 PGH=3 BH=7 LC=3 LS=3 LP=1 NFA=3 FP=1 CC=5 AC=4 SE=5 MS=7 CP=0 CM=0 AS=2 FCI=0 FCM=3 FS=8 FSPA=4 NBF=3 FSBE=0 FBEA=3 FC=1 FSr=7 SC=3 NSF=1 GU=2 PH=2 MLL=38.9 MLW=10 PHF=4 CWF=26.3 FL=25.4 FW=3.9 FWt=1.7 FT=87.3 FPP=0.1 DFt=39.1 DFl=36 DFF=60 TSWt=114 FPK=5.2 Yd=400 _ERROR_=1 _N_=17 NOTE: SAS went to a new line when INPUT statement reached past the end of a line. NOTE: The data set WORK.CHILI has 19 observations and 46 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 57 ;
With that many data errors I would not be surprised that you don't get good output.
One question would be why do you have an & on the input statement after reading the Acc variable?
However the real question comes in the proc cluster code where you use a data set from proc distance that did not include the variable COUNTRY anywhere (and not in the data set chili)
107 proc distance data=chili out=dist method=euclid; 108 var interval (StC NA SP PGH BH LC LS LP NFA FP CC AC SE MS CP CM AS 108! FCI FCM FS FSPA NBF FSBE FBEA FC FSr SC NSF GU PH MLL MLW PHF CWF FL 108! FW FWt FT FPP DFt DFl DFF TSWt FPK Yd); 109 id Acc; 110 run; NOTE: The data set WORK.DIST has 38 observations and 39 variables. NOTE: PROCEDURE DISTANCE used (Total process time): real time 0.02 seconds cpu time 0.00 seconds 111 ods graphics on; 112 proc cluster data=Dist method=ward plots=dendrogram (height=rsq); 113 id country; ERROR: Variable COUNTRY not found. 114 run; NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.DATA1 may be incomplete. When this step was stopped there were 0 observations and 0 variables. NOTE: PROCEDURE CLUSTER used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
If you use ID ACC; in the Proc Cluster code as in the Proc Distance it should work
121 ods graphics on; 122 proc cluster data=Dist method=ward plots=dendrogram (height=rsq); 123 id acc; 124 run; NOTE: The input data set is a TYPE=DISTANCE data set. For such a data set, the procedure requires that the order of the rows match the order of the variables. NOTE: Writing HTML Body file: sashtml.htm NOTE: Input distances have been squared.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
For SAS newbies, this video is a great way to get started. James Harroun walks through the process using SAS Studio for SAS OnDemand for Academics, but the same steps apply to any analytics project.
Find more tutorials on the SAS Users YouTube channel.