BookmarkSubscribeRSS Feed
summy
Calcite | Level 5

Hi!

I have a data set with two variables,one is the ID,the other is the amount.Can proc cluster or porc fastclus deal only with the amount?

And what is the difference between quantile and cluster?

Maybe I am too careful,but I have to make it clear.

Thanks in advance!

1 REPLY 1
PGStats
Opal | Level 21

Yes, you can do clustering on a single variable. Suppose you know that there are two groups in your data and want to separate them automatically, you could use clustering to do that. Run the following example:

/* Generate example data with 2 clusters in variable x */

data test;
do x = 1,2,3,4,5,12,13,14;
id = put(x,2.);
output;
end;
run;

/* Form all clusters */

proc cluster data=test outtree=tree method=centroid noprint;
var x;
id id;
run;

/* Isolate top 2 clusters */

proc tree data=tree out=clusters nclusters=2 noprint;

run;

/* Get quantiles */

proc rank data=test fraction out=quantiles;
var x;
ranks quantile;
run;

/* Assemble clusters and quantiles */

proc sql;
select Q.id, Q.x, Q.quantile label="Quantile", C.cluster
from clusters as C inner join quantiles as Q on C._NAME_=Q.id
order by cluster, x;
quit;

Observations are assigned to the proper cluster. The quantiles are just like rescaled ranks.

PG

PG

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 5858 views
  • 3 likes
  • 2 in conversation