Hi,
I ran a proc cluster on the proc corresp output. I obtain a tree (see picture below), and the output table (see table attached).
In the table (Ward), I have in front of each of my variables, the name of (what I believe to be) the smallest cluster I can get. What i'd like, one of my most cherished dreams, is having several columns in my Ward table, each column representing one level of clustering.
For instance Var_Level_1 : Cluster_1 ; Cluster_2
Var_Level_2 : Cluster_1_1 ; Cluster_1_2 ; Cluster_2_1 ; Cluster_2_2 etc...
Thank you by advance for your help 🙂
A set of _NAME_ _PARENT_
data nodes define a hierarchical tree. From node X, following the data up the parent linkage, to the root is an ancestors path. You want data columns , 1..N, containing the names from root to X. The data node can be stored in a DATA step HASH object and ancestral path traversal can be performed using a series of FIND() operations.
Example:
* pass 1 - compute number of columns needed;
* find longest path;
data _null_;
if 0 then set download.ward; * prep pdv;
declare hash links (dataset:'download.ward');
links.defineKey('_NAME_');
links.defineData('_PARENT_');
links.defineDone();
declare hiter iter('links');
do index=1 by 1 while (iter.next() = 0);
do step = 1 by 1 while (links.find(key:_parent_) eq 0);
if _parent_ = '' then leave;
end;
max_step = max (max_step, step);
end;
call symput ('MAX_DEPTH', cats(max_step));
run;
%put NOTE: &=MAX_DEPTH;
* final step;
data have;
length depth 8;
length level1-level&MAX_DEPTH $25;
array level level1-level&MAX_DEPTH;
set download.ward;
if _n_ = 1 then do;
declare hash links (dataset:'download.ward');
links.defineKey('_NAME_');
links.defineData('_PARENT_');
links.defineDone();
end;
* determine depth (number of steps to root) and capture tiers;
* in this loop 'level' actually means 'ancestor';
orig_parent = _parent_;
level(1) = _parent_;
do step = 2 by 1 while (links.find(key:_parent_) eq 0);
if _parent_ = '' then leave;
level(step) = _parent_;
end;
_parent_ = orig_parent;
depth = step - 1;
* reverse the the captured tiers;
do step = 1 to depth/2;
opstep = depth - step + 1;
hold = level(step);
level(step) = level(opstep);
level(opstep) = hold;
end;
drop step opstep hold orig_parent;
run;
A set of _NAME_ _PARENT_
data nodes define a hierarchical tree. From node X, following the data up the parent linkage, to the root is an ancestors path. You want data columns , 1..N, containing the names from root to X. The data node can be stored in a DATA step HASH object and ancestral path traversal can be performed using a series of FIND() operations.
Example:
* pass 1 - compute number of columns needed;
* find longest path;
data _null_;
if 0 then set download.ward; * prep pdv;
declare hash links (dataset:'download.ward');
links.defineKey('_NAME_');
links.defineData('_PARENT_');
links.defineDone();
declare hiter iter('links');
do index=1 by 1 while (iter.next() = 0);
do step = 1 by 1 while (links.find(key:_parent_) eq 0);
if _parent_ = '' then leave;
end;
max_step = max (max_step, step);
end;
call symput ('MAX_DEPTH', cats(max_step));
run;
%put NOTE: &=MAX_DEPTH;
* final step;
data have;
length depth 8;
length level1-level&MAX_DEPTH $25;
array level level1-level&MAX_DEPTH;
set download.ward;
if _n_ = 1 then do;
declare hash links (dataset:'download.ward');
links.defineKey('_NAME_');
links.defineData('_PARENT_');
links.defineDone();
end;
* determine depth (number of steps to root) and capture tiers;
* in this loop 'level' actually means 'ancestor';
orig_parent = _parent_;
level(1) = _parent_;
do step = 2 by 1 while (links.find(key:_parent_) eq 0);
if _parent_ = '' then leave;
level(step) = _parent_;
end;
_parent_ = orig_parent;
depth = step - 1;
* reverse the the captured tiers;
do step = 1 to depth/2;
opstep = depth - step + 1;
hold = level(step);
level(step) = level(opstep);
level(opstep) = hold;
end;
drop step opstep hold orig_parent;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.