BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Quartz | Level 8

## How to keep a certain level of the clusters from de Proc Cluster

Hi,

I ran a proc cluster on the proc corresp output. I obtain a tree (see picture below), and the output table (see table attached).

In the table (Ward), I have in front of each of my variables, the name of (what I believe to be) the smallest cluster I can get. What i'd like,  one of my most cherished dreams, is having several columns in my Ward table, each column representing one level of clustering.
For instance Var_Level_1 : Cluster_1 ; Cluster_2

Var_Level_2 : Cluster_1_1 ; Cluster_1_2 ; Cluster_2_1 ; Cluster_2_2 etc...

Thank you by advance for your help 🙂

1 ACCEPTED SOLUTION

Accepted Solutions
Barite | Level 11

## Re: How to keep a certain level of the clusters from de Proc Cluster

A set of `_NAME_ _PARENT_` data nodes define a hierarchical tree. From node X, following the data up the parent linkage, to the root is an ancestors path.  You want data columns , 1..N, containing the names from root to X.  The data node can be stored in a DATA step HASH object and ancestral path traversal can be performed using a series of FIND() operations.

• The number of steps it takes to traverse from a given node to the root is the depth of the node.
• Suppose you have some longest path P > Q > R > S > T > U > V > W which is length depth 7.
• You know you will need to have 7 columns to capture the pieces of the longest path.
• This determination (computation) needs to be done before a final DATA step.
• Suppose you have another path A > B > C > D of depth 3.
•  Traversing up the parent links from D you have
• step 1 C
• step 2 B
• step 3 A
• step 4 done
• Arrayed, the traversal is in reverse order C B A, you want the data as A B C.
• The data captured during traversal needs to be reversed

Example:

``````* pass 1 - compute number of columns needed;
* find longest path;
data _null_;
if 0 then set download.ward; * prep pdv;

declare hash links (dataset:'download.ward');
links.defineKey('_NAME_');
links.defineData('_PARENT_');
links.defineDone();

declare hiter iter('links');

do index=1 by 1 while (iter.next() = 0);
do step = 1 by 1 while (links.find(key:_parent_) eq 0);
if _parent_ = '' then leave;
end;
max_step = max (max_step, step);
end;

call symput ('MAX_DEPTH', cats(max_step));
run;

%put NOTE: &=MAX_DEPTH;

* final step;
data have;
length depth 8;

length level1-level&MAX_DEPTH \$25;
array level level1-level&MAX_DEPTH;

set download.ward;

if _n_ = 1 then do;
declare hash links (dataset:'download.ward');
links.defineKey('_NAME_');
links.defineData('_PARENT_');
links.defineDone();
end;

* determine depth (number of steps to root) and capture tiers;
* in this loop 'level' actually means 'ancestor';
orig_parent = _parent_;
level(1) = _parent_;
do step = 2 by 1 while (links.find(key:_parent_) eq 0);
if _parent_ = '' then leave;
level(step) = _parent_;
end;
_parent_ = orig_parent;
depth = step - 1;

* reverse the the captured tiers;
do step = 1 to depth/2;

opstep = depth - step + 1;

hold = level(step);
level(step) = level(opstep);
level(opstep) = hold;
end;

drop step opstep hold orig_parent;
run;                               ``````

1 REPLY 1
Barite | Level 11

## Re: How to keep a certain level of the clusters from de Proc Cluster

A set of `_NAME_ _PARENT_` data nodes define a hierarchical tree. From node X, following the data up the parent linkage, to the root is an ancestors path.  You want data columns , 1..N, containing the names from root to X.  The data node can be stored in a DATA step HASH object and ancestral path traversal can be performed using a series of FIND() operations.

• The number of steps it takes to traverse from a given node to the root is the depth of the node.
• Suppose you have some longest path P > Q > R > S > T > U > V > W which is length depth 7.
• You know you will need to have 7 columns to capture the pieces of the longest path.
• This determination (computation) needs to be done before a final DATA step.
• Suppose you have another path A > B > C > D of depth 3.
•  Traversing up the parent links from D you have
• step 1 C
• step 2 B
• step 3 A
• step 4 done
• Arrayed, the traversal is in reverse order C B A, you want the data as A B C.
• The data captured during traversal needs to be reversed

Example:

``````* pass 1 - compute number of columns needed;
* find longest path;
data _null_;
if 0 then set download.ward; * prep pdv;

declare hash links (dataset:'download.ward');
links.defineKey('_NAME_');
links.defineData('_PARENT_');
links.defineDone();

declare hiter iter('links');

do index=1 by 1 while (iter.next() = 0);
do step = 1 by 1 while (links.find(key:_parent_) eq 0);
if _parent_ = '' then leave;
end;
max_step = max (max_step, step);
end;

call symput ('MAX_DEPTH', cats(max_step));
run;

%put NOTE: &=MAX_DEPTH;

* final step;
data have;
length depth 8;

length level1-level&MAX_DEPTH \$25;
array level level1-level&MAX_DEPTH;

set download.ward;

if _n_ = 1 then do;
declare hash links (dataset:'download.ward');
links.defineKey('_NAME_');
links.defineData('_PARENT_');
links.defineDone();
end;

* determine depth (number of steps to root) and capture tiers;
* in this loop 'level' actually means 'ancestor';
orig_parent = _parent_;
level(1) = _parent_;
do step = 2 by 1 while (links.find(key:_parent_) eq 0);
if _parent_ = '' then leave;
level(step) = _parent_;
end;
_parent_ = orig_parent;
depth = step - 1;

* reverse the the captured tiers;
do step = 1 to depth/2;

opstep = depth - step + 1;

hold = level(step);
level(step) = level(opstep);
level(opstep) = hold;
end;

drop step opstep hold orig_parent;
run;                               ``````

Discussion stats
• 1 reply
• 480 views
• 0 likes
• 2 in conversation