Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- How to keep a certain level of the clusters from de Proc Cluster

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 05-18-2020 06:24 AM
(498 views)

Hi,

I ran a proc cluster on the proc corresp output. I obtain a tree (see picture below), and the output table (see table attached).

In the table (Ward), I have in front of each of my variables, the name of (what I believe to be) the smallest cluster I can get. What i'd like, one of my most cherished dreams, is having several columns in my Ward table, each column representing one level of clustering.

For instance Var_Level_1 : Cluster_1 ; Cluster_2

Var_Level_2 : Cluster_1_1 ; Cluster_1_2 ; Cluster_2_1 ; Cluster_2_2 etc...

Thank you by advance for your help 🙂

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

A set of `_NAME_ _PARENT_`

data nodes define a hierarchical tree. From node X, following the data up the parent linkage, to the root is an ancestors path. You want data columns , 1..N, containing the names from root to X. The data node can be stored in a DATA step HASH object and ancestral path traversal can be performed using a series of FIND() operations.

- The number of steps it takes to traverse from a given node to the root is the depth of the node.
- Suppose you have some longest path P > Q > R > S > T > U > V > W which is length depth 7.

- You know you will need to have 7 columns to capture the pieces of the longest path.
- This determination (computation) needs to be done before a final DATA step.

- Suppose you have another path A > B > C > D of depth 3.
- Traversing up the parent links from D you have
- step 1 C
- step 2 B
- step 3 A
- step 4 done

- Arrayed, the traversal is in reverse order C B A, you want the data as A B C.
- The data captured during traversal needs to be reversed

- Traversing up the parent links from D you have

Example:

```
* pass 1 - compute number of columns needed;
* find longest path;
data _null_;
if 0 then set download.ward; * prep pdv;
declare hash links (dataset:'download.ward');
links.defineKey('_NAME_');
links.defineData('_PARENT_');
links.defineDone();
declare hiter iter('links');
do index=1 by 1 while (iter.next() = 0);
do step = 1 by 1 while (links.find(key:_parent_) eq 0);
if _parent_ = '' then leave;
end;
max_step = max (max_step, step);
end;
call symput ('MAX_DEPTH', cats(max_step));
run;
%put NOTE: &=MAX_DEPTH;
* final step;
data have;
length depth 8;
length level1-level&MAX_DEPTH $25;
array level level1-level&MAX_DEPTH;
set download.ward;
if _n_ = 1 then do;
declare hash links (dataset:'download.ward');
links.defineKey('_NAME_');
links.defineData('_PARENT_');
links.defineDone();
end;
* determine depth (number of steps to root) and capture tiers;
* in this loop 'level' actually means 'ancestor';
orig_parent = _parent_;
level(1) = _parent_;
do step = 2 by 1 while (links.find(key:_parent_) eq 0);
if _parent_ = '' then leave;
level(step) = _parent_;
end;
_parent_ = orig_parent;
depth = step - 1;
* reverse the the captured tiers;
do step = 1 to depth/2;
opstep = depth - step + 1;
hold = level(step);
level(step) = level(opstep);
level(opstep) = hold;
end;
drop step opstep hold orig_parent;
run;
```

1 REPLY 1

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

A set of `_NAME_ _PARENT_`

data nodes define a hierarchical tree. From node X, following the data up the parent linkage, to the root is an ancestors path. You want data columns , 1..N, containing the names from root to X. The data node can be stored in a DATA step HASH object and ancestral path traversal can be performed using a series of FIND() operations.

- The number of steps it takes to traverse from a given node to the root is the depth of the node.
- Suppose you have some longest path P > Q > R > S > T > U > V > W which is length depth 7.

- You know you will need to have 7 columns to capture the pieces of the longest path.
- This determination (computation) needs to be done before a final DATA step.

- Suppose you have another path A > B > C > D of depth 3.
- Traversing up the parent links from D you have
- step 1 C
- step 2 B
- step 3 A
- step 4 done

- Arrayed, the traversal is in reverse order C B A, you want the data as A B C.
- The data captured during traversal needs to be reversed

- Traversing up the parent links from D you have

Example:

```
* pass 1 - compute number of columns needed;
* find longest path;
data _null_;
if 0 then set download.ward; * prep pdv;
declare hash links (dataset:'download.ward');
links.defineKey('_NAME_');
links.defineData('_PARENT_');
links.defineDone();
declare hiter iter('links');
do index=1 by 1 while (iter.next() = 0);
do step = 1 by 1 while (links.find(key:_parent_) eq 0);
if _parent_ = '' then leave;
end;
max_step = max (max_step, step);
end;
call symput ('MAX_DEPTH', cats(max_step));
run;
%put NOTE: &=MAX_DEPTH;
* final step;
data have;
length depth 8;
length level1-level&MAX_DEPTH $25;
array level level1-level&MAX_DEPTH;
set download.ward;
if _n_ = 1 then do;
declare hash links (dataset:'download.ward');
links.defineKey('_NAME_');
links.defineData('_PARENT_');
links.defineDone();
end;
* determine depth (number of steps to root) and capture tiers;
* in this loop 'level' actually means 'ancestor';
orig_parent = _parent_;
level(1) = _parent_;
do step = 2 by 1 while (links.find(key:_parent_) eq 0);
if _parent_ = '' then leave;
level(step) = _parent_;
end;
_parent_ = orig_parent;
depth = step - 1;
* reverse the the captured tiers;
do step = 1 to depth/2;
opstep = depth - step + 1;
hold = level(step);
level(step) = level(opstep);
level(opstep) = hold;
end;
drop step opstep hold orig_parent;
run;
```

Are you ready for the spotlight? We're accepting content ideas for **SAS Innovate 2025** to be held May 6-9 in Orlando, FL. The call is **open **until September 25. Read more here about **why** you should contribute and **what is in it** for you!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.