<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Sankey Diagram, Decision Tree etc. in Graphics Programming</title>
    <link>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719468#M21047</link>
    <description>&lt;P&gt;If I were to automate a Sankey diagram (which I will admit I am now intrigued to try) I would probably follow the same methods I used when I made a CIRCOS graph macro: &lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/CIRCOS-A-SAS-Macro-to-Create-CIRCOS-Plots/ta-p/457952" target="_blank"&gt;https://communities.sas.com/t5/SAS-Communities-Library/CIRCOS-A-SAS-Macro-to-Create-CIRCOS-Plots/ta-p/457952&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The concepts are quite similar, but instead of going from one end of a circle to another they flow through different horizontal sections.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 15 Feb 2021 20:07:45 GMT</pubDate>
    <dc:creator>JeffMeyers</dc:creator>
    <dc:date>2021-02-15T20:07:45Z</dc:date>
    <item>
      <title>Sankey Diagram, Decision Tree etc.</title>
      <link>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719050#M21038</link>
      <description>&lt;P&gt;Dear all,&lt;/P&gt;
&lt;P&gt;am about to start a new project where I will need to create diagrams like sankey, decision trees and so on.&lt;/P&gt;
&lt;P&gt;I am try to gather some information about what I can accomplish with SAS. I read about using Visualization for&lt;/P&gt;
&lt;P&gt;sankeys , decision trees.....&lt;/P&gt;
&lt;P&gt;Since am not very familiar with visual analysis, I wish to ask if its possible to use SAS-Graphics to also achieve&lt;/P&gt;
&lt;P&gt;this goal.&lt;/P&gt;
&lt;P&gt;I saw something like using PROC JSON.&lt;/P&gt;
&lt;P&gt;One of the task will be a sankey diagram to show&amp;nbsp; from which areas the patients originate. Or compare therapies&lt;/P&gt;
&lt;P&gt;I will appreciate any hint about where to start from.&lt;/P&gt;
&lt;P&gt;thanks&lt;/P&gt;</description>
      <pubDate>Fri, 12 Feb 2021 23:12:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719050#M21038</guid>
      <dc:creator>Anita_n</dc:creator>
      <dc:date>2021-02-12T23:12:53Z</dc:date>
    </item>
    <item>
      <title>Re: Sankey Diagram, Decision Tree etc.</title>
      <link>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719052#M21039</link>
      <description>Using VA or Base SAS?&lt;BR /&gt;&lt;BR /&gt;In general, when searching for resources I recommend search lexjansen.com for the topic of interest or robslinks.com which is Rob Allison's page of graphs. &lt;BR /&gt;This paper popped up, but there are others:&lt;BR /&gt;&lt;A href="https://www.lexjansen.com/pharmasug/2018/DV/PharmaSUG-2018-DV16.pdf" target="_blank"&gt;https://www.lexjansen.com/pharmasug/2018/DV/PharmaSUG-2018-DV16.pdf&lt;/A&gt;</description>
      <pubDate>Fri, 12 Feb 2021 23:24:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719052#M21039</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2021-02-12T23:24:03Z</dc:date>
    </item>
    <item>
      <title>Re: Sankey Diagram, Decision Tree etc.</title>
      <link>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719082#M21040</link>
      <description>Hi Reeza, &lt;BR /&gt;thanks for the quick reply and the recommended links. &lt;BR /&gt;I'm planning to use  Base SAS. Are there special procedures&lt;BR /&gt;for these cases?</description>
      <pubDate>Sat, 13 Feb 2021 08:20:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719082#M21040</guid>
      <dc:creator>Anita_n</dc:creator>
      <dc:date>2021-02-13T08:20:24Z</dc:date>
    </item>
    <item>
      <title>Re: Sankey Diagram, Decision Tree etc.</title>
      <link>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719200#M21043</link>
      <description>&lt;P&gt;&lt;A href="https://blogs.sas.com/content/graphicallyspeaking/2015/03/21/sankey-diagrams/" target="_self"&gt;This blog post discusses creating a Sankey diagram by using PROC SGPLOT&lt;/A&gt; (Base SAS), but it is not automatic. Some advanced thinking/planning is required.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The diagram that results will be static. If you have ever seen a demo of Visual Analytics, you might have seen &lt;A href="https://blogs.sas.com/content/customeranalytics/2018/05/23/sas-customer-intelligence-360-path-analysis-for-re-engagement/" target="_self"&gt;an interactive version of a Sankey diagram.&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 14 Feb 2021 11:59:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719200#M21043</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2021-02-14T11:59:12Z</dc:date>
    </item>
    <item>
      <title>Re: Sankey Diagram, Decision Tree etc.</title>
      <link>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719279#M21046</link>
      <description>Thankyou for the link. It sounds promising</description>
      <pubDate>Mon, 15 Feb 2021 07:22:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719279#M21046</guid>
      <dc:creator>Anita_n</dc:creator>
      <dc:date>2021-02-15T07:22:05Z</dc:date>
    </item>
    <item>
      <title>Re: Sankey Diagram, Decision Tree etc.</title>
      <link>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719468#M21047</link>
      <description>&lt;P&gt;If I were to automate a Sankey diagram (which I will admit I am now intrigued to try) I would probably follow the same methods I used when I made a CIRCOS graph macro: &lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/CIRCOS-A-SAS-Macro-to-Create-CIRCOS-Plots/ta-p/457952" target="_blank"&gt;https://communities.sas.com/t5/SAS-Communities-Library/CIRCOS-A-SAS-Macro-to-Create-CIRCOS-Plots/ta-p/457952&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The concepts are quite similar, but instead of going from one end of a circle to another they flow through different horizontal sections.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 15 Feb 2021 20:07:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719468#M21047</guid>
      <dc:creator>JeffMeyers</dc:creator>
      <dc:date>2021-02-15T20:07:45Z</dc:date>
    </item>
    <item>
      <title>Re: Sankey Diagram, Decision Tree etc.</title>
      <link>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719554#M21051</link>
      <description>That looks interesting, thank you</description>
      <pubDate>Tue, 16 Feb 2021 08:53:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719554#M21051</guid>
      <dc:creator>Anita_n</dc:creator>
      <dc:date>2021-02-16T08:53:43Z</dc:date>
    </item>
    <item>
      <title>Re: Sankey Diagram, Decision Tree etc.</title>
      <link>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719812#M21056</link>
      <description>&lt;P&gt;Well, as I said it sounded like an interesting challenge so I made a short macro to create them.&amp;nbsp; I haven't really polished this macro at all with error checking or anything, but it could be an example to use if you would like.&amp;nbsp; The data should be multiple rows per "patient" where each row has a value that corresponds to one of the horizontal nodes in the Sankey graph.&amp;nbsp; There is then a value that corresponds to a subgroup within each node.&lt;/P&gt;
&lt;P&gt;Here is the final product I made with a made up dataset (see code).&amp;nbsp; It looks at the maximum grade of patients from cycle 1 assessment through cycle 5 assessments.&amp;nbsp; Grade 5 means death so patients cannot continue on from that group.&amp;nbsp; I tried to leave notes in the code to help.&amp;nbsp; I can explain more if interested.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="_sankey.png" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/54828i73AF126B44019966/image-size/large?v=v2&amp;amp;px=999" role="button" title="_sankey.png" alt="_sankey.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;

data random;
    call streaminit(123);
    do id = 1 to 100;
        do cycle=1 to 5;
            u = rand("Uniform");
            grade = floor(6*u);    
            output;
            if grade=5 then cycle=5;
        end;
    end;  
    drop u:;
    label cycle='Cycle' grade='Maximum Grade AE';
run;


option mprint;
%macro sankey(
    /*Required*/
    data= /*Dataset to input*/,
    id= /*Patient/traveler ID*/,
    nodes= /*Nodes are the different points or time-points that the curves connect between.  Needs to be numeric*/,
    group= /*Different subgroups for the node sections*/,
    /*Optional for tweaking the graph*/
    barwidth=5 /*Controls width of the Nodes*/,  
    bargap=5 /*Allocates a percentage of the y-space for gaps between GROUPs*/, 
    points=20 /*Number of points used to draw Bezier curves.  More=smoother lines but more memory*/, 
    curve_rectangle_gap=0 /*Gap as a percentage between the ends of the connecting curves and the Nodes*/,
    antialias=200000 /*When a lot of points are used antialias will need to be increased to get smooth curves*/,
    width=16in /*Determines the width of the image*/,
    height=8in /*Determines the height of the image*/,
    plotname=_sankey /*Determines the name of the image. ODS LISTING should be turned on to save image*/,
    font_size=12pt /*Determines the font size*/);

    /**Rename Variables and create a temporary dataset**/
    data _temp;
        merge &amp;amp;data (keep=&amp;amp;id rename=(&amp;amp;id=id))
            &amp;amp;data (keep=&amp;amp;nodes rename=(&amp;amp;nodes=nodes))
            &amp;amp;data (keep=&amp;amp;group rename=(&amp;amp;group=group));
    run;
    
    /**Grab Labels for Group and Nodes variables**/
    data _null_;
        set _temp (obs=1);
        %local group_label nodes_label;
        call symput('nodes_label',strip(vlabel(nodes)));
        call symput('group_label',strip(vlabel(group)));
    run;
    
    /**Grab cross tab frequency of groups and nodes for rectangles**/
    proc freq data=_temp noprint;
        table nodes*group / outpct out=_frq (keep=nodes group count pct_row);
    run;
    
    /**Assign a group level value for use in arrays later**/
    data _levels;
        set _frq;
        by nodes group;
        if first.nodes then group_lvl=0;
        group_lvl+1;
    run;
    
    /**Grab number and values of unique Node values**/
    proc sql noprint;
        %local n_nodes i null;
        select count(distinct nodes) into :n_nodes separated by ''
            from _frq;
        select distinct nodes format=12. into :node1- 
            from _frq;

        %do i = 1 %to &amp;amp;n_nodes;
            %local n_group&amp;amp;i ;
        %end;
        /**Count how many groups are in each Node**/
        select nodes,count(distinct group) into :null,:n_group1- 
            from _frq group by nodes;
    quit;
    /*Create coordinates for rectangles**/
    /*BARWIDTH controls the width of rectangles, BARGAP assigns a percentage for white space*/
    data _rectangles;
        set _frq;
        by nodes group;
        
        array _node_n {&amp;amp;n_nodes} 
            (%do i = 1 %to &amp;amp;n_nodes;
                 %if &amp;amp;i&amp;gt;1 %then %do; , %end;
                 &amp;amp;&amp;amp;n_group&amp;amp;i
             %end;);
        if first.nodes then do;
           last_y=0;nodes_count+1;group_count=0;
        end;
        if first.group then do;
            last_x=100*(nodes_count-1)/(&amp;amp;n_nodes-1);
            group_count+1;
        end;
        retain last_x last_y;
        rectangle_id=catx('-',nodes_count,group_count);
        x=last_x;y=last_y;output;
        x=last_x+&amp;amp;barwidth;y=last_y;output;
        x=last_x+&amp;amp;barwidth;y=last_y+((100-&amp;amp;bargap)/100)*pct_row;output;
        x=last_x;y=last_y+((100-&amp;amp;bargap)/100)*pct_row;output;
        
        last_y=y+&amp;amp;bargap/_node_n(nodes_count);
        
        drop _node:;
    run;
    

    /*Find the unique paths going out of each node between groups*/
    proc sort data=_temp;
        by id;
    data _paths;
        set _temp;
        by id;
        
        array node_ {&amp;amp;n_nodes};
        retain node_;
        if first.id then call missing(of node_(*));
        
        %do i=1 %to &amp;amp;n_nodes;
            %if &amp;amp;i&amp;gt;1 %then %do; else %end;
            if nodes=&amp;amp;&amp;amp;node&amp;amp;i then node_lvl=&amp;amp;i;
        %end;
        node_(node_lvl)=group;
        
        if last.id then do;
            do i=1 to dim(node_)-1;
                if ^missing(node_(i)) then do;
                    start=node_(i);
                    starting_node=i;
                    do j = i+1 to dim(node_);
                        if ^missing(node_(j)) then do;
                            end=node_(j);
                            ending_node=j;
                            output;
                            j=dim(node_);
                        end;
                    end;
                end;
            end;
        end;
        keep id start starting_node end ending_node node_:;
    run;
    /**Get counts for each path**/
    proc sort data=_paths;
        by starting_node ending_node start end;
    run;
    proc freq data=_paths noprint;
        by starting_node ending_node;
        table start*end / list missing out=_frq2;
    run;
        
    /**Grab counts and values needed to create the Connecting Curves**/
    proc sql noprint;
        create table _paths2 as
            select 
                /**Numbers for starting groups**/
                a.starting_node,a.start,c.count as start_group_n,
                /*Grab location and height of rectangles*/
                c2.y_min as start_group_min,c2.y_max as start_group_max,c2.y_max-c2.y_min as start_group_diff,c2.x_max+&amp;amp;curve_rectangle_gap as start_group_x,
                c3.group_lvl as start_index, /*Used for arrays*/
                
                /**Numbers for ending groups**/
                a.ending_node,a.end,e.count as end_group_n,
                /*Grab location and height of rectangles*/
                e2.y_min as end_group_min,e2.y_max as end_group_max,e2.y_max-e2.y_min as end_group_diff,e2.x_min-&amp;amp;curve_rectangle_gap as end_group_x,
                e3.group_lvl as end_index,/*Used for arrays*/
                a.count as n_move /*Number of patients in the current path*/
                from _frq2 a 
                    left join _frq c on a.starting_node=c.nodes and a.start=c.group
                    left join _frq e on a.ending_node=e.nodes and a.end=e.group
                    left join (select nodes,group,min(y) as y_min,max(y) as y_max, max(x) as x_max
                                from _rectangles group by nodes,group) as c2
                        on a.starting_node=c2.nodes and a.start=c2.group
                    left join (select nodes,group,min(y) as y_min,max(y) as y_max,min(x) as x_min
                                from _rectangles group by nodes,group) as e2
                        on a.ending_node=e2.nodes and a.end=e2.group
                    left join _levels c3 on a.starting_node=c3.nodes and a.start=c3.group
                    left join _levels e3 on a.ending_node=e3.nodes and a.end=e3.group
            order by starting_node,ending_node,start,end;
        /**Grab number of distinct group values for array**/
        %local n_grps;
        select count(distinct end) into :n_grps separated by ''
            from _paths2;
        /**Grab x-axis location for node labels**/
        create table _node_labels as
            select nodes, (max(x)+min(x))/2 as x_label from _rectangles group by nodes;
    quit;
    
    data _paths3;
        set _paths2;
        by starting_node ending_node start end;
        
        /**Hold running totals for group counts**/
        array start_n {&amp;amp;n_nodes,&amp;amp;n_grps} (%sysevalf(&amp;amp;n_nodes*&amp;amp;n_grps)*0) ;
        array end_n {&amp;amp;n_nodes,&amp;amp;n_grps} (%sysevalf(&amp;amp;n_nodes*&amp;amp;n_grps)*0) ;
                
        /***Build the Bezier Curve Connectors: Use Cubic Bezier Equation: 
            B(t)=(1-t)^3*P0 + 3(1-t)^2*t*P1 + 3(1-t)*t^2*P2 + t^3*P3, 0 &amp;lt;= t &amp;lt;= 1***/
        length path_index $20.;
        path_index=catx('-',starting_node,start_index,end_index);
        /*Find y-axis values for start/end corners of Bezier curves*/
        start_y1=start_group_min+start_group_diff*(start_n(starting_node,start_index)/start_group_n);
        start_y2=start_y1+start_group_diff*(n_move/start_group_n);
        end_y1=end_group_min+end_group_diff*(end_n(ending_node,end_index)/end_group_n)+end_group_diff*(n_move/end_group_n);
        end_y2=end_y1-end_group_diff*(n_move/end_group_n);
        /**Bezier Curve 1: From Left group to Right Group path**/
        do t = 0 to 1 by 1/25;
            x=((1-t)**3)*start_group_x+
                3*((1-t)**2)*t*(start_group_x+(end_group_x-start_group_x)/3)+
                3*(1-t)*(t**2)*(start_group_x+2*(end_group_x-start_group_x)/3)+
                (t**3)*end_group_x;
            y=((1-t)**3)*start_y2+3*((1-t)**2)*t*start_y2+3*(1-t)*(t**2)*end_y1+(t**3)*end_y1;
            output;
        end;
        /**Bezier Curve 2: From Right Group back to Left Group**/
        do t = 0 to 1 by 1/25;
            x=((1-t)**3)*end_group_x+
                3*((1-t)**2)*t*(start_group_x+2*(end_group_x-start_group_x)/3)+
                3*(1-t)*(t**2)*(start_group_x+(end_group_x-start_group_x)/3)+
                (t**3)*start_group_x;
            y=((1-t)**3)*end_y2+3*((1-t)**2)*t*end_y2+3*(1-t)*(t**2)*start_y1+(t**3)*start_y1;
            output;
        end;
        /**Increase running total for each groups N**/
        start_n(starting_node,start_index)+n_move;
        end_n(ending_node,end_index)+n_move;
        
        keep x y path_index start;
    run;
    

    /**Combine Data for Plot**/
    data _plot;
        set _rectangles (keep=x y rectangle_id group rename=(rectangle_id=id))
            _paths3 (keep=x y path_index start rename=(path_index=id start=group))
            _node_labels;
    run;
    ods graphics /reset width=&amp;amp;width height=&amp;amp;height ANTIALIASMAX=&amp;amp;antialias imagename="&amp;amp;plotname";
    proc sgplot data=_plot noborder;
        polygon x=x y=y id=id / fill nooutline group=group transparency=0.3 name='p';
        xaxistable nodes / x=x_label location=inside position=top nolabel title="&amp;amp;nodes_label" valueattrs=(size=&amp;amp;font_size) titleattrs=(size=&amp;amp;font_size) ;
        xaxis min=0 max=%sysevalf(100+&amp;amp;barwidth) values=(0 to 100 by 10) display=none valueshint;
        yaxis min=0 max=100 reverse values=(0 to 100 by 10) display=none;
        keylegend 'p'/ title="&amp;amp;group_label" noborder location=outside position=bottom exclude=('') valueattrs=(size=&amp;amp;font_size) titleattrs=(size=&amp;amp;font_size);
    run;
            
            
%mend;
%sankey(data=random,group=grade,id=id,nodes=cycle);

 
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 17 Feb 2021 06:29:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/719812#M21056</guid>
      <dc:creator>JeffMeyers</dc:creator>
      <dc:date>2021-02-17T06:29:31Z</dc:date>
    </item>
    <item>
      <title>Re: Sankey Diagram, Decision Tree etc.</title>
      <link>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/833794#M23169</link>
      <description>&lt;P&gt;Dear Jeff,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'd like to know if it is possible to add numerical values at each node, and how can this be done.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:45:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Graphics-Programming/Sankey-Diagram-Decision-Tree-etc/m-p/833794#M23169</guid>
      <dc:creator>fcassanelli</dc:creator>
      <dc:date>2022-09-16T09:45:38Z</dc:date>
    </item>
  </channel>
</rss>

