Hi All,
Can we create the following kind of patient flow sankey diagram in SAS?
https://bl.ocks.org/micahstubbs/ed0ae1c70256849dab3e35a0241389c9
Thank you in advance.
Best,
C
EDIT TO SOLUTION: Since this thread has gone in a few directions after the initial question, I’m including a brief primer on path analysis (or Sankey diagrams) in SAS Visual Analytics for those landing here for the first time. This blog post goes into much more detail and includes more screenshots.
In a nutshell, a path analysis helps you determine a sequence of events in a particular time window. Path analysis can be applied to a number of scenarios from understanding your customers’ behavior online, campaign analysis to help drive your next email campaign, prospect or existing customer touch point paths, and more.
The blog post presents a very basic example of path analysis that shows shared paths and other commonalities among data streams. The simple example allows for aggregation and colorization, and gives you the ability to weigh paths by a given measure. Example:
With a more advanced data set, you see a sharp increases in the number of paths, and ranking and segmentation become even more important. Segmentation is an effective grouping method that reduces the overall number of events, and you can create custom categories in SAS Visual Analytics to achieve this. This aggregated view produces a much simpler Sankey diagram.
You also have the option of path filtering where you can select one or more events and include or exclude the items by various conditions. Example:
In addition, SAS Visual Analytics provides a number of options to filter and rank paths. Example:
ORIGINAL SOLUTION:
Hi,
Well, SAS Visual Analytics supports Sankey diagrams (named Path Analysis). These won't be as animated as in your example though. There is a blog with some information to get started at https://blogs.sas.com/content/sascom/2014/08/19/path-analysis-with-sas-visual-analytics.
Hope this helps!
Regards, Falko
EDIT TO SOLUTION: Since this thread has gone in a few directions after the initial question, I’m including a brief primer on path analysis (or Sankey diagrams) in SAS Visual Analytics for those landing here for the first time. This blog post goes into much more detail and includes more screenshots.
In a nutshell, a path analysis helps you determine a sequence of events in a particular time window. Path analysis can be applied to a number of scenarios from understanding your customers’ behavior online, campaign analysis to help drive your next email campaign, prospect or existing customer touch point paths, and more.
The blog post presents a very basic example of path analysis that shows shared paths and other commonalities among data streams. The simple example allows for aggregation and colorization, and gives you the ability to weigh paths by a given measure. Example:
With a more advanced data set, you see a sharp increases in the number of paths, and ranking and segmentation become even more important. Segmentation is an effective grouping method that reduces the overall number of events, and you can create custom categories in SAS Visual Analytics to achieve this. This aggregated view produces a much simpler Sankey diagram.
You also have the option of path filtering where you can select one or more events and include or exclude the items by various conditions. Example:
In addition, SAS Visual Analytics provides a number of options to filter and rank paths. Example:
ORIGINAL SOLUTION:
Hi,
Well, SAS Visual Analytics supports Sankey diagrams (named Path Analysis). These won't be as animated as in your example though. There is a blog with some information to get started at https://blogs.sas.com/content/sascom/2014/08/19/path-analysis-with-sas-visual-analytics.
Hope this helps!
Regards, Falko
Hi Falko, First of all thank you for your article. It amazingly describe how to develop a sankey diagram in sas VA.
I do not have sas VA at this moment and my director asked me to develop a couple of sample VA using a trial version.
My research topic is process improvement in hospital.
A patient has to go thru multiple units and process. We have time stamp of each processes.
Sample variables are : registration time, triage time, nurse seen time, physician seen time, lab order time, lab result time, physician decision to admit time, bed request time, bed assign time, patient transportation to assigned bed time, inpatient time........... discharge order time and patient leave time.
So this happens in sequence.My question is : thru the sankey diagram it is possible to tell as an example that about 35% of patient spent 20% of the time waiting for a lab result( time between lab order to lab result)?
I would like to gather as much information as possible so that I can develop something like this using a trail days and convence my director to buy a sas va/statistics.
Thank you
c
Hi Chuie,
Hopefully the following will be helpful. I am not a great expert on this, but I believe it is correct.
As I understand it, the Path Analysis in VA uses a single time (or numeric) variable to identify the order of events.
It sounds like your data looks like this:
Patient | Registration | Triage | Nurse Seen | Physician Seen |
10011 | 2018/19/11 22:34:00 | 2018/19/11 22:49:00 | 2018/19/11 23:01:00 | 2018/19/11 23:32:00 |
But Path Analysis expects data like this:
Patient | Time | Status |
10011 | 2018/19/11 22:34:00 | Registration |
10011 | 2018/19/11 22:49:00 | Triage |
10011 | 2018/19/11 23:01:00 | Nurse Seen |
10011 | 2018/19/11 23:32:00 | Physician Seen |
This would give you flows that would identify how many patients are actually admitted, how many get labs drawn, and so on, and how often these events happen in each particular order. It will not automatically tell you how much time elapsed between each event.
If you want to analyze how much time was spent at each status, you will need to calculate the differences between the timestamps.
In VA, performing calculations on datetime values can be tricky, but you can use the TreatAs operator to enable you to add and subtract datetimes.
For example:
( TreatAs(_Number_, 'Triage Time'n) - TreatAs(_Number_, 'Registration Time'n) )
will provide the difference between registration and triage times, in seconds. You can then apply a Duration format this value to display it as hours, minutes, and seconds (1:00:35 instead of 3635).
If you calculate these differences for all of your timestamps (or perhaps, between registration and each of the other timestamps), then you should be able to display the average durations for all the statuses between events. (Time to triage, time to see a nurse, etc.)
You could create a lot of visualizations and calculations for these time values, but I am not sure you can show them in a Sankey diagram in the way you are asking.
I hope that helps,
Sam
Hi Sam
Thank you so so much for your detailed explanation.
I will create a data like you have mentioned, please navigate me thru how to develop a diagram
is there any other diagram other than sanky to display this kind of information and answer the question I had on my earlier post?
You mentioned an earlier post, do you mean the question about decision trees?
If you calculate the time elapsed between patient registration and discharge, this variable would be the response in your decision tree. You could leave the time as a measure, or create a custom category to break the time into ranges (for example: less than 90 minutes, 90-120 minutes, longer than 120 minutes)
If you just want to display how long it took patients to reach each stage of processing, then simple graphs like a bar chart might be fine. But a decision tree will more intelligently identify patterns (perhaps elderly patients with cardiac symptoms are more likely to spend 120+ minutes in hospital.)
There are two versions of the decision tree in VA, a basic version which is included by default and an advanced version that requires SAS Visual Statistics. I believe either should be fine for what you want to do, but the advanced version will give you greater control and additional features.
In the USA this is a holiday week, so we are shorthanded at the Cary campus. I will see if I can get someone with more expertise to help you.
Thanks,
Sam
Thank you Sam
I mean by : I need to create a visualization of the patient flow in term so their time spent in each section ( registration, triage, lab, radiology, waiting room etc) and visually tell that about 50% of patient are waiting in a average of 2 hours waiting for lab etc.
Do we have something like that in VA?
As I have mentioned that I do not have VA yet but I have understating of VA and SAS miner thru my previous job. But in current job I d not have any of these yet and I am planing to do a presentation on what we can do in sas va/statistics as a show case its brilliant abilities so that my director can consider buying it.
Would you be able to guide me please?
Thank you
C
Sam,
In here( pic) , can we have a width of the patient ( blue arrow) is related to # patient following the sequence ( thicker = more patient, thinner = less patient) and color coordinate the patn based on the average time between two .
For example we have 1000 who follow B to D hence that path is thicker compare to A to F ( 20 patient hence thinner )
In addition to that can we also assign the color based on some number like it takes an average of 150 minutes form D to E hence that path is RED
Hi, here is an example using the data from your original post in VA 8.3 (latest production version):
The data structure to support this would look like:
And yes, the width of the path segment is driven by the number of paths (event occurrences) at this level. So in your example - you would have the # of patients. So yes, thinner paths would indicate less patients and vice versa. One can also replace frequency here to use another weight variable - for instance to represent a percentage or dollar value behind a given path.
Path analysis does not support dynamic display rules to drive color assignments yet - but I believe that's something we are looking into.
Hope this helps! Cheers, Falko
Hi Falko.
Thank you.
Is the time variable a sequence variable? Could you please explain he weight variables in this example?
Is there a way to incorporate the time in between two sequence and display in this diagram? I completely understand the thicker band represents the weight variables but what if I would also like to represent some consequences of moving from point A to B ( lab drawn to lab result ) and the time in between those is higher than a benchmark ?
My Dream result 🙂 : all refereed patients have a first consultation and about x % have received a OR receipt. The average time of first consult to or receipt is Y which is 55% more than a national benchmark ( hence red)... something like this.
On the side note:would you please list your other favorite visualization in sas VA statistics other than sankey and decision trees?
Thank you
c
Hi!
Yes, the sequence is typical a time/date variable - but doesn't have to be. Any measure indicating the 'order' of events would work here.
A weight variable specifies a measure which determines the weight for each transaction and event. It replaces the default frequency (number of events for a given path segment) and is typically used to represent the path width using an aggregated measure given the context of the paths. In your example - something like % of patients with/without receipt could be used. Or something like the actual cost behind consultations may be interesting ($ value). More information about path analysis and supported roles are in the user doc: https://go.documentation.sas.com/?cdcId=vacdc&cdcVersion=8.3&docsetId=vaobj&docsetTarget=p0iff44qtc0...
If you want to explore specific consequences (or event segments) - you would typically right click on the segment of interest and either include/exclude paths or derive another visualization based on your selection. The following screenshot shows an example showing detail path information with patients having a 'OR-receipt' event in their paths:
Once selected - another visualization is created with just these specific patients. You can change the viz type to something else if required from here on:
Does that help? It may make sense for you guys to get hands on a real environment so you can get a feeling what the system provides and what analysis you can run. Visual Analytics is a very (very :-)) flexible and interactive data exploration and analysis tool - so often you find results by just navigating thru your data.
You could also merge other information to these derived visualizations including time information (e.g. avg time of first consultation etc) or other patient details (e.g. name/city etc). You can also apply dynamic display rules to color code bars given a specific threshold value.
I'm not sure there is such thing as 'favorite' visualization. it really depends what you want to find out ;-). Things like path analysis is great for transnational data and to gain understanding how identities (e.g. patients or customers) flow thru your workflow. On the other hand - a visualization such as network plot is great for visualizing relationships, so for example to explore a doctor-patient relationship.
So it really depends what your requirements are. A good overview what visualizations are supported are in the user doc: https://go.documentation.sas.com/?cdcId=vacdc&cdcVersion=8.3&docsetId=vaobj&docsetTarget=titlepage.h... . If you have a local SAS representative - it may also make sense to get in touch. Often we have prepared industry specific examples and demonstrations which may be useful for you to get started.
Hope this helps.
Regards, Falko
Thank you for detailed explanation Falko and Sam
You guys are just an amazing gem of sas VA .
Kudos to you both..
Best,
C
Hi Falko,
Please help.
I would like to create a path analysis for 5 ESI levels( emergency severity Index).So through this diagram I would like to show what % of which ESI level goes to which events.
I anticipate through this diagram I am able to tell that about x% of patient with ESI 1 goes straight to physician and 80% of them are admitted where as only 5 % of ESI 5 were admitted.
I am not sure if the data format is correct to answer. But I tried it with the weight and all of the weight by ESIs are summed up for some reason.
What I meant by this data is that for ESI 1 25% had a triage, 20% was seen by nurse, 61% seen a doctor 55% had lab and 85% decided to be admitted.
Sequence | EVENT | WEIGHT | ID |
1 | 1-ESI | 1 | 1 |
2 | Triage | 25 | 1 |
3 | RN | 20 | 1 |
4 | Doctor | 61 | 1 |
5 | LAB/RADIOLOGY | 55 | 1 |
6 | DECIDED TO ADMIT | 85 | 1 |
1 | 2-ESI | 1 | 2 |
2 | Triage | 22 | 2 |
3 | RN | 80 | 2 |
4 | Doctor | 65 | 2 |
5 | LAB/RADIOLOGY | 59 | 2 |
6 | DECIDED TO ADMIT | 78 | 2 |
1 | 3-ESI | 1 | 3 |
2 | Triage | 18 | 3 |
3 | RN | 18 | 3 |
4 | Doctor | 63 | 3 |
5 | LAB/RADIOLOGY | 25 | 3 |
6 | DECIDED TO ADMIT | 41 | 3 |
1 | 4-ESI | 1 | 4 |
2 | Triage | 5 | 4 |
3 | RN | 29 | 4 |
4 | Doctor | 45 | 4 |
5 | LAB/RADIOLOGY | 56 | 4 |
6 | DECIDED TO ADMIT | 20 | 4 |
1 | 5-ESI | 1 | 5 |
2 | Triage | 95 | 5 |
3 | RN | 15 | 5 |
4 | Doctor | 19 | 5 |
5 | LAB/RADIOLOGY | 65 | 5 |
6 | DECIDED TO ADMIT | 12 | 5 |
Hi!
I don't think this will work the way you want unfortunately. The weight variable is aggregated on path level and not on event level (see doc for some more details). This means - all of your weight values (across all events) for 1-ESI are aggregated together and represent the path's weight. Given you want to aggregate on individual event level - this wouldn't produce the desired outcome.
Alternatively - have you tried a network visualization yet? I'm attaching a sample data set with source and target role using EVENT_SOURCE/EVENT_TARGET and link width the WEIGHT variable.Single ESI level selected via top button bar
Multi-level with EVENT_TYPE as link color
Not sure it's exactly what you need but I believe values are aggregated the way you need it. Note, that I applied the PERCENT format to the WEIGHT variable.
Hope this helps. Falko
Thank you Falko.
This network diagram works great as an alternative.
I assume you are a builder /creater of this awesome "path analysis" diagram in SAS VA.
Thank you !!
Just a though,If you could add a way where we could pick the color of each event other than general selection of 1. entire path 2. event 3. drop off from the drop down in link color that would be great.
So for my example I would show the patient flow and then width of the path would be the frequency ( thicker the more patient in that path etc) and then I would pick the color of each event ( event A to event B)where I could change the color ( red- orange-yellow-green) based on some metric . So that this diagram would be 3 D view with frequency and some status metric and just looking at the diagram one can tell so this section ( event a to event B) has more patient and also this section is red so we need to focus here....
Does that make sense?:)
Thank you once again for your prompt reply.
Your attention to detail is very much appreciated.
C
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.