<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Enterprise Miner Association Node with TS Similarity in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Association-Node-with-TS-Similarity/m-p/800723#M9105</link>
    <description>&lt;P&gt;Thanks Koen,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So you think Viya could handle&amp;nbsp;&lt;SPAN&gt;ICD – 10’s as factors and do time warping on them?&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 07 Mar 2022 20:07:10 GMT</pubDate>
    <dc:creator>Timg</dc:creator>
    <dc:date>2022-03-07T20:07:10Z</dc:date>
    <item>
      <title>Enterprise Miner Association Node with TS Similarity</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Association-Node-with-TS-Similarity/m-p/799937#M9102</link>
      <description>&lt;P&gt;I have been working with ICD-10 health care diagnosis codes arranged in sequence to uncover patterns in patient utilization. I am modeling my process off market basket analysis often done in retail. My data is arranged as follows:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Timg_0-1646330370029.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/69116i2F931FDE377DDF76/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Timg_0-1646330370029.png" alt="Timg_0-1646330370029.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Where:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;“ITEM” is the target and contains ICD – 10 Codes e.g., “ABNORMAL GLUCOSE COMP PREGNANCY”&lt;/LI&gt;&lt;LI&gt;“MCID” is my ID e.g., “1,2,3”&lt;/LI&gt;&lt;LI&gt;“SEQ” is my sequence variable e.g., “1,2,3” that is converted from date format in base SAS. The data are sorted by ID and SEQ order. SEQ can be as high as 50&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Sorry, I can’t share the data as it contains personal information. Below is an approximation&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;MCID&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;ITEM&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;SEQ&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;POISN METHYLPHENIDATE UNDET SEQUELA&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;SKELETAL FLUOROSIS THIGH&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;2&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;ABNORMAL GLUCOSE COMP PREGNANCY&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;3&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;LOW-TENSION GLAUCOMA&amp;nbsp; RIGHT EYE&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;4&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;2&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;NDSPLC FX PROX PHAL RT RF INIT OP&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;2&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;OTH INJ RAD ART WRST HND LT ARM SEQ&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;2&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;2&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;INJ ABDUCENT NERVE RT SIDE INITIAL&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;3&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;2&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;OTH INJ LT QUAD MUSC FASC TEND SEQ&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;4&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;3&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;INF INFLM RXN PROS DEV GFT URIN SYS&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;3&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;NDSPL FX CAPITATE BN RT WRST SB RTN&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;2&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;3&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;LAC M&amp;amp;T LNG EXT TOE ANK FT UNS SIDE&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;3&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My analysis runs fine but I am finding the results are shallow. Even with thousands of patients and 7 years of data, even my low confidence and support rules are very basic and only have chain lengths of 3-4.&lt;/P&gt;&lt;P&gt;Example:&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;Chain Length&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Transaction Count&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Support(%)&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Confidence(%)&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;PseudoLift&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Rule&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;3&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;2896&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;2.005082&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;36.6768&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;1.587408&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&amp;nbsp;LOW BACK PAIN ==&amp;gt; CONTCT EXPS OTH VIRL COMMUNICABL DZ ==&amp;gt; CONTACT W/AND (SUSP) EXPOS COVID-19&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think the problem lies with alignment of each subject’s diagnosis sequences. Meaning there are groups of subjects that have similar patterns but they are not aligned in such a way that SAS can process them. So I did experiments adjusting Chain Count and Consolidate Time in the Association node but “0” seemed to work best. My next step is to investigate TS Data Prep and TS Similarity / Time Warping but I am not finding any good learning resources and TS Data Prep needs my target variable to be an interval which I can not do (ICD – 10’s are factors / nominal).&lt;/P&gt;&lt;P&gt;Thanks for reading this long post. I would appreciate any learning documents or comments on how I might get a better analysis.&lt;/P&gt;</description>
      <pubDate>Thu, 03 Mar 2022 18:02:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Association-Node-with-TS-Similarity/m-p/799937#M9102</guid>
      <dc:creator>Timg</dc:creator>
      <dc:date>2022-03-03T18:02:42Z</dc:date>
    </item>
    <item>
      <title>Re: Enterprise Miner Association Node with TS Similarity</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Association-Node-with-TS-Similarity/m-p/800271#M9103</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The 6 nodes in the Time Series tab of your Enterprise Miner diagram will not bring you any further as they are designed for numerical time series.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Recurrent Neural Networks (RNNs) are specifically designed to handle sequence data, such as speech, text, time series, and so on. RNNs are called recurrent because they perform the same task for every element of a sequence. The output for each element depends on the computations of its preceding elements.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Unfortunately, RNNs are NOT in Enterprise Miner (but they are&amp;nbsp;in SAS VIYA Model&amp;nbsp;Studio).&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;You could call Python RNNs from Enterprise Miner though.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;What you can also do :&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Make all your 'ICD-10 health care diagnosis codes' variables.&lt;BR /&gt;Give all your patients a 1 / 0 ( Y / N ) code for every diagnosis.&lt;BR /&gt;You can then calculate the distance between patients (for example with PROC DISTANCE and the&amp;nbsp;Jaccard Coefficient).&lt;BR /&gt;Using the distance matrix, you can then do clustering of patients.&lt;BR /&gt;However, the above approach disregards the sequence of events. So that may not be what you want.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Good luck,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Koen&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 04 Mar 2022 19:08:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Association-Node-with-TS-Similarity/m-p/800271#M9103</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2022-03-04T19:08:45Z</dc:date>
    </item>
    <item>
      <title>Re: Enterprise Miner Association Node with TS Similarity</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Association-Node-with-TS-Similarity/m-p/800723#M9105</link>
      <description>&lt;P&gt;Thanks Koen,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So you think Viya could handle&amp;nbsp;&lt;SPAN&gt;ICD – 10’s as factors and do time warping on them?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 07 Mar 2022 20:07:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Association-Node-with-TS-Similarity/m-p/800723#M9105</guid>
      <dc:creator>Timg</dc:creator>
      <dc:date>2022-03-07T20:07:10Z</dc:date>
    </item>
    <item>
      <title>Re: Enterprise Miner Association Node with TS Similarity</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Association-Node-with-TS-Similarity/m-p/800729#M9106</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I do not think you can use&amp;nbsp;recurrent neural networks (RNNs) in SAS VIYA to do dynamic time warping.&lt;/P&gt;
&lt;P&gt;But your question is an interesting one. ( Time warping on time-stamped sequences of&amp;nbsp;&lt;SPAN&gt;ICD – 10 codes )&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;Let me investigate.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind regards,&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;</description>
      <pubDate>Mon, 07 Mar 2022 21:02:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Association-Node-with-TS-Similarity/m-p/800729#M9106</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2022-03-07T21:02:36Z</dc:date>
    </item>
    <item>
      <title>Re: Enterprise Miner Association Node with TS Similarity</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Association-Node-with-TS-Similarity/m-p/800903#M9108</link>
      <description>&lt;P&gt;Hello &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/48417"&gt;@Timg&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I asked info to 2 colleagues.&lt;/P&gt;
&lt;P&gt;Here is what I found out (thanks to them).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SAX (&lt;U&gt;&lt;STRONG&gt;S&lt;/STRONG&gt;&lt;/U&gt;&lt;SPAN&gt;ymbolic &lt;STRONG&gt;&lt;U&gt;A&lt;/U&gt;&lt;/STRONG&gt;ggregate appro&lt;STRONG&gt;&lt;U&gt;X&lt;/U&gt;&lt;/STRONG&gt;imation&lt;/SPAN&gt;) has some way to measure the distance between two strings that represent time series. See&lt;/P&gt;
&lt;P&gt;&lt;A href="https://go.documentation.sas.com/doc/en/pgmsascdc/v_016/castsp/castsp_tsd_sect047.htm" target="_blank"&gt;https://go.documentation.sas.com/doc/en/pgmsascdc/v_016/castsp/castsp_tsd_sect047.htm&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fjmotif.github.io%2Fsax-vsm_site%2Fmorea%2Falgorithm%2FSAX.html&amp;amp;data=04%7C01%7Ckoen.knapen%40sas.com%7C6c86d024073c49e8b18a08da0106b221%7Cb1c14d5c362545b3a4309552373a0c2f%7C0%7C0%7C637823425562946501%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;amp;sdata=vjs3%2FXARXGIPqQkJ4xCp2FbOeul87yNXPcMOEwn53sk%3D&amp;amp;reserved=0" target="_blank"&gt;https://jmotif.github.io/sax-vsm_site/morea/algorithm/SAX.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Since the TSD (&lt;STRONG&gt;&lt;U&gt;T&lt;/U&gt;&lt;/STRONG&gt;ime &lt;STRONG&gt;&lt;U&gt;S&lt;/U&gt;&lt;/STRONG&gt;eries &lt;STRONG&gt;&lt;U&gt;D&lt;/U&gt;&lt;/STRONG&gt;istance) package does not accept text sequences as an input data, you cannot use TSD/DTW (&lt;U&gt;&lt;STRONG&gt;D&lt;/STRONG&gt;&lt;/U&gt;&lt;SPAN&gt;ynamic &lt;STRONG&gt;&lt;U&gt;T&lt;/U&gt;&lt;/STRONG&gt;ime &lt;STRONG&gt;&lt;U&gt;W&lt;/U&gt;&lt;/STRONG&gt;arping)&amp;nbsp;&lt;/SPAN&gt;directly.&lt;/P&gt;
&lt;P&gt;However, if you map the text items to numeric numbers, assuming you know all the words that occur in all the text sequences, you can use DTW at the TSD package with timeid = obs.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The longest common subsequence example at the TSD package uses proc format for a similar problem.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://go.documentation.sas.com/doc/en/pgmsascdc/v_016/castsp/castsp_tsd_sect084.htm" target="_blank"&gt;https://go.documentation.sas.com/doc/en/pgmsascdc/v_016/castsp/castsp_tsd_sect084.htm&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note: For the TSD (Time Series Distance Measure) Package, you need a Visual Forecasting license in SAS VIYA 3.5+.&lt;/P&gt;
&lt;P&gt;TSD contains SAX and DTW.&lt;/P&gt;
&lt;P&gt;Visual Forecasting also offers the PROC TSMODEL.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Good luck,&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;</description>
      <pubDate>Tue, 08 Mar 2022 16:51:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Enterprise-Miner-Association-Node-with-TS-Similarity/m-p/800903#M9108</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2022-03-08T16:51:09Z</dc:date>
    </item>
  </channel>
</rss>

