<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Medical data: matching diagnosis codes with multiple variables/codes with arrays and partial mat in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/829235#M327606</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;wrote:&lt;/BLOCKQUOTE&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;I would replace this step with a similar process to what you had before. Import the data and create the macro lists using a SQL step.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;You can simplify your code a bit as follows, declaring arrays only once, two loops rather than 4&lt;/SPAN&gt;&lt;SPAN&gt;. You could add back a more complex UNTIL statement checking both or just let it loop through the full list - I'd probably let it loop through all. Untested as no data.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;This looks promising!&amp;nbsp; It would eliminate the most tedious part of the job -- manually copying the codes and formatting them into lists.&amp;nbsp; I use a macro to import the tabs from the Excel file, so I would only have to add the list-creating proc sql once. Something like this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql noprint;
select quote(dx) into :&amp;amp;sheet._list separated by ", "
from &amp;amp;sheet.;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;And simplifying the do loops will save a lot of lines of code.&amp;nbsp; I also like that it's similar to the existing syntax that I already understand.&amp;nbsp; I will start implementing these changes and let u know how it works.&amp;nbsp; Thanks!&lt;span class="lia-unicode-emoji" title=":beaming_face_with_smiling_eyes:"&gt;😁&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 18 Aug 2022 16:34:59 GMT</pubDate>
    <dc:creator>Wolverine</dc:creator>
    <dc:date>2022-08-18T16:34:59Z</dc:date>
    <item>
      <title>Medical data: matching diagnosis codes with multiple variables/codes with arrays and partial matches</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/829087#M327541</link>
      <description>&lt;P&gt;I was given an Excel file with 23 tabs.&amp;nbsp; Each of these tabs includes a list of diagnosis and/or medical procedure codes that represent a given medical condition.&amp;nbsp; Many of these codes are stems, so a code such as 0WJG may in fact represent 0WJG1,&amp;nbsp;0WJG2,&amp;nbsp;0WJGA, etc. I need to match these codes to patient data files to flag patients who have these conditions.&amp;nbsp; These data files have 37 diagnosis variables and 26 procedure variables.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I've done similar work in the past, but with 2 important differences -- all the codes were completely listed (ie, no code stems), and there were only a few DX/proc variables in the patient data.&amp;nbsp; So I would import the Excel tabs, create a flag variable for each list of codes from each tab, and then combine those tabs into a single SAS file. Then I used PROC SQL to merge the data files with the flag file on DX_code OR proc_code.&amp;nbsp; Then I took the max of each flag variable, grouped by patient, to identify all patients who had at least 1 record with that flag.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The current situation is more complicated.&amp;nbsp; I have syntax that works, but it's tedious to program.&amp;nbsp; I will be using this data quite a bit in the future, so I'd like to find a better way.&amp;nbsp; I manually copied the codes from each tab and created list variables.&amp;nbsp; I then used arrays to match across multiple variables, using the in: operator to match codes to code stems.&amp;nbsp; Below is a portion of this syntax for 2 of the code lists.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%let Surg_mgmt_other_list = ("49320","49000","58578","0WJG","0WJG","0UQ9");/*DX and proc*/	
%let Cervical_inj_list = ("57720","0UQC");/*DX and proc*/

DATA want; SET have;
 array dx414 PRINCIPAL_DIAGNOSIS_CODE DIAGNOSIS_CODE_1-DIAGNOSIS_CODE_36;
  do index=1 to dim(dx414) until(Surg_mgmt_other_indc);
    Surg_mgmt_other_indc= dx414[index] in: &amp;amp;Surg_mgmt_other_list. ; 
end;
 array pr414 PROCEDURE_CODE ICD_PROCEDURE_CODE_1-ICD_PROCEDURE_CODE_25;
  do index=1 to dim(pr414) until(Surg_mgmt_other_indc);
    Surg_mgmt_other_indc= pr414[index] in: &amp;amp;Surg_mgmt_other_list. ; 
end;
IF Surg_mgmt_other_indc ^= 1 THEN Surg_mgmt_other_indc = 0;

 array dx415 PRINCIPAL_DIAGNOSIS_CODE DIAGNOSIS_CODE_1-DIAGNOSIS_CODE_36;
  do index=1 to dim(dx415) until(Cervical_inj_indc);
    Cervical_inj_indc= dx415[index] in: &amp;amp;Cervical_inj_list. ; 
end;
 array pr415 PROCEDURE_CODE ICD_PROCEDURE_CODE_1-ICD_PROCEDURE_CODE_25;
  do index=1 to dim(pr415) until(Cervical_inj_indc);
    Cervical_inj_indc= pr415[index] in: &amp;amp;Cervical_inj_list. ; 
end;
IF Cervical_inj_indc ^= 1 THEN Cervical_inj_indc = 0;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 17 Aug 2022 17:20:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/829087#M327541</guid>
      <dc:creator>Wolverine</dc:creator>
      <dc:date>2022-08-17T17:20:48Z</dc:date>
    </item>
    <item>
      <title>Re: Medical data: matching diagnosis codes with multiple variables/codes with arrays and partial mat</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/829096#M327547</link>
      <description>&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;The current situation is more complicated.&amp;nbsp; I have syntax that works, but it's tedious to program.&amp;nbsp; I will be using this data quite a bit in the future, so I'd like to find a better way.&amp;nbsp; I manually copied the codes from each tab and created list variables.&amp;nbsp; I then used arrays to match across multiple variables, using the in: operator to match codes to code stems.&amp;nbsp; Below is a portion of this syntax for 2 of the code lists.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;I would replace this step with a similar process to what you had before. Import the data and create the macro lists using a SQL step.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;I don't think this will scale well, but another option is to flip your diag list to a long format and merge with the other lists, identify the flags and merge back the flags with the main data set. &lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;You would need to replicate the process with the proc list. Since both of these lists usually have a lot of empty fields this isn't too bad of a process.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt; It's a somewhat easier process but I can understand the desire to keep it all in a single data step using the codes above.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;You can simplify your code a bit as follows, declaring arrays only once, two loops rather than 4&lt;/SPAN&gt;&lt;SPAN&gt;. You could add back a more complex UNTIL statement checking both or just let it loop through the full list - I'd probably let it loop through all. Untested as no data.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql noprint;
select quote(dx) into :surg_mgmt_other_list separated by ", "
from excel_import1;
quit;

proc sql noprint;
select quote(dx) into :cervical_inj_list separated by ", "
from excel_import2;
quit;


DATA want; 
SET have;
 
array dx_codes PRINCIPAL_DIAGNOSIS_CODE DIAGNOSIS_CODE_1-DIAGNOSIS_CODE_36;
 array proc_codes PROCEDURE_CODE ICD_PROCEDURE_CODE_1-ICD_PROCEDURE_CODE_25;

Surg_mgmt_other_indc=0;
Cervical_inj_indc=0;

  do index=1 to dim(dx_codes );
    Surg_mgmt_other_indc= dx_codes[index] in: (&amp;amp;Surg_mgmt_other_list. ); 
    Cervical_inj_indc= dx_codes[index] in: (&amp;amp;Cervical_inj_list.) ; 
end;



  do index=1 to dim(proc_codes) until(Surg_mgmt_other_indc=1 and Cervical_inj_indc=1);
    Surg_mgmt_other_indc= proc_codes[index] in: (&amp;amp;Surg_mgmt_other_list.) ; 
    Cervical_inj_indc= proc_codes[index] in: (&amp;amp;Cervical_inj_list.); 
end;

&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN&gt;HTH&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 17 Aug 2022 18:11:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/829096#M327547</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2022-08-17T18:11:29Z</dc:date>
    </item>
    <item>
      <title>Re: Medical data: matching diagnosis codes with multiple variables/codes with arrays and partial mat</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/829140#M327568</link>
      <description>This makes me think of hash objects. I think this can be done nicely by loading the xcel worksheets via datasets into hash lookup tables. The only hurdle is stem matching. I think it can be overcome by creating a stem variable for the patients that is a substring of the original longer ICD code. This stem on the subjects data set can be matched to diagnosis using hashes. How is a hash better than merging? It is an alternative that seems more fun and flexible: &lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lrcon/n1b4cbtmb049xtn1vh9x4waiioz4.htm" target="_blank"&gt;https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lrcon/n1b4cbtmb049xtn1vh9x4waiioz4.htm&lt;/A&gt;</description>
      <pubDate>Thu, 18 Aug 2022 03:16:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/829140#M327568</guid>
      <dc:creator>pink_poodle</dc:creator>
      <dc:date>2022-08-18T03:16:18Z</dc:date>
    </item>
    <item>
      <title>Re: Medical data: matching diagnosis codes with multiple variables/codes with arrays and partial mat</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/829235#M327606</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;wrote:&lt;/BLOCKQUOTE&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;I would replace this step with a similar process to what you had before. Import the data and create the macro lists using a SQL step.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;You can simplify your code a bit as follows, declaring arrays only once, two loops rather than 4&lt;/SPAN&gt;&lt;SPAN&gt;. You could add back a more complex UNTIL statement checking both or just let it loop through the full list - I'd probably let it loop through all. Untested as no data.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;This looks promising!&amp;nbsp; It would eliminate the most tedious part of the job -- manually copying the codes and formatting them into lists.&amp;nbsp; I use a macro to import the tabs from the Excel file, so I would only have to add the list-creating proc sql once. Something like this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql noprint;
select quote(dx) into :&amp;amp;sheet._list separated by ", "
from &amp;amp;sheet.;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;And simplifying the do loops will save a lot of lines of code.&amp;nbsp; I also like that it's similar to the existing syntax that I already understand.&amp;nbsp; I will start implementing these changes and let u know how it works.&amp;nbsp; Thanks!&lt;span class="lia-unicode-emoji" title=":beaming_face_with_smiling_eyes:"&gt;😁&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Aug 2022 16:34:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/829235#M327606</guid>
      <dc:creator>Wolverine</dc:creator>
      <dc:date>2022-08-18T16:34:59Z</dc:date>
    </item>
    <item>
      <title>Re: Medical data: matching diagnosis codes with multiple variables/codes with arrays and partial mat</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/830084#M327989</link>
      <description>&lt;P&gt;Some other tasks came up, but I finally got a chance to get back to this, and unfortunately, the array part of the syntax doesn't work properly.&amp;nbsp; All but one of the _indc variables = 0 for every record in the data file.&amp;nbsp; The only exception came up with 54 records that = 1 (out of ~3.5M records).&amp;nbsp; In any event, the frequencies don't match the frequencies from my original approach.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Could the proc code array be overwriting the DX array?&amp;nbsp; I just don't understand arrays well enough to know what it's doing at each step.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Aug 2022 14:24:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/830084#M327989</guid>
      <dc:creator>Wolverine</dc:creator>
      <dc:date>2022-08-24T14:24:48Z</dc:date>
    </item>
    <item>
      <title>Re: Medical data: matching diagnosis codes with multiple variables/codes with arrays and partial mat</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/830097#M327996</link>
      <description>&lt;P&gt;Check that the macro variables are being created the same. The arrays have no change except you're not declaring them multiple points of time.&lt;/P&gt;</description>
      <pubDate>Wed, 24 Aug 2022 15:07:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/830097#M327996</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2022-08-24T15:07:58Z</dc:date>
    </item>
    <item>
      <title>Re: Medical data: matching diagnosis codes with multiple variables/codes with arrays and partial mat</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/830100#M327999</link>
      <description>&lt;P&gt;I haven't updated that part of the syntax yet, so the _list variables are exactly the same.&amp;nbsp; I'm still using the manually-created hardcoded lists&lt;/P&gt;</description>
      <pubDate>Wed, 24 Aug 2022 15:24:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/830100#M327999</guid>
      <dc:creator>Wolverine</dc:creator>
      <dc:date>2022-08-24T15:24:59Z</dc:date>
    </item>
    <item>
      <title>Re: Medical data: matching diagnosis codes with multiple variables/codes with arrays and partial mat</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/830102#M328001</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql noprint;
select quote(dx) into :surg_mgmt_other_list separated by ", "
from excel_import1;
quit;

proc sql noprint;
select quote(dx) into :cervical_inj_list separated by ", "
from excel_import2;
quit;


DATA want; 
SET have;
 
array dx_codes PRINCIPAL_DIAGNOSIS_CODE DIAGNOSIS_CODE_1-DIAGNOSIS_CODE_36;
 array proc_codes PROCEDURE_CODE ICD_PROCEDURE_CODE_1-ICD_PROCEDURE_CODE_25;

Surg_mgmt_other_indc=0;
Cervical_inj_indc=0;

  do index=1 to dim(dx_codes );
    if dx_codes[index] in: (&amp;amp;Surg_mgmt_other_list. ) then Surg_mgmt_other_indc=1;
    if dx_codes[index] in: (&amp;amp;Cervical_inj_list.) then Cervical_inj_indc=1;
end;



  do index=1 to dim(proc_codes);
    if proc_codes[index] in: (&amp;amp;Surg_mgmt_other_list.) then Surg_mgmt_other_indc= 1 ; 
    if  proc_codes[index] in: (&amp;amp;Cervical_inj_list.) then Cervical_inj_indc=1; 
end;

run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You're correct that would have coded the variables incorrectly as it's an assignment, try this approach instead.&lt;/P&gt;</description>
      <pubDate>Wed, 24 Aug 2022 15:31:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/830102#M328001</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2022-08-24T15:31:49Z</dc:date>
    </item>
    <item>
      <title>Re: Medical data: matching diagnosis codes with multiple variables/codes with arrays and partial mat</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/830122#M328013</link>
      <description>&lt;P&gt;There is no need to define multiple arrays that contain the exact same set of variables.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;array dx414 PRINCIPAL_DIAGNOSIS_CODE DIAGNOSIS_CODE_1-DIAGNOSIS_CODE_36;
array dx415 PRINCIPAL_DIAGNOSIS_CODE DIAGNOSIS_CODE_1-DIAGNOSIS_CODE_36;

array pr414 PROCEDURE_CODE ICD_PROCEDURE_CODE_1-ICD_PROCEDURE_CODE_25;
array pr415 PROCEDURE_CODE ICD_PROCEDURE_CODE_1-ICD_PROCEDURE_CODE_25;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Just make one array for each type and then re-use those same arrays when searching for other indications.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;array dx_codes PRINCIPAL_DIAGNOSIS_CODE DIAGNOSIS_CODE_1-DIAGNOSIS_CODE_36;
array pr_codes PROCEDURE_CODE ICD_PROCEDURE_CODE_1-ICD_PROCEDURE_CODE_25;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 24 Aug 2022 16:00:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/830122#M328013</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-08-24T16:00:31Z</dc:date>
    </item>
    <item>
      <title>Re: Medical data: matching diagnosis codes with multiple variables/codes with arrays and partial mat</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/830130#M328019</link>
      <description>This works and provides the same exact frequencies as the original programming.  Thanks!</description>
      <pubDate>Wed, 24 Aug 2022 16:35:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Medical-data-matching-diagnosis-codes-with-multiple-variables/m-p/830130#M328019</guid>
      <dc:creator>Wolverine</dc:creator>
      <dc:date>2022-08-24T16:35:38Z</dc:date>
    </item>
  </channel>
</rss>

