<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How do I parse out a text field that contains multiple fields in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517294#M139824</link>
    <description>&lt;P&gt;Parsing comes second.&amp;nbsp; First step is making decisions about what the goal is.&amp;nbsp; For example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Do you want to create three variables named Category, Age, and License?&amp;nbsp; Will that always be true, or will it depend on what is inside Unique_Fields?&lt;/LI&gt;
&lt;LI&gt;Which new variables should be character and which should be numeric?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 29 Nov 2018 23:03:35 GMT</pubDate>
    <dc:creator>Astounding</dc:creator>
    <dc:date>2018-11-29T23:03:35Z</dc:date>
    <item>
      <title>How do I parse out a text field that contains multiple fields</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517292#M139822</link>
      <description>&lt;P&gt;For example, the data looks like this (CSV):&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"Client_NO", "Client_Name", "Unique_Fields"&lt;/P&gt;
&lt;P&gt;123,"Muffin","{"Category":"DOG","Age":3,"License":1111}"&lt;/P&gt;
&lt;P&gt;222,"Sprinkles",&lt;SPAN&gt;"{"Category":"CAT","Age":12,"License":1122}"&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;423,"Larry",&lt;SPAN&gt;"{"Category":"CHICKEN","Age":2,"License":1133}"&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;The Unique_Fields contains 3 additional fields.&amp;nbsp; I&amp;nbsp;can read them into a data set, but&amp;nbsp;how do I parse out Unique_Fields in a data step?&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Nov 2018 22:54:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517292#M139822</guid>
      <dc:creator>Areksoo</dc:creator>
      <dc:date>2018-11-29T22:54:02Z</dc:date>
    </item>
    <item>
      <title>Re: How do I parse out a text field that contains multiple fields</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517294#M139824</link>
      <description>&lt;P&gt;Parsing comes second.&amp;nbsp; First step is making decisions about what the goal is.&amp;nbsp; For example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Do you want to create three variables named Category, Age, and License?&amp;nbsp; Will that always be true, or will it depend on what is inside Unique_Fields?&lt;/LI&gt;
&lt;LI&gt;Which new variables should be character and which should be numeric?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Nov 2018 23:03:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517294#M139824</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2018-11-29T23:03:35Z</dc:date>
    </item>
    <item>
      <title>Re: How do I parse out a text field that contains multiple fields</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517297#M139827</link>
      <description>&lt;P&gt;I didn't want to complicate it, but sometimes the values in&amp;nbsp;&lt;SPAN&gt;Unique_Fields is completely different with different variable names and values.&amp;nbsp; For example:&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;"{"Color":"RED","Height":25}"&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;There's only 3 different sets fields in the Unique_Field, so if I can do one of them, I would just make&amp;nbsp;IF statements with different logic depending on the first few characters.&amp;nbsp; The variable names can be kept the same, but if the text has quotes, then it remains text, if it has no quotes it should become a number.&amp;nbsp; Between field name and value, there's a colon and a comma between each set of field name and value.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;There might an elegant&amp;nbsp;solution to read multiple different variables, but I would just be happy with a solution that works for 1 case.&amp;nbsp; The trouble I'm having is how do I read from the colon to the comma and then jump to the next variable and read from the colon to the comma.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Nov 2018 23:20:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517297#M139827</guid>
      <dc:creator>Areksoo</dc:creator>
      <dc:date>2018-11-29T23:20:41Z</dc:date>
    </item>
    <item>
      <title>Re: How do I parse out a text field that contains multiple fields</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517301#M139830</link>
      <description>Is the source JSON or XML originally?</description>
      <pubDate>Thu, 29 Nov 2018 23:41:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517301#M139830</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-11-29T23:41:23Z</dc:date>
    </item>
    <item>
      <title>Re: How do I parse out a text field that contains multiple fields</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517305#M139831</link>
      <description>&lt;P&gt;For your limited example and note about changing fields this works:&lt;/P&gt;
&lt;PRE&gt;data junk;
   infile datalines dlm=',{}' dsd;
   informat Client_NO $5. Client_Name $15. category $10. Age best. license $5.;
   informat color $10. height best.;
   input  Client_NO  Client_Name @;
   if index(_infile_,'"Category":')&amp;gt; 0 then 
      input    @'"Category":' category    @'"Age":' age @'"License":' license;
   ;                             
   if index(_infile_,'"Color"')&amp;gt; 0 then 
      input    @'"Color":' color    @'"Height":' height;
   ;                             
datalines;
123,"Muffin","{"Category":"DOG","Age":3,"License":1111}"
222,"Sprinkles","{"Category":"CAT","Age":12,"License":1122}"
423,"Larry","{"Category":"CHICKEN","Age":2,"License":1133}"
423,"Larry","{"Color":"RED","Height":25}""
;
run;
&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The first @ on the input holds the line for different readings. _INFILE_ is an automatic variable containing the contents of the INFILE buffer that can be searched like any other string.&lt;/P&gt;
&lt;P&gt;The @'string' tells SAS to read from the position that value is found. the ugly code of mixed single and double quotes is because that source actually has unmatched quotes for most purposes.&lt;/P&gt;
&lt;P&gt;The index function searches for a hopefully key value that identifies which record format you need to read&lt;/P&gt;</description>
      <pubDate>Fri, 30 Nov 2018 00:17:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517305#M139831</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2018-11-30T00:17:43Z</dc:date>
    </item>
    <item>
      <title>Re: How do I parse out a text field that contains multiple fields</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517323#M139835</link>
      <description>&lt;P&gt;Here's the approach I would take. It doesn't really matter if the terms are different, they'll be set as missing in the data set if it doesn't exist. I've used this approach to parse XML files before and it's pretty robust.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;*Sample data - assuming you already have this read in;
data have;
    informat ID $3. Type $12. Random $50.;
    infile cards dlm="|" dsd;
    input ID $ Type $ Random $;
    cards;
123|"Muffin"|"{"Category":"DOG","Age":3,"License":1111}"
222|"Sprinkles"|"{"Category":"CAT","Age":12,"License":1122}"
423|"Larry"|"{"Category":"CHICKEN","Age":2,"License":1133}"
;
    ;
    ;;
run;

*flip structure to long to account for differing types;
data long;
    set have;
    
    *count the number of : whichis number of terms;
    nwords=countc(Random, ':');
    *remove {} from variable;
    random=compress(random, "{}");

    *loop over each number of term;
    do i=1 to nwords;
        *isolate each term alone;
        term=scan(Random, i, ",");
        *get name;
        VarName=scan(term, 1, '":');
        *get value;
        VarValue=scan(term, 2, '":');
        *output to data set;
        output;
    end;
    
    *Keep only variables needed;
    keep ID Type VarName VarValue;
run;

*flip to wide format for standard data set;
proc transpose data=long out=want (drop=_name_);
    by ID Type;
    ID VarName;
    Var VarValue;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12031"&gt;@Areksoo&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;For example, the data looks like this (CSV):&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"Client_NO", "Client_Name", "Unique_Fields"&lt;/P&gt;
&lt;P&gt;123,"Muffin","{"Category":"DOG","Age":3,"License":1111}"&lt;/P&gt;
&lt;P&gt;222,"Sprinkles",&lt;SPAN&gt;"{"Category":"CAT","Age":12,"License":1122}"&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;423,"Larry",&lt;SPAN&gt;"{"Category":"CHICKEN","Age":2,"License":1133}"&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;The Unique_Fields contains 3 additional fields.&amp;nbsp; I&amp;nbsp;can read them into a data set, but&amp;nbsp;how do I parse out Unique_Fields in a data step?&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 30 Nov 2018 01:32:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517323#M139835</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-11-30T01:32:31Z</dc:date>
    </item>
    <item>
      <title>Re: How do I parse out a text field that contains multiple fields</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517487#M139923</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
infile cards dsd truncover;
input id name :$20. temp $100.;
n+1;
do i=1 to countw(temp,',');
 t=scan(temp,i,',');
 x=scan(t,1,'"{}:');
 y=scan(t,2,'"{}:');
 output;
end;
drop i t temp;
cards;
123,"Muffin","{"Category":"DOG","Age":3,"License":1111}"
222,"Sprinkles","{"Category":"CAT","Age":12,"License":1122}"
423,"Larry","{"Category":"CHICKEN","Age":2,"License":1133}"
;
run;

proc transpose data=have out=want;
by n id name;
var y;
id x;
run;
proc print;run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 30 Nov 2018 14:29:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/517487#M139923</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2018-11-30T14:29:52Z</dc:date>
    </item>
    <item>
      <title>Re: How do I parse out a text field that contains multiple fields</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/518180#M140205</link>
      <description>&lt;P&gt;Apologies for the late reply... flu hit me hard on Friday.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is fantastic and worked perfectly!&lt;/P&gt;</description>
      <pubDate>Mon, 03 Dec 2018 18:40:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-parse-out-a-text-field-that-contains-multiple-fields/m-p/518180#M140205</guid>
      <dc:creator>Areksoo</dc:creator>
      <dc:date>2018-12-03T18:40:53Z</dc:date>
    </item>
  </channel>
</rss>

