BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Priyanka2019
Calcite | Level 5

I am trying to parse below json string from parameters column in table 1:-

"{""creditUnionValue"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""3.25*"",""type"":""Single Rate"",""needVerification"":""true""},
{""creditUnionValue"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""2.87*"",""type"":""Joint Rate"",""needVerification"":""true""},{""creditUnionValue"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":"""",""type"":""Joint Dev"",""needVerification"":""false""},{""crediting"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""40000"",""type"":""Benefit Maximize"",""needVerification"":""false""},{""crediting"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""11"",""type"":""Termination "",""needVerification"":""false""},{""crediting"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""22"",""type"":""Eligibility Age"",""needVerification"":""false""},{""crediting"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""UNL"",""type"":""Term Of Insurance"",""needVerification"":""false""},{""crediting"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""7501"",""type"":""Benefit Amount"",""needVerification"":""false""},{""crediting"":null,""intergroupVisibility"":""DEFAULT"",""notes"":null,""intergroupValue"":"""",""type"":""Delinquent Payment"",""needVerification"":""false""}"

Tried Coding as below:-

 

select max(count(parameters,'}'))+2 into :maxelements from table1;

 

data want;

set have;

array parsed_vars $ 150 new_var1-new_var%eval(&maxelements);

i=1;

do i=1 to &maxelements;

parsed_vars(i) =scan(parameters, i, "}");

i+1;

end;

run;

 

In the result am getting alternate row of data in new_var1 then new_var3, new_var5 all even ones are blank which is a problem. So I am getting only 5 of 9 rows.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

It seems that this JSON is not too complicated.

 

options noquotelenmax;
data have;
x='{""creditUnionValue"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""3.25*"",""type"":""Single Rate"",""needVerification"":""true""},
{""creditUnionValue"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""2.87*"",""type"":""Joint Rate"",""needVerification"":""true""}';
run;
data temp;
 set have;
 do i=1 to countw(x,',{}');
   temp=scan(x,i,',{}');output;
 end;
 drop x;
run;
data want;
 set temp;
 name=scan(temp,1,':');
 value=scan(temp,-1,':');
run;

View solution in original post

11 REPLIES 11
Ksharp
Super User

It seems that this JSON is not too complicated.

 

options noquotelenmax;
data have;
x='{""creditUnionValue"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""3.25*"",""type"":""Single Rate"",""needVerification"":""true""},
{""creditUnionValue"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""2.87*"",""type"":""Joint Rate"",""needVerification"":""true""}';
run;
data temp;
 set have;
 do i=1 to countw(x,',{}');
   temp=scan(x,i,',{}');output;
 end;
 drop x;
run;
data want;
 set temp;
 name=scan(temp,1,':');
 value=scan(temp,-1,':');
run;

Priyanka2019
Calcite | Level 5

One additional query to this I will need to capture all these tokens as an array so I can search the ones I need. Also these lines are just part of more than 15K row of data.

Suppose I have id=101, parameter{<datalines shared earlier>} to parse now if I get count of words and want to create array where size is dynamically captured based on the tokens present in each dataline. Each row will have different count of sub rows in parameter. 

I was able to get count as a table column in proc sql :-

(count(parameters,'}')*12 as ratecnt

But problem is I want to define array defined on ratecnt for each row of data. But only if I take max of above and capture using into macro variable it allows me to create dynamic array.

Please help me define proc sql or sas dynamic array defined based on ratcnt field captured in every row of data.

Below code doesn't run but on these lines.

 

array parsed_vars $ 150 new_var1-new_var%eval(ratecnt);
i=1;
do i=1 to ratecnt;
parsed_vars(i) =scan(parameters, i, "}");
i+1;*/

Tom
Super User Tom
Super User

An array of variables in a SAS data step always refers to the same list of variables on every observation.

 

So just need to know the MAX() of the number needed for each observation across the whole dataset.

Try this:

proc sql noprint;
select max(ratecnt) into :array_length trimmed 
from have 
;
quit;

data want;
   set have;
   array parsed_vars [&array_length] $150 ;
   ....
run;

The first step will create the macro variable ARRAY_LENGTH with a value like 45.

Then in the data step the ARRAY statement will create 45 variables named PARSED_VARS1 to PARSED_VARS150, each of which is defined as character of length 150.

 

Priyanka2019
Calcite | Level 5
This i thought of but then for each array i will need to calculate number of actual populated elements for do while loop because max works for overall table for eg comes 229 but some rows have as small as 98 name/ value elements.
Tom
Super User Tom
Super User
I don't understand your point. Didn't you have to calculate the count for each observation in order to find the max over all the observations?
Priyanka2019
Calcite | Level 5
Ratecnt defined in above post was giving count for each row. But array was to be defined for each row based on Number of observations each row has. How to use ratecnt defined in proc sql for each row in do while. If i take max in proc sql n pass to macro it is constant for all rows when each row has different number of obs. The *into* clause uses only 1 value from whole table for assigning.
Tom
Super User Tom
Super User

@Priyanka2019 wrote:
Ratecnt defined in above post was giving count for each row. But array was to be defined for each row based on Number of observations each row has. How to use ratecnt defined in proc sql for each row in do while. If i take max in proc sql n pass to macro it is constant for all rows when each row has different number of obs. The *into* clause uses only 1 value from whole table for assigning.

That doesn't make any sense.  You cannot define an array differently on different observations. Anymore than you could change the name of a variable on different observations.  You can choose how many of the variables in the array you actually USE.

array X [2000] ;
do i=1 to ratecnt;
  x[i]= ....
end;
Priyanka2019
Calcite | Level 5
I removed array only and added another step for parsing thanks for your solution
Priyanka2019
Calcite | Level 5
Tom this makes sense system shouldnt allow it i didnt think through, ratecnt wasnt working in do while loop but will just recheck this once and confirm back. Have one more query regarding survey parsing is there where Request response:value1,obs2, obs 3 etc, ans:value5 List, ques no - 1.2 , Cu response =bcx, repeated for multiple qs are there. Was wondering same above logic will work without array n using wordcount and scan. Any Thoughts...many curly braces in between for a single q n ans in obs as json
Tom
Super User Tom
Super User

If the JSON is too complex you might have trouble parsing.

 

One way that might work is to parse it in parts.  Use one delimiter to scan across and then apply the same method to the result but using the second delimiter there.   For quotes you probably need to look at the Q modifier on the SCAN() and COUNTW() functions. You might want to use the DEQUOTE() function to remove them once you are down to a single value.  Note that DEQUOTE() does nothing to the value if it is not enclosed in quotes, so need to check first, just apply.

 

But the nesting of [ ] and { } in the JSON is at all complex then it will probably make using the simple SCAN() function not enough.  You will either need to make a much more complex logic to parse it.  Or just figure out how to write the strings to a file and read it using the JSON libname engine instead.

 

filename sample temp;
data _null_;
 file sample;
 put '['
"{""creditUnionValue"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""3.25*"",""type"":""Single Rate"",""needVerification"":""true""},
{""creditUnionValue"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""2.87*"",""type"":""Joint Rate"",""needVerification"":""true""},{""creditUnionValue"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":"""",""type"":""Joint Dev"",""needVerification"":""false""},{""crediting"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""40000"",""type"":""Benefit Maximize"",""needVerification"":""false""},{""crediting"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""11"",""type"":""Termination "",""needVerification"":""false""},{""crediting"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""22"",""type"":""Eligibility Age"",""needVerification"":""false""},{""crediting"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""UNL"",""type"":""Term Of Insurance"",""needVerification"":""false""},{""crediting"":null,""intergroupVisibility"":""READONLY"",""notes"":null,""intergroupValue"":""7501"",""type"":""Benefit Amount"",""needVerification"":""false""},{""crediting"":null,""intergroupVisibility"":""DEFAULT"",""notes"":null,""intergroupValue"":"""",""type"":""Delinquent Payment"",""needVerification"":""false""}"
']'
;
run;


libname sample json;

proc print data=sample.alldata (obs=10);
run;

    Obs    P    P1                      V    Value

       1    1    creditUnionValue        1    null
       2    1    intergroupVisibility    1    READONLY
       3    1    notes                   1    null
       4    1    intergroupValue         1    3.25*
       5    1    type                    1    Single Rate
       6    1    needVerification        1    true
       7    1    creditUnionValue        1    null
       8    1    intergroupVisibility    1    READONLY
       9    1    notes                   1    null
      10    1    intergroupValue         1    2.87*
Priyanka2019
Calcite | Level 5
Also some obs are strings in this

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 3207 views
  • 1 like
  • 3 in conversation