BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Shuail_Ibrahim
Calcite | Level 5

I have a sample string I need to convert the text to JSON format. I tried n number of ways but not able to achieve it, It would really sound great if anyone helps me to construct it.  

sample ="The following are the graphical (non-control) characters defined by
ISO 8859-1 (1987).  DESCRIPTION :  in words aren't all that helpful,
but they're the best we can do in text.  A graphics file illustrating
the character set should be available from the same archive as this
file.RESULT :success INTERPRETATION : ISO 8859-1 (1987).CREATED_BY:Questy.CREATED_ON:29/07/1963"

below is the required JSON output 

{
   "DESCRIPTION":" in words aren't all that helpful but they're the best we can do in text.   A graphics file illustrating the character set should be available from the same archive as thisfile",
   "RESULT":"success",
   "INTERPRETATION":" ISO 8859-1 (1987)",
   "CREATED_BY":"Questy",
   "CREATED_ON":"29/07/1963"
}
1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Below stitched together from the discussion and contributions here.

data have;
  infile datalines truncover;
  input _str $256.;
  length sampleString $ 1000;
  retain sampleString;
  sampleString=catx(' ', sampleString, _str);
  keep sampleString;
  if _n_=6 then output;
datalines;
The following are the graphical (non-control) characters defined by
ISO 8859-1 (1987).  DESCRIPTION :  in words aren't all that helpful,
but they're the best we can do in text.  A graphics file illustrating
the character set should be available from the same archive as this
file.RESULT :success INTERPRETATION : ISO 8859-1 (1987).CREATED_BY:
Questy.CREATED_ON:29/07/1963
;

options ps=max ls=max;
data parsed(keep=key value);
  set have;

  length found $2000 key $100 value $2000;
  start=1;
  stop=lengthn(sampleString);

  prxid=prxparse('/\b\w+\b\s*:.*?(?=((\b\w+\b\s*:)|$))/');
  call prxnext(prxid, start, stop, trim(sampleString), pos, len);
  do while(pos>0);
    found=substr(sampleString,pos,len);
    key=scan(found,1,':');
    value=scan(found,2,':');
    output;
    call prxnext(prxid, start, stop, trim(sampleString), pos, len);
  end;
run;

proc transpose data=parsed out=transposed(drop=_name_);
  id key;
  var value;
run;

filename myjson temp;
proc json out=myjson nosastags pretty;
  export transposed;
run;

data _null_;
  file print;
  infile myjson;
  input;
  put _infile_;
run;

Patrick_0-1592627467729.png

 

View solution in original post

8 REPLIES 8
Ksharp
Super User
data have;
sample ="The following are the graphical (non-control) characters defined by
ISO 8859-1 (1987).  DESCRIPTION :  in words aren't all that helpful,
but they're the best we can do in text.  A graphics file illustrating
the character set should be available from the same archive as this
file.RESULT :success INTERPRETATION : ISO 8859-1 (1987).CREATED_BY:Questy.CREATED_ON:29/07/1963";
run;

filename x 'c:\temp\x.json';
data _null_;
set have;
a1=find(sample,'DESCRIPTION');
a2=find(sample,'RESULT');
a3=find(sample,'INTERPRETATION');
a4=find(sample,'CREATED_BY');
a5=find(sample,'CREATED_ON');

s1=quote(substr(sample,a1+14,a2-a1-14));
s2=quote(substr(sample,a2+8,a3-a2-8));
s3=quote(substr(sample,a3+16,a4-a3-16));
s4=quote(substr(sample,a4+11,a5-a4-11));
s5=quote(substr(sample,a5+11));

file x lrecl=32767;
put '{';
put '"DESCRIPTION":'  s1 $ +(-1)  ',';
put '"RESULT":'  s2 $ +(-1) ',';
put '"INTERPRETATION":'  s3 $ +(-1)  ',';
put '"CREATED_BY":'  s4 $ +(-1) ',';
put '"CREATED_ON":'  s5 $ ;
put '}';

run;
Shuail_Ibrahim
Calcite | Level 5
thanks a lot, Ksharp, the sample string format will not be the same every time, it will change every time based on the key present in JSON we need to pick the value. sorry for mentioning this bit late
smantha
Lapis Lazuli | Level 10
data _null_;
length sample $32767. Phrase Key Value $32767.;
sample ="The following are the graphical (non-control) characters defined by
ISO 8859-1 (1987).  DESCRIPTION :  in words aren't all that helpful,
but they're the best we can do in text.  A graphics file illustrating
the character set should be available from the same archive as this
file.RESULT :success. INTERPRETATION : ISO 8859-1 (1987).CREATED_BY:Questy.CREATED_ON:29/07/1963";
/*
Assumptions: 
One did not know what keys would be except the fact that they precede a colon.
periods are the separators between differnt sections.
*/
put '{';
count_periods=countc(sample,'.');
do count=1 to count_periods+1;
	Phrase = scan(sample,count,'.');
	if indexc(Phrase,':') > 0 then do;
		Key=strip(scan(Phrase,1,':'));
		Value = strip(scan(Phrase,2,':'));
		put '"'key+(-1)'":"'Value+(-1)'"';
	end;
end;
put '}';
run;

I modified the input by putting a period after success

 

BillM_SAS
SAS Employee

This solution produces valid JSON but not exactly in the format you specified. I am not sure how strict your requirements are for the output. The advantage is that the code is simpler since once the data is in a SAS data set, PROC JSON produces the valid JSON output. I borrowed and slightly modified the DATA step that @Ksharp wrote to put the data into a SAS data set.

 

data have;
sample ="The following are the graphical (non-control) characters defined by
ISO 8859-1 (1987).  DESCRIPTION :  in words aren't all that helpful,
but they're the best we can do in text.  A graphics file illustrating
the character set should be available from the same archive as this
file.RESULT :success INTERPRETATION : ISO 8859-1 (1987).CREATED_BY:Questy.CREATED_ON:29/07/1963";
run;

data work.test(drop=sample a1-a5);
set have;
a1=find(sample,'DESCRIPTION');
a2=find(sample,'RESULT');
a3=find(sample,'INTERPRETATION');
a4=find(sample,'CREATED_BY');
a5=find(sample,'CREATED_ON');

DESCRIPTION=substr(sample,a1+14,a2-a1-14);
RESULT=substr(sample,a2+8,a3-a2-8);
INTERPRETATION=substr(sample,a3+16,a4-a3-16);
CREATED_BY=substr(sample,a4+11,a5-a4-11);
CREATED_ON=substr(sample,a5+11);
run;

proc json out='c:\temp\x.json' nosastags pretty;
export work.test;
run;

This is the resultant JSON file:
[
  {
    "DESCRIPTION": " in words aren't all that helpful,but they're the best we can do in text. A graphics file illustratingthe character set should be available from the same archive as thisfile.",
    "RESULT": "success",
    "INTERPRETATION": " ISO 8859-1 (1987).",
    "CREATED_BY": "Questy.",
    "CREATED_ON": "29/07/1963"
  }
]

Shuail_Ibrahim
Calcite | Level 5
thanks a lot, BillM_SAS, the sample string format will not be the same every time, it will change every time based on the key present in JSON we need to pick the value. I cant hard code the position sorry for mentioning this bit late
Shuail_Ibrahim
Calcite | Level 5
thank u all for the replies, the sample string format will not be the same every time, it will change every time based on the key present in JSON we need to pick the value. sorry for mentioning this bit late
Patrick
Opal | Level 21

Below stitched together from the discussion and contributions here.

data have;
  infile datalines truncover;
  input _str $256.;
  length sampleString $ 1000;
  retain sampleString;
  sampleString=catx(' ', sampleString, _str);
  keep sampleString;
  if _n_=6 then output;
datalines;
The following are the graphical (non-control) characters defined by
ISO 8859-1 (1987).  DESCRIPTION :  in words aren't all that helpful,
but they're the best we can do in text.  A graphics file illustrating
the character set should be available from the same archive as this
file.RESULT :success INTERPRETATION : ISO 8859-1 (1987).CREATED_BY:
Questy.CREATED_ON:29/07/1963
;

options ps=max ls=max;
data parsed(keep=key value);
  set have;

  length found $2000 key $100 value $2000;
  start=1;
  stop=lengthn(sampleString);

  prxid=prxparse('/\b\w+\b\s*:.*?(?=((\b\w+\b\s*:)|$))/');
  call prxnext(prxid, start, stop, trim(sampleString), pos, len);
  do while(pos>0);
    found=substr(sampleString,pos,len);
    key=scan(found,1,':');
    value=scan(found,2,':');
    output;
    call prxnext(prxid, start, stop, trim(sampleString), pos, len);
  end;
run;

proc transpose data=parsed out=transposed(drop=_name_);
  id key;
  var value;
run;

filename myjson temp;
proc json out=myjson nosastags pretty;
  export transposed;
run;

data _null_;
  file print;
  infile myjson;
  input;
  put _infile_;
run;

Patrick_0-1592627467729.png

 

Ksharp
Super User

OK. It is a little bit complicated.

 

data have;
sample ="The following are the graphical (non-control) characters defined by
ISO 8859-1 (1987).  DESCRIPTION :  in words aren't all that helpful,
but they're the best we can do in text.  A graphics file illustrating
the character set should be available from the same archive as this
file.RESULT :success INTERPRETATION : ISO 8859-1 (1987).CREATED_BY:Questy.CREATED_ON:29/07/1963";
run;
data temp;
 set have;
 n+1;
 do i=1 to countw(sample,':');
  temp=scan(sample,i,':');
  output;
 end;
drop i sample;
run;
data temp;
 set temp;
 by n;
 if not last.n then do;
   call scan(temp,-1,p,l,'_','ka');
   name=cats('"',substr(temp,p),'"');
   value=cats('"',substr(temp,1,p-1),'",') ;
 end;
 else do;name=' ';value=cats('"',temp,'"');end;
 drop p l;
run; 
data want;
 merge temp temp(keep=n value rename=(n=_n value=_value) firstobs=2);
 if n=_n;
 keep n name _value;
run;

filename x 'c:\temp\x.json';
data _null_;
set want;
by n;
file x lrecl=32767;
if first.n then put '{';
put name $ +(-1) ':' _value $;
if last.n then put '}';
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1077 views
  • 0 likes
  • 5 in conversation