BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tekish
Quartz | Level 8

Hi expert,

I have data similar with sample data. How do I convert the row values with their respective values in columns. I have attached the sample output. I really appreciate your ideas or suggestions.

sample data:

NAMESVALUES
id1
kindde
serH2
xid604
nid103
ceid3
nrid624448f2-8263
redf1
nansnull
updateddt11/10/2018
  
id2
kindde
serH2
xid604
nid115
ceid2
nrid6244-8263-44
redf0
nansnull
updateddt11/10/2018
  
id3
kindde
serJ2
xid604
nid125
ceid4
nrid62f3ff48-8263
redf1
nansnull
updateddt11/10/2018
  
id4
kindde
serJ4
xid605
nid211
ceid3
nrid9aea-31cddfd7
redf1
nansnull
updateddt11/10/2018
  
id5
kindde
serJ4
xid605
nid217
ceid1
nrid9aea-310c
redf1
nansnull
updateddt11/10/2018
  
id6
kindde
serJ4
xid605
nid217
ceid3
nrid562f3ff48-8263
redf0
nansattacked
updateddt11/10/2018

 

Sample Out put.

ID

Kind

ser

xid

nid

ceid

nrid

redf

nans

updatedt

1

de

H2

604

103

3

424448f2-8263

1

null

11/10/2018

 

2

de

H2

604

115

2

4244-8263-44

0

null

11/10/2018

 

3

de

J2

604

125

4

562f3ff48-8263

1

null

 

4

de

J4

605

211

3

9aea-31cddfd7

1

null

11/10/2018

5

de

J4

605

217

1

9aea-310c

1

null

11/10/2018

6

de

J4

605

217

3

562f3ff48-8263

0

attacked

11/10/2018

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

So that file is valid JSON.  But it is really a JSON sandwich.  The value of the BODY field is another independent JSON object.

So first read that using the JSON engine. Then write it back out into a valid JSON file and then read it back in.

So let me make a copy of your text into a file on my system.  Let's just reference it with the fileref JSON.

filename json temp;
options parmcards=json;
parmcards4;
{
  "data": {"json": {"suc": true,"body": "[{\"id\":1,\"kind\":\"de\",\"ser\":\"H2\",\"xid\":\"604-efg\",\"nid\":103,\"ceid\":3,\"rid\":\"a930ca5a994c2ef8\",\"redf\":1,\"nans\":null,\"updateddt\":\"2020-11-10T21\"},{\"id\":2,\"kind\":\"de\",\"ser\":\"H2\",\"xid\":\"604-efg\",\"nid\":115,\"ceid\":2,\"rid\":\"a930-ca5a994c2ef8\",\"redf\":0,\"nans\":null,\"updateddt\":\"2020-11-10T21\"},{\"id\":3,\"kind\":\"de\",\"ser\":\"H2\",\"xid\":\"604-efg\",\"nid\":125,\"ceid\":4,\"rid\":\"a930-ca5a994c2ef8\",\"redf\":1,\"nans\":null,\"updateddt\":\"2020-11-10T21\"},{\"id\":4,\"kind\":\"de\",\"ser\":\"J2\",\"xid\":\"0605-efg\",\"nid\":211,\"ceid\":3,\"nrid\":\"9aea-310c395d271c\",\"redf\":1,\"nans\":null,\"updateddt\":\"2020-11-11T15\"}]"
      }
  }
}
;;;;

On your system you would just need to write a FILENAME statement pointing to where ever that file existed.

filename json '.....txt';

Now let's point a JSON library at it.  Just use the same value for the libref as used for the fileref.

libname json json;

So looking at that there is a table named DATA_JSON ( because of the first two names in the original JSON file ) with a field named BODY that has the actual JSON string. 

proc print data=json.data_json; 
run;

image.png

Notice how the confusing escape character syntax has been cleaned up.

Now write it back out without those escape characters so the JSON object becomes actually part of the new JSON object. Let's just use BODY as the name in this new object and that will be the name the JSON engine will use for the dataset.

filename json2 temp;
data _null_;
  set json.data_json end=eof;
  file json2 ;
  if _n_=1 then put '{' @;
  else put ',' @;
  put '"body":' body ;
  if eof then put '}';
run;
libname json2 json;

Now it looks like something that might actually be usable.

libname json2 json;
proc print data=json2.body;
run;

Results:

image.png

View solution in original post

14 REPLIES 14
ballardw
Super User

Is that a SAS data set or an example of an external file that needs to be read? If an external file, what type of file is it (plain text, spreadsheet, delimited)?

 

Which values are supposed to be numeric?

How long do the character variables have to be?

By "null" do you actually want a character value that reads "null" or the SAS missing value?<99% of the time + you want missing>

tekish
Quartz | Level 8

Hi ballardw,

 

This is a JSON file save as josn.txt. I have attached the sample JSON file.

here is the my SAS codes:

data Have;
infile '......\Jsonexample\devjson1.txt' recfm=n dlm='{}[],';
input value : $200. @@;

if value in: ('count' 'calls') then
delete;
run;

data want;
set Have;
length name value $ 100;
name=dequote(scan(value,1,'=','m'));
values=dequote(scan(value,2,'=','m'));
drop value;
run;

 

Kurt_Bremser
Super User

None of your posts has an attachment, and none of your data examples looks like JSON. Do NOT use anything other than a simple editor to open such files and copy/paste them into posts.

If you want to post textual data by copy/pasting, always use the </> button to open a "code box" for this. The main window will destroy important information.

ballardw
Super User

@tekish wrote:

Hi ballardw,

 

This is a JSON file save as josn.txt. I have attached the sample JSON file.

here is the my SAS codes:

data Have;
infile '......\Jsonexample\devjson1.txt' recfm=n dlm='{}[],';
input value : $200. @@;

if value in: ('count' 'calls') then
delete;
run;

data want;
set Have;
length name value $ 100;
name=dequote(scan(value,1,'=','m'));
values=dequote(scan(value,2,'=','m'));
drop value;
run;

 


And may want to look into the JSON libname to access the data. May have a better result and the question of manipulating poor data structure goes away completely.

tekish
Quartz | Level 8

Hi Ballardw,

I tried the JSON engine and it did not work and then I save the JSON file into a text file. Is there a way I can parse text files using the data step?

 

here I the sample save json.txt file.

[{id=1, kind=de, ser=H2,xid=604,nid=103, ceid=3,nrid=894448f2, redf=1, nans=null,updateddt=2018-11-10},
{id=2, kind=de, ser=H2,xid=604,nid=115, ceid=2,nrid=724448f2, redfl=0, nans=some data available,updateddt=2018-11-10},
{id=3, kind=de, ser=H2,xid=604, nid=125, ceid=4,nrid=854448f2, redf=1, nans=some issues,updateddt=2018-11-10},
{id=4, kind=de, ser=J2,xid=605, nid=211, ceid=3,nrid=7881dd3d, redf=1, nans=there is data,updateddt=2018-11-11},
{id=5, kind=de, ser=J2,xid=605, nid=217, ceid=1,nrid=ffcb48f2, redf=1, nans=test,updateddt=2018-11-11},
{id=6, kind=de, ser=J2,xid=605, nid=217, ceid=3,nrid=810c395d, redf=0,nans=attacked, updateddt=2018-11-11},
{id=7, kind=de, ser=J3,xid=605, nid=187, ceid=1,nrid=540c395d, redf=0, nans=null,updateddt=2018-11-11},
{id=8, kind=de, ser=J2,xid=605, nid=193, ceid=3,nrid=48f2310c, redf=1, nans=the bies,updateddt=2018-11-11},
{id=9, kind=de, ser=J2,xid=605, nid=199, ceid=4,nrid=5610c395, red=1, nans=null,updateddt=2018-11-11}]

ballardw
Super User

@tekish wrote:

Hi Ballardw,

I tried the JSON engine and it did not work and then I save the JSON file into a text file. Is there a way I can parse text files using the data step?

 

here I the sample save json.txt file.

[{id=1, kind=de, ser=H2,xid=604,nid=103, ceid=3,nrid=894448f2, redf=1, nans=null,updateddt=2018-11-10},
{id=2, kind=de, ser=H2,xid=604,nid=115, ceid=2,nrid=724448f2, redfl=0, nans=some data available,updateddt=2018-11-10},
{id=3, kind=de, ser=H2,xid=604, nid=125, ceid=4,nrid=854448f2, redf=1, nans=some issues,updateddt=2018-11-10},
{id=4, kind=de, ser=J2,xid=605, nid=211, ceid=3,nrid=7881dd3d, redf=1, nans=there is data,updateddt=2018-11-11},
{id=5, kind=de, ser=J2,xid=605, nid=217, ceid=1,nrid=ffcb48f2, redf=1, nans=test,updateddt=2018-11-11},
{id=6, kind=de, ser=J2,xid=605, nid=217, ceid=3,nrid=810c395d, redf=0,nans=attacked, updateddt=2018-11-11},
{id=7, kind=de, ser=J3,xid=605, nid=187, ceid=1,nrid=540c395d, redf=0, nans=null,updateddt=2018-11-11},
{id=8, kind=de, ser=J2,xid=605, nid=193, ceid=3,nrid=48f2310c, redf=1, nans=the bies,updateddt=2018-11-11},
{id=9, kind=de, ser=J2,xid=605, nid=199, ceid=4,nrid=5610c395, red=1, nans=null,updateddt=2018-11-11}]


Doesn't work is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the <> to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the <> icon or attached as text to show exactly what you have and that we can test code against.

 

Actual text, as in posted into the code box opened with </> icon is better, especially as other representations of JSON files posted here have LOTS more line breaks then you show.

 

That example can be read BUT if you have other variables that have random names like the RED, REDF, REDFL then you have a LOT of coding ahead. And since I think JSON do not play well with line breaks no promises that this will work with your actual data since actually you have not posted it.

data example;
   informat id $5. kind $5. ser $5. xid $6. nid ceid best8. nrid $10.
            red redf redfl best. nans $25. updateddt yymmdd10.;
   format updateddt yymmdd10.;
   infile datalines dlm='[]{},' truncover;
   input   @"id=" id @"kind=" kind @"ser=" ser @"xid=" xid @"nid=" nid @"ceid=" ceid 
           @"nrid=" nrid @;
   if index(_infile_,"red="  ) > 0 then input @"red=" red @;
   if index(_infile_,"redf=" ) > 0 then input @"redf=" redf @;
   if index(_infile_,"redfl=") > 0 then input @"redfl=" redfl @;
   input @"nans=" nans   @"updateddt=" updateddt ;
datalines;
[{id=1, kind=de, ser=H2, xid=604, nid=103, ceid=3, nrid=894448f2, redf=1, nans=null,updateddt=2018-11-10},
{id=2, kind=de, ser=H2, xid=604, nid=115, ceid=2, nrid=724448f2, redfl=0, nans=some data available,updateddt=2018-11-10},
{id=3, kind=de, ser=H2, xid=604, nid=125, ceid=4, nrid=854448f2, redf=1, nans=some issues,updateddt=2018-11-10},
{id=4, kind=de, ser=J2, xid=605, nid=211, ceid=3, nrid=7881dd3d, redf=1, nans=there is data,updateddt=2018-11-11},
{id=5, kind=de, ser=J2, xid=605, nid=217, ceid=1, nrid=ffcb48f2, redf=1, nans=test,updateddt=2018-11-11},
{id=6, kind=de, ser=J2, xid=605, nid=217, ceid=3, nrid=810c395d, redf=0, nans=attacked, updateddt=2018-11-11},
{id=7, kind=de, ser=J3, xid=605, nid=187, ceid=1, nrid=540c395d, redf=0, nans=null,updateddt=2018-11-11},
{id=8, kind=de, ser=J2, xid=605, nid=193, ceid=3, nrid=48f2310c, redf=1, nans=the bies,updateddt=2018-11-11},
{id=9, kind=de, ser=J2, xid=605, nid=199, ceid=4, nrid=5610c395, red=1, nans=null,updateddt=2018-11-11}]
;

 

 

tekish
Quartz | Level 8

Hi ballardw;

I am getting some data using your code. Is there a way I can improve your code to handle the one line data? The issue is when I save the JSON data into text into saved as one long line(one row). I get also a SAS dataset with all the JSON files in one column. 

 

Sample data for text file:

data example;
informat id $5. kind $5. ser $5. xid $6. nid ceid best8. nrid $10.
redf best. nans $25. updateddt yymmdd10.;
format updateddt yymmdd10.;
infile datalines dlm='[]{},' truncover;
input @"id=" id @"kind=" kind @"ser=" ser @"xid=" xid @"nid=" nid @"ceid=" ceid
@"nrid=" nrid @;
if index(_infile_,"redf=" ) > 0 then input @"redf=" redf @;
input @"nans=" nans @"updateddt=" updateddt ;
datalines;
[{id=1, kind=de, ser=H2, xid=604, nid=103, ceid=3, nrid=894448f2, redf=1, nans=null,updateddt=2018-11-10},{id=2, kind=de, ser=H2, xid=604, nid=115, ceid=2, nrid=724448f2, redf=0, nans=some data available,updateddt=2018-11-10},{id=3, kind=de, ser=H2, xid=604, nid=125, ceid=4, nrid=854448f2, redf=1, nans=some issues,updateddt=2018-11-10},{id=4, kind=de, ser=J2, xid=605, nid=211, ceid=3, nrid=7881dd3d, redf=1, nans=there is data,updateddt=2018-11-11},{id=5, kind=de, ser=J2, xid=605, nid=217, ceid=1, nrid=ffcb48f2, redf=1, nans=test,updateddt=2018-11-11},{id=6, kind=de, ser=J2, xid=605, nid=217, ceid=3, nrid=810c395d, redf=0, nans=attacked, updateddt=2018-11-11},{id=7, kind=de, ser=J3, xid=605, nid=187, ceid=1, nrid=540c395d, redf=0, nans=null,updateddt=2018-11-11},{id=8, kind=de, ser=J2, xid=605, nid=193, ceid=3, nrid=48f2310c, redf=1, nans=the bies,updateddt=2018-11-11},{id=9, kind=de, ser=J2, xid=605, nid=199, ceid=4, nrid=5610c395, red=1, nans=null,updateddt=2018-11-11}]
;
run;

 

SAS dataset. is there a way we can use Array?

Values

[{id=1, kind=de, ser=H2,xid=604,nid=103, ceid=3,nrid=894448f2, redf=1, nans=null,updateddt=2018-11-10},

{id=2, kind=dev, ser=H2,xid=604,nid=115, ceid=2,nrid=724448f2, redf=0, nans=some data available,updateddt=2018-11-10},

{id=3, kind=de, ser=H2,xid=604, nid=125, ceid=4,nrid=854448f2, redf=1, nans=some issues,updateddt=2018-11-10},

{id=4, kind=de, ser=J2,xid=605, nid=211, ceid=3,nrid=7881dd3d, redf=1, nans=there is data,updateddt=2018-11-11},

{id=5, kind=de, ser=J2,xid=605, nid=217, ceid=1,nrid=ffcb48f2, redf=1, nans=test,updateddt=2018-11-11},

{id=6, kind=de, ser=J2,xid=605, nid=217, ceid=3,nrid=810c395d, redf=0,nans=attacked, updateddt=2018-11-11},

{id=7, kind=de, ser=J3,xid=605, nid=187, ceid=1,nrid=540c395d, redf=0, nans=null,updateddt=2018-11-11},

{id=8, kind=de, ser=J2,xid=605, nid=193, ceid=3,nrid=48f2310c, redf=1, nans=the bies,updateddt=2018-11-11},

{id=9, kind=de, ser=J2,xid=605, nid=199, ceid=4,nrid=5610c395, redf=1, nans=null,updateddt=2018-11-11}]

Tom
Super User Tom
Super User

If you had valid JSON it would be easy, but your example file is NOT using JSON syntax.  JSON uses : and not =. JSON puts quotes around names and strings. The main problem with your existing format is how to detect the lines.  If it was one record per line then your trick of including the brackets in the list of delimiters would work. 

You could try to convert it to a regular comma delimited file so it would be easier to read. 

 

So let's make a version of your file without line breaks.  I will use ORIGINAL as the fileref for this file.

filename original temp;
data _null_;
  input ;
  len=length(_infile_);
  file original recfm=n;
  put _infile_ $varying200. len ;
cards4;
[{id=1, kind=de, ser=H2,xid=604,nid=103, ceid=3,nrid=894448f2, redf=1, nans=null,updateddt=2018-11-10},
{id=2, kind=dev, ser=H2,xid=604,nid=115, ceid=2,nrid=724448f2, redf=0, nans=some data available,updateddt=2018-11-10},
{id=3, kind=de, ser=H2,xid=604, nid=125, ceid=4,nrid=854448f2, redf=1, nans=some issues,updateddt=2018-11-10},
{id=4, kind=de, ser=J2,xid=605, nid=211, ceid=3,nrid=7881dd3d, redf=1, nans=there is data,updateddt=2018-11-11},
{id=5, kind=de, ser=J2,xid=605, nid=217, ceid=1,nrid=ffcb48f2, redf=1, nans=test,updateddt=2018-11-11},
{id=6, kind=de, ser=J2,xid=605, nid=217, ceid=3,nrid=810c395d, redf=0,nans=attacked, updateddt=2018-11-11},
{id=7, kind=de, ser=J3,xid=605, nid=187, ceid=1,nrid=540c395d, redf=0, nans=null,updateddt=2018-11-11},
{id=8, kind=de, ser=J2,xid=605, nid=193, ceid=3,nrid=48f2310c, redf=1, nans=the bies,updateddt=2018-11-11},
{id=9, kind=de, ser=J2,xid=605, nid=199, ceid=4,nrid=5610c395, redf=1, nans=null,updateddt=2018-11-11}]
;;;;

Now let's write a simple data step to read it and convert the square brackets to spaces and the curly brackets to quotes. While we are at is let's double up any existing quotes.

filename fixed temp;
data _null_;
  infile original recfm=n ;
  file fixed recfm=n ;
  length string $300;
  input string $char100. ;
  string=tranwrd(string,'"','""');
  string=translate(string,' "" ','[{}]');
  len = length(string);
  put string $varying300. len ;
run;

So we get something like this:

590   data _null_;
591    infile fixed;
592    input;
593    list;
594   run;

NOTE: The infile FIXED is: ....

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+--
1          "id=1, kind=de, ser=H2,xid=604,nid=103, ceid=3,nrid=894448f2, redf=1, nans=null,updateddt=2018-1
      98  1-10","id=2, kind=dev, ser=H2,xid=604,nid=115, ceid=2,nrid=724448f2, redf=0, nans=some data avail
     195  able,updateddt=2018-11-10","id=3, kind=de, ser=H2,xid=604, nid=125, ceid=4,nrid=854448f2, redf=1,
     292   nans=some issues,updateddt=2018-11-10","id=4, kind=de, ser=J2,xid=605, nid=211, ceid=3,nrid=7881
     389  dd3d, redf=1, nans=there is data,updateddt=2018-11-11","id=5, kind=de, ser=J2,xid=605, nid=217, c
     486  eid=1,nrid=ffcb48f2, redf=1, nans=test,updateddt=2018-11-11","id=6, kind=de, ser=J2,xid=605, nid=
     583  217, ceid=3,nrid=810c395d, redf=0,nans=attacked, updateddt=2018-11-11","id=7, kind=de, ser=J3,xid
     680  =605, nid=187, ceid=1,nrid=540c395d, redf=0, nans=null,updateddt=2018-11-11","id=8, kind=de, ser=
     777  J2,xid=605, nid=193, ceid=3,nrid=48f2310c, redf=1, nans=the bies,updateddt=2018-11-11","id=9, kin
     874  d=de, ser=J2,xid=605, nid=199, ceid=4,nrid=5610c395, redf=1, nans=null,updateddt=2018-11-11" 965
NOTE: 1 record was read from the infile FIXED.
      The minimum record length was 965.
      The maximum record length was 965.

Now we can read it and parse it (as long as the lines are not longer then 32K bytes).

data tall;
  infile fixed dsd recfm=n ;
  length row col 8 line $500 word name value $100 ;
  row+1;
  input line :$500. ;
  do col=1 to countw(line,',','mq');
    word = left(scan(line,col,',','mq'));
    name=scan(word,1,'=','mq');
    value=scan(word,2,'=','mq');
    output;
  end;
  drop line word;
run;

Now you have a structure that is good for PROC TRANSPOSE.

proc transpose data=tall out=want(drop=_name_);
  by row;
  id name ;
  var value ;
run;

Results:

Obs   row   id   kind   ser   xid   nid   ceid     nrid     redf   nans                  updateddt

 1     1    1    de     H2    604   103    3     894448f2    1     null                  2018-11-10
 2     2    2    dev    H2    604   115    2     724448f2    0     some data available   2018-11-10
 3     3    3    de     H2    604   125    4     854448f2    1     some issues           2018-11-10
 4     4    4    de     J2    605   211    3     7881dd3d    1     there is data         2018-11-11
 5     5    5    de     J2    605   217    1     ffcb48f2    1     test                  2018-11-11
 6     6    6    de     J2    605   217    3     810c395d    0     attacked              2018-11-11
 7     7    7    de     J3    605   187    1     540c395d    0     null                  2018-11-11
 8     8    8    de     J2    605   193    3     48f2310c    1     the bies              2018-11-11
 9     9    9    de     J2    605   199    4     5610c395    1     null                  2018-11-11

If you want to decide that some of the variables should be numbers (or even dates) instead of strings then you will need to add more logic somewhere.  You could modify the variables after the transpose.  Or when reading the file original you could generate a numeric variable in the TALL structure.  With that you could run two separate PROC TRANSPOSE steps, one just for the values of NAME that you want to make as numeric and one for character strings.  You could then merge the two datasets back together based on the ROW variable.

 

BillM_SAS
SAS Employee

The first problem I see is that the data is not in JSON format. Your text file shows lines in the form of:

{id=1, kind=de, ser=H2,xid=604,nid=103, ceid=3,nrid=894448f2, redf=1, nans=null,updateddt=2018-11-10},

To be in JSON format, it needs to look like:

{"id":1, "kind":"de", "ser":"H2","xid":604, "nid":103, "ceid":3, "nrid":"894448f2", "redf":1, "nans":"null", "updateddt":"2018-11-10"},

Here is your posted data in valid JSON format:

[{"id":1, "kind":"de", "ser":"H2","xid":604, "nid":103, "ceid":3,"nrid":"894448f2", "redf":1, "nans":"null","updateddt":"2018-11-10"},
 {"id":2, "kind":"de", "ser":"H2","xid":604, "nid":115, "ceid":2,"nrid":"724448f2", "redf":0, "nans":"some data available","updateddt":"2018-11-10"},
 {"id":3, "kind":"de", "ser":"H2","xid":604, "nid":125, "ceid":4,"nrid":"854448f2", "redf":1, "nans":"some issues","updateddt":"2018-11-10"},
 {"id":4, "kind":"de", "ser":"J2","xid":605, "nid":211, "ceid":3,"nrid":"7881dd3d", "redf":1, "nans":"there is data","updateddt":"2018-11-11"},
 {"id":5, "kind":"de", "ser":"J2","xid":605, "nid":217, "ceid":1,"nrid":"ffcb48f2", "redf":1, "nans":"test","updateddt":"2018-11-11"},
 {"id":6, "kind":"de", "ser":"J2","xid":605, "nid":217, "ceid":3,"nrid":"810c395d", "redf":0,"nans":"attacked", "updateddt":"2018-11-11"},
 {"id":7, "kind":"de", "ser":"J3","xid":605, "nid":187, "ceid":1,"nrid":"540c395d", "redf":0, "nans":"null","updateddt":"2018-11-11"},
 {"id":8, "kind":"de", "ser":"J2","xid":605, "nid":193, "ceid":3,"nrid":"48f2310c", "redf":1, "nans":"the bies","updateddt":"2018-11-11"},
 {"id":9, "kind":"de", "ser":"J2","xid":605, "nid":199, "ceid":4,"nrid":"5610c395", "redf":1, "nans":"null","updateddt":"2018-11-11"}]

As noted previously, you have columns RED, REDF, and REDFL. Since your output example shows only the REDF field, I am going to assume that all three of these column names are actually the same column.


If you are able to get the data into valid JSON form, I found that the JSON libname engine worked fine:

libname x json './sasuser/sasComm700502.json';
proc print data=x.root(drop=ordinal_root); 
run;

Here are the results:

Obs id kind ser xid nid ceid   nrid   redf nans                updateddt 
1    1  de  H2  604 103   3  894448f2  1   null                2018-11-10 
2    2  de  H2  604 115   2  724448f2  0   some data available 2018-11-10 
3    3  de  H2  604 125   4  854448f2  1   some issues         2018-11-10 
4    4  de  J2  605 211   3  7881dd3d  1   there is data       2018-11-11 
5    5  de  J2  605 217   1  ffcb48f2  1   test                2018-11-11 
6    6  de  J2  605 217   3  810c395d  0   attacked            2018-11-11 
7    7  de  J3  605 187   1  540c395d  0   null                2018-11-11 
8    8  de  J2  605 193   3  48f2310c  1   the bies            2018-11-11 
9    9  de  J2  605 199   4  5610c395  1   null                2018-11-11 
tekish
Quartz | Level 8

Hi Bill,

The problem is the data is being fetched from an external website using APIKEY we don't have control over the data structure. We want to create an automated data extraction.

Kurt_Bremser
Super User

If this file was described as JSON, its creators must have been smoking some pretty strong stuff, as it sure ain't valid JSON code, and that's why the SAS JSON engine coughs up on it.

 

See if @ballardw 's suggestion works for you, but it will require a lot of coding in the future.

tekish
Quartz | Level 8

BillM_SAS;

Now, i got valid JSON format with more parameter. However, i didn't get the indented output.

I have attached "Testfile.txt". I can not attached the jSON file. here the below json format.

1.Is there a way to parse the backslash using SAS for JSON file.

2.How can I get rid of  this part  so that SAS can read as JSON file.""data": {"json": {"suc": true,"body":,I manually removed the backslash and the first part unilt "[" and JSON engine works for that.

</>{
"data": {"json": {"suc": true,"body": "[{\"id\":1,\"kind\":\"de\",\"ser\":\"H2\",\"xid\":\"604-efg\",\"nid\":103,\"ceid\":3,\"rid\":\"a930ca5a994c2ef8\",\"redf\":1,\"nans\":null,\"updateddt\":\"2020-11-10T21\"},{\"id\":2,\"kind\":\"de\",\"ser\":\"H2\",\"xid\":\"604-efg\",\"nid\":115,\"ceid\":2,\"rid\":\"a930-ca5a994c2ef8\",\"redf\":0,\"nans\":null,\"updateddt\":\"2020-11-10T21\"},{\"id\":3,\"kind\":\"de\",\"ser\":\"H2\",\"xid\":\"604-efg\",\"nid\":125,\"ceid\":4,\"rid\":\"a930-ca5a994c2ef8\",\"redf\":1,\"nans\":null,\"updateddt\":\"2020-11-10T21\"},{\"id\":4,\"kind\":\"de\",\"ser\":\"J2\",\"xid\":\"0605-efg\",\"nid\":211,\"ceid\":3,\"nrid\":\"9aea-310c395d271c\",\"redf\":1,\"nans\":null,\"updateddt\":\"2020-11-11T15\"}]"
}
}
}</>

how can get I this part only using SAS:

[{\"id\":1,\"kind\":\"de\",\"ser\":\"H2\",\"xid\":\"604-efg\",\"nid\":103,\"ceid\":3,\"rid\":\"a930ca5a994c2ef8\",\"redf\":1,\"nans\":null,\"updateddt\":\"2020-11-10T21\"},{\"id\":2,\"kind\":\"de\",\"ser\":\"H2\",\"xid\":\"604-efg\",\"nid\":115,\"ceid\":2,\"rid\":\"a930-ca5a994c2ef8\",\"redf\":0,\"nans\":null,\"updateddt\":\"2020-11-10T21\"},{\"id\":3,\"kind\":\"de\",\"ser\":\"H2\",\"xid\":\"604-efg\",\"nid\":125,\"ceid\":4,\"rid\":\"a930-ca5a994c2ef8\",\"redf\":1,\"nans\":null,\"updateddt\":\"2020-11-10T21\"},{\"id\":4,\"kind\":\"de\",\"ser\":\"J2\",\"xid\":\"0605-efg\",\"nid\":211,\"ceid\":3,\"nrid\":\"9aea-310c395d271c\",\"redf\":1,\"nans\":null,\"updateddt\":\"2020-11-11T15\"}]"
}
}
}

 

libname x json "&Dir.\Testfile.json";
proc print data=x.data_json;run;

 

Tom
Super User Tom
Super User

So that file is valid JSON.  But it is really a JSON sandwich.  The value of the BODY field is another independent JSON object.

So first read that using the JSON engine. Then write it back out into a valid JSON file and then read it back in.

So let me make a copy of your text into a file on my system.  Let's just reference it with the fileref JSON.

filename json temp;
options parmcards=json;
parmcards4;
{
  "data": {"json": {"suc": true,"body": "[{\"id\":1,\"kind\":\"de\",\"ser\":\"H2\",\"xid\":\"604-efg\",\"nid\":103,\"ceid\":3,\"rid\":\"a930ca5a994c2ef8\",\"redf\":1,\"nans\":null,\"updateddt\":\"2020-11-10T21\"},{\"id\":2,\"kind\":\"de\",\"ser\":\"H2\",\"xid\":\"604-efg\",\"nid\":115,\"ceid\":2,\"rid\":\"a930-ca5a994c2ef8\",\"redf\":0,\"nans\":null,\"updateddt\":\"2020-11-10T21\"},{\"id\":3,\"kind\":\"de\",\"ser\":\"H2\",\"xid\":\"604-efg\",\"nid\":125,\"ceid\":4,\"rid\":\"a930-ca5a994c2ef8\",\"redf\":1,\"nans\":null,\"updateddt\":\"2020-11-10T21\"},{\"id\":4,\"kind\":\"de\",\"ser\":\"J2\",\"xid\":\"0605-efg\",\"nid\":211,\"ceid\":3,\"nrid\":\"9aea-310c395d271c\",\"redf\":1,\"nans\":null,\"updateddt\":\"2020-11-11T15\"}]"
      }
  }
}
;;;;

On your system you would just need to write a FILENAME statement pointing to where ever that file existed.

filename json '.....txt';

Now let's point a JSON library at it.  Just use the same value for the libref as used for the fileref.

libname json json;

So looking at that there is a table named DATA_JSON ( because of the first two names in the original JSON file ) with a field named BODY that has the actual JSON string. 

proc print data=json.data_json; 
run;

image.png

Notice how the confusing escape character syntax has been cleaned up.

Now write it back out without those escape characters so the JSON object becomes actually part of the new JSON object. Let's just use BODY as the name in this new object and that will be the name the JSON engine will use for the dataset.

filename json2 temp;
data _null_;
  set json.data_json end=eof;
  file json2 ;
  if _n_=1 then put '{' @;
  else put ',' @;
  put '"body":' body ;
  if eof then put '}';
run;
libname json2 json;

Now it looks like something that might actually be usable.

libname json2 json;
proc print data=json2.body;
run;

Results:

image.png

tekish
Quartz | Level 8
Thanks Tom it works perfectly.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 6708 views
  • 2 likes
  • 5 in conversation