We’re smarter together. Learn from this collection of community knowledge and add your expertise.

Using DS2 to parse JSON from a REST API

by Community Manager ‎10-14-2015 09:15 AM - edited ‎04-07-2016 10:48 AM (1,926 Views)

SAS 9.4 introduced PROC JSON, which allows you to create JSON output from a data set.  But it wasn't until SAS 9.4 Maint 3 that we have a built-in method to parse JSON content.  That method was added as a DS2 package: the JSON package.

 

About JSON

JSON (JavaScript Object Notation) is the most common response type from REST-based web services.  It's a simple plain text format that structures data into name-value pairs, along with some metadata, to communicate "answers" to the questions that you ask via HTTP calls.  But the data structures can be nested, and the text is delimited with special characters such as braces, quotes, commas, and colons.  Here is an example snippet of JSON that describes part of a message object from this community, including details about the author and the time stamp of the message:

 

"response": {
"status": "success",
"messages": {
"message": [
{
"last_edit_author": {
"type": "user",
"href": "\/users\/id\/1184",
"login": {
"type": "string",
"$": "Kanyange"
}
},
"post_time": {
"type": "date_time",
"$": "2015-10-03T15:44:28+00:00"
},
}
]
}
}

 

 

About DS2

PROC DS2 is part of Base SAS.  DS2 is a procedural programming language that is similar to DATA step and that has access to your favorite DATA step functions, but it's much more.  With DS2, you can define program variables that aren't part of your data, you can define methods, and use control flow logic.  DS2 also has packages: predefined collections of capabilities for special purposes.  Some built-in packages include HTTP (for web calls), MATRIX for matrix programming, and now JSON. Learn more about DS2 from the SAS online documentation.

 

The JSON parser in DS2 frees you from the worry of parsing the syntax of a JSON object (with its various delimiters) and provides simple access to the data portion of JSON tokens: those name-value combinations that represent the data you want.

 

To use the JSON package, you first declare an object of type json.  From that object you can use the createParser() method to create an instance of the parser that will do the work for you.  After you initialize the parser with the JSON content, you can use getNextToken to navigate through the JSON structure to get to the values you want.  (There are many more methods in the JSON package, but these are the basic building blocks.

 

Here's a simple DS2 segment that shows how you would use the http package to call an API, then navigate the API response with the JSON parser:

 

dcl package json j();
dcl package http webQuery();
webQuery.createGetMethod('http://webservice.com/?response_style=json');
webQuery.executeMethod();
webQuery.getResponseBodyAsString(response, rc);
rc = j.createParser( response );
do while (rc=0);
j.getNextToken( rc, token, tokenType, parseFlags);
* subject line;
if (token eq 'subject') then
do;
j.getNextToken( rc, token, tokenType, parseFlags);
subject=token;
end;
end;

 

A complete example of the JSON parser with a REST API

Here's an example of the method working -- using an API that powers this very community!  (That's "meta" for you.)  

 

The program requires SAS 9.4 Maint 3.  It also works fine in the most recent version of SAS University Edition (using SAS Studio 3.4).  All of the code runs using just Base SAS procedures.

 

/* DS2 program that uses a REST-based API */
/* Uses http package for API calls       */
/* and the JSON package (new in 9.4m3)   */
/* to parse the result.                  */
proc ds2; 
  data messages (overwrite=yes);
    /* Global package references */
    dcl package json j();

    /* Keeping these variables for output */
    dcl double post_date having format datetime20.;
    dcl int views;
    dcl nvarchar(128) subject author board;
    
    /* these are temp variables */
    dcl varchar(65534) character set utf8 response;
    dcl int rc;
    drop response rc;

    method parseMessages();
      dcl int tokenType parseFlags;
      dcl nvarchar(128) token;
      rc=0;
      * iterate over all message entries;
      do while (rc=0);
        j.getNextToken( rc, token, tokenType, parseFlags);

        * subject line;
        if (token eq 'subject') then
          do;
            j.getNextToken( rc, token, tokenType, parseFlags);
            subject=token;
          end;

        * board URL, nested in an href label;
        if (token eq 'board') then
          do;
            do while (token ne 'href');
               j.getNextToken( rc, token, tokenType, parseFlags );
            end;
            j.getNextToken( rc, token, tokenType, parseFlags );
            board=token;
          end;

        * number of views (int), nested in a count label ;
        if (token eq 'views') then
          do;
            do while (token ne 'count');
               j.getNextToken( rc, token, tokenType, parseFlags );
            end;
            j.getNextToken( rc, token, tokenType, parseFlags );
            views=inputn(token,'5.');
          end;

        * date-time of message (input/convert to SAS date) ;
        * format from API: 2015-09-28T10:16:01+00:00 ;
        if (token eq 'post_time') then
          do;
            j.getNextToken( rc, token, tokenType, parseFlags );
            post_date=inputn(token,'anydtdtm26.');
          end;

        * user name of author, nested in a login label;
        if (token eq 'author') then
          do; 
            do while (token ne 'login');
               j.getNextToken( rc, token, tokenType, parseFlags );
            end;
            * get the author login (username) value;
            j.getNextToken( rc, token, tokenType, parseFlags );
            author=token;
            output;
          end;
      end;
      return;
    end;

    method init();
      dcl package http webQuery();
      dcl int rc tokenType parseFlags;
      dcl nvarchar(128) token;
      dcl integer i rc;

      /* create a GET call to the API                                         */
      /* 'sas_programming' covers all SAS programming topics from communities */
      webQuery.createGetMethod(
         'http://communities.sas.com/kntur85557/' || 
         'restapi/vc/categories/id/sas_programming/posts/recent' ||
         '?restapi.response_format=json' ||
         '&restapi.response_style=-types,-null&page_size=100');
      /* execute the GET */
      webQuery.executeMethod();
      /* retrieve the response body as a string */
      webQuery.getResponseBodyAsString(response, rc);
      rc = j.createParser( response );
      do while (rc = 0);
        j.getNextToken( rc, token, tokenType, parseFlags);
        if (token = 'message') then
          parseMessages();
      end;
    end;

  method term();
    rc = j.destroyParser();
  end;

  enddata;
run;
quit;

/* Add some basic reporting */
proc freq data=messages noprint;
    format post_date datetime11.;
    table post_date / out=message_times;
run;

ods graphics / width=2000 height=600;
title '100 recent message contributions in SAS Programming';
title2 'Time in GMT';
proc sgplot data=message_times;
    series x=post_date y=count;
    xaxis minor label='Time created';
    yaxis label='Messages' grid;
run;

title 'Board frequency for recent 100 messages';
proc freq data=messages order=freq;
    table board;
run;

title 'Detailed listing of messages';
proc print data=messages;
run;

title;

Here is some partial output:

 

msglist.png

 

 A note about "destructors"... You might notice that the example includes a call to destroyParser().

 

  method term();
    rc = j.destroyParser();
  end;

It's not strictly necessary in this example, as the parser object will be cleaned up when the object goes out of scope at the end of the DS2 program.  But as some DS2 programs can be long-running and memory intensive, cleaning up allocated objects is a best practice.

Comments
by Super User
on ‎10-15-2015 08:56 AM

I think it's great that there are means to access json files, since they are pretty common nowdays.

But, one thing that is great about SAS programing is the powerful languge - you can achieve a lot with few lines of code.

What I see here (I haven't studied the details, I must admit) is 75 lines of code (blanks and comments uncounted). Is there still a benifit compared with other tools/languages?

 

Please, prove me wrong :smileyhappy:

by Community Manager
on ‎10-16-2015 08:11 PM

@LinusH, there are JSON parsers in other languages that present more of an object model and require less code to get to particular values.  You're correct that the SAS JSON parser does just the basics right now: you can navigate the values in the JSON data without worrying about the delimiter tokens.  That is, it's a parser for the syntax layer, but does not present a semantic layer.

 

For small JSON content, DATA step and constructs like the SCAN function might be more elegant than the DS2 JSON package.  For larger JSON content, especially with many nesting levels and types, the DS2 JSON parser allows you to build more robust applications.

by SAS Super FREQ
on ‎04-07-2016 10:44 AM

Chris

 

Looking at the graph seems like the label for the axis is just the other way around

 

Here is the code in question:

proc sgplot data=message_times;
    series x=post_date y=count;
    xaxis minor label='Messages';
    yaxis label='Time created' grid;
run;

Bruno

by Community Manager
on ‎04-07-2016 10:49 AM

@Bruno_SAS - you're right! I've fixed it.  Thanks for noticing.

by Regular Contributor
on ‎05-25-2016 08:41 AM

Hi Chris :-) 

 

I'm running Windows 7 32-bit with SAS 9.4 TS1M3 locally installed.

 

I'm trying multiple versions of the PROC DS2 HTTP / JSON packages and getting errors that I don't understand.  When I run your example (straight copy / paste from the article) I get "getNextToken encountered a failure", "Encountered an illegal character", "JSON Parser located at column 0 of line 1 within the text". 

 

I've tried other HTTP / JSON examples and get different errors, probably because of the different ways to use the packages.  In one case, I'm running the code below and getting "DATA STEP component object failure.  Aborted during compilation phase.  Statement is not valid or it is used out of proper order." and then " PACKAGE is not a known class name".


data _null_;
    method init();
        dcl package json j();
        dcl int rc tokenType parseFlags;
        dcl bigint lineNum colNum;
        dcl nvarchar(128) token abc t1;
        abc = 'xyz';
        t1 = '{"abc" : 1 }';
        rc = j.createParser( t1 );
             if (rc ne 0) then goto TestError;

        * obj open;
        j.getNextToken( rc, token, tokenType, parseFlags, lineNum, colNum );
        if ( rc ne 0 ) then goto TestError;

        * obj label; 
        j.getNextToken( rc, token, tokenType, parseFlags, lineNum, colNum );
        if ( rc ne 0 ) then goto TestError;

        * obj value; 
        j.getNextToken( rc, token, tokenType, parseFlags, lineNum, colNum );
        if ( rc  ne 0) then goto TestError;

        * obj close; 
        j.getNextToken( rc, token, tokenType, parseFlags, lineNum, colNum );
        if ( rc ne 0 ) then goto TestError;

     Exit:
        rc = j.destroyParser();
        return;

    TestError:
        put 'Test ended abnormally.';
        goto Exit;

    end;
enddata;
run;

 

Any thoughts?  I've started fiddling around with this and so probably missing something.

Thanks and don't work too hard!

Chris

by Community Manager
on ‎05-26-2016 08:47 AM

Hi @DarthPathos - not sure what's going on with your errors when you copy/paste, but the example you shared has a problem -- no PROC DS2 statement.  Add proc ds2; on th line before the data _null_ statement, and your test should work.

 

As far as the copy/paste, it's possible there is transcoding issue if any of the topics have Unicode chars.  Can you start your SAS session with -encoding utf8 to see if that helps? 

by Regular Contributor
on ‎05-27-2016 08:02 AM

Hmmm...I've now tried copying from the article to Notepad, then to SAS; same error message. 

 

So i typed out all the code (hey, it's Friday, what else am I going to do :P).  The code works (almost)!  I now get the data I'm expecting but for some reason the "Subject" column is showing "Subject" all the way down instead of the actual subject.  Can't see a difference between your code and mine, but who knows.

 

Thanks for your time and happy Weekend!!

Chris

by Community Manager
on ‎05-27-2016 08:12 AM

@DarthPathos I suspect you might be missing a "getNextToken" call, which would set the "subject" var to the value of the subject field.

 

        * subject line;
        if (token eq 'subject') then
          do;
            j.getNextToken( rc, token, tokenType, parseFlags);
            subject=token;
          end;
by Regular Contributor
on ‎05-27-2016 08:49 AM

Hey Chris - turns out I was missing the "token" in the brackets. <hangs head in shame...>

 

MOAR COFFEE.....

 

Thanks :-)

Chris

by New Contributor faisalbasra
4 weeks ago

This is a great article, thanks for sharing. Do anyone here, have experience for DS2 with SOAP web service invoking and processing? 

Your turn
Sign In!

Want to write an article? Sign in with your profile.