BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pronabesh
Fluorite | Level 6

Hi SAS Community,

I am looking to parse a string variable and extract several variables from the information contained in it. Listed below is an example of the string data:

Variable_data

{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}

I want the output data set with additional variables parsed from the string (shown below with variable names on top):

Calories  Steps  Distance  Floors  Elevation  activeScore

153  87  0.0578725741524  0  0  -1

There is a second layer of data management to this. In some cases there are two data points in one row for the string variable (example shown below)

[{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}, {"calories":"159","steps":"77","distance":"0.0498723741524","floors":"0","elevation":"0","activeScore":"-1"}]

Is it possible to parse the data as two separate rows in this case?

Any help will be much appreciated.

Best,

Pronabesh

1 ACCEPTED SOLUTION

Accepted Solutions
pronabesh
Fluorite | Level 6

Thank All,

I was actually trying to avoid the datalines option because the file was very large. The following infile and input statement code worked.

filename a "C:\Users\Pronabesh\Desktop\PLM_Stats\Biogen Devices\Data analysis\test.txt"; *This is any file, you just need something with a row or two in it;

data want;

set have;

infile a dlm=', "';

input @@;

_infile_=data;

input

@'"calories":' calories $

@'"steps":' steps $

@'"distance":' distance $;

put calories= steps= distance=;

run;

View solution in original post

13 REPLIES 13
Patrick
Opal | Level 21

This looks like JSON to me. If so then a Google search using something like "site:support.sas.com JSON" will return quite a few relevant links like:

http://support.sas.com/resources/papers/proceedings13/296-2013.pdf

http://support.sas.com/resources/papers/proceedings13/342-2013.pdf

Haikuo
Onyx | Level 15

This may get you started,

data have;

     infile cards dlm='}]';

     input var : $ 150. @@;

     if not missing (var);

     array v  calories steps distance floors elevation activeScore;

     format distance best32.20;

     do over v;

           v=prxchange('s/.+'||vname(v)||'":"(-\d+|\d+|\d\.\d+)".*/$1/i',-1,var);

     end;

     drop var;

     cards;

{"calories":"152","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}

[{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}, {"calories":"159","steps":"77","distance":"0.0498723741524","floors":"0","elevation":"0","activeScore":"-1"}]

;


Haikuo

slchen
Lapis Lazuli | Level 10

data have;

     infile cards truncover dlm=',';

     input @;

  _infile_=compress(_infile_,'-.,','kd');

  input calories steps distance floors elevation activeScore;

     cards;

[{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}

{"calories":"159","steps":"77","distance":"0.0498723741524","floors":"0","elevation":"0","activeScore":"-1"}]

;

run;

Ksharp
Super User

Your JSON file contains JSON ARRAY, that would make problem more complicated . My suggest is like Patrick to use proc json .

Patrick
Opal | Level 21

I believe PROC JSON is only for writing a SAS data set to a JSON file. You will need a SAS data step for reading a JSON file.

Ksharp
Super User

Oh,Patrick That is disappoint about SAS for not adding such function in proc json .

If there were no JSON Object ,that would be easy, but considering about json object that would make problem complicated .

Patrick
Opal | Level 21

Just stumbled over this link How to import Twitter tweets in SAS DATA Step using OAuth 2 authentication style - SAS Voices and thought this might interest you.

Ksharp
Super User

Hi Patrick,

Sorry . Firstly, I can't enter Twitter url , Chinese Government have already block them all . Secondly, I totally have no idea about it , OAuth 2 is a completely different thing other than general url , Maybe you should check SAS documentation detail like proc http or something else , or talk some SAS guys to see if SAS has already support this kind of connection .

OH, I know what do you mean. proc groovy can import JSON file . That will be nice .

Regards

Xia Keshan

pronabesh
Fluorite | Level 6

Thank All,

I was actually trying to avoid the datalines option because the file was very large. The following infile and input statement code worked.

filename a "C:\Users\Pronabesh\Desktop\PLM_Stats\Biogen Devices\Data analysis\test.txt"; *This is any file, you just need something with a row or two in it;

data want;

set have;

infile a dlm=', "';

input @@;

_infile_=data;

input

@'"calories":' calories $

@'"steps":' steps $

@'"distance":' distance $;

put calories= steps= distance=;

run;

Patrick
Opal | Level 21

If the data structure in the JSON file is orthogonal and you will always get all key:value pairs then you could try something like below:

data testdata;

  infile datalines truncover dsd dlm=',}' COLUMN=point_pos;

  point_pos=1;

  input @;

  row_id=_n_;

  do while(point_pos+10 < lengthn(_infile_));

    input @point_pos

      @'"calories":' calories :32.

      @'"steps":' steps :32.

      @'"distance":' distance :32.

      @'"floors":' floors :32.

      @'"elevation":' elevation :32.

      @'"activeScore":' activeScore :32.

      @

    ;

    output;

  end;

  datalines;

{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}

[{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}, {"calories":"159","steps":"77","distance":"0.0498723741524","floors":"0","elevation":"0","activeScore":"-1"}]

;

run;

pronabesh
Fluorite | Level 6

Thanks Patrick and Ksharp. As you guys pointed out, proc JSON is only for exporting SAS files to JSON.

@Patrick: Your code works if I use the datalines options. However, I have a very large datafile in which the json object is just one variable. There are other variables linked to the json object via unique identifier (example shown below). Is there a way to parse it in this case?

49999{"calories":"1736","steps":"4024","distance":"1.79278637687032","floors":"0","elevation":"0","activeScore":"-1"}activity
51111{"calories":"1583","steps":"85","distance":"0.0378725741524","floors":"0","elevation":"0","activeScore":"-1"}activity
pronabesh
Fluorite | Level 6

the variables are

obs id data activity

49999{"calories":"1736","steps":"4024","distance":"1.79278637687032","floors":"0","elevation":"0","activeScore":"-1"}activity
51111{"calories":"1583","steps":"85","distance":"0.0378725741524","floors":"0","elevation":"0","activeScore":"-1"}activity
Patrick
Opal | Level 21

One way to go:

data testdata(drop=_:);

  infile datalines truncover dlm='"[{,}]' COLUMN=point_pos;

  point_pos=1;

  attrib

    id data informat=32.

    activity informat=$20.

    ;

  input _lead_str :$30. @;

  if not missing(_lead_str) then

    do;

      id=input(scan(_lead_str,1,' '),32.);

      data=input(scan(_lead_str,2,' '),32.);

    end;

  activity=scan(_infile_,-1,'}] ');

  row_id=_n_;

  do while(point_pos+20 < lengthn(_infile_));

    input @point_pos

      @'"calories":' calories :32.

      @'"steps":' steps :32.

      @'"distance":' distance :32.

      @'"floors":' floors :32.

      @'"elevation":' elevation :32.

      @'"activeScore":' activeScore :32.

      @

    ;

    output;

  end;

  datalines;

4 999 {"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"} activity

5 1111 [{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}, {"calories":"159","steps":"77","distance":"0.0498723741524","floors":"0","elevation":"0","activeScore":"-1"}] activity

;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 4374 views
  • 1 like
  • 5 in conversation