BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pronabesh
Fluorite | Level 6

Hi SAS Community,

I am looking to parse a string variable and extract several variables from the information contained in it. Listed below is an example of the string data:

Variable_data

{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}

I want the output data set with additional variables parsed from the string (shown below with variable names on top):

Calories  Steps  Distance  Floors  Elevation  activeScore

153  87  0.0578725741524  0  0  -1

There is a second layer of data management to this. In some cases there are two data points in one row for the string variable (example shown below)

[{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}, {"calories":"159","steps":"77","distance":"0.0498723741524","floors":"0","elevation":"0","activeScore":"-1"}]

Is it possible to parse the data as two separate rows in this case?

Any help will be much appreciated.

Best,

Pronabesh

1 ACCEPTED SOLUTION

Accepted Solutions
pronabesh
Fluorite | Level 6

Thank All,

I was actually trying to avoid the datalines option because the file was very large. The following infile and input statement code worked.

filename a "C:\Users\Pronabesh\Desktop\PLM_Stats\Biogen Devices\Data analysis\test.txt"; *This is any file, you just need something with a row or two in it;

data want;

set have;

infile a dlm=', "';

input @@;

_infile_=data;

input

@'"calories":' calories $

@'"steps":' steps $

@'"distance":' distance $;

put calories= steps= distance=;

run;

View solution in original post

13 REPLIES 13
Patrick
Opal | Level 21

This looks like JSON to me. If so then a Google search using something like "site:support.sas.com JSON" will return quite a few relevant links like:

http://support.sas.com/resources/papers/proceedings13/296-2013.pdf

http://support.sas.com/resources/papers/proceedings13/342-2013.pdf

Haikuo
Onyx | Level 15

This may get you started,

data have;

     infile cards dlm='}]';

     input var : $ 150. @@;

     if not missing (var);

     array v  calories steps distance floors elevation activeScore;

     format distance best32.20;

     do over v;

           v=prxchange('s/.+'||vname(v)||'":"(-\d+|\d+|\d\.\d+)".*/$1/i',-1,var);

     end;

     drop var;

     cards;

{"calories":"152","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}

[{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}, {"calories":"159","steps":"77","distance":"0.0498723741524","floors":"0","elevation":"0","activeScore":"-1"}]

;


Haikuo

slchen
Lapis Lazuli | Level 10

data have;

     infile cards truncover dlm=',';

     input @;

  _infile_=compress(_infile_,'-.,','kd');

  input calories steps distance floors elevation activeScore;

     cards;

[{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}

{"calories":"159","steps":"77","distance":"0.0498723741524","floors":"0","elevation":"0","activeScore":"-1"}]

;

run;

Ksharp
Super User

Your JSON file contains JSON ARRAY, that would make problem more complicated . My suggest is like Patrick to use proc json .

Patrick
Opal | Level 21

I believe PROC JSON is only for writing a SAS data set to a JSON file. You will need a SAS data step for reading a JSON file.

Ksharp
Super User

Oh,Patrick That is disappoint about SAS for not adding such function in proc json .

If there were no JSON Object ,that would be easy, but considering about json object that would make problem complicated .

Patrick
Opal | Level 21

Just stumbled over this link How to import Twitter tweets in SAS DATA Step using OAuth 2 authentication style - SAS Voices and thought this might interest you.

Ksharp
Super User

Hi Patrick,

Sorry . Firstly, I can't enter Twitter url , Chinese Government have already block them all . Secondly, I totally have no idea about it , OAuth 2 is a completely different thing other than general url , Maybe you should check SAS documentation detail like proc http or something else , or talk some SAS guys to see if SAS has already support this kind of connection .

OH, I know what do you mean. proc groovy can import JSON file . That will be nice .

Regards

Xia Keshan

pronabesh
Fluorite | Level 6

Thank All,

I was actually trying to avoid the datalines option because the file was very large. The following infile and input statement code worked.

filename a "C:\Users\Pronabesh\Desktop\PLM_Stats\Biogen Devices\Data analysis\test.txt"; *This is any file, you just need something with a row or two in it;

data want;

set have;

infile a dlm=', "';

input @@;

_infile_=data;

input

@'"calories":' calories $

@'"steps":' steps $

@'"distance":' distance $;

put calories= steps= distance=;

run;

Patrick
Opal | Level 21

If the data structure in the JSON file is orthogonal and you will always get all key:value pairs then you could try something like below:

data testdata;

  infile datalines truncover dsd dlm=',}' COLUMN=point_pos;

  point_pos=1;

  input @;

  row_id=_n_;

  do while(point_pos+10 < lengthn(_infile_));

    input @point_pos

      @'"calories":' calories :32.

      @'"steps":' steps :32.

      @'"distance":' distance :32.

      @'"floors":' floors :32.

      @'"elevation":' elevation :32.

      @'"activeScore":' activeScore :32.

      @

    ;

    output;

  end;

  datalines;

{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}

[{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}, {"calories":"159","steps":"77","distance":"0.0498723741524","floors":"0","elevation":"0","activeScore":"-1"}]

;

run;

pronabesh
Fluorite | Level 6

Thanks Patrick and Ksharp. As you guys pointed out, proc JSON is only for exporting SAS files to JSON.

@Patrick: Your code works if I use the datalines options. However, I have a very large datafile in which the json object is just one variable. There are other variables linked to the json object via unique identifier (example shown below). Is there a way to parse it in this case?

49999{"calories":"1736","steps":"4024","distance":"1.79278637687032","floors":"0","elevation":"0","activeScore":"-1"}activity
51111{"calories":"1583","steps":"85","distance":"0.0378725741524","floors":"0","elevation":"0","activeScore":"-1"}activity
pronabesh
Fluorite | Level 6

the variables are

obs id data activity

49999{"calories":"1736","steps":"4024","distance":"1.79278637687032","floors":"0","elevation":"0","activeScore":"-1"}activity
51111{"calories":"1583","steps":"85","distance":"0.0378725741524","floors":"0","elevation":"0","activeScore":"-1"}activity
Patrick
Opal | Level 21

One way to go:

data testdata(drop=_:);

  infile datalines truncover dlm='"[{,}]' COLUMN=point_pos;

  point_pos=1;

  attrib

    id data informat=32.

    activity informat=$20.

    ;

  input _lead_str :$30. @;

  if not missing(_lead_str) then

    do;

      id=input(scan(_lead_str,1,' '),32.);

      data=input(scan(_lead_str,2,' '),32.);

    end;

  activity=scan(_infile_,-1,'}] ');

  row_id=_n_;

  do while(point_pos+20 < lengthn(_infile_));

    input @point_pos

      @'"calories":' calories :32.

      @'"steps":' steps :32.

      @'"distance":' distance :32.

      @'"floors":' floors :32.

      @'"elevation":' elevation :32.

      @'"activeScore":' activeScore :32.

      @

    ;

    output;

  end;

  datalines;

4 999 {"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"} activity

5 1111 [{"calories":"153","steps":"87","distance":"0.0578725741524","floors":"0","elevation":"0","activeScore":"-1"}, {"calories":"159","steps":"77","distance":"0.0498723741524","floors":"0","elevation":"0","activeScore":"-1"}] activity

;

run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 5583 views
  • 1 like
  • 5 in conversation