Help using Base SAS procedures

Removing null and zero values in proc JSON

Reply
New Contributor
Posts: 4

Removing null and zero values in proc JSON

I am trying to export a SAS data set as a .JSON file. The data set has many empty cells, and cells that are zero's. In an attempt to reduce the size of the .JSON file (it is presently more than 1GB in size), I wish to avoid printing these values. However, it does not seem like PROC JSON has this feature included. Therefore, can anyone help me with a workaround?

 

Thanks in advance for any tips or help!

Regular Contributor
Posts: 178

Re: Removing null and zero values in proc JSON

Can you give us some sample data - that'll make it easier to give you a comprehensive response?

New Contributor
Posts: 4

Re: Removing null and zero values in proc JSON

[ Edited ]

Hi Chris

 

Sorry for not adding any programs or datasets. I created a small dataset that hopefully illustrates the issue. Let's say I have a dataset defined as follows

 

data testdata;
input class $ var1 var2 var3;
datalines;
class1 3 0 1
class2 2 2 .
class3 1 1 1
;

 

 

 

I have a number of class variables, each with a set of (in this case numeric) values. Now, I wish to export the data as a .JSON file that is to be read elsewhere, where it doesn't matter if a given value is empty, 0 or missing. Therefore, I wish to remove these values from the exported .JSON file. Using a normal proc json,

 

proc json out = 'Y:\jsonout.json' pretty nosastags ;
export testdata;
run;

 

 

I get the following:

 

[
{
"class": "class1",
"var1": 3,
"var2": 0,
"var3": 1
},
{
"class": "class2",
"var1": 2,
"var2": 2,
"var3": null
},
{
"class": "class3",
"var1": 1,
"var2": 1,
"var3": 1
}
]

 

So basically, I wish to create a .JSON file that does not include "var2" for "class1" and "var3" for "class2", but every other variable. What I want is the following:

 

[
{
"class": "class1",
"var1": 3,
"var3": 1
},
{
"class": "class2",
"var1": 2,
"var2": 2,
},
{
"class": "class3",
"var1": 1,
"var2": 1,
"var3": 1
}
]

 

I hope this example helps.

Community Manager
Posts: 2,757

Re: Removing null and zero values in proc JSON

I think you'll have to post-process the JSON file that you create.  It's simple with DATA step.  Read the file lines, and output only those that have a value other than 0 or null.

 

data testdata;
input class $ var1 var2 var3;
datalines;
class1 3 0 1
class2 2 2 .
class3 1 1 1
;

proc json out = 'c:\temp\jsonout.json' pretty nosastags ;
 export testdata;
run;

data _null_;
  infile 'c:\temp\jsonout.json';
  file 'c:\temp\jsonout_light.json';
  input;
  length val $ 10;
  /* get value token from the JSON line */
  val = compress( scan(_infile_,-1,':'), ',','s'); 
  if (val ^= '0' and val ^= 'null') then
    put _infile_;
run;

The one problem with this (well, maybe there are more problems that I haven't thought of) -- this could leave some lines with a terminating comma before the end brace, when JSON convention is to leave the comma off for the last value in a list.  So you'll probably need to add in some more conditions with a RETAIN to strip the comma if the next token is an end brace.

New Contributor
Posts: 4

Re: Removing null and zero values in proc JSON

Hi Chris

 

Thank you for your answer. I will try to experiment a bit with your scan solution. Do you know if any plans for implementing such a feature directly to proc JSON is on its way?

SAS Employee
Posts: 31

Re: Removing null and zero values in proc JSON

Since you are worried about the size of the file, another possibility is to turn off the field keys. This will probably cut the size of the file by at least 50%. Without the keys, the file is slightly harder for a human to read. I do not know how the JSON file is to be used. But, if the JSON file is going to be used as input to another computer program are the keys really needed? The first value in the array is always the "class" variable, the second variable, "var1", etc. Note that without keys, the JSON containers holding the data change from objects to arrays.

proc json out='smallJson.json' pretty nosastags nokeys;
export testdata;
run;
[
  [
    "class1",
    3,
    0,
    1
  ],
  [
    "class2",
    2,
    2,
    null
  ],
  [
    "class3",
    1,
    1,
    1
  ]
]
New Contributor
Posts: 4

Re: Removing null and zero values in proc JSON

Hi BIIM_SAS

 

Thank you for your answer! 

 

While your solution does decreases the size of my file significantly (it just about cuts the size in half), I'm afraid your solution isn't viable as I see it. Yes, the JSON file is to be read in a different program, but the data set has many columns (right now it is 168 columns), and some are added and removed continuously as we still are in a development phase. Therefore, simply removing the keys from the JSON file is not something I am comfortable with.

Ask a Question
Discussion stats
  • 6 replies
  • 199 views
  • 4 likes
  • 4 in conversation