BookmarkSubscribeRSS Feed
TBDA
Calcite | Level 5

Hello Everyone,

 

I have a csv file with 4 columns in which data within the first column is stored as json text whose length exceeds 32767.  Can you please help me with the code to import this column into sas.  Please find below the code I have used.  This splits the column with json text abruptly only where the rows have length more than 32767, the other rows are imported correctly.

 

data Test;
    %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
    infile "\test.csv" delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=1;
 length var1-var4 $10000.;
 input var1-var4 $;
run;

2 REPLIES 2
Tom
Super User Tom
Super User

So it sounds like you have a data where the content of the first field is a JSON object.  So perhaps something like this:

Obs                        json                        height    weight

 1     { "Name": "Alfred", "Sex": "M", "Age": 14 }      69.0      112.5
 2     { "Name": "Alice", "Sex": "F", "Age": 13 }       56.5       84.0
 3     { "Name": "Barbara", "Sex": "F", "Age": 13 }     65.3       98.0
 4     { "Name": "Carol", "Sex": "F", "Age": 14 }       62.8      102.5
 5     { "Name": "Henry", "Sex": "M", "Age": 14 }       63.5      102.5

And then someone wrote that into a CSV file and gave it to you.

Do you know if the "CSV" file has been created as a proper CSV file with the JSON quoted and embedded quotes doubled? Like this:

json,height,weight
"{ ""Name"": ""Alfred"", ""Sex"": ""M"", ""Age"": 14 }",69,112.5
"{ ""Name"": ""Alice"", ""Sex"": ""F"", ""Age"": 13 }",56.5,84
"{ ""Name"": ""Barbara"", ""Sex"": ""F"", ""Age"": 13 }",65.3,98
"{ ""Name"": ""Carol"", ""Sex"": ""F"", ""Age"": 14 }",62.8,102.5
"{ ""Name"": ""Henry"", ""Sex"": ""M"", ""Age"": 14 }",63.5,102.5

Or without the extra protections around the quotes in the JSON text? Like this:

{ "Name": "Alfred", "Sex": "M", "Age": 14 },69,112.5
{ "Name": "Alice", "Sex": "F", "Age": 13 },56.5,84
{ "Name": "Barbara", "Sex": "F", "Age": 13 },65.3,98
{ "Name": "Carol", "Sex": "F", "Age": 14 },62.8,102.5
{ "Name": "Henry", "Sex": "M", "Age": 14 },63.5,102.5

If you have the later then you can read the first column character by character and write it to a JSON text file. Then read the other fields on the line.  So something like this will work with my little example.

filename json temp;
data csv_fields ;
  infile csv dsd truncover lrecl=1000000 firstobs=1 column=cc length=ll end=eof;
  file json lrecl=1000000;
  if _n_=1 then put '[' @;
  else if eof then put ']';
  else put ',' @;
  found=0;
  do until(found or cc>ll);
    input ch $char1. @;
    q=sum(q,ch='{',-(ch='}'));
    if ch=',' and q=0 then found=1;
    else put ch $char1. @;
  end;
  put ;
  input height weight ;
  keep height weight ;
run;

And generate a JSON file that looks like this:

[{ "Name": "Alfred", "Sex": "M", "Age": 14 }
,{ "Name": "Alice", "Sex": "F", "Age": 13 }
,{ "Name": "Barbara", "Sex": "F", "Age": 13 }
,{ "Name": "Carol", "Sex": "F", "Age": 14 }
,{ "Name": "Henry", "Sex": "M", "Age": 14 }
]

Which you could then easily read with the JSON libname engine.  Then you can combine the data from the JSON file with the fields you read from the CSV file.  So something like this:

libname json json ;
data want;
  set json.root;
  set csv_fields;
run;

Results:

       ordinal_
Obs      root       Name      Sex    Age    height    weight

 1         1       Alfred      M      14     69.0      112.5
 2         2       Alice       F      13     56.5       84.0
 3         3       Barbara     F      13     65.3       98.0
 4         4       Carol       F      14     62.8      102.5
 5         5       Henry       M      14     63.5      102.5

 If the CSV file has the extra quotes you will want to do something to remove those. You might pre-process the file first to remove the extra quotes.  Or doing during the step that writes the JSON file.  Or post process the JSON file to remove the extra quotes.

 

If the JSON object text is more complicated than my example the step I wrote to detect the end of the JSON part of the line might not work right. So you might have to adjust that.

 

Also if the JSON object is more complex then the process of matching it back up to the other fields in the CSV file might be harder than my simple example.

Ksharp
Super User
Could make it vertically , and process it later.

data test;
infile "\test.csv" recfm=n ;
input temp $1. @@ ;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 672 views
  • 4 likes
  • 3 in conversation