BookmarkSubscribeRSS Feed
TBDA
Calcite | Level 5

Hello Everyone,

 

I have a csv file with 4 columns in which data within the first column is stored as json text whose length exceeds 32767.  Can you please help me with the code to import this column into sas.  Please find below the code I have used.  This splits the column with json text abruptly only where the rows have length more than 32767, the other rows are imported correctly.

 

data Test;
    %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
    infile "\test.csv" delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=1;
 length var1-var4 $10000.;
 input var1-var4 $;
run;

2 REPLIES 2
Tom
Super User Tom
Super User

So it sounds like you have a data where the content of the first field is a JSON object.  So perhaps something like this:

Obs                        json                        height    weight

 1     { "Name": "Alfred", "Sex": "M", "Age": 14 }      69.0      112.5
 2     { "Name": "Alice", "Sex": "F", "Age": 13 }       56.5       84.0
 3     { "Name": "Barbara", "Sex": "F", "Age": 13 }     65.3       98.0
 4     { "Name": "Carol", "Sex": "F", "Age": 14 }       62.8      102.5
 5     { "Name": "Henry", "Sex": "M", "Age": 14 }       63.5      102.5

And then someone wrote that into a CSV file and gave it to you.

Do you know if the "CSV" file has been created as a proper CSV file with the JSON quoted and embedded quotes doubled? Like this:

json,height,weight
"{ ""Name"": ""Alfred"", ""Sex"": ""M"", ""Age"": 14 }",69,112.5
"{ ""Name"": ""Alice"", ""Sex"": ""F"", ""Age"": 13 }",56.5,84
"{ ""Name"": ""Barbara"", ""Sex"": ""F"", ""Age"": 13 }",65.3,98
"{ ""Name"": ""Carol"", ""Sex"": ""F"", ""Age"": 14 }",62.8,102.5
"{ ""Name"": ""Henry"", ""Sex"": ""M"", ""Age"": 14 }",63.5,102.5

Or without the extra protections around the quotes in the JSON text? Like this:

{ "Name": "Alfred", "Sex": "M", "Age": 14 },69,112.5
{ "Name": "Alice", "Sex": "F", "Age": 13 },56.5,84
{ "Name": "Barbara", "Sex": "F", "Age": 13 },65.3,98
{ "Name": "Carol", "Sex": "F", "Age": 14 },62.8,102.5
{ "Name": "Henry", "Sex": "M", "Age": 14 },63.5,102.5

If you have the later then you can read the first column character by character and write it to a JSON text file. Then read the other fields on the line.  So something like this will work with my little example.

filename json temp;
data csv_fields ;
  infile csv dsd truncover lrecl=1000000 firstobs=1 column=cc length=ll end=eof;
  file json lrecl=1000000;
  if _n_=1 then put '[' @;
  else if eof then put ']';
  else put ',' @;
  found=0;
  do until(found or cc>ll);
    input ch $char1. @;
    q=sum(q,ch='{',-(ch='}'));
    if ch=',' and q=0 then found=1;
    else put ch $char1. @;
  end;
  put ;
  input height weight ;
  keep height weight ;
run;

And generate a JSON file that looks like this:

[{ "Name": "Alfred", "Sex": "M", "Age": 14 }
,{ "Name": "Alice", "Sex": "F", "Age": 13 }
,{ "Name": "Barbara", "Sex": "F", "Age": 13 }
,{ "Name": "Carol", "Sex": "F", "Age": 14 }
,{ "Name": "Henry", "Sex": "M", "Age": 14 }
]

Which you could then easily read with the JSON libname engine.  Then you can combine the data from the JSON file with the fields you read from the CSV file.  So something like this:

libname json json ;
data want;
  set json.root;
  set csv_fields;
run;

Results:

       ordinal_
Obs      root       Name      Sex    Age    height    weight

 1         1       Alfred      M      14     69.0      112.5
 2         2       Alice       F      13     56.5       84.0
 3         3       Barbara     F      13     65.3       98.0
 4         4       Carol       F      14     62.8      102.5
 5         5       Henry       M      14     63.5      102.5

 If the CSV file has the extra quotes you will want to do something to remove those. You might pre-process the file first to remove the extra quotes.  Or doing during the step that writes the JSON file.  Or post process the JSON file to remove the extra quotes.

 

If the JSON object text is more complicated than my example the step I wrote to detect the end of the JSON part of the line might not work right. So you might have to adjust that.

 

Also if the JSON object is more complex then the process of matching it back up to the other fields in the CSV file might be harder than my simple example.

Ksharp
Super User
Could make it vertically , and process it later.

data test;
infile "\test.csv" recfm=n ;
input temp $1. @@ ;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 992 views
  • 4 likes
  • 3 in conversation