BookmarkSubscribeRSS Feed
rudfaden
Lapis Lazuli | Level 10

I have an xml file with a variable that runs over 100.000 bytes. I would like it to overflow into several variables. Any ideas on how to do that?

7 REPLIES 7
rhaley1821
Obsidian | Level 7

If you are able to import the variable into sas, I would suggest using the SCAN function to break it up! 

 

Here is a similar post with some great examples: 

https://stackoverflow.com/questions/47693983/spliting-a-variable-into-multiple-in-sas

rudfaden
Lapis Lazuli | Level 10

The problem is how to import such a long variable

Kurt_Bremser
Super User

You will have to dissect that data on your own. How you can do that will depend on the layout of the file; if that XML variable is contained on its own single line, in a separate block, or somehow contained within an unbroken stream of data.

rudfaden
Lapis Lazuli | Level 10

Hi Kurt.

 

Thanks for your comment. I have given an example on the data structure here. It is this part

 

<TextContent>
very loon text...
</TextContent>

 

that I wan't to import. The text is a resume in plain text.

 

<CandidateList xmlns="http://schemas.hr-manager.net/restful/2.0/" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<DocumentList>
<Document>
<Id>13289823</Id>
<Name>CV name</Name>
<Extension>pdf</Extension>
<ByteCount>186787</ByteCount>
<DownloadUrl>
https://cdn-recruiter.hr-manager.net/Export/Attachments/ViewDocument.aspx?cid=342&q=KQAAWqKhQYsY6i8aT%2byG16dhiBatjCa7wqDpBB%2flEupzmp3BQQ63DZ41fcyCwWN0
</DownloadUrl>
<Bytes i:nil="true"/>
<Type>Cv</Type>
<TextContent>
very loon text...
</TextContent>
</Document>
</DocumentList>
</CandidateList>
Kurt_Bremser
Super User

See a very simplified example for how I would tackle such an issue:

%let len=20;

data test;
infile datalines truncover;
retain id;
length id 8 text $&len.;
input tag $30.;
select (tag);
  when ("<id>") do;
    input id;
    input;
  end;
  when ('<textcontent>') do;
    input text $&len..@;
    do until (text = '</textcontent>');
      output;
      input text $&len..@;
      if lengthn(text) = 0 then do;
        input;
        input text $&len..@;
      end; 
    end;
    input;
  end;
end;
drop tag;
datalines;
<id>
1
</id>
<textcontent>
xxxxxxxxxxxxxxxxxxxxyyyyyyyyyyyyyyyyyyyzzzzzzzzzzzzzz
</textcontent>
;
rudfaden
Lapis Lazuli | Level 10

Hi Kurt.

 

Thanks for yor reply. Forgot to add that the data in textContent is multilined.

 

<id>
1
</id>
<textcontent>CV
Hi my name is John.

I like to work with sas data.
You can call me on my phone 445-334-566

See you.

John
</textcontent>
Kurt_Bremser
Super User

I have updated the code so that it deals correctly with data that follows the tag on the same line.

%let len=20;

data test;
infile datalines truncover;
retain id;
length id 8 text $&len.;
input line $30.;
tag = scan(line,1,">") !! ">";
line = scan(line,2,">");
select (tag);
  when ("<id>") do;
    input id;
    input;
  end;
  when ('<textcontent>') do;
    if lengthn(line) > 0
    then do;
      text = line;
      output;
    end;
    input text $&len..@;
    do until (text = '</textcontent>');
      output;
      input text $&len..@;
      if lengthn(text) = 0 then do;
        input;
        input text $&len..@;
      end; 
    end;
    input;
  end;
end;
drop tag;
datalines;
<id>
1
</id>
<textcontent>CV
Hi my name is John.

I like to work with sas data.
You can call me on my phone 445-334-566

See you.

John
</textcontent>
;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1322 views
  • 0 likes
  • 3 in conversation