BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AlexMoreton
Obsidian | Level 7

Hi, I am trying to concatenate 6 variables (all strings) while I am exporting it into xml. So basically it was split down into smaller chunks in order for SAS to read them all (one observation gets split into 6 variables but are just the same string split down).

I was able to export the file into XML with the filename X 'X.xml' step. However, I realized that the text string is now broken randomly with carriage rows. So now I have to add a new code to remove them.

I really just need to know if there is a way I can join the variables up that removes all carriage rows as it made about 300 paragraphs and split many words up into unreadable characters.

 

Original code:

 

data MAYBE;
set Work.ALL_THERE;           *ALL_THERE_ contains the 6 variables that makes up the string;
result=catx('', XMLCol1, XMLCol2, XMLCol3, XMLCol4, XMLCol5, XMLCol6);   *the 6 variables;
put result $char.;
run;

filename alex 'D:\SAS Code\ALEX TESTING\XMLCol_ind.xml';
data ;
file alex;
set ALL_THERE;

put XMLCol1 XMLCol2 XMLCol3 XMLCol4 XMLCol5 XMLCol6;

run;

 

Code now:

 

filename alex 'D:\SAS Code\ALEX TESTING\XMLCol_ind.xml';

data ;
file alex;
*alex= TRANWRD(alex,'0D0A'x,'');      *this is to remove the carriage rows once it is combined.;
set ALL_THERE;                                  *this is the data that contains the 6 variables that I want to concatenate;


alex= cats(XMLCol1, XMLCol2, XMLCol3, XMLCol4, XMLCol5, XMLCol6);   *the 6 variables to concatenate;

result=cats('',of XMLCol1-XMLCol6);                         *to see if it is concatenated;
put result=;

put _all_;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@AlexMoreton wrote:

Hi Ballard,

Thank you! I'm sure the code works but unfortunately, the total number of characters in the alex variable will end up being 80700 characters... therefore even though there was no error, no records were written into alex. I think it's because it is too large.

therefore basically i can only try and write a code for SAS to just combine the variables and not have any carriage rows

because when I copy the variables (texts) by themselves and paste it to word, none of them had carriage rows. So it must be something to do with the export function.

Is there any ways to overcome this? Cos I need to combine them together into a xml file (i managed to do so but have LOTS of carriage rows) but the total number of characters is too large for SAS (33000 or so is the limit).


What are the lengths of the existing variables you are attempting to concatenate? The maximum length of a character variable is 32767 currently. I couldn't really tell which was your final output attempt and I don't attempt to write raw XML from code (no need in my current position) but neither attempt that I see looks to be outputting any XML codes. All I see is plain text written to a text file that happens to have an XML extension.

 

if you want all of that text in a single cell of an XML file you will likely have to write xml generation directly such as:

filename alex 'D:\SAS Code\ALEX TESTING\XMLCol_ind.xml';

data MAYBE;
   set Work.ALL_THERE;        
   file alex lrecl=90000;
   array x  XMLCol1 XMLCol2 XMLCol3 XMLCol4 XMLCol5 XMLCol6;   
   do i=1 to dim(x);
      x[i] = translate(x[i],' ','ODOA'X);
   end;
   put '<Row >';
   put '<Cell <Data ss:Type="String">'XMLCol1 XMLCol2 XMLCol3 XMLCol4 XMLCol5 XMLCol6'</Data></Cell>';
   put '</Row>';
run; 

Note the LRECL to handle longer lines, otherwise the system likely defaults to 32K and won't hold a line as long as you want.

 

If you are actually wanting to place each of those variables as a separate column in the file then perhaps:

data MAYBE;
   set Work.ALL_THERE;        
   file alex lrecl=90000;
   array x  XMLCol1 XMLCol2 XMLCol3 XMLCol4 XMLCol5 XMLCol6;   
   do i=1 to dim(x);
      x[i] = translate(x[i],' ','ODOA'X);
   end;
run; 

ods tagsets.excelxp file="D:\SAS Code\ALEX TESTING\XMLCol_ind.xml";
proc print data=MAYBE;
var x:;
run;
ods tagsets.excelxp close;

 

View solution in original post

7 REPLIES 7
ballardw
Super User

TRANWRD would not be my choice of the function as the carriage return is unlikely to be treated as a "word" consistently if at all. TRANSLATE would be a better function to use on the concatenated variable as it will replace every occurrence of the character.

 

Perhaps:

 alex=translate( cats(XMLCol1, XMLCol2, XMLCol3, XMLCol4, XMLCol5, XMLCol6) ,' ','ODOA'X);

Though you may also need to set a length for the variable alex prior to the concatenation to ensure it is long enough to hold the result.

 

AlexMoreton
Obsidian | Level 7

Hi Ballard,

Thank you! I'm sure the code works but unfortunately, the total number of characters in the alex variable will end up being 80700 characters... therefore even though there was no error, no records were written into alex. I think it's because it is too large.

therefore basically i can only try and write a code for SAS to just combine the variables and not have any carriage rows

because when I copy the variables (texts) by themselves and paste it to word, none of them had carriage rows. So it must be something to do with the export function.

Is there any ways to overcome this? Cos I need to combine them together into a xml file (i managed to do so but have LOTS of carriage rows) but the total number of characters is too large for SAS (33000 or so is the limit).

ballardw
Super User

@AlexMoreton wrote:

Hi Ballard,

Thank you! I'm sure the code works but unfortunately, the total number of characters in the alex variable will end up being 80700 characters... therefore even though there was no error, no records were written into alex. I think it's because it is too large.

therefore basically i can only try and write a code for SAS to just combine the variables and not have any carriage rows

because when I copy the variables (texts) by themselves and paste it to word, none of them had carriage rows. So it must be something to do with the export function.

Is there any ways to overcome this? Cos I need to combine them together into a xml file (i managed to do so but have LOTS of carriage rows) but the total number of characters is too large for SAS (33000 or so is the limit).


What are the lengths of the existing variables you are attempting to concatenate? The maximum length of a character variable is 32767 currently. I couldn't really tell which was your final output attempt and I don't attempt to write raw XML from code (no need in my current position) but neither attempt that I see looks to be outputting any XML codes. All I see is plain text written to a text file that happens to have an XML extension.

 

if you want all of that text in a single cell of an XML file you will likely have to write xml generation directly such as:

filename alex 'D:\SAS Code\ALEX TESTING\XMLCol_ind.xml';

data MAYBE;
   set Work.ALL_THERE;        
   file alex lrecl=90000;
   array x  XMLCol1 XMLCol2 XMLCol3 XMLCol4 XMLCol5 XMLCol6;   
   do i=1 to dim(x);
      x[i] = translate(x[i],' ','ODOA'X);
   end;
   put '<Row >';
   put '<Cell <Data ss:Type="String">'XMLCol1 XMLCol2 XMLCol3 XMLCol4 XMLCol5 XMLCol6'</Data></Cell>';
   put '</Row>';
run; 

Note the LRECL to handle longer lines, otherwise the system likely defaults to 32K and won't hold a line as long as you want.

 

If you are actually wanting to place each of those variables as a separate column in the file then perhaps:

data MAYBE;
   set Work.ALL_THERE;        
   file alex lrecl=90000;
   array x  XMLCol1 XMLCol2 XMLCol3 XMLCol4 XMLCol5 XMLCol6;   
   do i=1 to dim(x);
      x[i] = translate(x[i],' ','ODOA'X);
   end;
run; 

ods tagsets.excelxp file="D:\SAS Code\ALEX TESTING\XMLCol_ind.xml";
proc print data=MAYBE;
var x:;
run;
ods tagsets.excelxp close;

 

AlexMoreton
Obsidian | Level 7

Ballard thank you so much!

The first code works perfectly!

Thank you very much!

One quick question, after viewing the new xml file, I realized that the code also generate some extra stuff.

 

<Row >
<Cell <Data ss:Type="String">

 

</Data></Cell>
</Row>

 

Is there any way to remove them within the code you have provided or should I just remove them through another code or manually?

Thank you for everything you've done so far!!

ballardw
Super User

@AlexMoreton wrote:

Ballard thank you so much!

The first code works perfectly!

Thank you very much!

One quick question, after viewing the new xml file, I realized that the code also generate some extra stuff.

 

<Row >
<Cell <Data ss:Type="String">

 

</Data></Cell>
</Row>

 

Is there any way to remove them within the code you have provided or should I just remove them through another code or manually?

Thank you for everything you've done so far!!


I think you have to decide if you are creating XML or text. The bits inside the <> tell XML parsers how to display values. The "String" says the value is text, not numeric. Which likely isn't a problem in your case but if your value were an account number like "001233344111" you need to know that it is string to prevent things like the leading zeroes not displaying.

 

XML has layout rules. To some extent the position of things like CELL within a row whether

<row>

<cell> </cell>

<cell> </cell>

<cell> </cell>

</row>

or

<row> <cell> </cell><cell> </cell><cell> </cell></row>

 are style choices but the <row> indicates a new row, <cell> is a new cell (or column in a row) and <data> is the start of actual data. the </ are end of row, cell or data.

If you do not want any of those then you are creating a text file, not xml.

Tom
Super User Tom
Super User

Sounds like you already have text split into multiple variable and you just want to write it a text file.

data _null_;
   file 'D:\SAS Code\ALEX TESTING\XMLCol_ind.xml' lrecl=100000 ;
   set ALL_THERE ; 
   put XMLCol1 - XMLCol6;
run;

If you want to append the new lines to the end of an existing text file then add the MOD option to the FILE statement.

 

Now the simple PUT statement above will probably remove any leading/trailing blanks from your character strings.  If you want to include them you will need to add formats to your PUT statement.  If each variable is defined as length 10000 then you could use something like:

   put (XMLCol1 - XMLCol6) ($char10000.) ;

and SAS will write exactly 60,000 bytes.

 

I suspect this is related to earlier question about pulling long strings from an external database.  If you also included the length of the original string then you could avoid adding extra spaces to the ends of the lines.  So if had a variable called LENGTH with the length of the original string and each of the 6 variables was defined with length 10,000 then your data step might look more like this.

data _null_;
   file 'D:\SAS Code\ALEX TESTING\XMLCol_ind.xml' lrecl=100000 ;
   set ALL_THERE ; 
   array x XMLCol1 - XMLCol6 ;
   do i=1 to dim(x);
    nchar = max(0,min(length-(i-1)*10000,10000)) ;
    put x(i) $varying10000. nchar @;
  end;
  put;
run;
AlexMoreton
Obsidian | Level 7

Hi Tom,

Thank you very much! Did the trick!

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1850 views
  • 1 like
  • 3 in conversation