BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
learn_SAS_23
Pyrite | Level 9

Hello Team , 

 

am trying to read the below Out_test.txt file  and trying to read the highlighted text in the file with below code

Out_test.txt : 

 

<?xml version="1.0" encoding="utf-8"?>
<entry xml:base="http://Test.sas.api"
xmlns="http://www.w3.org/2005/Atom"
xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices"
xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"
xmlns:georss="http://www.georss.org/georss"
xmlns:gml="http://www.opengis.net/gml">
<id>//Test.sas.api('/iapps/SAR/files/163... new test file €.log')</id>
<category term="SP.File" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
<link rel="edit" href="Web/GetFileByServerRelativeUrl('/iapps/SAR/files/mynewfile%20%20new%20test%20file%20%E2%82%AC.log')" />
<link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/Author" type="application/atom+xml;type=entry" title="Author" href="Web/GetFileByServerRelativeUrl('/iapps/SAR/files/1632849929_mynewfile%20%20new%20test%20file%20%E2%82%AC.log')/Auth
<link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/CheckedOutByUser" type="application/atom+xml;type=entry" title="CheckedOutByUser" href="Web/GetFileByServerRelativeUrl('/iapps/SAR/files/mynewfile%20%20new%20test%20file%20%E2%82%AC.l
<title />
<updated>2021-09-28T17:32:50Z</updated>
<author>
<name />
</author>
<content type="application/xml">
<m:properties>
<d:CheckInComment></d:CheckInComment>
<d:CheckOutType >2</d:CheckOutType>
<d:UIVersionLabel>1.0</d:UIVersionLabel>
</m:properties>
</content>
</entry>

 

code : 

data Test_out_href;
length text $32767;
infile "/home/out_test.txt";
input;
text=scan(substr(_infile_, index(_infile_,"href")+37),1,"')");
call symputx("ServerRelativeUrl_new", text);
run;
%put this is to test &ServerRelativeUrl_new.

Log : 

 

NOTE: 1 record was read from the infile "/home/out_test.txt".
The minimum record length was 3511.
The maximum record length was 3511.
NOTE: The data set WORK.TEST_OUT_HREF has 1 observations and 1 variables.
NOTE: Compressing data set WORK.TEST_OUT_HREF increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 1071.40k
OS Memory 24492.00k
Timestamp 09/28/2021 10:22:44 PM
Step Count 55 Switch Count 2
Page Faults 0
Page Reclaims 165
Page Swaps 0
2 The SAS System Tuesday, September 28, 2021 07:05:00 PM

Voluntary Context Switches 23
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 528

33 %put this is to test &ServerRelativeUrl_new.
34
35
36 GOPTIONS NOACCESSIBLE;
WARNING: Apparent invocation of macro E2 not resolved.
WARNING: Apparent invocation of macro AC not resolved.
this is to test /iapps/SAR/files/mynewfile%20%20new%20test%20file%20%E2%82%AC.log.log

 

extra text ".log" is appended to the macro variable , Can i know reason behind it .

 

in my code , am trying to search the first occurance of href in the file and get the text after 37 characters till ) parenthesis.


text=scan(substr(_infile_, index(_infile_,"href")+37),1,"')");

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

I think the extra extension is a bug in the error message.

But you need to add some macro quoting because of the strange text you are including in the value. Otherwise things like %AC are going to look like a macro call to the macro processor.

%let ServerRelativeUrl_new=%superq(ServerRelativeUrl_new);

Example:

188   data _null_;
189     text='/iapps/SAR/files/Mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt';
190     call symputx("ServerRelativeUrl_new",text);
191   run;

NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


192
WARNING: Apparent invocation of macro E2 not resolved.
193   %put &=ServerRelativeUrl_new;
WARNING: Apparent invocation of macro AC not resolved.
SERVERRELATIVEURL_NEW=/iapps/SAR/files/Mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt.txt
194   %let ServerRelativeUrl_new=%superq(ServerRelativeUrl_new);
195   %put &=ServerRelativeUrl_new;
SERVERRELATIVEURL_NEW=/iapps/SAR/files/Mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt

 

 

Do you need to convert those %20 things back into the characters they mean?

196   data _null_;
197     text='/iapps/SAR/files/Mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt';
198     call symputx("ServerRelativeUrl_new",urldecode(text));
199   run;

NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


200
201   %put &=ServerRelativeUrl_new;
SERVERRELATIVEURL_NEW=/iapps/SAR/files/Mytestfile_  new test file €.txt
202   %let ServerRelativeUrl_new=%superq(ServerRelativeUrl_new);
203   %put &=ServerRelativeUrl_new;
SERVERRELATIVEURL_NEW=/iapps/SAR/files/Mytestfile_  new test file €.txt

Are you perhaps running using UTF-8?

 

image.png

 

 

View solution in original post

10 REPLIES 10
ballardw
Super User

Hint: if you want us to play with reading text please post the text into a TEXT box opened on the forum with the the </> icon that appears above the message window. The main message windows on this forum reformat text when pasted such as inserting html tags and removing spaces. End of line characters may also get messed with.

So there is almost no chance that what you posted in your question is actually the same as your source text file.

 

First thing, I would make sure that HREF is actually in the string before using it for anything. Also, If the input file gets treated as having more than one line you would be overwriting your macro variable with a bad value for TEXT as you call symputx unconditionally and would be replaced with each line of input leaving the value as whatever came from the last line of data when href is on the line. You do have href appear more than once in that information.

 

I would trouble shoot this to some extent by reading the data in one step assigning the value of _infile_ to a variable. Then use another data step to parse and extract the desired text. Then I can see what the value of _infile_ actually is to diagnose issues.

 

I also suggest finding some way to identify the line you want so that the search for "href" only happens where you actually want it to.

 

 

ChrisNZ
Tourmaline | Level 20

Why is there no semicolon after the %put statement?

What's in the data set variable?

Also this is more robust:

TEXT=scan( substr(_infile_, index(_infile_,"href")), 2, "(')" );

learn_SAS_23
Pyrite | Level 9

Thanks , this works similar way 

 

Code : 

 data Test_out;
 length text2 $32767;
 infile "/home/out_test.txt";
  input;
  TEXT2=scan( substr(_infile_, index(_infile_,"href")), 2, "(')" );
   put    'this is to test inside data step ' TEXT2;
   call symputx("ServerRelativeUrl_new", strip(TEXT2));
   run;

   %put this is to test out side  &ServerRelativeUrl_new. ;

Log : 


this is to test inside data step /iapps/SAR/files/Mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt
NOTE: 1 record was read from the infile "/home/b723166/out_test.txt".
The minimum record length was 3511.
The maximum record length was 3511.
NOTE: The data set WORK.TEST_OUT has 1 observations and 1 variables.
NOTE: Compressing data set WORK.TEST_OUT increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 1092.56k
OS Memory 21164.00k
Timestamp 09/29/2021 11:00:56 AM
Step Count 11 Switch Count 2
2 The SAS System Wednesday, September 29, 2021 10:48:00 AM

Page Faults 0
Page Reclaims 181
Page Swaps 0
Voluntary Context Switches 18
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 528

35
36 %put this is to test out side &ServerRelativeUrl_new.
37
38 GOPTIONS NOACCESSIBLE;
WARNING: Apparent invocation of macro E2 not resolved.
WARNING: Apparent invocation of macro AC not resolved.
this is to test out side /iapps/SAR/files/Mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt.txt

The variable inside the data step works fine , but when we are using the variable outside of datastep it added with extra text like below  /iapps/SAR/files/Mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt.txt

 

please suggest some hints to fix this.

learn_SAS_23
Pyrite | Level 9
 This also results similar way 

 data Test_out;
  %let TEXT2='/iapps/SAR/files/mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt';
   put    'this is to test inside data step ' &TEXT2;
   call symputx("ServerRelativeUrl_new2", strip(&TEXT2));
   run;
   %put this is to test out side  &ServerRelativeUrl_new2.

log : 

NOTE: Compression was disabled for data set WORK.TEST_OUT because compression overhead would increase the size of the data set.
this is to test inside data step /iapps/SAR/files/mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt
NOTE: The data set WORK.TEST_OUT has 1 observations and 0 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      user cpu time       0.00 seconds
      system cpu time     0.00 seconds
      memory              394.18k
      OS Memory           20388.00k
      Timestamp           09/29/2021 11:18:11 AM
      Step Count                        16  Switch Count  2
      Page Faults                       0
      Page Reclaims                     55
      Page Swaps                        0
      Voluntary Context Switches        15
      Involuntary Context Switches      0
      Block Input Operations            0
      Block Output Operations           136
      

31         
32            %put this is to test out side  &ServerRelativeUrl_new2.
33         
34         GOPTIONS NOACCESSIBLE;
WARNING: Apparent invocation of macro E2 not resolved.
WARNING: Apparent invocation of macro AC not resolved.
2                                                          The SAS System                  Wednesday, September 29, 2021 10:48:00 AM

this is to test out side  /iapps/SAR/files/mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt.txt 
ballardw
Super User

Here's a suggestion to test.

 

%put _user_;

%put text is &ServerRelativeUrl_new.      ;
%put;

The _user_ says show all the user defined macro variables. Hopefully you haven't generated many in the current session. Note that the _user_ line does not generate the warning about the macro invocation and does not show the extra .log.

Then the second put does have the warning and shows the extra .log. Think there might be a connection?

 

I think you are running into one of the macro resolution rules involving the dot character. Run this and you can see that the . immediately following an "apparent invocation of macro" duplicates text.

data _null_;
   x='text %be.abc';
   call symputx('xtest',x);
   y='text %be abc';
   call symputx('ytest',y);
run;

%put Xtest is: &xtest.    Ytest is: &ytest. ;

I don't have a suggestion that will work right now if you really want the %E2 and %AC characters since this is happening a macro resolution time, not at assignment or creation.

 

 

 

 

learn_SAS_23
Pyrite | Level 9
Thanks for tips , I Agree with exactly in this example also The macro variable X resolves to duplicate text

Xtest is: text %be.abc.abc

🙂 how to fix this duplication of text ,
Tom
Super User Tom
Super User

I think the extra extension is a bug in the error message.

But you need to add some macro quoting because of the strange text you are including in the value. Otherwise things like %AC are going to look like a macro call to the macro processor.

%let ServerRelativeUrl_new=%superq(ServerRelativeUrl_new);

Example:

188   data _null_;
189     text='/iapps/SAR/files/Mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt';
190     call symputx("ServerRelativeUrl_new",text);
191   run;

NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


192
WARNING: Apparent invocation of macro E2 not resolved.
193   %put &=ServerRelativeUrl_new;
WARNING: Apparent invocation of macro AC not resolved.
SERVERRELATIVEURL_NEW=/iapps/SAR/files/Mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt.txt
194   %let ServerRelativeUrl_new=%superq(ServerRelativeUrl_new);
195   %put &=ServerRelativeUrl_new;
SERVERRELATIVEURL_NEW=/iapps/SAR/files/Mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt

 

 

Do you need to convert those %20 things back into the characters they mean?

196   data _null_;
197     text='/iapps/SAR/files/Mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt';
198     call symputx("ServerRelativeUrl_new",urldecode(text));
199   run;

NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


200
201   %put &=ServerRelativeUrl_new;
SERVERRELATIVEURL_NEW=/iapps/SAR/files/Mytestfile_  new test file €.txt
202   %let ServerRelativeUrl_new=%superq(ServerRelativeUrl_new);
203   %put &=ServerRelativeUrl_new;
SERVERRELATIVEURL_NEW=/iapps/SAR/files/Mytestfile_  new test file €.txt

Are you perhaps running using UTF-8?

 

image.png

 

 

learn_SAS_23
Pyrite | Level 9

Thanks every one for valuable inputs , with %superq the value works as expected

   data _null_;
     text='/Eapps/SAR/files/Mytestfile_%20%20new%20test%20file%20%E2%82%AC.txt';
     call symputx("ServerRelativeUrl_new",text);
   run;
  

   %put &ServerRelativeUrl_new  and the other value is %superq(ServerRelativeUrl_new);

 

Kurt_Bremser
Super User

Welcome to the wonderful world of blanks and special characters in filenames.

(sarcasm intended)

 

This causes the XML to use %xx sequences to encode the special characters (hex E282AC is the Euro symbol in UTF-8), which in turn causes a hickup by the SAS macro processor. The blanks (encoded as %20) do not cause a problem because 20 can't be a macro name.

Note that this only happens in the %PUT, the macro variable is correct:

data Test_out_href;
length text $32767;
input;
text=scan(substr(_infile_, index(_infile_,"href")+37),1,"')");
call symputx("ServerRelativeUrl_new", text);
datalines;
<link rel="edit" href="Web/GetFileByServerRelativeUrl('/iapps/SAR/files/mynewfile%20%20new%20test%20file%20%E2%82%AC.log')" />
;
%put this is to test &ServerRelativeUrl_new.;

data check;
text = symget("ServerRelativeUrl_new");
run;

"log" is not duplicated when the contents of the macro variable are retrieved with SYMGET and the macro processor itself does not need to intervene.

learn_SAS_23
Pyrite | Level 9

Thanks Kurt , Agree that this issue is caused by special characters like '%' , in the variable. SYMGET  helps in some situations but we cannot use functions like SYMGET when we are passing as a input variable to macro's. we like to use the string with special characters in a variable in several places of code like

 

for example we can’t use SYMGET to pass to macro’s like below , which inturn results an error.

%Test_macro (url=%sysfunc(symget(ServerRelativeUrl_new))   );

 

%put is just to test the value of the variable , but we need to use the URL variable in different places . is there any way to deal with this issue.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 3464 views
  • 5 likes
  • 5 in conversation