Why using put _infile_ returns different results

Accepted Solution Solved
Reply
Super Contributor
Posts: 418
Accepted Solution

Why using put _infile_ returns different results

Hello everyone. I Have a txt file that I am trying to read in, and remove some specific text, and then create a new file with.

This works perfectly with the following code.

/*works*/
data _null_;
infile "C:\Desktop\testfile.txt" end=eof lrecl=1000000;
file "C:\Desktop\code1_testfile.txt" lrecl=32000;
input;
newvar=transtrn(_infile_,' string to remove',trimn(' '));
put newvar;
run;

However if I don't declare a new variable and instead try to over-write the _infile_ and then put it back out, it doesn't work.  SO this code doesnt' work.

/*Doesn't work*/

data chatthelp1;

infile "C:\Desktop\testfile.txt" end=eof lrecl=1000000;

file "C:\Desktop\code2_testfile.txt" lrecl=32000;

input;

_infile_=transtrn(_infile_,' string to remove',trimn(' '));

put _infile_;

run;

Can anyone explain to me why this is, or point me in a location that I can find out why?

In addition, both pieces of code are adding a new line to the end of the txt file... I am not sure why, but there is now an additional line of data. Can anyone help me figure out why that is?

The txt file is attached.

Attachment

Accepted Solutions
Solution
‎10-25-2014 09:40 AM
Valued Guide
Posts: 3,206

Re: Why using put _infile_ returns different results

I needed some changes before I could run it at UE. That long buffer length is a new feature (Windows,Unix)l.

It is not the same limitation as the variable length.   It could be this special is causing problems for variable functions replacing data.

To check that you could use the modified program.  I changed it back to a long one and got:

ERROR: The LRECL / LINESIZE for infile TEST("testfile.txt") exceeds the maximum allowable length for an _INFILE_ or _INFILE_=

        variable (32,767). The DATA STEP will not be executed.

As I did run this in UE (unix version) with both runs delivering identical output that function of replacing (shortening) works.

I had some trouble recognizing your error. I see in your result (code2) the string is removed next is shifted to the right. The error is the length is not adjusted to that smaller length.     

I would avoid that kind of constructions replacing data in field that is also being read unless your are it is going to work.

 

filename test "/folders/myfolders/test"; 
/* https://communities.sas.com/message/234512#234512 */

/*works*/
data _null_;
infile test("testfile.txt") end=eof lrecl=32767;
file test("code1_testfile.txt") lrecl=32767;
input;
newvar=transtrn(_infile_,' string to remove',trimn(' '));
put newvar;
run;

/*Doesn't work    at anotherdreams , does work at UE */
data chatthelp1;
infile test("testfile.txt") end=eof lrecl=32767;
file test("code2_testfile.txt") lrecl=32767;
input;
_infile_=transtrn(_infile_,' string to remove',trimn(' '));
put _infile_;
run;

---->-- ja karman --<-----

View solution in original post


All Replies
Respected Advisor
Posts: 3,775

Re: Why using put _infile_ returns different results

newvar1=tranwrd(_infile_,'string to remove"','');

Do you see any thing fishy above?

filename FT15F001 temp;
data _null_;
  
infile FT15F001;
   file log;
  
input;
   _infile_=transtrn(_infile_,
' string to remove',trimn(' '));
   put _infile_;
  
parmcards4;
<?xml version=
"1.0" encoding="UTF-8"?>
<fake data>this is the string where string to remove is located</fake data>
;;;;
   run;
Super Contributor
Posts: 418

Re: Why using put _infile_ returns different results

Hello Data Null. Sorry that was a typo in my code copy, there is no double quotes in my tranwrd....  I have re-written my programs to match what you have more closely, and they still do not match. Hopefully the re-write will help with the double quote issue and the use of "tranwrd" instead of "transtrn".

For your reference I have included the txt file Start,  Code 1 result, and Code 2 Result  so you can see what it's doing.

Please also notice that both code 1 and code 2 are adding an extra "line" of empty space to the end of the txt file (looks like they're adding an extra CRLF) that I would also like to do away with, but am not quite sure how.

/*works*/
data _null_;
infile "C:\Desktop\testfile.txt" end=eof lrecl=1000000;
file "C:\Desktop\code1_testfile.txt" lrecl=32000;
input;
newvar=transtrn(_infile_,' string to remove',trimn(' '));
put newvar;
run;

/*Doesn't work*/
data chatthelp1;
infile "C:\Desktop\testfile.txt" end=eof lrecl=1000000;
file "C:\Desktop\code2_testfile.txt" lrecl=32000;
input;
_infile_=transtrn(_infile_,' string to remove',trimn(' '));
put _infile_;
run;

Thanks again!

Attachment
Attachment
Attachment
Super User
Super User
Posts: 6,318

Re: Why using put _infile_ returns different results

It looks to me like neither should work as the length of the input lines is larger than the maximum size for a character variable.

How is it NOT working?  Perhaps SAS does not know the new length of the _INFILE_ buffer?

Also in the first one the PUT statement is going to "eat" any leading spaces unless you use a format like $VARYING.

Respected Advisor
Posts: 3,775

Re: Why using put _infile_ returns different results

Yes that's right that INFILE LRECL in the OPs program doesn't work.  I forgot to mention that because my mom was calling me for supper. Smiley Happy

Respected Advisor
Posts: 3,775

Re: Why using put _infile_ returns different results

You should be seeing the ERROR below.

52   data _null_;

53       infile FT88F001 end=eof lrecl=1000000;

54       file   FT89F001 lrecl=32000;

55       input;

56       newvar=transtrn(_infile_,' string to remove',trimn(' '));

57       put newvar;

58       run;

NOTE: The infile FT88F001 is:

      (system-specific pathname),

      (system-specific file attributes)

ERROR: The LRECL / LINESIZE for infile FT88F001 exceeds the maximum allowable length for an

       _INFILE_ or _INFILE_= variable (32,767). The DATA STEP will not be executed.

NOTE: The file FT89F001 is:

      (system-specific pathname),

      (system-specific file attributes)

NOTE: 0 records were written to the file (system-specific pathname).

NOTE: The SAS System stopped processing this step because of errors.

NOTE: DATA statement used (Total process time):

      real time           0.03 seconds

      cpu time            0.03 seconds

Solution
‎10-25-2014 09:40 AM
Valued Guide
Posts: 3,206

Re: Why using put _infile_ returns different results

I needed some changes before I could run it at UE. That long buffer length is a new feature (Windows,Unix)l.

It is not the same limitation as the variable length.   It could be this special is causing problems for variable functions replacing data.

To check that you could use the modified program.  I changed it back to a long one and got:

ERROR: The LRECL / LINESIZE for infile TEST("testfile.txt") exceeds the maximum allowable length for an _INFILE_ or _INFILE_=

        variable (32,767). The DATA STEP will not be executed.

As I did run this in UE (unix version) with both runs delivering identical output that function of replacing (shortening) works.

I had some trouble recognizing your error. I see in your result (code2) the string is removed next is shifted to the right. The error is the length is not adjusted to that smaller length.     

I would avoid that kind of constructions replacing data in field that is also being read unless your are it is going to work.

 

filename test "/folders/myfolders/test"; 
/* https://communities.sas.com/message/234512#234512 */

/*works*/
data _null_;
infile test("testfile.txt") end=eof lrecl=32767;
file test("code1_testfile.txt") lrecl=32767;
input;
newvar=transtrn(_infile_,' string to remove',trimn(' '));
put newvar;
run;

/*Doesn't work    at anotherdreams , does work at UE */
data chatthelp1;
infile test("testfile.txt") end=eof lrecl=32767;
file test("code2_testfile.txt") lrecl=32767;
input;
_infile_=transtrn(_infile_,' string to remove',trimn(' '));
put _infile_;
run;

---->-- ja karman --<-----
Super Contributor
Posts: 418

Re: Why using put _infile_ returns different results

Hey everyone.

Master

as always thanks for your help. I actually think that the LRECL is okay since I work with Windows, and you can have any length of LRECL up to (1 trillion?). Variables can still only be 32767 but the line length itself is almost limited by your memory... Correct me if I am wrong, but I do not get that error and I can actually read in files whose lines are longer than 32,767.

Thanks for your advice Jaap.  I guess the idea of changing the variable you are reading in can cause problems and I actually don't mind just using the first method I posted.  Thanks for the advice.

Question Remaining:

Can anyone answer how to remove the final CRLF that happens when you ouput to a txt file using WPS? I would like it so that the file still retains it's normal structure, but the last line of the file actually has output on it, instead of putting one null line of data.  I actually need this because I am using SAS XML input, and it fails if the xml has anything after the closing tags.  (including blank space).

Thanks so much

Respected Advisor
Posts: 3,775

Re: Why using put _infile_ returns different results

I ran that program using SAS 9.4 for Winders.  I know that you can have LRECL much larger than 32767 you just can use the _INFILE_ automatic variable as explained in the ERROR message in my post..  Not to be confused with the PUT statement directive PUT _INFILE_;

You need to show your SASLOG where you use LRECL GT 32767 and you use _INFILE_ as a variable.

Super Contributor
Posts: 418

Re: Why using put _infile_ returns different results

Ohhhh I see what you are saying.  Sorry @Data_Null_  I actually wasn't aware that the automatic variable _infile_ wasn't the same thing as the LRECL by definition so I figured that having the value set differently wouldn't matter. (I'm very new to using the _infile_ variable).


My log doesn't actually throw any errors.  I get the following log when running the second piece of code.

NOTE: 2 records were read from file 'C:\Users\bsharbo\Desktop\testfile.txt'

      The minimum record length was 38

      The maximum record length was 75

NOTE: 2 records were written to file 'C:\Users\bsharbo\Desktop\code2_testfile.txt'

      The minimum record length was 38

      The maximum record length was 75

NOTE: Data set "WORK.chatthelp1" has 2 observation(s) and 0 variable(s)

NOTE: The data step took :

      real time : 0.017

      cpu time  : 0.000

In-fact If I change the Lrecl to 32000 I get the same error as I did in the above posts.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 531 views
  • 0 likes
  • 4 in conversation