BookmarkSubscribeRSS Feed
MikeCa
Calcite | Level 5
I'm using the following SAS code to read data in from the following text file. When I run the code the SAS data set "detail" only contains 3 records? Yet when I read the log it says it read all 7 records from the test file? Where did the other 4 records go? Help is appreciated.

er3detail.tx File contents
Last Error Message: 23-2: Invalid option name SEX
Last Warning Message: The data set WORK.WORDGENDER may be
Last Dataset Created: WORK.WORDGENDER
Test Line 4
Test Line 5
Test Line 6



SAS Code:
data detail;
length t $ 300;
infile "C:\WUSS14\er3detail.txt";
input t $ 1-300 ;
run;

SAS LOG
91
92 data detail;
93 length t $ 300;
94 infile "C:\WUSS14\er3detail.txt";
95 input t $ 1-300 ;
96 run;

NOTE: The infile "C:\WUSS14\er3detail.txt" is:
Filename=C:\WUSS14\er3detail.txt,
RECFM=V,LRECL=256,File Size (bytes)=295,
Last Modified=12Nov2010:08:05:44,
Create Time=09Nov2010:19:22:20

NOTE: 7 records were read from the infile "C:\WUSS14\er3detail.txt".
The minimum record length was 10.
The maximum record length was 140.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.DETAIL has 3 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
12 REPLIES 12
Cynthia_sas
SAS Super FREQ
Hi:
When I find that I have variable length records (as indicated by this note in the log):
[pre]
The minimum record length was 10.
The maximum record length was 140.
[/pre]

I turn to the $VARYING. informat to read my variable length text lines. The program below worked for me when I saved the file into c:\temp\er3detail.txt (I removed a blank line from the bottom of the file, so I only have 6 lines.

cynthia
[pre]
1662 data detail;
1663 length t $ 300;
1664 infile "C:\temp\er3detail.txt" length=lg;
1665 input t $varying. lg;
1666 run;

NOTE: The infile "C:\temp\er3detail.txt" is:
Filename=C:\temp\er3detail.txt,
RECFM=V,LRECL=256,File Size (bytes)=189,
Last Modified=12Nov2010:08:39:20,
Create Time=12Nov2010:08:39:18

NOTE: 6 records were read from the infile "C:\temp\er3detail.txt".
The minimum record length was 11.
The maximum record length was 58.
NOTE: The data set WORK.DETAIL has 6 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 1.84 seconds
cpu time 0.00 seconds
[/pre]
MikeCa
Calcite | Level 5
Awesome! You made my day! Thanks
Patrick
Opal | Level 21
Wouldn't a simple "truncover" in the infile statement do the same trick without having to change informats?
Cynthia_sas
SAS Super FREQ
Patrick:
Yes, you're right. I know there are instances where TRUNCOVER would work just as well.

I happen to be a fan of $VARYING. for reading free-format text fields. I use the LG variable to get rid of "empty" lines, and with files in some formats (such as when I'm scraping HTML or scraping the LOG), I can use LG (along with other criteria) to filter out some observations from being read or being written to the output file.

This just looked like text scraping to me and so I picked $VARYING.

cynthia
Patrick
Opal | Level 21
Hi Cynthia

Thanks for your explanation.

Just to insist a bit more:
" there are instances where TRUNCOVER would work just as well"

In which instances wouldn't it work? I can't think of any?

Thanks
Patrick
Cynthia_sas
SAS Super FREQ
Patrick:
I am a "never say never" and "never say always" kind of girl. When I am text scraping I use $VARYING. You may use TRUNCOVER. This is "probably" an instance where either would work for the data that was posted. However, I remain a firm fan of $VARYING.

As for your question...here's an instance:
http://support.sas.com/kb/5/411.html

And I suspect I stopped using TRUNCOVER because of this or similar behavior in earlier versions of SAS:
http://support.sas.com/kb/32/889.html

But as I said, I almost always use $VARYING. for text scraping. TRUNCOVER would not have been my suggestion unless there were more variables being read in the INPUT statement than just a possible 300 character line of text.

Just a preference. Do whatever you prefer. There's usually (note the "usually") more than one way to accomplish the same task with SAS.

cynthia
Patrick
Opal | Level 21
Cynthia
Learnt something new! Thanks!
I just knew that it's worth bothering you a bit more 🙂
Patrick
Abby19
Fluorite | Level 6
Hi Cynthia,

What if my text file contains numeric values. I don't find the corresponding numeric informat for $varying. So, in this case i can use only truncover option?
Cynthia_sas
SAS Super FREQ

Hi:
As I explained, I use $VARYING for "screen scraping" or grabbing lines of text, as when someone sends me a mainframe report file, and I have to "scrape" data out of the mainframe report. Then I use $VARYING to read one entire line, numbers, letters, punctuation, and spaces and then, I typically with use SUBSTR or SCAN to break the big text string into smaller character and numeric chunks.

I suspect you are not using screen scraping or text scraping, since you have numeric values. I cannot comment on whether TRUNCOVER would be required until you show your data. It really, really depends on what your data looks like. Is it fixed format? Is it delimited by some character? Are the numbers in "standard" numeric format or do your numbers have currency symbols and thousands separators, like commas? Consider these examples below. I would only use $VARYING. for the report, on the right. I would use standard INFILE and INPUT the other examples on the left.

when_use_varying.png


Cynthia

Abby19
Fluorite | Level 6

I was using a variable length, only numeric data file. Below is how I tried - initially, before reading this thread - to read the file into SAS.

 

Code I used:

data test;
infile "/folders/myfolders/Oct26/Runners.txt" ;
input amt 8.2;
run;
proc print;
run;

 

The flat text file I wanted to read with the above code:

 

12345678.90
12500.02
5.11

 

Output:

 

 

123456.78
12500.02

 

Part of SAS Log:

 

NOTE: 3 records were read from the infile "/folders/myfolders/Oct26/Runners.txt".
The minimum record length was 4.
The maximum record length was 11.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.TEST has 2 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.02 seconds
 
 
77 proc print;
78 run;
 
NOTE: There were 2 observations read from the data set WORK.TEST.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.17 seconds
cpu time 0.09 seconds
 
The log says it read 3 records but still returned only 2 records to the SAS data set. I tried varying combinations of the input data like 5 records, different numeric values but SAS would return alternate records based on the data I sent. And I didn't know where I was going wrong until I read this thread of messages. 
 
After I used truncover, the issue was resolved:
 
Code:
 
data test;
infile "/folders/myfolders/Oct26/Runners.txt" truncover;
input amt 8.2;
run;
proc print;
run;
 
output:
 

123456.78
12500.02
5.11

 

So, I was wondering if there was an equivalent of $varying informat for reading numeric data :). Just curiousity 🙂

But, thanks for the very informative explanation as always 🙂

 

Thanks,

Abi

Tom
Super User Tom
Super User

No need for either in that case. If you either remove the INFORMAT specification from the INPUT statement or add the colon modifier in front of it then you are using list mode input style and SAS will automatically read the next word in the file.  

 

Also note that you do NOT want to include a decimal part on your informat specification, unless you need to tell SAS where to place implied decimal point for source text that has been generated without them.  If you read the string '12345' with the 8.2 informat SAS will place the implied decimal point between the 3 and the 4 and the result will be 123.45 instead of 12,345.

 

For normal input there is no need for ANY informat specifications. SAS already knows how to read numbers and character strings without an special instructions.  It is only things like dates and time where SAS needs help in understanding how to translate the human readable text into the internal value that it should store.

Abby19
Fluorite | Level 6
Thx Tom. This helped me a lot!!

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 5630 views
  • 2 likes
  • 5 in conversation