BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
amt11189
Calcite | Level 5

Greetings, I am using SAS University Edition and I am getting some issues with the following script. I expect it to accept the numeric values but it does not.

Input

data MySample1;
input Firstfield secondfield thirdfield;
datalines;
123 34 54
21 35 33
1 32 34
;
run;
Proc Print Data=Sample;Run;

 

Output

 Firstfield secondfield thirdfield1

...

 

Error

NOTE: Invalid data for Firstfield in line 76 1-9.
NOTE: Invalid data for secondfield in line 77 1-8.
NOTE: Invalid data for thirdfield in line 78 1-7.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
 
78 CHAR 1.32.34
ZONE 30330332222222222222222222222222222222222222222222222222222222222222222222222222
NUMR 19329340000000000000000000000000000000000000000000000000000000000000000000000000
NOTE: Invalid data errors for file CARDS occurred outside the printed range.
NOTE: Increase available buffer lines with the INFILE n= option.
Firstfield=. secondfield=. thirdfield=. _ERROR_=1 _N_=1
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.MYSAMPLE11 has 1 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
 
 
79 ;
80 run;
81 Proc Print Data=MySample11;Run;
 
NOTE: There were 1 observations read from the data set WORK.MYSAMPLE11.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

The ZONE and NUMR lines provide the answer to your question.   Those lines, which are synchronized with each other show the first 4 bits (ZONE) and second 4 bits (NUMR) of each byte in the lines of data.

 

Since 4 bits ranges from 0000 to 1111, the lines actually print the corresponding hexadecimal characters (0000=0,0001=1,..., 1001=9,1010=A,....1111=F).  So, in ascii, if ZONE=3 and NUMR=1  ('31'x in hexadecimal notation), you have the ascii code for character "1".

 

Now, in the datalines you show, it appears that the 3 numbers in each line are separated by blanks, which would be '20'x.  But instead of '20'x you have '09'x separating the numeric fields, which is not a blank, but a tab character.  SAS University apparently defaults to using blanks as delimiters, not tab.   

 

So, if you don't change all your tab characters to blank, then you can fix this with the DLM option on the INFILE statement, telling SAS to consider tab characters as delimiters, and not to look for blank delimiters:

 

data t;
  infile datalines dlm='09'x;
  input x y z;
datalines;
1	2
3	4
run;

If you have BOTH tabs and spaces as delimiters, then expand the DLM= parameter:

 

data t;
  infile datalines dlm='0920'x;
  input x y z;
datalines;
1	2
3	4
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

6 REPLIES 6
amt11189
Calcite | Level 5

Sorry Data line should read data MySample11 "But the error is the same".

mkeintz
PROC Star

The ZONE and NUMR lines provide the answer to your question.   Those lines, which are synchronized with each other show the first 4 bits (ZONE) and second 4 bits (NUMR) of each byte in the lines of data.

 

Since 4 bits ranges from 0000 to 1111, the lines actually print the corresponding hexadecimal characters (0000=0,0001=1,..., 1001=9,1010=A,....1111=F).  So, in ascii, if ZONE=3 and NUMR=1  ('31'x in hexadecimal notation), you have the ascii code for character "1".

 

Now, in the datalines you show, it appears that the 3 numbers in each line are separated by blanks, which would be '20'x.  But instead of '20'x you have '09'x separating the numeric fields, which is not a blank, but a tab character.  SAS University apparently defaults to using blanks as delimiters, not tab.   

 

So, if you don't change all your tab characters to blank, then you can fix this with the DLM option on the INFILE statement, telling SAS to consider tab characters as delimiters, and not to look for blank delimiters:

 

data t;
  infile datalines dlm='09'x;
  input x y z;
datalines;
1	2
3	4
run;

If you have BOTH tabs and spaces as delimiters, then expand the DLM= parameter:

 

data t;
  infile datalines dlm='0920'x;
  input x y z;
datalines;
1	2
3	4
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

That is one of the differences between SAS/Studio and Display Manager.  In Display Manager the program editor will replace the tabs in your program with spaces and then your code would work as is.  In SAS/Studio somehow the actual tabs end up in the program stream that is passed to SAS.  So the input sees 123<tab>34<tab>54 as one word to read and since it is not a valid number you get that error message and a missing value for Firstfield.  Since that is the last word on the line the input statement moves to the second line to find data for the second variable where the same problem exists.

 

Don't put tabs into your data. In fact don't put tabs into any of your SAS code.

andreas_lds
Jade | Level 19

Afaik there is an option in SAS Studio to convert tabs to spaces automatically. Turning it on should also prevent such errors. But the best thing to do is simply not to use tab after the first visible char.

ChrisNZ
Tourmaline | Level 20

This:

78 CHAR 1.32.34
ZONE 30330332222222222222222222222222222222222222222222222222222222222222222222222222
NUMR 19329340000000000000000000000000000000000000000000000000000000000000000000000000

reads as:

hex 31 which is 1

hex 09 which is tab

hex 33 which is 3

hex 32 which is 2

hex 09 which is tab

hex 33 which is 3

hex 34 which is 4

hex 20 which is space

 

So the whole field is read in on swoop and the result is not a number. Hence the error.

 

If your data is a mix of space separated and tab separated numbers, you really need the source to give you a better file.

If that's impossible, you can write code to parse the line for values.

 

Hint: In EG, you can replace all tabs in the code editor by using regular expression replacement:

 

[The image seems to be gone. Check the regular expression search box, enter \x09 in the find text field, enter a space in the replace with field. ]

 

There is no reason tabs should ever be there.

 

 

 

 

 

 

Kurt_Bremser
Super User

Your code as posted does work, but that may be because you (wrongly) posted the code into the main posting window, and it converted tabs to HTML white space (a single blank).

Always use the "little running man" icon next to the one indicated for posting code:

Bildschirmfoto 2020-04-07 um 08.32.59.jpg

 

like this:

data MySample1;
input Firstfield secondfield thirdfield;
datalines;
123 34 54
21 35 33
1 32 34
;

Similarly, use the indicated icon (</>) for logs or other structured text data.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 3441 views
  • 4 likes
  • 6 in conversation