BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ybz12003
Rhodochrosite | Level 12

Hello,

I have datalines program show below.  Due to the test dataset have 300 obs, I only list line numbers from 81 to 83.

 

data test;  
	length id $10 status 3; 
	infile datalines delimiter=','; 
	input id  $ status;  
	datalines;                     
	.
	.
	.
	EH1U00185,	4,
	EN1R01252, 	4,
	EY1K01251,	2,
	.
	.
	.
;                          
run; 

 

There is an error message shown in the log window:

 

1956 data test;

1957 length id $10 status 3;

1958 infile datalines delimiter=',';

1959 input id $ status;

1960 datalines;

NOTE: Invalid data for status in line 2035 15-17.

RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+-

2035 EN1R01252,  4,

caseid=EN1R01252 hstatus=. _ERROR_=1 _N_=75

NOTE: The data set WORK.STATUS has 217 observations and 2 variables.

NOTE: DATA statement used (Total process time):

real time 0.03 seconds

cpu time 0.01 seconds

 

 

Could anyone let me know why, and how to fix it?  Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Your errors do not match your code, but perhaps some simple changes will fix your issue(s).

 

First: DONT INDENT in-line data.  It will confuse you.  To make it easier to do this place the DATALINES; (also known as CARDS;) statement in column 1. Also place the semi-colon line that ends the data in column one.  The extra RUN: statement past the end of the data is not needed.

 

Second: If you are going to delimit the data with something other than the default space then you probably also want to use the DSD option so that adjacent delimiters indicate a null value.  The default delimiter when using the DSD option is a comma.

 

Third: Normally all of the data for a single observation will be on a single line so use the TRUNCOVER option on the DATALINES statement.  (Avoid the older MISSOVER option unless you really want it to ignore values at the end of the line that are too short for the format width requested.)

 

Fourth:  Do NOT include physical TAB characters in your data.  If you are using DISPLAY MANAGER to run your code they will be replaced with spaces and if you are using SAS/Studio then they will NOT be replaced.  Making your code hard to debug.  I would also recommend not placing physical tab characters in any of your code.  The SAS editors (and most other decent editors) will replace tabs with spaces for you so you can use the tab key on the keyboard to align your code without having the actual tab characters in the code.

 

I also wouldn't bother to store numeric variables with less than the full 8 bytes SAS uses for the floating numbers it uses for all numeric variables.

data test;  
  length id $10 status 8; 
  infile datalines dsd truncover; 
  input id status;  
datalines;                     
EH1U00185,4,
EN1R01252,4,
EY1K01251,2,
;         

 

View solution in original post

6 REPLIES 6
Kurt_Bremser
Super User

The code and log do not match. The log lists variables caseid and hstatus, the code reads id and status.

Your problem might arise from missing data. Add the TRUNCOVER option to the INFILE statement.

When posting code, ALWAYS use the </> button. ALWAYS. The main posting window destroys the horizontal formatting, which is essential in logs.

ballardw
Super User

After removing the . lines of your datalines which will cause errors the error you show is not duplicated.

 

If you read the log you posted carefully your log does not match the code shown:

1956 data test;
1957 length id $10 status 3;
1958 infile datalines delimiter=',';
1959 input id $ status;
1960 datalines;
NOTE: Invalid data for status in line 2035 15-17.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+-
2035 EN1R01252, 4,
caseid=EN1R01252 hstatus=. _ERROR_=1 _N_=75
NOTE: The data set WORK.STATUS has 217 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds

Since your shown code does not include the variables shown in the invalid data then something else is going on.

 

Does your actual code have any macro coding involved? Or something that might be interpreted as a macro trigger (%sometext for example)

ybz12003
Rhodochrosite | Level 12

I corrected the variable names, still did not work.

ballardw
Super User

@ybz12003 wrote:

I corrected the variable names, still did not work.


Then share the entire code and paste it into a text box and the LOG in a text box as well. The text in relation to the columns indicator in an invalid data message is important and by pasting into the main message window you mess up the the relation ship because the main message window reformats text, removing white space especially.

Tom
Super User Tom
Super User

Your errors do not match your code, but perhaps some simple changes will fix your issue(s).

 

First: DONT INDENT in-line data.  It will confuse you.  To make it easier to do this place the DATALINES; (also known as CARDS;) statement in column 1. Also place the semi-colon line that ends the data in column one.  The extra RUN: statement past the end of the data is not needed.

 

Second: If you are going to delimit the data with something other than the default space then you probably also want to use the DSD option so that adjacent delimiters indicate a null value.  The default delimiter when using the DSD option is a comma.

 

Third: Normally all of the data for a single observation will be on a single line so use the TRUNCOVER option on the DATALINES statement.  (Avoid the older MISSOVER option unless you really want it to ignore values at the end of the line that are too short for the format width requested.)

 

Fourth:  Do NOT include physical TAB characters in your data.  If you are using DISPLAY MANAGER to run your code they will be replaced with spaces and if you are using SAS/Studio then they will NOT be replaced.  Making your code hard to debug.  I would also recommend not placing physical tab characters in any of your code.  The SAS editors (and most other decent editors) will replace tabs with spaces for you so you can use the tab key on the keyboard to align your code without having the actual tab characters in the code.

 

I also wouldn't bother to store numeric variables with less than the full 8 bytes SAS uses for the floating numbers it uses for all numeric variables.

data test;  
  length id $10 status 8; 
  infile datalines dsd truncover; 
  input id status;  
datalines;                     
EH1U00185,4,
EN1R01252,4,
EY1K01251,2,
;         

 

ybz12003
Rhodochrosite | Level 12

Thanks, Tom.   I followed your suggestion and remove the space between ID and Status, the run in the end.  All of above steps helped the program running through. 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2122 views
  • 1 like
  • 4 in conversation