Hello,
I have datalines program show below. Due to the test dataset have 300 obs, I only list line numbers from 81 to 83.
data test; length id $10 status 3; infile datalines delimiter=','; input id $ status; datalines; . . . EH1U00185, 4, EN1R01252, 4, EY1K01251, 2, . . . ; run;
There is an error message shown in the log window:
1956 data test;
1957 length id $10 status 3;
1958 infile datalines delimiter=',';
1959 input id $ status;
1960 datalines;
NOTE: Invalid data for status in line 2035 15-17.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+-
2035 EN1R01252, 4,
caseid=EN1R01252 hstatus=. _ERROR_=1 _N_=75
NOTE: The data set WORK.STATUS has 217 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
Could anyone let me know why, and how to fix it? Thank you.
Your errors do not match your code, but perhaps some simple changes will fix your issue(s).
First: DONT INDENT in-line data. It will confuse you. To make it easier to do this place the DATALINES; (also known as CARDS;) statement in column 1. Also place the semi-colon line that ends the data in column one. The extra RUN: statement past the end of the data is not needed.
Second: If you are going to delimit the data with something other than the default space then you probably also want to use the DSD option so that adjacent delimiters indicate a null value. The default delimiter when using the DSD option is a comma.
Third: Normally all of the data for a single observation will be on a single line so use the TRUNCOVER option on the DATALINES statement. (Avoid the older MISSOVER option unless you really want it to ignore values at the end of the line that are too short for the format width requested.)
Fourth: Do NOT include physical TAB characters in your data. If you are using DISPLAY MANAGER to run your code they will be replaced with spaces and if you are using SAS/Studio then they will NOT be replaced. Making your code hard to debug. I would also recommend not placing physical tab characters in any of your code. The SAS editors (and most other decent editors) will replace tabs with spaces for you so you can use the tab key on the keyboard to align your code without having the actual tab characters in the code.
I also wouldn't bother to store numeric variables with less than the full 8 bytes SAS uses for the floating numbers it uses for all numeric variables.
data test;
length id $10 status 8;
infile datalines dsd truncover;
input id status;
datalines;
EH1U00185,4,
EN1R01252,4,
EY1K01251,2,
;
The code and log do not match. The log lists variables caseid and hstatus, the code reads id and status.
Your problem might arise from missing data. Add the TRUNCOVER option to the INFILE statement.
When posting code, ALWAYS use the </> button. ALWAYS. The main posting window destroys the horizontal formatting, which is essential in logs.
After removing the . lines of your datalines which will cause errors the error you show is not duplicated.
If you read the log you posted carefully your log does not match the code shown:
1956 data test;
1957 length id $10 status 3;
1958 infile datalines delimiter=',';
1959 input id $ status;
1960 datalines;
NOTE: Invalid data for status in line 2035 15-17.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+-
2035 EN1R01252, 4,
caseid=EN1R01252 hstatus=. _ERROR_=1 _N_=75
NOTE: The data set WORK.STATUS has 217 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
Since your shown code does not include the variables shown in the invalid data then something else is going on.
Does your actual code have any macro coding involved? Or something that might be interpreted as a macro trigger (%sometext for example)
I corrected the variable names, still did not work.
@ybz12003 wrote:
I corrected the variable names, still did not work.
Then share the entire code and paste it into a text box and the LOG in a text box as well. The text in relation to the columns indicator in an invalid data message is important and by pasting into the main message window you mess up the the relation ship because the main message window reformats text, removing white space especially.
Your errors do not match your code, but perhaps some simple changes will fix your issue(s).
First: DONT INDENT in-line data. It will confuse you. To make it easier to do this place the DATALINES; (also known as CARDS;) statement in column 1. Also place the semi-colon line that ends the data in column one. The extra RUN: statement past the end of the data is not needed.
Second: If you are going to delimit the data with something other than the default space then you probably also want to use the DSD option so that adjacent delimiters indicate a null value. The default delimiter when using the DSD option is a comma.
Third: Normally all of the data for a single observation will be on a single line so use the TRUNCOVER option on the DATALINES statement. (Avoid the older MISSOVER option unless you really want it to ignore values at the end of the line that are too short for the format width requested.)
Fourth: Do NOT include physical TAB characters in your data. If you are using DISPLAY MANAGER to run your code they will be replaced with spaces and if you are using SAS/Studio then they will NOT be replaced. Making your code hard to debug. I would also recommend not placing physical tab characters in any of your code. The SAS editors (and most other decent editors) will replace tabs with spaces for you so you can use the tab key on the keyboard to align your code without having the actual tab characters in the code.
I also wouldn't bother to store numeric variables with less than the full 8 bytes SAS uses for the floating numbers it uses for all numeric variables.
data test;
length id $10 status 8;
infile datalines dsd truncover;
input id status;
datalines;
EH1U00185,4,
EN1R01252,4,
EY1K01251,2,
;
Thanks, Tom. I followed your suggestion and remove the space between ID and Status, the run in the end. All of above steps helped the program running through.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.