I'm importing this data and there's supposed to be 1 error in there, but I'm getting 2.
data CarVROOM; infile "E:\...VROOM.txt" firstobs=9 missover; input Manufacturer$ Model$ Type$ Min_Price Price Max_Price MPG_city MPG_highway DriveTrain$ Cylinders; run;
This is the log report:
NOTE: Invalid data for Cylinders in line 70 45-50. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9-- 70 Mazda RX-7 Sporty 32.5 32.5 32.5 17 25 Rear rotary 50 Manufacturer=Mazda Model=RX-7 Type=Sporty Min_Price=32.5 Price=32.5 Max_Price=32.5 MPG_city=17 MPG_highway=25 DriveTrain=Rear Cylinders=. _ERROR_=1 _N_=57 NOTE: Invalid data for MPG_highway in line 93 36-38. 93 Subaru Justy Small 7.3 8.4 9.5 33 4WD 3 40 Manufacturer=Subaru Model=Justy Type=Small Min_Price=7.3 Price=8.4 Max_Price=9.5 MPG_city=33 MPG_highway=. DriveTrain=3 Cylinders=. _ERROR_=1 _N_=80
It looks like the errors are to do with the values being "." instead of numbers in the dataset, are there any options to add to the infile statement to reduce one of these errors?
Is Cylinders Numeric or Character? It looks like you have it as a Character in 1 spot ("2WD" /"4WD" I'd assume you want it to be Character) but then reference it missing as if it is numeric:
Cylinders=.
Maybe try changing the missing data to " " (Blanks)?
Edit: I'm a noob with SAS sorry try changing your input statement for Cylinders and declare it as a Character.
data CarVROOM;
infile "E:\...VROOM.txt" firstobs=9 missover;
input Manufacturer$ Model$ Type$ Min_Price Price Max_Price MPG_city MPG_highway DriveTrain$ Cylinders$.;
run;
Is Cylinders Numeric or Character? It looks like you have it as a Character in 1 spot ("2WD" /"4WD" I'd assume you want it to be Character) but then reference it missing as if it is numeric:
Cylinders=.
Maybe try changing the missing data to " " (Blanks)?
Edit: I'm a noob with SAS sorry try changing your input statement for Cylinders and declare it as a Character.
data CarVROOM;
infile "E:\...VROOM.txt" firstobs=9 missover;
input Manufacturer$ Model$ Type$ Min_Price Price Max_Price MPG_city MPG_highway DriveTrain$ Cylinders$.;
run;
Thanks again!
It means your data is not being read in correctly. Specifically, cylinders. In this case it looks like the Drivertrain is actually two things: Rear rotary, which contains a space which throws off your alignment of the data. So it tries to read rotary into the cylinders field which is incorrect.
NOTE: Invalid data for Cylinders in line 70 45-50. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9-- 70 Mazda RX-7 Sporty 32.5 32.5 32.5 17 25 Rear rotary 50 Manufacturer=Mazda Model=RX-7 Type=Sporty Min_Price=32.5 Price=32.5 Max_Price=32.5 MPG_city=17 MPG_highway=25 DriveTrain=Rear Cylinders=. _ERROR_=1 _N_=57
You have a similar issue with other variables.
@whoisyourmother wrote:
I'm importing this data and there's supposed to be 1 error in there, but I'm getting 2.
data CarVROOM; infile "E:\...VROOM.txt" firstobs=9 missover; input Manufacturer$ Model$ Type$ Min_Price Price Max_Price MPG_city MPG_highway DriveTrain$ Cylinders; run;
This is the log report:
NOTE: Invalid data for Cylinders in line 70 45-50. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9-- 70 Mazda RX-7 Sporty 32.5 32.5 32.5 17 25 Rear rotary 50 Manufacturer=Mazda Model=RX-7 Type=Sporty Min_Price=32.5 Price=32.5 Max_Price=32.5 MPG_city=17 MPG_highway=25 DriveTrain=Rear Cylinders=. _ERROR_=1 _N_=57 NOTE: Invalid data for MPG_highway in line 93 36-38. 93 Subaru Justy Small 7.3 8.4 9.5 33 4WD 3 40 Manufacturer=Subaru Model=Justy Type=Small Min_Price=7.3 Price=8.4 Max_Price=9.5 MPG_city=33 MPG_highway=. DriveTrain=3 Cylinders=. _ERROR_=1 _N_=80
It looks like the errors are to do with the values being "." instead of numbers in the dataset, are there any options to add to the infile statement to reduce one of these errors?
Show more of your data. It does not look like it is formatted in a way that you can parse it.
Here are the two lines you included in your message.
----+----1----+----2----+----3----+----4----+----5----+----6
Mazda RX-7 Sporty 32.5 32.5 32.5 17 25 Rear rotary
Subaru Justy Small 7.3 8.4 9.5 33 4WD 3
You are trying to read the data using LIST MODE input. So each value is separated by one or more spaces. That first line has 10 values and the second line only has 9 values. Your INPUT statement is trying to read 10 variables. 3 character, 5 numbers, 1 character and 1 number.
Looks like you have two issues.
The first line is having trouble because the value of DRIVE_TRAIN has an embedded space. You need to either use something else as the delimiter instead of space, or add an extra space after the values of DRIVE_TRAIN and use the & modifier on the input to have it keep reading until it sees two or more spaces to mark the end of the string. (You might need to do that for your other character variables also)>
The second appears to be missing a value for one of the 5 numeric values in the middle. Your text file should have a period to mark where a missing value belongs. It is not clear which value is missing.
Mazda RX-7 Sporty 32.5 32.5 32.5 17 25 Rear rotary .
Subaru Justy Small 7.3 8.4 9.5 33 . 4WD 3
I can see you've marked an answer, but given what you've posted that won't actually fix your issue. It will remove the notes from the log (possibly) but it will not read in the data correctly.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.