I have an excel file that I'm importing into SAS using proc import.
proc import datafile="Transactions_Details_Sample_Test.xlsx" out=Transaction_dump dbms=xlsx replace;
run;
Assume the excel file has 1000 rows with below columns
Employee Name(Char Data) | NIN Number (Number Data) | Enrollment Date (Date Data)
The Issue I'm facing
if someone enters the character data even by mistake or typo into the 1000th row of excel of the NIN Number column, the SAS will treat it as a character column, and all the numeric operations will fail. How can I write code in SAS to always treat it as a numeric column and to find that character data?
The same issue is with date, is by typo someone enters Jann-2021 instead of Jan-2021 the SAS will treat it as a character column instead of numeric date column and all the data will be messed up.
Attached is a sample of data, the last row has type errors that can change the data type of column in SAS. with millions of rows, I want to find such errors in my code.
To avoid the guessing of PROC IMPORT, save the spreadsheet to a csv file and read that with a data step written by yourself; add code to detect input errors (automatic _ERROR_ variable).
If I export that into CSV and there is any comma in the name, it's gonna mess up.
Excel uses semicolons in csv exports, and puts quotes around values that contain delimiters. The DSD option in the INFILE statement will honor that.
See the attached Excel and csv file.
I read the csv and print the dataset with
data test;
infile "~/test.csv" dlm=";" dsd truncover;
input name :$20. data $ number;
run;
proc print data=test;
run;
giving this result:
Beob. name data number
1 name,name xxx 1
2 name;name yyy 2
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.