Hi All,
I am creating a data warehouse by imorting daily text files into SAS work. and then combining the individual SAS tables into one large table. A month has 20 work days and hence, 20 text files for JUNE. The code that I have works almost all of the time. However, the 17th of 20 files is not interpreting the "Settlement Date" correctly. All fo the other files use Date MMDDYY10 for the format and Informat. The 17th file, however, is using Character $1 and $1 respectively (and truncating my dates). The source data has settlement dates in it but also many empty fields too. I am guessing that SAS is using, positionally, a field that is empty to get the $1.
If I manually import the file of the 17th, it correctly interprets "Settlemet Date" as
'Settlement Date'n MMDDYY10.
How can I, for all files to use "Date: MMDDYY10" when creating the imported SAS table?
Here is the import code that I am using.
/* IMPORT */
PROC IMPORT OUT= WORK.Loan_Detail_&i.
DATAFILE= "/sas/sasperm3_prod/RiskInfrastructure/XXXXX/LoanDetail/&Name.txt"
DBMS=DLM REPLACE;
DELIMITER='|';
GETNAMES=YES;
DATAROW=2;
RUN;
Many Thanks
Hi,
When you execute a proc import in the log is generated a data step that create your table. If all the tables have the same structure you could copy your data step and force certain formats changing the format and informat statement.
In the log must appear something like
data WORK.MYDATA ;
%let _EFIERR_ = 0;
infile 'C:\My Documents\myfiles\delimiter.txt' delimiter = '&' MISSOVER DSD lrecl=32767 firstobs=2 ;
informat Region $8. ;
informat State $2. ;
informat Month MONYY7. ;
informat Expenses best32. ;
informat Revenue best32. ;
format Region $8. ;
format State $2. ;
format Month MONYY7. ;
format Expenses best12. ;
format Revenue best12. ;
input
Region $
State $
Month
Expenses
Revenue
;
if _ERROR_ then call symputx('_EFIERR_',1);
run;
Whatever you do: proc import will only make a guess and try its best.
If you want control of the process, take the data step generated by proc import (you find it in the log) and modify it to your needs. Sooner or later you won't use proc import any longer, as you will be quicker writing the step from scratch.
And do not name variables like 'Settlement Date'n.
Instead name them settlement_date. Saves a lot of typing and a lot of headaches.
Thanks
I agree about no space headers and headaches!
Hi,
When you execute a proc import in the log is generated a data step that create your table. If all the tables have the same structure you could copy your data step and force certain formats changing the format and informat statement.
In the log must appear something like
data WORK.MYDATA ;
%let _EFIERR_ = 0;
infile 'C:\My Documents\myfiles\delimiter.txt' delimiter = '&' MISSOVER DSD lrecl=32767 firstobs=2 ;
informat Region $8. ;
informat State $2. ;
informat Month MONYY7. ;
informat Expenses best32. ;
informat Revenue best32. ;
format Region $8. ;
format State $2. ;
format Month MONYY7. ;
format Expenses best12. ;
format Revenue best12. ;
input
Region $
State $
Month
Expenses
Revenue
;
if _ERROR_ then call symputx('_EFIERR_',1);
run;
Correct, I have the code generated by the wizard and it works but references an infile and import specification in the workspace location: When I alter the file, I get the correct table structure, but no data.
INFILE '/sas/sasperm3_prod/RiskInfrastructure/xxxx/LoanDetail/LoanDetail 061016.txt'
/*
INFILE '/sas/grid/SAS_workE6A800002BD8_svcksa69901mpm/#LN00082'
*/
'Funding Type'n $CHAR9.
'Fallout Function ID'n BEST3.
Mandatory BEST7. ;
INFILE '/sas/grid/SAS_workE6A800002BD8_svcksa69901mpm/#LN00082'
LRECL=393
ENCODING="LATIN1"
TERMSTR=CRLF
DLM='7F'x
MISSOVER
DSD ;
INPUT
'Loan Number'n : ?? BEST10.
The second statement is comment out.
I included it to show how I changed the original code.
You would be much better off to to move all of the formats to in informat statements before the input instead of using inline format specifications especially with list input. The inline format specification will force reading all characters indicated by the lengths of the formats which will often lead to data issues with varying length values.
I also support the recommendations about not using the 'variable space'n constructs. Do however add Label statements to 1) document your variables and 2) provide nice headers and such in procedure output.
The snippet you posted has two INFILE statements, one of which has a comment that includes part of another infile statement in it.
PROC IMPORT tries to guess what your data is, but it generates really ugly SAS code that includes things like permanently attached $xx. formats that you would not want to have in your real dataset. The general form for reading a delimited txt file is to define the variables, define the input location and then read the variables. Define the variables explictly using either LENGTH or ATTRIB statement. Do not make the data step compiler guess as to what type and length to use for the variable based on how you reference it in the INPUT statement or what FORMAT or INFORMAT you first attach to it. You only need to attach INFORMAT or FORMAT to variables that require them.
data want ;
length var1 $20 var2 8 var3 $40 .... ;
format var2 date9.;
informat var2 mmddyy10.;
infile 'myfile.txt' dsd truncover firstobs=2;
input var1 var2 var3 .... ;
run;
Also if your input flenames follow a nice naming convention such that you can replace the day of the month with a wild card then you can read all of the files in one data step if you want. You can use the EOV= option of the INFILE statement to make a variable that you can use to detect when a new file starts and skip the header row. So for example if your file names look like '2016-06-01.txt', '2016-06-02.txt' etc. then you could do something like this.
infile '/path/2016-06-*.txt' DSD TRUNCOVER FIRSTOBS=2 EOV=EOV;
input @;
if eov then input;
eov=0;
input ...
I was able to TWEAK this line to make it work!
%let _EFIERR_ = 0;
infile 'C:\My Documents\myfiles\delimiter.txt' delimiter = '&' MISSOVER DSD lrecl=32767 firstobs=2 ;
informat Region $8. ;
Change to: PIPE
%let _EFIERR_ = 0;
infile 'C:\My Documents\myfiles\delimiter.txt' delimiter = '|' MISSOVER DSD lrecl=32767 firstobs=2 ;
informat Region $8. ;
Thank you
If all you changed was the delimiter then you can change that in the PROC IMPORT code usinging the DELIMITER= statement.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.