BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kody_devl
Quartz | Level 8

Hi All,

 

I am creating a data warehouse by imorting daily text files into SAS work.  and then combining the individual SAS tables into one large table.   A month has 20 work days and hence, 20 text files for JUNE.  The code that I have works almost all of the time.  However, the 17th of 20 files is not interpreting the "Settlement Date" correctly.  All fo the other files use  Date MMDDYY10 for the format and Informat.  The 17th file, however, is using Character $1 and $1 respectively (and truncating my dates).  The source data has settlement dates in it but also many empty fields too.  I am guessing that SAS is using, positionally, a field that is empty to get the $1.

 

If I manually import the file of the 17th, it correctly interprets "Settlemet Date" as

'Settlement Date'n MMDDYY10.

 

How can I, for all files to use "Date:  MMDDYY10" when creating the imported SAS table?

Here is the import code that I am using.

 

/* IMPORT */

PROC IMPORT OUT= WORK.Loan_Detail_&i.

DATAFILE= "/sas/sasperm3_prod/RiskInfrastructure/XXXXX/LoanDetail/&Name.txt"

DBMS=DLM REPLACE;

DELIMITER='|';

GETNAMES=YES;

DATAROW=2;

RUN;

 

Many Thanks

 

1 ACCEPTED SOLUTION

Accepted Solutions
arodriguez
Lapis Lazuli | Level 10

Hi,

 

When you execute a proc import in the log is generated a data step that create your table. If all the tables have the same structure you could copy your data step and force certain formats changing the format and informat statement.

 

In the log must appear something like 

data WORK.MYDATA    ;
   %let _EFIERR_ = 0; 
   infile 'C:\My Documents\myfiles\delimiter.txt' delimiter = '&' MISSOVER  DSD lrecl=32767 firstobs=2 ;
   informat Region $8. ;
   informat State $2. ;
   informat Month MONYY7. ;
   informat Expenses best32. ;
   informat Revenue best32. ;
   format Region $8. ;
   format State $2. ;
   format Month MONYY7. ;
   format Expenses best12. ;
   format Revenue best12. ;
input
            Region $
            State $
            Month
            Expenses
            Revenue
;
if _ERROR_ then call symputx('_EFIERR_',1);  
run;

 

View solution in original post

10 REPLIES 10
Kurt_Bremser
Super User

Whatever you do: proc import will only make a guess and try its best.

If you want control of the process, take the data step generated by proc import (you find it in the log) and modify it to your needs. Sooner or later you won't use proc import any longer, as you will be quicker writing the step from scratch.

 

And do not name variables like 'Settlement Date'n.

Instead name them settlement_date. Saves a lot of typing and a lot of headaches.

Kody_devl
Quartz | Level 8

Thanks

 

I agree about no space headers and headaches!

 

arodriguez
Lapis Lazuli | Level 10

Hi,

 

When you execute a proc import in the log is generated a data step that create your table. If all the tables have the same structure you could copy your data step and force certain formats changing the format and informat statement.

 

In the log must appear something like 

data WORK.MYDATA    ;
   %let _EFIERR_ = 0; 
   infile 'C:\My Documents\myfiles\delimiter.txt' delimiter = '&' MISSOVER  DSD lrecl=32767 firstobs=2 ;
   informat Region $8. ;
   informat State $2. ;
   informat Month MONYY7. ;
   informat Expenses best32. ;
   informat Revenue best32. ;
   format Region $8. ;
   format State $2. ;
   format Month MONYY7. ;
   format Expenses best12. ;
   format Revenue best12. ;
input
            Region $
            State $
            Month
            Expenses
            Revenue
;
if _ERROR_ then call symputx('_EFIERR_',1);  
run;

 

Kody_devl
Quartz | Level 8

Correct, I have the code generated by the wizard and it works but references an infile and import specification in the workspace location:  When I alter the file, I get the correct table structure,  but no data.

 

INFILE '/sas/sasperm3_prod/RiskInfrastructure/xxxx/LoanDetail/LoanDetail 061016.txt'

/*

INFILE '/sas/grid/SAS_workE6A800002BD8_svcksa69901mpm/#LN00082'

*/

 

'Funding Type'n $CHAR9.

'Fallout Function ID'n BEST3.

Mandatory BEST7. ;

INFILE '/sas/grid/SAS_workE6A800002BD8_svcksa69901mpm/#LN00082'

LRECL=393

ENCODING="LATIN1"

TERMSTR=CRLF

DLM='7F'x

MISSOVER

DSD ;

INPUT

'Loan Number'n : ?? BEST10.

 

arodriguez
Lapis Lazuli | Level 10
You have two infile statements. I think that only the fist should be used
Kody_devl
Quartz | Level 8

The second statement is comment out.

 

I included it to show how I changed the original code.

 

 

ballardw
Super User

You would be much better off to to move all of the formats to in informat statements before the input instead of using inline format specifications especially with list input. The inline format specification will force reading all characters indicated by the lengths of the formats which will often lead to data issues with varying length values.

 

I also support the recommendations about not using the 'variable space'n constructs. Do however add Label statements to 1) document your variables and 2) provide nice headers and such in procedure output.

Tom
Super User Tom
Super User

The snippet you posted has two INFILE statements, one of which has a comment that includes part of another infile statement in it.

PROC IMPORT tries to guess what your data is, but it generates really ugly SAS code that includes things like permanently attached $xx. formats that you would not want to have in your real dataset.  The general form for reading a delimited txt file is to define the variables, define the input location and then read the variables.  Define the variables explictly using either LENGTH or ATTRIB statement. Do not make the data step compiler guess as to what type and length to use for the variable based on how you reference it in the INPUT statement or what FORMAT or INFORMAT you first attach to it.  You only need to attach INFORMAT or FORMAT to variables that require them.  

data want ;
   length var1 $20 var2 8 var3 $40 .... ;
   format var2 date9.;
   informat var2 mmddyy10.;
   infile 'myfile.txt' dsd truncover firstobs=2;
   input var1 var2 var3 .... ;
run;

Also if your input flenames follow a nice naming convention such that you can replace the day of the month with a wild card then you can read all of the files in one data step if you want.  You can use the EOV= option of the INFILE statement to make a variable that you can use to detect when a new file starts and skip the header row.  So for example if your file names look like '2016-06-01.txt', '2016-06-02.txt' etc. then you could do something like this.

infile '/path/2016-06-*.txt' DSD TRUNCOVER FIRSTOBS=2 EOV=EOV;
input @;
if eov then input;
eov=0;
input ...

 

Kody_devl
Quartz | Level 8

I was able to TWEAK this line to make it work!

 

 

  %let _EFIERR_ = 0;
   infile 'C:\My Documents\myfiles\delimiter.txt' delimiter = '&' MISSOVER  DSD lrecl=32767 firstobs=2 ;
   informat Region $8. ;
 

Change to: PIPE

  %let _EFIERR_ = 0;
   infile 'C:\My Documents\myfiles\delimiter.txt' delimiter = '|' MISSOVER  DSD lrecl=32767 firstobs=2 ;
   informat Region $8. ;
 

Thank you

 

 

 

Tom
Super User Tom
Super User

If all you changed was the delimiter then you can change that in the PROC IMPORT code usinging the DELIMITER= statement.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 18025 views
  • 1 like
  • 5 in conversation