Desktop productivity for business analysts and programmers

Importing Tab Delimited text files using data step - display issues SAS EG 5.1

Reply
Occasional Contributor
Posts: 10

Importing Tab Delimited text files using data step - display issues SAS EG 5.1

I have a quandry.  Recently I have been told that using the Import Wizard to bring in external tab delimited text files was creating a bunch of temporary files on my C Drive, which I am not allowed to access, and as such is causing a plethora of other issues with my computer.  It was suggested that I begin using Data step to import the external files.  When I have run the suggested code, the file imports, but the display is completely in accurate.  Which makes it difficult to go back and troubleshoot any output issues.  I have tried to copy and paste the Import Wizard code into a program, and it doesn't run at all (because it is still looking for a temporary file).   

 

Sample Data

Sample Data 1.png

Import Wizard Code - This is the code that doesn't work outside of using the Import Wizard

 

   Code generated by a SAS task
   
   Generated on Thursday, July 06, 2017 at 4:37:59 PM
   By task:     Import Data Wizard
   
   Source file: G:\Financial Services\Financial Reporting\Monthly
   Production Reports\Wealth_Core
   Reports\Retail_Accounts_for_Dashboards 062017.txt
   Server:      Local File System
   
   Output data: WORK.Retail_Accounts_for_Dashboards_0
   Server:      Analytics
   
   Note: In preparation for running the following code, the Import
   Data wizard has used internal routines to transfer the source data
   file from the local file system to Analytics. There is no SAS code
   available to represent this action.
   -------------------------------------------------------------------- */

DATA WORK.Retail_Accounts_for_Dashboards_0;
    LENGTH
        'Account Number'n $ 16
        Product          $ 3
        'Sub Product'n   $ 3
        'User Field39'n  $ 2
        'Account Number Dim Key'n $ 11 ;
    FORMAT
        'Account Number'n $CHAR16.
        Product          $CHAR3.
        'Sub Product'n   $CHAR3.
        'User Field39'n  $CHAR2.
        'Account Number Dim Key'n $CHAR11. ;
    INFORMAT
        'Account Number'n $CHAR16.
        Product          $CHAR3.
        'Sub Product'n   $CHAR3.
        'User Field39'n  $CHAR2.
        'Account Number Dim Key'n $CHAR11. ;
    INFILE 'E:\saswork\_TD13452_WIL-ENTSASPRD06_\#LN00033'
        LRECL=39
        ENCODING="WLATIN1"
        TERMSTR=CRLF
        DLM='7F'x
        MISSOVER
        DSD ;
    INPUT
        'Account Number'n : $CHAR16.
        Product          : $CHAR3.
        'Sub Product'n   : $CHAR3.
        'User Field39'n  : $CHAR2.
        'Account Number Dim Key'n : $CHAR11. ;
RUN;

 

Suggested Code - the one with the 3-Card monty for column values:

DATA Ret_Accts_for_Dashboards;
    LENGTH
        'Account Number'n $ 16
        Product          $ 3
        'Sub Product'n   $ 3
        'User Field39'n  $ 2
        'Account Number Dim Key'n $ 11 ;
    FORMAT
        'Account Number'n $CHAR16.
        Product          $CHAR3.
        'Sub Product'n   $CHAR3.
        'User Field39'n  $CHAR2.
        'Account Number Dim Key'n $CHAR11. ;
    INFORMAT
        'Account Number'n $CHAR16.
        Product          $CHAR3.
        'Sub Product'n   $CHAR3.
        'User Field39'n  $CHAR2.
        'Account Number Dim Key'n $CHAR11. ;
    INFILE '\\wil-netapp02\rev-cred\Financial Services\Financial Reporting\Monthly Production Reports\Wealth_Core Reports\06 2017 Retail Interchange.txt'
        DLM='	'
		DSD MISSOVER
		FIRSTOBS=2;
    INPUT
        'Account Number'n : $CHAR16.
        Product         :  $CHAR3.
        'Sub Product'n  :  $CHAR4.
        'User Field39'n :  $CHAR2.
        'Account Number Dim Key'n : $CHAR11.;
RUN;

And the resulting outputs with the display issues - all of the data is to come in as character, but using Proc Import chooses the 'best' data type for this, which doesn't necessarily work out great.

 

Sample Output.png

 

Any thoughts on how to correct the display issues?

 

Thanks

 

El

Esteemed Advisor
Esteemed Advisor
Posts: 7,233

Re: Importing Tab Delimited text files using data step - display issues SAS EG 5.1

[ Edited ]

You just need to update that code to reflect the data which you know best, for instance:

data ret_accts_for_dashboards;
    length
        account_number $16
        Product        $3
        sub_product    8
        user_field39   8
        account_number_dim_key 8;
    format
        account_number $16
        Product        $3
        sub_product    5
        user_field39   5
        account_number_dim_key 5;
    informat
        account_number $16
        Product        $3
        sub_product    5
        user_field39   5
        account_number_dim_key 5;
    infile '\\wil-netapp02\rev-cred\Financial Services\Financial Reporting\Monthly Production Reports\Wealth_Core Reports\06 2017 Retail Interchange.txt'
        dlm='09'x dsd missover firstobs=2;
    input
        account_number $
        Product        $
        sub_product    
        user_field39   
        account_number_dim_key;
run;

Reads sub_product and user_field in as numeric.

Super User
Super User
Posts: 6,364

Re: Importing Tab Delimited text files using data step - display issues SAS EG 5.1

The EG guide wizard will upload text files to the SAS server and generate a data step to read the file.  For Excel files it generates a text file on your PC and uploads that.

 

The code the EG writes is much more complicated than what you would write yourself. You do not need to attach formats or informats to 90% of variables. You also do not need to include $ in the INPUT statemen if you have already defined the variable as character. Just define the type and length using the LENGTH statement and SAS will know how to read and display them.  The main exception is date and time fields.  Also for fields with embedded $ and/or commas you should use the COMMA (or DOLLAR) informat instead of default informat (the default informat is just nn. which can also be called Fnn. Some users like to call it a BEST informat, but BEST is really the name for FORMAT and not an INFORMAT).

 

So if you wanted to re-write the code the EG generated you just need the DATA, INFILE, LENGTH and INPUT statement.  You can change the variables from character to numeric by changing the length assigned to them.  Numeric variables should use length 8 as SAS stores numbers using 64bit floating point.

 

If you want to change a variable from character to numeric then change the LENGTH value from something like '$12' to '8'.

If your file is really a tab delimited file then you would want to use '09'x as the delimiter on the INFILE statement and not '7F'x.  That might be the delimiter that EG uses when it creates a file from an Excel spreadsheet.  Your account number DIM Key field values look like numbers with commas in them.  But a KEY should not be a number since no one wants to take the mean of it.  If you do convert it to a number then you will need to add an INFORMAT statement or include a informat for it in the INPUT statement. If you do include the informat in the INPUT statement it is critical to prefix it with ':' or it could confuse how SAS parses the lines at the delimiters. 

 

You also probably do NOT want the LRECL= option on the INFILE statement.  Unless your lines are longer than the default (which is currently 32767).

 

So here is an updated data step that will from a tab delimited file and read 4th column as a number.

 

DATA WORK.Retail_Accounts_for_Dashboards_0;
  LENGTH
        'Account Number'n $ 16
        Product          $ 3
        'Sub Product'n   $ 3
        'User Field39'n  8
        'Account Number Dim Key'n $ 11 
  ;
  INFILE 'E:\saswork\_TD13452_WIL-ENTSASPRD06_\#LN00033'
        ENCODING="WLATIN1"
        TERMSTR=CRLF
        DLM='09'x
        truncover
        DSD 
  ;
  INPUT 'Account Number'n --  'Account Number Dim Key'n ;
RUN;
Community Manager
Posts: 2,697

Re: Importing Tab Delimited text files using data step - display issues SAS EG 5.1

You can get EG to generate a slightly more "reusable" file for you by clicking the Performance button, then "Bypass the data cleansing process" on the Import Data task.

 

This will tell EG to not "fix" the file for you by changing the delimiter and quoting in the fields.  It will instead copy/import the file as-is, which might include a few data processing warts -- but hey, that's life!

 

import.png

Ask a Question
Discussion stats
  • 3 replies
  • 126 views
  • 0 likes
  • 4 in conversation