Solved: Re: Proc Import -- Multiple Spaces Between Columns - Page 2

ballardw · Posted 01-11-2022 06:28 PM

@larryn3 wrote:

Attached is a word file which should show more clearly the way the columns are aligned.

Actually a word processor document is likely to be worse than plain text. If by any chance your data contains TAB characters the "alignment" almost certainly will not be what should be read.Also non-visible formatting characters can sneak into the file making copy out and read a headache as well.

If the file is TEXT then post actual text. Don't type anything on this forum. Open a text box using the </> icon. Copy lines from the file after it is opened in a plain text file editor like NOTEPAD or even the SAS Editor. Then paste those lines into the text box.

larryn3 · Posted 01-11-2022 06:32 PM

File 1

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
1         N22OR44DLICHT, MxxAdK,  b             x26     OWNEdd RY) adLATINUM              
      81                                             01/2002        B         Y         N 
     161   304gg438g97                                                                    
2         134PO5TEAT, PffAfU, AdsfO             x26     CHIEF sf323+R ALfdL 335gr         
      81                                             06/2007        NA        Y         N 
     161   3f5g77r      

File 2

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
1         M11Af N43 GRsfsgANTfOR                       yy50    OWdaNERd- PLATsd(NsdaY)    
      81                                                        03/2009     D      Y      
     161    N     x-xf3$x$5xxx                                                            
2         PLfsf33ATINUM CRdsfT                         yy50    GENE223RAL Pada2sARTNER - P
      81  LATIsdaNU CRE23                                       09/2005     NA     Y      
     161    N     www

Tom · Posted 01-11-2022 06:40 PM

You are going to have figure out the pattern yourself I am afraid.

Is there no documentation provided that explains why the different files use different widths for the fields?

Tom · Posted 01-11-2022 06:35 PM

Read the first line with the C1 .... in it and use that to calculate the starting locations for this version of the file.

Let's convert one of your example files into a physical file we can reference in the example code.

options parmcards=txt ;
filename txt temp;
parmcards4;
C1           C2           C3                                                   c4                                      c5                      c6                    c7     c8

aaaa        z8z  23    bbbbbbbbbbbbb                             s;lfjslf;sjl;dkfjs                    y                        ls;dkfjl  ;ds       n       abc xyz
;;;;

Now let's read that first line and pick out the variable names and their start and end columns.


data start;
  length col start end 8 name next $32;
  infile txt obs=1 column=cc truncover;
  start=1;
  input name @;
  do col=1 by 1 until(next=' ');
    input next @ ;
    end = cc - lengthn(next) - 2;
    output;
    name=next;
    start = end+1;
  end;
  list;
  drop next;
run;

Result:

Obs    col    start    end    name

 1      1        1      13     C1
 2      2       14      26     C2
 3      3       27      79     C3
 4      4       80     119     c4
 5      5      120     143     c5
 6      6      144     165     c6
 7      7      166     171     c7
 8      8      172     173     c8

Note that the end column for that last one might be a little off as the length of the NAME is probably shorter than the length of the values in the last column.

We can use that to write the input statement to read your file.

filename code temp;

data _null_;
  set start end=eof;
  file code;
  if _n_=1 then put 'input ' ;
  put @3 name '$' start '-' end ;
  if eof then put ';' ;
run;

data want;
  infile txt firstobs=2 truncover ;
  %include code / source2;
run;

larryn3 · Posted 01-12-2022 07:43 AM

Thank you very much for all your help and the time you spent on this. I'm going to study your suggestion in the next couple of days. It is possible I may need to follow up.

larryn3 · Posted 01-18-2022 12:11 PM

Works great!

Now I'm going to try to modify code since my input file doesn't have headers. I was thinking of using the first row as a header and since that row obviously won't be legitimate variable names, i would possibly just pick the first 3 or 4 characters and append a character before each in case there is a numeric in the first row.

Thank you again for your help. And I'm going to mark this as an accepted solution.

Re: Proc Import -- Multiple Spaces Between Columns

Re: Proc Import -- Multiple Spaces Between Columns

Re: Proc Import -- Multiple Spaces Between Columns

Re: Proc Import -- Multiple Spaces Between Columns

Re: Proc Import -- Multiple Spaces Between Columns

Re: Proc Import -- Multiple Spaces Between Columns

Register Today!

SAS Training: Just a Click Away