BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
larryn3
Obsidian | Level 7

Follow-up/related question to :

https://communities.sas.com/t5/SAS-Programming/Proc-Import-Multiple-Spaces-Between-Columns/m-p/78957...

 

Hi,

This works great for text followed by numerics.

 

I have sort of a related question.

 

I need to process many text files so that they can be imported into SAS one by one (I'll do a macro loop and then an append each).

 

They are all formatted "tables", i.e. with the same number of columns (8) which are all aligned and separated by spaces but the number of spaces can vary and each line is ended with [CRLF].

 

Here is an example of two of them:

 

FILE 1

 

N22OR44DLICHT, MxxAdK,  b         x26      OWNEdd   RY)  adLATINUM                                      01/2002                   B       Y        N      304gg438g97

134PO5TEAT, PffAfU, AdsfO            x26      CHIEF sf323+R ALfdL 335gr                                      06/2007                   NA    Y        N      3f5g77r
M11Af N43 GRsfsgANTfOR              yy50    OWdaNERd- PLATsd(NsdaY)                                    03/2009                   D       Y        N      x-xf3$x$5xxx

 

File 2

 

M11Af N43 GRsfsgANTfOR         y50   OWdaNERd- PLATsd(NsdaY)                    03/2009                 D                          Y                    N               x-xf3$x$5xxx
PLfsf33ATINUM CRdsfT              yy50   GENE223RAL Pada2sARTNER -             09/2005                 NA                       Y                    N               226

 

The output should be a dataset with 8 variables: C1....C8. The variables should each be text variables.


Any ideas?

Thank you for your help.

Larry

1 ACCEPTED SOLUTION

Accepted Solutions
larryn3
Obsidian | Level 7

Works great!

 

Now I'm going to try to modify code since my input file doesn't have headers.  I was thinking of using the first row as a header and since that row obviously won't be legitimate variable names, i would possibly just pick the first 3 or 4 characters and append a character before each in case there is a numeric in the first row.

 

Thank you  again for your help.  And I'm going to mark this as an accepted solution.

View solution in original post

20 REPLIES 20
Tom
Super User Tom
Super User

Your file appears to be in fixed columns, but it is hard to tell since you pasted it into the body of your message instead of using the Insert Code button to get a pop-up window to paste the lines.

larryn3
Obsidian | Level 7
Forgive my ignorance, but i don't see the popup.
ballardw
Super User

@larryn3 wrote:
Forgive my ignorance, but i don't see the popup.

Above the main message would in most browsers a number of icons appear for different actions. The </> opens a window for pasting plain text. The small "running man" opens a window for pasting SAS code which will have some code formatting applied such as coloring keywords.

For text data the text window is preferred though the code window will work. Neither changes the content of pasted text. The main message windows on this forum will reformat pasted text, typically reducing "white space" such as removing space characters  and may insert not-visible HTML tags.

Tom
Super User Tom
Super User

The menu bar appears right above the text box.

Screenshot 2022-01-11 154659.jpg

Tom
Super User Tom
Super User

To see how you lines line up try using the LIST statement in a data step.

For example this will dump the first 10 lines of 'file1'.

data _null_;
  infile 'file1' obs=10;
  input;
  list;
run;
larryn3
Obsidian | Level 7

Ah, this shows me the column locations where the variables should begin.  Now how do I put the "column start"s into a substring function to pull out the 8 variables.

 

How do I do that?

 

really appreciate your help,  I think we're on the right track.

 

 

Tom
Super User Tom
Super User

No need for splitting strings. Just tell SAS where to read the values.

 

data want;
  infile 'file1' ;
  input c1 $1-10 c2 11-15 .... ;
run;

https://documentation.sas.com/doc/en/pgmsascdc/v_018/lestmtsref/n13ejk9swz5vrbn0z34iazfrp0wp.htm

 

larryn3
Obsidian | Level 7

Maybe I'm missing something

 

I don't know which position to read each the values from.  Each of the files has can have a different column position for each of the eight variables I want to create.

 

The only thing I do know is that there are 8 columns in each file and that the each of the columns begins at the same position.  I hope I am being clear.  If not I can provide a more detailed explanation.

Tom
Super User Tom
Super User

It is not clear what you mean.  You say you know where the values are on the line and then you turn around and say you don't know. Which is it?

 

Post an example of the file. Show what you want to create from that example.

 

It sounds like your question is that you have 8 different files with slightly different layouts.  

Why not just make 8 different data steps to read each different file?  

larryn3
Obsidian | Level 7

Actually I have about 1000 files each file has 8 columns but the columns are in different positions:

 

File 1:

 

C1           C2           C3                                                   c4                                      c5                      c6                    c7     c8

aaaa        z8z  23    bbbbbbbbbbbbb                             s;lfjslf;sjl;dkfjs                    y                        ls;dkfjl  ;ds       n       abc xyz

aafssb     122          bb111233bbb                                  54332fjs                            n                        l4645643dsj    n       abc xyz2342221

 

File 2

C1                                     C2            C3                                    c4                                   c5                      c6                    c7                           c8

aaaa       zzz  zz                bbbb        s;lfjslf;sjl;dkfjs                   y                                     ls;dkfjl  ;ds         n                      223                         xyz

aafssfaa b zz  zz               b33bbb    54332fjs                           n                                     l46456ds           n                      a b                          qr4dsdfs

 

larryn3
Obsidian | Level 7

Sorry,  Unfortunately the columns didn't line up the way they appeared when i was typing them in.

 

I tried try the "insert code"  but that was not any better.  I'll try to put the example in a file and attach it to the next post.

 

Tom
Super User Tom
Super User

Copy the lines from the SAS log when you ran the data step with the LIST statement.  Include the ruler line it produces.  Open a new reply on the forum and click on the Insert Code button (looks like < / > ) and paste the lines into the window that pops up.  Click the OK button.  Then click the POST button to publish your reply.

 

 

larryn3
Obsidian | Level 7

Attached is a word file which should show more clearly the way the columns are aligned.

larryn3
Obsidian | Level 7
File 2

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
1         N22OR44DLICHT, MxxAdK,  b             x26     OWNEdd RY) adLATINUM              
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
      81                                             01/2002        B         Y         N 
     161   304gg438g97                                                                    
2         134PO5TEAT, PffAfU, AdsfO             x26     CHIEF sf323+R ALfdL 335gr         
      81                                             06/2007        NA        Y         N 
     161   3f5g77r                                                                        
NOTE: 3 records were read from file 'c:\test\file1a.txt'
      The minimum record length was 0
      The maximum record length was 172
NOTE: The data step took :
      real time : 0.001
      cpu time  : 0.000


File2
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
1         M11Af N43 GRsfsgANTfOR                       yy50    OWdaNERd- PLATsd(NsdaY)    
      81                                                        03/2009     D      Y      
     161    N     x-xf3$x$5xxx                                                            
2         PLfsf33ATINUM CRdsfT                         yy50    GENE223RAL Pada2sARTNER - P
      81  LATIsdaNU CRE23                                       09/2005     NA     Y      
     161    N     www   

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 20 replies
  • 1747 views
  • 0 likes
  • 3 in conversation