BookmarkSubscribeRSS Feed
Sk1_SAS
Obsidian | Level 7

Hi team,

 

Please, i need help to strucuture this file in SAS, this is the format that i received:

 

DT.TRANS   ERR    HR.TRANS   ERR    AGENC ORIG   ERR    SEQ LOG     ERR    NR.TERM    ERR    NR.AUTENT.   ERR
 22/12/2020        13:47:54          00990               000000320          100               00881
 NR.DOCTO   ERR   CH.TRANS                                             ERR   BANCO ERR   AGENC ERR   CONTA         ERR   TP.CONTA ERR
 0000320          237                    DP21                                999         00999       0000000099999       1
 DT.OPER    ERR  ID.TRAN   ERR  TP.ENQ ERR  TP.TRAN ERR  ID.EST ERR  VALOR            ERR  COD.FUNC  ERR DT.PROCES. ERR HR.PROCES ERR
 22/12/2020      S              090         01           N           000000000050000       009999999     18/02/2021     23:32:59
 CPF/CNPJ  PROPRIETARIO    ERR
 22222222222
 NOME PROPRIETARIO                                                                                                     ERR

 CPF/CNPJ  DESTINATARIO    ERR
 000000000/0000-00
 NOME DESTINATARIO                                                                                                     ERR

 CPF/CNPJ  PORTADOR        ERR
 000000000/0000-00         024
 NOME PORTADOR                                                                                                         ERR

 CPF/CNPJ  SACADOR         ERR
 000000000/0000-00
 NOME SACADOR                                                                                                          ERR

 CPF/CNPJ  SOLICITANTE     ERR
 000000000/0000-00
 NOME SOLICITANTE                                                                                                      ERR

 COMENTARIO                    ERR

Tks!!!!!

 

 

 

8 REPLIES 8
ballardw
Super User

Basic problem with this sort of input: You can only have ONE variable with a given name. So how do you want the names of the multiple ERR fields to be named?

 

I might suggest, since it appears that each of the ERR variables may be comment about the preceding variable that you name them DTERR HRERR and similar. Note, unless you want to cause yourself extra work do not place spaces, periods or other characters than letters, digits and the _ character in variable names.

 

SAS does no do "null". The variables without a value are "missing".

 

When posting example input text please copy the values from a plain text editor and paste into a text box opened on the forum with the </> icon to preserve the format. The message windows will reformat pasted text usually removing white space. From what you pasted not only do the "Err" variables mostly not appear, it is not possible to tell if you have record that appears on one or two rows in your source text.

 

Large economy sized hint: typically an INPUT has one semicolon to end the input.

 

If the file delimited or fixed column?

 

 

Sk1_SAS
Obsidian | Level 7

Hi ballardw,

 

The ERR fields can be ERR1, ERR2, ERR3 and etc, no problem with the name.

 

My file is fixed column, i cant delimited.

 

This is the original format of the .txt file:

 

DT.TRANS   ERR    HR.TRANS   ERR    AGENC ORIG   ERR    SEQ LOG     ERR    NR.TERM    ERR    NR.AUTENT.   ERR
 22/12/2020        13:47:54          00990               000000320          100               00881
 NR.DOCTO   ERR   CH.TRANS                                             ERR   BANCO ERR   AGENC ERR   CONTA         ERR   TP.CONTA ERR
 0000320          237                    DP21                                999         00999       0000000099999       1
 DT.OPER    ERR  ID.TRAN   ERR  TP.ENQ ERR  TP.TRAN ERR  ID.EST ERR  VALOR            ERR  COD.FUNC  ERR DT.PROCES. ERR HR.PROCES ERR
 22/12/2020      S              090         01           N           000000000050000       009999999     18/02/2021     23:32:59
 CPF/CNPJ  PROPRIETARIO    ERR
 22222222222
 NOME PROPRIETARIO                                                                                                     ERR

 CPF/CNPJ  DESTINATARIO    ERR
 000000000/0000-00
 NOME DESTINATARIO                                                                                                     ERR

 CPF/CNPJ  PORTADOR        ERR
 000000000/0000-00         024
 NOME PORTADOR                                                                                                         ERR

 CPF/CNPJ  SACADOR         ERR
 000000000/0000-00
 NOME SACADOR                                                                                                          ERR

 CPF/CNPJ  SOLICITANTE     ERR
 000000000/0000-00
 NOME SOLICITANTE                                                                                                      ERR

 COMENTARIO                    ERR

Tks a lot!

 

Sk1_SAS
Obsidian | Level 7

any suggestion?

Tom
Super User Tom
Super User

The first step is to figure out where the fields are in that file.  You might have to use some trial and error to figure it out.

To start you can run a simple data step to get a look at the file and where on the line the text appears.  The LIST statement is useful for this.  So if you file is name myfile.txt then run this data step to read the file and dump the first 100 lines to the SAS log.

data _null_;
  infile 'myfile.txt' ;
  input;
  if _n_ <= 100 then list;
run;
ballardw
Super User

Where does one record end? Or what do you expect the output to look like?

If data is fixed column then you specify the columns in the data to read from and the informat to read the values.

 

Sort of like this.

data possible;
   infile datalines truncover;
   informat dttrans  ddmmyy10. hrtrans time8. agencyorig $10. seqlog $12.
            nrterm $5.  nrautent $6. nrdocto $8.   chtrans $52. banco $6. AGENC $6.
            CONTA $14. tpconta $5.  dtoper ddmmyy10. idtran $9. tpenq $5.
            tptran $5. idest $5. valor $17. codfunc $10. dtprocess ddmmyy10.
            hrprocess time8.
            err1 - err21 $8. 
   ;
   format  dttrans dtoper dtprocess ddmmyy10. hrtrans hrprocess time8. ;
   input /  /* skip first line as it appears to be headers*/
         dttrans  err1 12-18 hrtrans  err2 30-36  agencyorig 37-46 err3 50-56
         seqlog 57-68 err4 69-75 nrterm 76-80 err5  87-93   nrautent 94-99 err6 107-113
         /  /* skip another line of headers*/
         /nrdocto 1-11  err7 12-17  chtrans 18-70 err8 71-76 banco 77-82 err9 83-88 
          AGENC 89-94 err10 95-100  CONTA 101-114 err11 115-120 tpconta 121-126 err12 127-132
         /  /* skip another line of headers*/
         / dtoper  err13 12-16 idtran 17-25 err14 27-31 tpenq 32-37 err15 39-43  tptran 44-48 err16 52-56
           idest 57-61 err17 64-68 valor 69-85 err18 86-90  codfunc 91-100 err19 101-104
           dtprocess  err20 116-119 hrprocess  err21 129-133
   ;
datalines;
DT.TRANS   ERR    HR.TRANS   ERR    AGENC ORIG   ERR    SEQ LOG     ERR    NR.TERM    ERR    NR.AUTENT.   ERR
22/12/2020        13:47:54          00990               000000320          100               00881                
NR.DOCTO   ERR   CH.TRANS                                             ERR   BANCO ERR   AGENC ERR   CONTA         ERR   TP.CONTA ERR
0000320          237                    DP21                                999         00999       0000000099999       1          
DT.OPER    ERR  ID.TRAN   ERR  TP.ENQ ERR  TP.TRAN ERR  ID.EST ERR  VALOR            ERR  COD.FUNC  ERR DT.PROCES. ERR HR.PROCES ERR
22/12/2020      S              090         01           N           000000000050000       009999999     18/02/2021     23:32:59    
;

This only reads the first three data lines intentionally because I 1) get tired of writing code for garbage layouts 2) this should provide something to get started and 3) how the last rows are to appear makes a difference.

 

The / character on an INPUT statement says "read from next line". So you can read multiple lines into a single record.


I suspect that some of the information in the first three needs to be repeated for some of that other stuff but can't tell which.

 

If the variables are supposed to be numeric then use a number format instead of the $ I used. Often when something shows leading zeroes they are "significant" such as account numbers and should be character and not numbers. So guesses. And since NONE of the ERR values are populated I place them at the end of the data with the INFORMAT statement and guess as to an appropriate length for the values.

 

If you have a document that describes the values then you should be able to set a proper informat and length. If not, good luck.

Sk1_SAS
Obsidian | Level 7
Thank you Ballardw,

i will try.
Tom
Super User Tom
Super User

What is the question you are asking?  Do you need help with reading a file like your example into a SAS dataset (or perhaps a series of SAS datasets)?

Can you provide a list of the variables you expect to create from this file? With names, type and for character variables the length?

Do you only want to read this one file? Or will you get many more files that use a similar structure?

Do all of the files represent separate records in the target SAs dataset?  Or could this block of lines repeat for other observations?  

Does the file really have the first row starting in column 1 and the rest starting in column 2? 

 

 

Sk1_SAS
Obsidian | Level 7

Hi Tom,

 

I need read this external file and import as a table in sas, the problem is that the file has no delimiter. 

 

List of variables and the result that i need according to the file that i posted above.

 

DT.TRANS = 22/12/2020 - date ddmmyy10.

HR.TRANS = 13:47:54 - time

AGENC ORIG = 00990 - number

SEQ LOG = 000000320 - number

NR.TERM = 100 - number

NR.AUTENT. = 00881 - number

NR.DOCTO = 0000320 - number

CH.TRANS = 237 - number

BANCO ERR = 999 - number

AGENC = 00999 - number

CONTA = 99999 - number

TP.CONTA = 1 number

DT.OPER = 22/12/2020 - date ddmmyy10.

ID.TRAN = S - $10.

TP.ENQ = 90 - number

TP.TRAN = 1 - number

ID.EST = N - $10.

VALOR = 50000 - number

COD.FUNC = 9999999 - number

DT.PROCES. = 18/02/2021 - ddmmyy10.

HR.PROCES = 23:32:59 - time

CPF/CNPJ PROPRIETARIO = 22222222222 - $30.

CPF/CNPJ PORTADOR = 000000000/0000-00 - $30.

CPF/CNPJ SACADOR = 000000000/0000-00 -  $30.

CPF/CNPJ SOLICITANTE = 000000000/0000-00 - $30.

 

SAS LOG:

5         GOPTIONS ACCESSIBLE;
26         data _null_;
27           infile 'file.txt' ;
28           input;
29           if _n_ <= 100 then list;
30         run;

NOTE: The infile 'file.txt' is:
      Filename=file.txt,
      Owner Name=xxx,Group Name=xxx,
      Access Permission=-rw-rw-r--,
      Last Modified=22Feb2021:16:04:09,
      File Size (bytes)=1666

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0                      
1         DT.TRANS   ERR    HR.TRANS   ERR    AGENC ORIG   ERR    SEQ LOG     ERR    NR.TERM    ERR    NR.AUTE
     101  NT.   ERR 109
2          22/12/2020        13:47:54          00990               000000320          100               00881 99
3          NR.DOCTO   ERR   CH.TRANS                                             ERR   BANCO ERR   AGENC ERR  
     101   CONTA         ERR   TP.CONTA ERR 133
4          0000320          237                    DP21                                999         00999      
     101   0000000099999       1 122
5          DT.OPER    ERR  ID.TRAN   ERR  TP.ENQ ERR  TP.TRAN ERR  ID.EST ERR  VALOR            ERR  COD.FUNC 
     101   ERR DT.PROCES. ERR HR.PROCES ERR 133
6          22/12/2020      S              090         01           N           000000000050000       009999999
     101       18/02/2021     23:32:59 128
7          CPF/CNPJ  PROPRIETARIO    ERR 30
8          22222222222 12
9          NOME PROPRIETARIO                                                                                  
     101                     ERR 122
10         0
11         CPF/CNPJ  DESTINATARIO    ERR 30
12         000000000/0000-00 18
2                                                          The SAS System                            16:00 Monday, February 22, 2021

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0                      
13         NOME DESTINATARIO                                                                                  
     101                     ERR 122
14         0
15         CPF/CNPJ  PORTADOR        ERR 30
16         000000000/0000-00         024 30
17         NOME PORTADOR                                                                                      
     101                     ERR 122
18         0
19         CPF/CNPJ  SACADOR         ERR 30
20         000000000/0000-00 18
21         NOME SACADOR                                                                                       
     101                     ERR 122
22         0
23         CPF/CNPJ  SOLICITANTE     ERR 30
24         000000000/0000-00 18
25         NOME SOLICITANTE                                                                                   
     101                     ERR 122
26         0
27         COMENTARIO                    ERR 34
NOTE: 27 records were read from the infile 'file.txt'.
      The minimum record length was 0.
      The maximum record length was 133.

 

 

 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 798 views
  • 0 likes
  • 3 in conversation