Hi team,
Please, i need help to strucuture this file in SAS, this is the format that i received:
DT.TRANS ERR HR.TRANS ERR AGENC ORIG ERR SEQ LOG ERR NR.TERM ERR NR.AUTENT. ERR 22/12/2020 13:47:54 00990 000000320 100 00881 NR.DOCTO ERR CH.TRANS ERR BANCO ERR AGENC ERR CONTA ERR TP.CONTA ERR 0000320 237 DP21 999 00999 0000000099999 1 DT.OPER ERR ID.TRAN ERR TP.ENQ ERR TP.TRAN ERR ID.EST ERR VALOR ERR COD.FUNC ERR DT.PROCES. ERR HR.PROCES ERR 22/12/2020 S 090 01 N 000000000050000 009999999 18/02/2021 23:32:59 CPF/CNPJ PROPRIETARIO ERR 22222222222 NOME PROPRIETARIO ERR CPF/CNPJ DESTINATARIO ERR 000000000/0000-00 NOME DESTINATARIO ERR CPF/CNPJ PORTADOR ERR 000000000/0000-00 024 NOME PORTADOR ERR CPF/CNPJ SACADOR ERR 000000000/0000-00 NOME SACADOR ERR CPF/CNPJ SOLICITANTE ERR 000000000/0000-00 NOME SOLICITANTE ERR COMENTARIO ERR
Tks!!!!!
Basic problem with this sort of input: You can only have ONE variable with a given name. So how do you want the names of the multiple ERR fields to be named?
I might suggest, since it appears that each of the ERR variables may be comment about the preceding variable that you name them DTERR HRERR and similar. Note, unless you want to cause yourself extra work do not place spaces, periods or other characters than letters, digits and the _ character in variable names.
SAS does no do "null". The variables without a value are "missing".
When posting example input text please copy the values from a plain text editor and paste into a text box opened on the forum with the </> icon to preserve the format. The message windows will reformat pasted text usually removing white space. From what you pasted not only do the "Err" variables mostly not appear, it is not possible to tell if you have record that appears on one or two rows in your source text.
Large economy sized hint: typically an INPUT has one semicolon to end the input.
If the file delimited or fixed column?
Hi ballardw,
The ERR fields can be ERR1, ERR2, ERR3 and etc, no problem with the name.
My file is fixed column, i cant delimited.
This is the original format of the .txt file:
DT.TRANS ERR HR.TRANS ERR AGENC ORIG ERR SEQ LOG ERR NR.TERM ERR NR.AUTENT. ERR 22/12/2020 13:47:54 00990 000000320 100 00881 NR.DOCTO ERR CH.TRANS ERR BANCO ERR AGENC ERR CONTA ERR TP.CONTA ERR 0000320 237 DP21 999 00999 0000000099999 1 DT.OPER ERR ID.TRAN ERR TP.ENQ ERR TP.TRAN ERR ID.EST ERR VALOR ERR COD.FUNC ERR DT.PROCES. ERR HR.PROCES ERR 22/12/2020 S 090 01 N 000000000050000 009999999 18/02/2021 23:32:59 CPF/CNPJ PROPRIETARIO ERR 22222222222 NOME PROPRIETARIO ERR CPF/CNPJ DESTINATARIO ERR 000000000/0000-00 NOME DESTINATARIO ERR CPF/CNPJ PORTADOR ERR 000000000/0000-00 024 NOME PORTADOR ERR CPF/CNPJ SACADOR ERR 000000000/0000-00 NOME SACADOR ERR CPF/CNPJ SOLICITANTE ERR 000000000/0000-00 NOME SOLICITANTE ERR COMENTARIO ERR
Tks a lot!
any suggestion?
The first step is to figure out where the fields are in that file. You might have to use some trial and error to figure it out.
To start you can run a simple data step to get a look at the file and where on the line the text appears. The LIST statement is useful for this. So if you file is name myfile.txt then run this data step to read the file and dump the first 100 lines to the SAS log.
data _null_;
infile 'myfile.txt' ;
input;
if _n_ <= 100 then list;
run;
Where does one record end? Or what do you expect the output to look like?
If data is fixed column then you specify the columns in the data to read from and the informat to read the values.
Sort of like this.
data possible; infile datalines truncover; informat dttrans ddmmyy10. hrtrans time8. agencyorig $10. seqlog $12. nrterm $5. nrautent $6. nrdocto $8. chtrans $52. banco $6. AGENC $6. CONTA $14. tpconta $5. dtoper ddmmyy10. idtran $9. tpenq $5. tptran $5. idest $5. valor $17. codfunc $10. dtprocess ddmmyy10. hrprocess time8. err1 - err21 $8. ; format dttrans dtoper dtprocess ddmmyy10. hrtrans hrprocess time8. ; input / /* skip first line as it appears to be headers*/ dttrans err1 12-18 hrtrans err2 30-36 agencyorig 37-46 err3 50-56 seqlog 57-68 err4 69-75 nrterm 76-80 err5 87-93 nrautent 94-99 err6 107-113 / /* skip another line of headers*/ /nrdocto 1-11 err7 12-17 chtrans 18-70 err8 71-76 banco 77-82 err9 83-88 AGENC 89-94 err10 95-100 CONTA 101-114 err11 115-120 tpconta 121-126 err12 127-132 / /* skip another line of headers*/ / dtoper err13 12-16 idtran 17-25 err14 27-31 tpenq 32-37 err15 39-43 tptran 44-48 err16 52-56 idest 57-61 err17 64-68 valor 69-85 err18 86-90 codfunc 91-100 err19 101-104 dtprocess err20 116-119 hrprocess err21 129-133 ; datalines; DT.TRANS ERR HR.TRANS ERR AGENC ORIG ERR SEQ LOG ERR NR.TERM ERR NR.AUTENT. ERR 22/12/2020 13:47:54 00990 000000320 100 00881 NR.DOCTO ERR CH.TRANS ERR BANCO ERR AGENC ERR CONTA ERR TP.CONTA ERR 0000320 237 DP21 999 00999 0000000099999 1 DT.OPER ERR ID.TRAN ERR TP.ENQ ERR TP.TRAN ERR ID.EST ERR VALOR ERR COD.FUNC ERR DT.PROCES. ERR HR.PROCES ERR 22/12/2020 S 090 01 N 000000000050000 009999999 18/02/2021 23:32:59 ;
This only reads the first three data lines intentionally because I 1) get tired of writing code for garbage layouts 2) this should provide something to get started and 3) how the last rows are to appear makes a difference.
The / character on an INPUT statement says "read from next line". So you can read multiple lines into a single record.
I suspect that some of the information in the first three needs to be repeated for some of that other stuff but can't tell which.
If the variables are supposed to be numeric then use a number format instead of the $ I used. Often when something shows leading zeroes they are "significant" such as account numbers and should be character and not numbers. So guesses. And since NONE of the ERR values are populated I place them at the end of the data with the INFORMAT statement and guess as to an appropriate length for the values.
If you have a document that describes the values then you should be able to set a proper informat and length. If not, good luck.
What is the question you are asking? Do you need help with reading a file like your example into a SAS dataset (or perhaps a series of SAS datasets)?
Can you provide a list of the variables you expect to create from this file? With names, type and for character variables the length?
Do you only want to read this one file? Or will you get many more files that use a similar structure?
Do all of the files represent separate records in the target SAs dataset? Or could this block of lines repeat for other observations?
Does the file really have the first row starting in column 1 and the rest starting in column 2?
Hi Tom,
I need read this external file and import as a table in sas, the problem is that the file has no delimiter.
List of variables and the result that i need according to the file that i posted above.
DT.TRANS = 22/12/2020 - date ddmmyy10.
HR.TRANS = 13:47:54 - time
AGENC ORIG = 00990 - number
SEQ LOG = 000000320 - number
NR.TERM = 100 - number
NR.AUTENT. = 00881 - number
NR.DOCTO = 0000320 - number
CH.TRANS = 237 - number
BANCO ERR = 999 - number
AGENC = 00999 - number
CONTA = 99999 - number
TP.CONTA = 1 number
DT.OPER = 22/12/2020 - date ddmmyy10.
ID.TRAN = S - $10.
TP.ENQ = 90 - number
TP.TRAN = 1 - number
ID.EST = N - $10.
VALOR = 50000 - number
COD.FUNC = 9999999 - number
DT.PROCES. = 18/02/2021 - ddmmyy10.
HR.PROCES = 23:32:59 - time
CPF/CNPJ PROPRIETARIO = 22222222222 - $30.
CPF/CNPJ PORTADOR = 000000000/0000-00 - $30.
CPF/CNPJ SACADOR = 000000000/0000-00 - $30.
CPF/CNPJ SOLICITANTE = 000000000/0000-00 - $30.
SAS LOG:
5 GOPTIONS ACCESSIBLE; 26 data _null_; 27 infile 'file.txt' ; 28 input; 29 if _n_ <= 100 then list; 30 run; NOTE: The infile 'file.txt' is: Filename=file.txt, Owner Name=xxx,Group Name=xxx, Access Permission=-rw-rw-r--, Last Modified=22Feb2021:16:04:09, File Size (bytes)=1666 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 1 DT.TRANS ERR HR.TRANS ERR AGENC ORIG ERR SEQ LOG ERR NR.TERM ERR NR.AUTE 101 NT. ERR 109 2 22/12/2020 13:47:54 00990 000000320 100 00881 99 3 NR.DOCTO ERR CH.TRANS ERR BANCO ERR AGENC ERR 101 CONTA ERR TP.CONTA ERR 133 4 0000320 237 DP21 999 00999 101 0000000099999 1 122 5 DT.OPER ERR ID.TRAN ERR TP.ENQ ERR TP.TRAN ERR ID.EST ERR VALOR ERR COD.FUNC 101 ERR DT.PROCES. ERR HR.PROCES ERR 133 6 22/12/2020 S 090 01 N 000000000050000 009999999 101 18/02/2021 23:32:59 128 7 CPF/CNPJ PROPRIETARIO ERR 30 8 22222222222 12 9 NOME PROPRIETARIO 101 ERR 122 10 0 11 CPF/CNPJ DESTINATARIO ERR 30 12 000000000/0000-00 18 2 The SAS System 16:00 Monday, February 22, 2021 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 13 NOME DESTINATARIO 101 ERR 122 14 0 15 CPF/CNPJ PORTADOR ERR 30 16 000000000/0000-00 024 30 17 NOME PORTADOR 101 ERR 122 18 0 19 CPF/CNPJ SACADOR ERR 30 20 000000000/0000-00 18 21 NOME SACADOR 101 ERR 122 22 0 23 CPF/CNPJ SOLICITANTE ERR 30 24 000000000/0000-00 18 25 NOME SOLICITANTE 101 ERR 122 26 0 27 COMENTARIO ERR 34 NOTE: 27 records were read from the infile 'file.txt'. The minimum record length was 0. The maximum record length was 133.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.