Dear Members,
I have a big text file and it contains 3 tables. The records in the file are unsorted, the records in the file are in line with the column name separated by space. The tables are repeating themselves until the end. I want to import the data from that notepad file under correct table into SAS. I want to read the records and column names from the text file and put them under correct table In SAS. I tried through INFILE and I was successful in importing the data into SAS, but since Columns are unsorted and tables are repeating themselves therefore it is harder to keep the records under correct table in SAS. I am a beginner to SAS, any help would be greatly appreciated. The sample file is attached for your reference.
Thanks
Ather
Hi
You will need some time to do this, here is a basic program that I hope gets you started. See the comments in the code for an explanation what the codes does.
data want;
infile "c:\temp\incomingTestFile.txt" truncover;
* read a records;
input;
* check for type of table;
if find(_infile_, "Table 1") then do;
*
* depending on the table start reading values
* the @"someText" positions you rigth after the text
* the @28 positions on col 28
* varname informat, reads the value into a variable, carefull about the length
* the slash "/" indicates "read a new line"
*;
input
/
@"Acct Number" acctNumber $30.
@"ZIP Code" ZIPCode $30.
/
@"Floor Limit Indicator" floorLimitInd $8.
@"State/Prov. Code" state_prov_code $20.
/
/
/
/
/
@28 purchaseDate yymmdd8.
;
*
* take out any control chars like tab etc. and take away leading and trailing spaces
*;
array xchar{*} _character_;
drop i;
do i = 1 to dim(xchar);
xchar{i} = strip(compress(xchar{i},,"C"));
end;
* write to the output data sets;
output;
end;
format
purchaseDate date9.
;
run;
Bruno
Thank you for answering the question, we dont have source, we only get this file on daily basis in this format. The data is not stored anywhere except in this file and we want the data to be available with us to perform different analytics. This is the reason of importing the file in SAS. Thanks
Why are you trying to import a report file? Go back to the source data - i.e. the data used to produce that report. Otherwise you have quite a chunk of work on your hands, import each row, process it out etc.
Thank you for answering the question, we dont have source, we only get this file on daily basis in this format. The data is not stored anywhere except in this file and we want the data to be available with us to perform different analytics. This is the reason of importing the file in SAS. Thanks
So you get this file regularly, but the supplier cannot provide either raw data access or a proper data transfer file? I would go and check the contract you have with the data provider - this is a report file, it is not a data transfer. You can try coding round it, but at each run you will likely be re-coding and re-doing the process. It really is in your best interest and the supplier to agree a proper data transfer process. Even if that process it to drop the data to CSV at the same time the text file is created.
I'd do it like that:
one data step, with N tables in the DATA statement; use KEEP to only put the proper variables in the respective datasets
INFILE statement with end=done option
RETAIN all variables that appear in the KEEP lists for the output tables
also RETAIN a flag variable that holds which table is currently being read; set it to 0 initially.
read input lines into sufficently long string variable; discard lines that hold no real data (like those containing PAGE x) with subsetting if
when a "Table X" line is encountered:
- if the flag variable is not zero (indicating that we are not starting the first table), OUTPUT to corresponding dataset (use a SELECT block)
- set the flag variable accordingly
a SELECT(flag); block, where you have parsing statements for each table you are currently reading
when the "done" variable is set, do one final output of the currently read table (as you did when a new table started)
Thanks Kurt for the suggestion, could you please elaborate more through SAS code, would be grateful to you. Thanks
Start easy:
data test (keep=acctno);
infile "your_test_file" truncover;
length input_line $200;
input input_line $200.;
if index(input_line,'Account') > 0 then do;
acctno = input(substr(input_line,X,Y),Z.);
output;
end;
run;
where X,Y,Z denote positions and length of the accountnumber within the report line.
Once you get this working, you already have one of the parsing statements. Then write more and more parsing statements.
The INDEX function will also be used to detect the lines that start a table, which will then help you create the SELECT blocks.
Hi
You will need some time to do this, here is a basic program that I hope gets you started. See the comments in the code for an explanation what the codes does.
data want;
infile "c:\temp\incomingTestFile.txt" truncover;
* read a records;
input;
* check for type of table;
if find(_infile_, "Table 1") then do;
*
* depending on the table start reading values
* the @"someText" positions you rigth after the text
* the @28 positions on col 28
* varname informat, reads the value into a variable, carefull about the length
* the slash "/" indicates "read a new line"
*;
input
/
@"Acct Number" acctNumber $30.
@"ZIP Code" ZIPCode $30.
/
@"Floor Limit Indicator" floorLimitInd $8.
@"State/Prov. Code" state_prov_code $20.
/
/
/
/
/
@28 purchaseDate yymmdd8.
;
*
* take out any control chars like tab etc. and take away leading and trailing spaces
*;
array xchar{*} _character_;
drop i;
do i = 1 to dim(xchar);
xchar{i} = strip(compress(xchar{i},,"C"));
end;
* write to the output data sets;
output;
end;
format
purchaseDate date9.
;
run;
Bruno
It is really not easy. There are too many scenarios you need to consider about : filename xx '/folders/myfolders/incoming test file.txt'; data temp; infile xx lrecl=2000 truncover length=len; input x $varying2000. len; if find(x,'INCOMING INTERCHANGE') or find(x,'SYSTEM DATE') or find(x,'RUN NO') or missing(x) then delete; length tname $ 100; retain tname; if scan(x,-2)='Table' then do;tname=strip(scan(x,-1,'-'));delete;end; run; data temp1; set temp; temp1=strip(tranwrd(substr(x,1,55),'09'x,' ')); p1=find(temp1,' '); name1=substr(temp1,1,p1); value1=strip(substr(temp1,p1)); temp2=strip(tranwrd(substr(x,56),'09'x,' ')); p2=find(temp2,' '); name2=substr(temp2,1,p2); value2=strip(substr(temp2,p2)); keep name1 value1 name2 value2 tname temp1 temp2; run; data want; set temp1; name=name1;value=value1;output; name=name2;value=value2;output; keep name value tname; run; proc print noobs label;run; OUTPUT: tname name value Table 1 Acct Number xxxxxxxxxxxxxxxxxxx Table 1 ZIP Code xxxxx Table 1 Floor Limit Indicator x Table 1 State/Prov. Code AB Table 1 CRB/Exception File Ind Table 1 Requested Payment Service Table 1 PCAS Indicator x Table 1 Number of Payment Forms Table 1 Reference Nbr xxxxxxxxxxxxxxxxxxxxxxx Table 1 Usage Code 1 Table 1 Business ID xxx Table 1 Reason Code 00 Table 1 Purchase Date 20160119 Table 1 Flag 0 Table 1 Destination Amount xxxxxxxxxxxx Table 1 Characteristics Ind N Table 1 Currency Code 840 Table 1 Code 123456 Table 1 Source Amount 000000000100 Table 1 Capability 2 Table 1 Source Currency Code 840 Table 1 ID Method 1 Table 1 Name ABCDEFGHIJKLMNOPQRSTUV Table 1 Flag Table 1 City xxx-xxx-xxxx Table 1 Entry Mode 01 Table 1 Country Code US Table 1 Processing Date 20160121 Table 1 Category Code 1234 Table 1 Attribute 0 Table 2 Format Code Table 2 Trace Number Table 2 Assurance Level Table 2 Terminal Ind Table 2 Reference Nbr 000000 Table 2 Indicator Table 2 Indicator Table 2 Development Field 0 Table 2 Message Text Table 2 Response Code Table 2 Condition Ind Table 2 Source Code 5 Table 2 Program Indicator Table 2 Id Format Table 2 Charge x Table 2 Selection 0 Table 2 ID XXXXXXXXXXXXXXX Table 2 Payment Count Table 2 Terminal ID Table 2 Identifier 123456789 Table 2 Fee 000000000000 Ca Table 2 shback 000000000 Table 2 Pymt Ind 7 Table 2 Condition Code Table 2 Chargeback Ind Table 2 Environment Table 3 Identifier 123456789 Volume Indicator Table 3 Table 3 Amount 000000000000 Table 3 Goods Indicator Table 3 Currency Cd 123 Table 3 Verification Val Table 3 Response Cd Table 3 Fee Amount 000000000000000 Table 3 Code Table 3 Fee Sign Table 3 ID Rsn Table 3 Curr Ex Rate 00000000 Table 3 Processing Code Table 3 Curr Ex Rate 00000000 Table 3 Rights Ind Table 3 ISA Amt 000000000000 Table 3 Clearing Seq Nbr 00 Table 3 Product Id F Table 3 Clearing Seq Cnt 00 Table 3 Program ID Table 3 Auth Data Table 3 Conv Ind Table 3 Total Authorized Amount 000000000000 Table 3 Type Identification Table 3 Information Ind Table 3 Qualified Indicator Table 3 Telephone Number Table 3 Token 0000000000000000 Table 3 Data Indicator Table 3 Result Code Table 1 Acct Number xxxxxxxxxxxxxxxxxxx Table 1 ZIP Code xxxxx Table 1 Floor Limit Indicator x Table 1 State/Prov. Code AB Table 1 CRB/Exception File Ind Table 1 Requested Payment Service Table 1 PCAS Indicator x Table 1 Number of Payment Forms Table 1 Reference Nbr xxxxxxxxxxxxxxxxxxxxxxx Table 1 Usage Code 1 Table 1 Business ID xxx Table 1 Reason Code 00 Table 1 Purchase Date 20160119 Table 1 Flag 0 Table 1 Destination Amount xxxxxxxxxxxx Table 1 Characteristics Ind N Table 1 Currency Code 840 Table 1 Code 123456 Table 1 Source Amount 000000000100 Table 1 Capability 2 Table 1 Source Currency Code 840 Table 1 ID Method 1 Table 1 Name ABCDEFGHIJKLMNOPQRSTUV Table 1 Flag Table 1 City xxx-xxx-xxxx Table 1 Entry Mode 01 Table 1 Country Code US Processing Dat Table 1 e 20160121 Table 1 Category Code 1234 Table 1 Attribute 0 Table 2 Format Code Table 2 Trace Number Table 2 Assurance Level Table 2 Terminal Ind Table 2 Reference Nbr 000000 Table 2 Indicator Table 2 Indicator Table 2 Development Field 0 Table 2 Message Text Table 2 Response Code Table 2 Condition Ind Table 2 Source Code 5 Table 2 Program Indicator Table 2 Id Format Table 2 Charge x Table 2 Selection 0 Table 2 ID XXXXXXXXXXXXXXX Table 2 Payment Count Table 2 Terminal ID Table 2 Identifier 123456789 Table 2 Fee 000000000000 Ca Table 2 shback 000000000 Table 2 Pymt Ind 7 Table 2 Condition Code Table 2 Chargeback Ind Table 2 Environment Table 3 Identifier 123456789 Volume Indicator Table 3 Table 3 Amount 000000000000 Table 3 Goods Indicator Table 3 Currency Cd 123 Table 3 Verification Val Table 3 Response Cd Table 3 Fee Amount 000000000000000 Table 3 Code Table 3 Fee Sign Table 3 ID Rsn Table 3 Curr Ex Rate 00000000 Table 3 Processing Code Table 3 Curr Ex Rate 00000000 Table 3 Rights Ind Table 3 ISA Amt 000000000000 Table 3 Clearing Seq Nbr 00 Table 3 Product Id F Table 3 Clearing Seq Cnt 00 Table 3 Program ID Table 3 Auth Data Table 3 Conv Ind Table 3 Total Authorized Amount 000000000000 Table 3 Type Identification Table 3 Information Ind Table 3 Qualified Indicator Table 3 Telephone Number Table 3 Token Table 3 Additional Data Indicator Table 3 CVV2 Result Code Table 3 Chargeback Ind Table 3 Environment
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.