BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
alam_ather
Calcite | Level 5

 

Dear Members,

 

 

I have a big text file and it contains 3 tables. The records in the file are unsorted, the records in the file are in line with the column name separated by space. The tables are repeating themselves until the end. I want to import the data from that notepad file under correct table into SAS. I want to read the records and column names from the text file and put them under correct table In SAS. I tried through INFILE and I was successful in importing the data into SAS, but since Columns are unsorted and tables are repeating themselves therefore it is harder to keep the records under correct table in SAS. I am a beginner to SAS, any help would be greatly appreciated. The sample file is attached for your reference.

 

Thanks

Ather

1 ACCEPTED SOLUTION

Accepted Solutions
BrunoMueller
SAS Super FREQ

Hi

 

You will need some time to do this, here is a basic program that I hope gets you started. See the comments in the code for an explanation what the codes does.

 

data want;
  infile "c:\temp\incomingTestFile.txt" truncover;

  * read a records;
  input;

  * check for type of table;
  if find(_infile_, "Table 1") then do;

    *
    * depending on the table start reading values
    * the @"someText" positions you rigth after the text
    * the @28 positions on col 28
    * varname informat, reads the value into a variable, carefull about the length
    * the slash "/" indicates "read a new line"
    *;
    input
      /
      @"Acct Number" acctNumber $30.
      @"ZIP Code" ZIPCode $30.
      /
      @"Floor Limit Indicator" floorLimitInd $8.
      @"State/Prov. Code" state_prov_code $20.
      /
      /
      /
      /
      /
      @28 purchaseDate yymmdd8.
    ;

    *
    * take out any control chars like tab etc. and take away leading and trailing spaces
    *;
    array xchar{*} _character_;

    drop i;
    do i = 1 to dim(xchar);
      xchar{i} = strip(compress(xchar{i},,"C"));
    end;

    * write to the output data sets;
    output;
  end;

  format 
    purchaseDate date9.
  ;
run;

Bruno

View solution in original post

10 REPLIES 10
LinusH
Tourmaline | Level 20
You could succeed if you invest quite some amount of time and effort by parsing this report style of file.
It would be substantially easier if you could get a data file instead from the source.
Data never sleeps
alam_ather
Calcite | Level 5

Thank you for answering the question, we dont have source, we only get this file on daily basis in this format. The data is not stored anywhere except in this file and we want the data to be available with us to perform different analytics. This is the reason of importing the file in SAS. Thanks

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Why are you trying to import a report file?  Go back to the source data - i.e. the data used to produce that report.  Otherwise you have quite a chunk of work on your hands, import each row, process it out etc.

alam_ather
Calcite | Level 5

Thank you for answering the question, we dont have source, we only get this file on daily basis in this format. The data is not stored anywhere except in this file and we want the data to be available with us to perform different analytics. This is the reason of importing the file in SAS. Thanks

RW9
Diamond | Level 26 RW9
Diamond | Level 26

So you get this file regularly, but the supplier cannot provide either raw data access or a proper data transfer file?  I would go and check the contract you have with the data provider - this is a report file, it is not a data transfer.  You can try coding round it, but at each run you will likely be re-coding and re-doing the process.  It really is in your best interest and the supplier to agree a proper data transfer process. Even if that process it to drop the data to CSV at the same time the text file is created.

Kurt_Bremser
Super User

I'd do it like that:

one data step, with N tables in the DATA statement; use KEEP to only put the proper variables in the respective datasets

INFILE statement with end=done option

RETAIN all variables that appear in the KEEP lists for the output tables

also RETAIN a flag variable that holds which table is currently being read; set it to 0 initially.

read input lines into sufficently long string variable; discard lines that hold no real data (like those containing PAGE x) with subsetting if

when a "Table X" line is encountered:

 - if the flag variable is not zero (indicating that we are not starting the first table), OUTPUT to corresponding dataset (use a SELECT block)

 - set the flag variable accordingly

a SELECT(flag); block, where you have parsing statements for each table you are currently reading

when the "done" variable is set, do one final output of the currently read table (as you did when a new table started)

alam_ather
Calcite | Level 5

Thanks Kurt for the suggestion, could you please elaborate more through SAS code, would be grateful to you. Thanks

Kurt_Bremser
Super User

Start easy:

data test (keep=acctno);
infile "your_test_file" truncover;
length input_line $200;
input input_line $200.;
if index(input_line,'Account') > 0 then do;
  acctno = input(substr(input_line,X,Y),Z.);
  output;
end;
run;

where X,Y,Z denote positions and length of the accountnumber within the report line.

Once you get this working, you already have one of the parsing statements. Then write more and more parsing statements.

The INDEX function will also be used to detect the lines that start a table, which will then help you create the SELECT blocks.

BrunoMueller
SAS Super FREQ

Hi

 

You will need some time to do this, here is a basic program that I hope gets you started. See the comments in the code for an explanation what the codes does.

 

data want;
  infile "c:\temp\incomingTestFile.txt" truncover;

  * read a records;
  input;

  * check for type of table;
  if find(_infile_, "Table 1") then do;

    *
    * depending on the table start reading values
    * the @"someText" positions you rigth after the text
    * the @28 positions on col 28
    * varname informat, reads the value into a variable, carefull about the length
    * the slash "/" indicates "read a new line"
    *;
    input
      /
      @"Acct Number" acctNumber $30.
      @"ZIP Code" ZIPCode $30.
      /
      @"Floor Limit Indicator" floorLimitInd $8.
      @"State/Prov. Code" state_prov_code $20.
      /
      /
      /
      /
      /
      @28 purchaseDate yymmdd8.
    ;

    *
    * take out any control chars like tab etc. and take away leading and trailing spaces
    *;
    array xchar{*} _character_;

    drop i;
    do i = 1 to dim(xchar);
      xchar{i} = strip(compress(xchar{i},,"C"));
    end;

    * write to the output data sets;
    output;
  end;

  format 
    purchaseDate date9.
  ;
run;

Bruno

Ksharp
Super User
It is really not easy. There are too many scenarios you need to consider about :


filename xx '/folders/myfolders/incoming test file.txt';
data temp;
infile xx lrecl=2000 truncover length=len;
input x $varying2000. len;
if find(x,'INCOMING INTERCHANGE') or
   find(x,'SYSTEM DATE') or
   find(x,'RUN NO') or
   missing(x) then delete;
length tname $ 100;
retain tname;
if scan(x,-2)='Table' then do;tname=strip(scan(x,-1,'-'));delete;end;
run;

data temp1;
 set temp;
 temp1=strip(tranwrd(substr(x,1,55),'09'x,' '));
 p1=find(temp1,'  ');
 name1=substr(temp1,1,p1);
 value1=strip(substr(temp1,p1));
 
 temp2=strip(tranwrd(substr(x,56),'09'x,' '));
 p2=find(temp2,'  ');
 name2=substr(temp2,1,p2);
 value2=strip(substr(temp2,p2));

keep name1 value1 name2 value2 tname temp1 temp2;
run;
data want;
 set temp1;
 name=name1;value=value1;output;
 name=name2;value=value2;output;
keep name value tname;
run;

proc print noobs label;run;





OUTPUT:
tname	name	value
Table 1	Acct Number	xxxxxxxxxxxxxxxxxxx
Table 1	ZIP Code	xxxxx
Table 1	Floor Limit Indicator	x
Table 1	State/Prov. Code	AB
Table 1	CRB/Exception File Ind	 
Table 1	Requested Payment Service	 
Table 1	PCAS Indicator	x
Table 1	Number of Payment Forms	 
Table 1	Reference Nbr	xxxxxxxxxxxxxxxxxxxxxxx
Table 1	Usage Code	1
Table 1	Business ID	xxx
Table 1	Reason Code	00
Table 1	Purchase Date	20160119
Table 1	Flag	0
Table 1	Destination Amount	xxxxxxxxxxxx
Table 1	Characteristics Ind	N
Table 1	Currency Code	840
Table 1	Code	123456
Table 1	Source Amount	000000000100
Table 1	Capability	2
Table 1	Source Currency Code	840
Table 1	ID Method	1
Table 1	Name	ABCDEFGHIJKLMNOPQRSTUV
Table 1	Flag	 
Table 1	City	xxx-xxx-xxxx
Table 1	Entry Mode	01
Table 1	Country Code	US
Table 1	Processing Date	20160121
Table 1	Category Code	1234
Table 1	Attribute	0
Table 2	Format Code	 
Table 2	Trace Number	 
Table 2	Assurance Level	 
Table 2	Terminal Ind	 
Table 2	Reference Nbr	000000
Table 2	Indicator	 
Table 2	Indicator	 
Table 2	Development Field	0
Table 2	Message Text	 
Table 2	Response Code	 
Table 2	Condition Ind	 
Table 2	Source Code 5	 
Table 2	Program Indicator	 
Table 2	Id Format	 
Table 2	Charge	x
Table 2	Selection	0
Table 2	ID	XXXXXXXXXXXXXXX
Table 2	Payment Count	 
Table 2	Terminal ID	 
Table 2	Identifier	123456789
Table 2	Fee	000000000000 Ca
Table 2	shback	000000000
Table 2	Pymt Ind 7	 
Table 2	Condition Code	 
Table 2	Chargeback Ind	 
Table 2	Environment	 
Table 3	Identifier	123456789 Volume Indicator
Table 3	 	 
Table 3	Amount	000000000000
Table 3	Goods Indicator	 
Table 3	Currency Cd	123
Table 3	Verification Val	 
Table 3	Response Cd	 
Table 3	Fee Amount	000000000000000
Table 3	Code	 
Table 3	Fee Sign	 
Table 3	ID Rsn	 
Table 3	Curr Ex Rate	00000000
Table 3	Processing Code	 
Table 3	Curr Ex Rate	00000000
Table 3	Rights Ind	 
Table 3	ISA Amt	000000000000
Table 3	Clearing Seq Nbr 00	 
Table 3	Product Id	F
Table 3	Clearing Seq Cnt 00	 
Table 3	Program ID	 
Table 3	Auth Data	 
Table 3	Conv Ind	 
Table 3	Total Authorized Amount	000000000000
Table 3	Type Identification	 
Table 3	Information Ind	 
Table 3	Qualified Indicator	 
Table 3	Telephone Number	 
Table 3	Token	0000000000000000
Table 3	Data Indicator	 
Table 3	Result Code	 
Table 1	Acct Number	xxxxxxxxxxxxxxxxxxx
Table 1	ZIP Code	xxxxx
Table 1	Floor Limit Indicator	x
Table 1	State/Prov. Code	AB
Table 1	CRB/Exception File Ind	 
Table 1	Requested Payment Service	 
Table 1	PCAS Indicator	x
Table 1	Number of Payment Forms	 
Table 1	Reference Nbr	xxxxxxxxxxxxxxxxxxxxxxx
Table 1	Usage Code	1
Table 1	Business ID	xxx
Table 1	Reason Code	00
Table 1	Purchase Date	20160119
Table 1	Flag	0
Table 1	Destination Amount	xxxxxxxxxxxx
Table 1	Characteristics Ind	N
Table 1	Currency Code	840
Table 1	Code	123456
Table 1	Source Amount	000000000100
Table 1	Capability	2
Table 1	Source Currency Code	840
Table 1	ID Method	1
Table 1	Name	ABCDEFGHIJKLMNOPQRSTUV
Table 1	Flag	 
Table 1	City	xxx-xxx-xxxx
Table 1	Entry Mode	01
Table 1	Country Code	US Processing Dat
Table 1	e	20160121
Table 1	Category Code	1234
Table 1	Attribute	0
Table 2	Format Code	 
Table 2	Trace Number	 
Table 2	Assurance Level	 
Table 2	Terminal Ind	 
Table 2	Reference Nbr	000000
Table 2	Indicator	 
Table 2	Indicator	 
Table 2	Development Field	0
Table 2	Message Text	 
Table 2	Response Code	 
Table 2	Condition Ind	 
Table 2	Source Code 5	 
Table 2	Program Indicator	 
Table 2	Id Format	 
Table 2	Charge	x
Table 2	Selection	0
Table 2	ID	XXXXXXXXXXXXXXX
Table 2	Payment Count	 
Table 2	Terminal ID	 
Table 2	Identifier	123456789
Table 2	Fee	000000000000 Ca
Table 2	shback	000000000
Table 2	Pymt Ind 7	 
Table 2	Condition Code	 
Table 2	Chargeback Ind	 
Table 2	Environment	 
Table 3	Identifier	123456789 Volume Indicator
Table 3	 	 
Table 3	Amount	000000000000
Table 3	Goods Indicator	 
Table 3	Currency Cd	123
Table 3	Verification Val	 
Table 3	Response Cd	 
Table 3	Fee Amount	000000000000000
Table 3	Code	 
Table 3	Fee Sign	 
Table 3	ID Rsn	 
Table 3	Curr Ex Rate	00000000
Table 3	Processing Code	 
Table 3	Curr Ex Rate	00000000
Table 3	Rights Ind	 
Table 3	ISA Amt	000000000000
Table 3	Clearing Seq Nbr 00	 
Table 3	Product Id	F
Table 3	Clearing Seq Cnt 00	 
Table 3	Program ID	 
Table 3	Auth Data	 
Table 3	Conv Ind	 
Table 3	Total Authorized Amount	000000000000
Table 3	Type Identification	 
Table 3	Information Ind	 
Table 3	Qualified Indicator	 
Table 3	Telephone Number	 
Table 3	Token	 
Table 3	Additional Data Indicator	 
Table 3	CVV2 Result Code	 
Table 3	Chargeback Ind	 
Table 3	Environment	 


sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1796 views
  • 0 likes
  • 6 in conversation