Hello,
Im am trying to extract a txt file to output just certain information but are having trouble
The file is in an email format that I have extracted to txt file. It contains information such as
today is july 21, the rates are set as
Start End Days CCY Balance Rate Amount
___________ ___________ ____ ___ __________ _________ ________
31-Dec-2020 31-Dec-2020 1 USD 819,487.95 3.750000% 85.36
information ends
today is july 22, the rates are set as
Start End Days CCY Balance Rate Amount
___________ ___________ ____ ___ __________ _________ ________
10-Aug-2020 31-Dec-2021 1 USD 10000 3.750000% 100
information ends
and i only need the information in italic (after the ------ and before information ends) from this email.
wanting to output so they all come out in different rows like
31-Dec-2020 31-Dec-2020 1 USD 819,487.95 3.750000% 85.36
10-Aug-2020 31-Dec-2021 1 USD 10000 3.750000% 100
any ideas on how to do this?
Thanks in advance
Provide some of the text, enough to generate at least 3 desired output records, in a TEXT box opened on the forum with the </> icon above the message window. The message windows reformat text and what you have pasted may not actually look like what you have in the text file, as minimum it has likely inserted some line breaks. The text box will treat your pasted text as just plain text.
Are you expecting variable such as Start End Days CCY Balance Rate Amount in the result or a single line of text?
Something like this may work. There are some assumptions: 1) only one line of values after the line of ____ or what ever, if this is not the case then you need to provide a different example, 2) that the text I copied matches your actual text. It is very likely that simple underscore characters in a bunch like this were replaced by a different character in the message window, 3) that the data is on the line immediately following the under scores
data example; infile datalines; input @; if _infile_="___________ ___________ ____ ___ __________ _________ ________" then do; /* free up current line read above*/ input; /* read the values*/ input Start :date11. End :Date11. Days CCY :$5. Balance :comma16. Rate :percent10. Amount; /* only write to the data set after reading the values*/ output; end; else input; format start end date9. rate percent12.6; datalines; today is july 21, the rates are set as Start End Days CCY Balance Rate Amount ___________ ___________ ____ ___ __________ _________ ________ 31-Dec-2020 31-Dec-2020 1 USD 819,487.95 3.750000% 85.36 information ends today is july 22, the rates are set as Start End Days CCY Balance Rate Amount ___________ ___________ ____ ___ __________ _________ ________ 10-Aug-2020 31-Dec-2021 1 USD 10000 3.750000% 100 information ends ;
Your code would have an INFILE pointing to your text file.
The Input @; reads and holds the current line so it can be examined. _Infile is a character variable holding the current line of data. If there aren't any other ______ you could likely use a much shorter comparison string such as "if _infile_ =: '______' then do; The =: is a "begins with" comparison.
SAS is a row column based, so even though SAS can read .txt files it will create a row column structure of your source file. You may need to clean up your text file before using the code below. I saved data to a text file that looks like this:
I use the Modified List Input method to read the text file. See documentation SAS Help Center: INPUT Statement: List
data test2;
infile "c:\temp\test.txt" firstobs=3 dlm=' ' dsd missover;
input Start :date11. End :date11. Days :2. CCB :$3.
Balance :comma20. Rate :percent. Amount :10.;
format start end date11. balance amount dollar20.2 rate percent10.2;
if start ne .;/*This cleans up unwanted rows */
run;
Results;
@JOL wrote:
SAS is a row column based, so even though SAS can read .txt files it will create a row column structure of your source file. You may need to clean up your text file before using the code below. I saved data to a text file that looks like this:
I use the Modified List Input method to read the text file. See documentation SAS Help Center: INPUT Statement: List
data test2;
infile "c:\temp\test.txt" firstobs=3 dlm=' ' dsd missover;
input Start :date11. End :date11. Days :2. CCB :$3.Balance :comma20. Rate :percent. Amount :10.;
format start end date11. balance amount dollar20.2 rate percent10.2;
if start ne .;/*This cleans up unwanted rows */
run;
Results;
Really don't like all the "invalid data messages" this is going to generate.
You can modify the ERRORS= option before and after the data step code to eliminate the data errors in the log. Or alternatively clean up the source text file to get rid of the unwanted lines that will generate the data errors.
options errors=0;
data test2;
infile "c:\temp\test.txt" firstobs=3 dlm=' ' dsd missover;
input Start :date11. End :date11. Days :2. CCB :$3.
Balance :comma20. Rate :percent. Amount :10.;
format start end date11. balance amount dollar20.2 rate percent10.2;
if start ne .;/*This cleans up unwanted rows */
run;
options errors=20;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.