Solved: Re: Help in Reading the Text file and output only the email

pp2014 · Posted 08-26-2016 09:52 AM

I have following data in Text file separated by ; delimiter. I want only the email addresses separated by ; delimiter.

Input:

Doe, John <john.doe@abc.com>;Smith, Jeff <jeff.smith@abc.com>; lever, dave <dave.lever@abc.com>;

Output should be:

john.doe@abc.com;jeff.smith@abc.com;dave.lever@abc.com

Any help in this will be greatly appreciated.

Reeza · Posted 08-26-2016 10:10 AM

You'll need to read them separately. You can concatenate them back together with a CATX function and drop fields you don't need.

View solution in original post

Reeza · Posted 08-26-2016 09:59 AM

Look at scan function with <> as delimiters.

Is your data structure exactly this all the time?

Reeza · Posted 08-26-2016 09:59 AM

Look at scan function with <> as delimiters.

Is your data structure exactly this all the time?

pp2014 · Posted 08-26-2016 10:01 AM

Yes, the structure is exactly same all the time

Reeza · Posted 08-26-2016 10:05 AM

Are already reading in the file with three separate fields or as one long text field. Where in the process of reading the file are you, what does your code look like so far?

Reeza · Posted 08-26-2016 10:08 AM

Something like the following should get you started.

Data want;

informat name1-name3 $30. Email1-email3 $50.;

imfile filepath dlm=';<>' DSD Truncover ;

input name1 $ email1$ name2$ email2 $ ...;

run;

pp2014 · Posted 08-26-2016 10:08 AM

It can be considered 3 separate fields.

Final output should have only email addresses separated by ; delimiter

Reeza · Posted 08-26-2016 10:10 AM

You'll need to read them separately. You can concatenate them back together with a CATX function and drop fields you don't need.

pp2014 · Posted 08-26-2016 10:27 AM

Thanks Reeza for all the help...

pp2014 · Posted 08-26-2016 01:51 PM

I am trying to read the email_list file and output only email addresses separated by semi-column. But my code goes into a loop.

Can anybody help me in this??

email_file.txt has following records:

Doe, John <john.doe@abc.com>;Smith, Jeff <jeff.smith@abc.com>; lever, dave <dave.lever@abc.com>

Below is my code:

data test;
length text $32767;

infile 'c\email_file.txt' lrecl=32767 dsd dlm='09'x truncover;

input text $;

run;

data t1;
set test;
length tx3 $500;
tx3="";
tx1=text;
do until (length(cats(tx1))=0);
space_position = INDEX(tx1, '<');
slash_position = INDEX(tx1, '>');
space_to_slash = slash_position - space_position;
tx2 = substr(tx1, space_position+1, space_to_slash-1);
tx3=cats(tx3,tx2,";");
tx4=substr(tx1,find(tx1,";")+1);
tx1=tx4;
end;
run;

Reeza · Posted 08-26-2016 03:33 PM

Why didn't you use the SCAN function as suggested?

If your data is as indicated, only 3 emails per line you don't need all of that. Assuming that the text string is being read in correctly - which you should test first - this works for me.


data test;
length text $32767;

infile 'c\email_file.txt' lrecl=32767 dsd dlm='09'x truncover;

input text $;
email1 = scan(text, 2, "<>;");
email2 = scan(text, 4, "<>;"); 
email3 = scan(text, 6, "<>;");
want = catx(";", of email1-email3);
keep want;
run;

pp2014 · Posted 08-26-2016 03:45 PM

I have almost 170 email addresses.. Anyway I was able to fix my code. Thanks Reeza for help...

Reeza · Posted 08-26-2016 03:50 PM

@pp2014 wrote:

Yes, the structure is exactly same all the time

We can only respond to what you say....

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away