BookmarkSubscribeRSS Feed
sas_Forum
Calcite | Level 5

Hi

 

i am having data with DLm as @ and data is also having @ and few have email id and few data is not having email ids

 

Ex:

 

Name |age| email |loc

 

A@10@a@.gm@vij

B@12@ @ kl

 

11 REPLIES 11
Shmuel
Garnet | Level 18

Read the email as two cosequtive variable and concatenate to to create one email variable

the you can drop the two origin varaibles.

 

sas_Forum
Calcite | Level 5

Name @ age @ email  @ loc

a@10@ test@gmail.com@cul

b@12@  @mum

art297
Opal | Level 21

Your second example record doesn't contain an email. Do you have missing values for any of the variables but, more importantly, are you missing any emails in your data?

 

Art, CEO, AnalystFinder.com

 

sas_Forum
Calcite | Level 5

Yes in some data there is no email id

Shmuel
Garnet | Level 18

When there is no email id then probably you may have two or more cosequents of @ (thhat is: @@  or @@@ )

to deal with definne DSD option in your INFILE statement, then you will get value to each variable or missing value.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

I am sorry, but who uses @ as a delimiter in anything other than an email?  Thats mad.  There are any number of characters, tab, comma, pipe etc.  Its not surising your having issues.

ChrisBrooks
Ammonite | Level 13

When it's upto me and I don't want to use a comma I tend to use a tilde ~ as I've never seen that used for anything else in a genuine string...

art297
Opal | Level 21

@sas_Forum: Not sure if you ever got a useable solution. I'd try something like the following:

data have (drop=_:);
  infile cards dlm='~' dsd truncover;
  input @;
  if count(_infile_,'@') eq 3 then _infile_=translate(_infile_,'~','@');
  else do _i=4 to 1 by -1;
    if _i ne 3 then do;
      call scan(_infile_ , _i, _position, _length, '@');
      substr(_infile_,_position+_length,1)='~';
    end;
  end;
  input Name $ age email $ loc $;
  cards;
A@10@a@.gm@vij
B@12@ @ kl
;

Art, CEO, AnalystFinder.com

 

Ksharp
Super User

It is like a sandwich. first eat the top bread,then eat the bottom bread, the left is hotdog.

 

data have;
input;
name=scan(_infile_,1,'@');
age=scan(_infile_,2,'@');
call scan(_infile_,2,p,l,'@');
call scan(_infile_,-1,p1,l1,'@');
loc=substr(_infile_,p+l+1,p1-p-l-2);
drop p l p1 l1;
cards;
a@10@ test@gmail.com@cul
b@12@  @mum
;
run;
Tom
Super User Tom
Super User

First tell whoever is making the file that they need to add quotes around any value that contains the delimited. If they cannot generate a file that can be parsed then why generate the file at all?

 

If you have only one field that could contain the delimiter and every line has every field then you can read from the beginning and the end and whatever is left over is what to use for the middle field.  You can let the INPUT statement handle reading the beginning, but to read the ending field(s) you will need to use programming.

 

So for your example with the possible delimiters appearing in the third of four fields you can read the first two with INPUT, the fourth with SCAN() and then using SUBSTRN() to pull out the value for the third.  Use the COLUMN= option on the INFILE statement to find where the third field starts. The length of the field is the total length of the input line minus the length of the last word and minus the starting location. If you values have leading spaces after the delimiter like in some of the examples in this thread then you probably wnat to use the LEFT() function to remove those.

 

data have;
  infile cards dsd dlm='@' column=cc truncover ;
  length name $20 age 8 email loc $100;
  input name age @ ;
  loc = scan(_infile_,-1,'@');
  email=left(substrn(_infile_,cc,lengthn(_infile_)-cc-lengthn(loc)));
cards;
a@10@ test@gmail.com@cul
b@12@@mum
;
SuryaKiran
Meteorite | Level 14

Hi,

 

Create two variables for holding the mail and use if condition to concnate if there is actually mail id and if not send the values to next variable. I'm trying to make the code as simple as possible by understanding how the data pattern appears.

 

data test(drop=mail_ extra);
infile datalines dlm="@" dsd truncover;
input name :$8. age :8. mail_: $20. extra: $10. loc: $8. @;
if find(extra,".com","i") then mail=COMPRESS(CAT(mail_,"@",extra));
else loc=extra;
datalines;
user1@10@test1@gmail.com@value1
user2@20@@value2
user3@13@test3@yahoo.com@value3
;
run;

Thanks,

Suryakiran

Thanks,
Suryakiran

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1297 views
  • 0 likes
  • 8 in conversation