08-03-2017 10:14 AM
i am having data with DLm as @ and data is also having @ and few have email id and few data is not having email ids
Name |age| email |loc
B@12@ @ kl
08-03-2017 10:30 AM
Your second example record doesn't contain an email. Do you have missing values for any of the variables but, more importantly, are you missing any emails in your data?
Art, CEO, AnalystFinder.com
08-03-2017 11:48 AM
When there is no email id then probably you may have two or more cosequents of @ (thhat is: @@ or @@@ )
to deal with definne DSD option in your INFILE statement, then you will get value to each variable or missing value.
08-03-2017 10:47 AM
I am sorry, but who uses @ as a delimiter in anything other than an email? Thats mad. There are any number of characters, tab, comma, pipe etc. Its not surising your having issues.
08-03-2017 10:56 AM
When it's upto me and I don't want to use a comma I tend to use a tilde ~ as I've never seen that used for anything else in a genuine string...
08-03-2017 05:36 PM
@sas_Forum: Not sure if you ever got a useable solution. I'd try something like the following:
data have (drop=_:); infile cards dlm='~' dsd truncover; input @; if count(_infile_,'@') eq 3 then _infile_=translate(_infile_,'~','@'); else do _i=4 to 1 by -1; if _i ne 3 then do; call scan(_infile_ , _i, _position, _length, '@'); substr(_infile_,_position+_length,1)='~'; end; end; input Name $ age email $ loc $; cards; A@10@a@.gm@vij B@12@ @ kl ;
Art, CEO, AnalystFinder.com
08-05-2017 09:06 AM
It is like a sandwich. first eat the top bread,then eat the bottom bread, the left is hotdog.
data have; input; name=scan(_infile_,1,'@'); age=scan(_infile_,2,'@'); call scan(_infile_,2,p,l,'@'); call scan(_infile_,-1,p1,l1,'@'); loc=substr(_infile_,p+l+1,p1-p-l-2); drop p l p1 l1; cards; a@10@ firstname.lastname@example.org@cul b@12@ @mum ; run;
08-05-2017 11:38 AM
First tell whoever is making the file that they need to add quotes around any value that contains the delimited. If they cannot generate a file that can be parsed then why generate the file at all?
If you have only one field that could contain the delimiter and every line has every field then you can read from the beginning and the end and whatever is left over is what to use for the middle field. You can let the INPUT statement handle reading the beginning, but to read the ending field(s) you will need to use programming.
So for your example with the possible delimiters appearing in the third of four fields you can read the first two with INPUT, the fourth with SCAN() and then using SUBSTRN() to pull out the value for the third. Use the COLUMN= option on the INFILE statement to find where the third field starts. The length of the field is the total length of the input line minus the length of the last word and minus the starting location. If you values have leading spaces after the delimiter like in some of the examples in this thread then you probably wnat to use the LEFT() function to remove those.
data have; infile cards dsd dlm='@' column=cc truncover ; length name $20 age 8 email loc $100; input name age @ ; loc = scan(_infile_,-1,'@'); email=left(substrn(_infile_,cc,lengthn(_infile_)-cc-lengthn(loc))); cards; a@10@ email@example.com@cul b@12@@mum ;
08-05-2017 04:29 PM
Create two variables for holding the mail and use if condition to concnate if there is actually mail id and if not send the values to next variable. I'm trying to make the code as simple as possible by understanding how the data pattern appears.
data test(drop=mail_ extra); infile datalines dlm="@" dsd truncover; input name :$8. age :8. mail_: $20. extra: $10. loc: $8. @; if find(extra,".com","i") then mail=COMPRESS(CAT(mail_,"@",extra)); else loc=extra; datalines; user1@10@firstname.lastname@example.org@value1 user2@20@@value2 user3@13@email@example.com@value3 ; run;