08-12-2016 09:06 AM
I have 10million records, all that look like this: 2005370806 20151201 2
The length of charcters between spaces are NOT the same across al records.
Ive tried compbl and then scan function and nothing works!! i cant seem to separate this string. Please help!!
08-12-2016 09:17 AM
I am afraid "Ive tried compbl and then scan function and nothing works!!" isn't really descriptive of the problem. Post some test data in the form of a datastep and what code you have tried. Scan is the perfect function for such a thing, as this will demonostrate:
data want; original="2005370806 20151201 2"; want1=scan(original,1," "); want2=scan(original,2," "); want3=scan(original,3," "); run;
08-12-2016 09:24 AM
POID=SCAN (new,1,' ');
call_date =SCAN (new,2,' ');
calls =SCAN (new,3,' ');
The above returns blanks for call_date and calls and puts the entre string again for poid
08-12-2016 09:40 AM
I can't really tell you as I can't see your data, follow this post and give us a couple of lines of test data so we can see it:
At a guess, I would say your delimiter - ' ' does not match the delimiter in your data, could it be that it is a secret tab or something else?
08-12-2016 09:52 AM
I think the use of compbl() is not to be done because index() looks for embedded space to break the string.
08-12-2016 09:57 AM
Are you able to provide how to use index in this example? I know Ive used Index to look for certain pieces in a string but never spaces
08-12-2016 09:55 AM
The function COMBL:
"The COMPBL function removes multiple blanks in a character string by translating each occurrence of two or more consecutive blanks into a single blank."
It does not remove special characters such as tabs. For example if you run this:
data want; text="abcdef"||'09'x||"defe"; run; ods rtf file="c:\test.rtf"; proc print data=want; run; ods rtf close;
The data will not show any blanks, however when printed the tab comes into effect - these are special characters.
08-12-2016 10:05 AM
You are right, I did index to look for spaces and there are not any although when printed i see spaces.
08-12-2016 11:49 AM
You can try @Reeza's suggestion by using 's' modifier in the SCAN(),
data want; text="abcdef"||'09'x||"defe"; do i=1 to 2; _s1=scan(text,i,'','s'); output; end; run;
If this does not work out, try another suggestion also provided by @Reeza, to print out all chars in HEX, then you will be able to pinpoint your delimiter, of course, only try it on serveral obs to figure it out.
data want; text="abcdef"||'09'x||"defe"; len=length(text); do i=1 to len; char=char(text,i); hex=put(char,$hex2.); output; end; run;
As you can see from above example, after running, 7th(i=7) is a tab, hex=09, so next is easy:
data want; text="abcdef"||'09'x||"defe"; do i=1 to 2; _s1=scan(text,i,'09'x); output; end; run;
08-13-2016 01:51 AM
You could use more rigorous version of scan() with 'kd' option: data _null_; length t1 t2 x $ 40; x='111'||'09'x||'222'||'0A'x||'333'; t1=scan(x,1,,'s'); t2=scan(x,1,,'kd'); put t1= t2=; t1=scan(x,2,,'s'); t2=scan(x,2,,'kd'); put t1= t2=; t1=scan(x,3,,'s'); t2=scan(x,3,,'kd'); put t1= t2=; run;