Hello,
I have 10million records, all that look like this: 2005370806 20151201 2
The length of charcters between spaces are NOT the same across al records.
Ive tried compbl and then scan function and nothing works!! i cant seem to separate this string. Please help!!
I am afraid "Ive tried compbl and then scan function and nothing works!!" isn't really descriptive of the problem. Post some test data in the form of a datastep and what code you have tried. Scan is the perfect function for such a thing, as this will demonostrate:
data want; original="2005370806 20151201 2"; want1=scan(original,1," "); want2=scan(original,2," "); want3=scan(original,3," "); run;
data rmd.calls;
set rmd.calls;
new=compbl(poid);
run;
data rmd.calls;
set rmd.calls;
POID=SCAN (new,1,' ');
call_date =SCAN (new,2,' ');
calls =SCAN (new,3,' ');
run;
The above returns blanks for call_date and calls and puts the entre string again for poid
I can't really tell you as I can't see your data, follow this post and give us a couple of lines of test data so we can see it:
At a guess, I would say your delimiter - ' ' does not match the delimiter in your data, could it be that it is a secret tab or something else?
But this is why I did function compbl first
Look at the options for SCAN. I think there's an option for all space types.
I think the use of compbl() is not to be done because index() looks for embedded space to break the string.
Are you able to provide how to use index in this example? I know Ive used Index to look for certain pieces in a string but never spaces
X = scan(word, 1, , 'c');
What about this?
If that doesn't work, use a HEX format to display your results to identify your delimiter.
The function COMBL:
"The COMPBL function removes multiple blanks in a character string by translating each occurrence of two or more consecutive blanks into a single blank."
It does not remove special characters such as tabs. For example if you run this:
data want; text="abcdef"||'09'x||"defe"; run; ods rtf file="c:\test.rtf"; proc print data=want; run; ods rtf close;
The data will not show any blanks, however when printed the tab comes into effect - these are special characters.
You are right, I did index to look for spaces and there are not any although when printed i see spaces.
Now what?
You can try @Reeza's suggestion by using 's' modifier in the SCAN(),
data want;
text="abcdef"||'09'x||"defe";
do i=1 to 2;
_s1=scan(text,i,'','s');
output;
end;
run;
If this does not work out, try another suggestion also provided by @Reeza, to print out all chars in HEX, then you will be able to pinpoint your delimiter, of course, only try it on serveral obs to figure it out.
data want;
text="abcdef"||'09'x||"defe";
len=length(text);
do i=1 to len;
char=char(text,i);
hex=put(char,$hex2.);
output;
end;
run;
As you can see from above example, after running, 7th(i=7) is a tab, hex=09, so next is easy:
data want;
text="abcdef"||'09'x||"defe";
do i=1 to 2;
_s1=scan(text,i,'09'x);
output;
end;
run;
You could use more rigorous version of scan() with 'kd' option: data _null_; length t1 t2 x $ 40; x='111'||'09'x||'222'||'0A'x||'333'; t1=scan(x,1,,'s'); t2=scan(x,1,,'kd'); put t1= t2=; t1=scan(x,2,,'s'); t2=scan(x,2,,'kd'); put t1= t2=; t1=scan(x,3,,'s'); t2=scan(x,3,,'kd'); put t1= t2=; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.