I have a data string separated by a ":" colon delimiter. I want to separate the strings separated by delimiter into different columns in data set
string
:57D:3547264:56D:74635:
:52A:857373382:73D:827374928:
Desired Output
col_1 col_2 col_3 col_4
57D 3547264 56D 74635
52A 857373382 73D 827374928
I used scan function as below, but it did not work.
data want;
data have;
array parse(*) col_1 col_2 col_3 col_4;
do i =1 to dim(parse);
parse(i) = scan(string, i, ":");
drop i;
end;
run;
Your ARRAY statement is wrong. Has the LENGTH definition in the wrong place. Plus it is defining an array named A but the DO loop is looking for an array named COL_.
The length you set in the array is the length you want to use to define the new variables. Most likely that will be much shorter than the full 1,000 characters in the source string.
Do you know the maximum number of values that any given string could have? If you use this ARRAY method then you need define enough variables to hold all of the values.
data have;
input string :$1000. ;
cards4;
:57D:3547264:56D:74635:
:52A:857373382:73D:827374928:
;;;;
data want ;
set have ;
array new $32 a b c d ;
do i=1 to min(dim(new),countw(string,':'));
new[i] = scan(string,i,':');
end;
drop i;
run;
proc print;
run;
Obs string a b c d 1 :57D:3547264:56D:74635: 57D 3547264 56D 74635 2 :52A:857373382:73D:827374928: 52A 857373382 73D 827374928
data have;
input string :$50.;
datalines;
:57D:3547264:56D:74635:
:52A:857373382:73D:827374928:
;
data long;
set have;
n = _N_;
do i = 1 to countw(string, ':');
w = scan(string, i, ':');
if w ne '' then output;
end;
run;
proc transpose data = long out = want(drop=_:) prefix=col_;
by n;
id i;
var w;
run;
Did your log show anything about character to numeric conversion?
Your array Parse is defined as numeric.
So you can't put values like "57D" into it.
The code you show has no source either, should be as SET HAVE, not DATA HAVE.
Try:
data want; Set have; array Col_(4) $ 10 ; do i =1 to dim(Col_); Col_(i) = scan(string, i, ":"); drop i; end; run;
The 10 on the Array is the length of each character variable created. If your data may be longer than increase the size.
If I am making new variables I just use the base of the name in the array definition. You didn't provide any example of other variables that might be in your data so if the variables Col_1 etc. exist already they need to be character variables and could use your array statement.
Doesn't work is awful vague.
Are there errors in the log?: Post the code and log in a code box opened with the "<>" to maintain formatting of error messages.
No output? Post any log in a code box.
Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "<>" icon or attached as text to show exactly what you have and that we can test code against.
data want; Set have; array Col_(4) $ 10 ; do i =1 to dim(Col_); Col_(i) = scan(string, i, ":"); drop i; end; run;
Regarding this solution, I have different column names and the data length is $1000
so the modified code will be as below, correct?
data want;
Set have;
array A B C D $1000 ; <------- These are different column names. They are all different.
do i =1 to dim(Col_);
Col_(i) = scan(string, i, ":");
drop i;
end;
run;
I do not get any error, I get the data in the below format. I get the 2nd and 4th part of the string correctly but for 1st and 3rd part, I see dot (.) symbol.
A B C D
. 3547264 . 74635
. 857373382 . 827374928
Your ARRAY statement is wrong. Has the LENGTH definition in the wrong place. Plus it is defining an array named A but the DO loop is looking for an array named COL_.
The length you set in the array is the length you want to use to define the new variables. Most likely that will be much shorter than the full 1,000 characters in the source string.
Do you know the maximum number of values that any given string could have? If you use this ARRAY method then you need define enough variables to hold all of the values.
data have;
input string :$1000. ;
cards4;
:57D:3547264:56D:74635:
:52A:857373382:73D:827374928:
;;;;
data want ;
set have ;
array new $32 a b c d ;
do i=1 to min(dim(new),countw(string,':'));
new[i] = scan(string,i,':');
end;
drop i;
run;
proc print;
run;
Obs string a b c d 1 :57D:3547264:56D:74635: 57D 3547264 56D 74635 2 :52A:857373382:73D:827374928: 52A 857373382 73D 827374928
Are you reading a sas data set or an external file?
If it is an external file, like .csv or .txt you can define the delimiter in the infile staement:
infile "<path and filename>" dlm=':' <more options>;
and then
input col1 $ col2 col3 $ col 4;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.