- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi guys!!
Need your help on understanding the usage of an array with substring for the above codes.
The above screenshot was taken from a video where I am doing some self-learning for SAS programming.
The first data step created two strings: "abcdefghij" and "klmnopqrst";
while the second data step aims to separate the two string into its respective components of "a", "b", "c" ... etc.
However, I do not understand why the Do statement of "Do j=2 to 11;" is used instead of "Do j=1 to 10;"?
I have tried using "Do j=1 to 10;" and I know that the output generated will be wrong with the first record "a" being written over.
But I am not sure what is the rationale behind this.
Thanks in advance to anyone that can provide some clarification on the above question!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It's because of the array declaration.
array a{*} _character_ s1-s10;
which actually means
"create an array with all existing character variables and new variables s1 to s10, and set the dimension accordingly"
Since the variable string already exists when the array is defined, it ends up as a{1}, and needs to be skipped.
To clarify, add this to the second datastep:
x1 = vname(a{1});
and after testing that, try a slightly modified step:
data new;
array a{*} $1 s1-s10;
set old;
u = string;
do j = 1 to 10;
a{j} = substr(u,j,1);
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It's because of the array declaration.
array a{*} _character_ s1-s10;
which actually means
"create an array with all existing character variables and new variables s1 to s10, and set the dimension accordingly"
Since the variable string already exists when the array is defined, it ends up as a{1}, and needs to be skipped.
To clarify, add this to the second datastep:
x1 = vname(a{1});
and after testing that, try a slightly modified step:
data new;
array a{*} $1 s1-s10;
set old;
u = string;
do j = 1 to 10;
a{j} = substr(u,j,1);
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You say you got this as a training video, where did you get it from as it seems to show some bad methodology there. For instance:
- Why take 16 characters in the input string, if you know you are only dealing with 10 characters. This suggests to me problems for the future - i.e. you would miss 6 characters if the string was longer.
- As above, hardcoded limit of characters in array but not in input.
- Why would the array need to keep _character_ as well?? That is asking for problems, add one more character to the dataset for instance and your program will not give you the intended result.
Your code can be dissolved to:
data old; input string$1-16; datalines; abcdefghij klmnopqrst ; run; data new; set old; array s{10} $1; do i=1 to 10; s{i}=char(string,i); end; run;
Depending on the length of the string in, you may want to take max(lengthn(string)) and use that rather than assuming 10 characters. Also, avoid the whole mixed case coding, its hard to read.