DATA Step, Macro, Functions and more

Use of array and substr

Accepted Solution Solved
Reply
Senior User
Posts: 1
Accepted Solution

Use of array and substr

[ Edited ]

SAS Q1.PNG

 

Hi guys!!

Need your help on understanding the usage of an array with substring for the above codes.
The above screenshot was taken from a video where I am doing some self-learning for SAS programming.
The first data step created two strings: "abcdefghij" and "klmnopqrst";

while the second data step aims to separate the two string into its respective components of "a", "b", "c" ... etc.

However, I do not understand why the Do statement of "Do j=2 to 11;" is used instead of "Do j=1 to 10;"?
I have tried using "Do j=1 to 10;" and I know that the output generated will be wrong with the first record "a" being written over.
But I am not sure what is the rationale behind this. 

Thanks in advance to anyone that can provide some clarification on the above question!

 


Accepted Solutions
Solution
‎11-06-2017 09:34 PM
Super User
Posts: 9,618

Re: Use of array and substr

Posted in reply to FaithInFate

It's because of the array declaration.

array a{*} _character_ s1-s10;

which actually means

"create an array with all existing character variables and new variables s1 to s10, and set the dimension accordingly"

Since the variable string already exists when the array is defined, it ends up as a{1}, and needs to be skipped.

To clarify, add this to the second datastep:

x1 = vname(a{1});

and after testing that, try a slightly modified step:

data new;
array a{*} $1 s1-s10;
set old;
u = string;
do j = 1 to 10;
  a{j} = substr(u,j,1);
end;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Solution
‎11-06-2017 09:34 PM
Super User
Posts: 9,618

Re: Use of array and substr

Posted in reply to FaithInFate

It's because of the array declaration.

array a{*} _character_ s1-s10;

which actually means

"create an array with all existing character variables and new variables s1 to s10, and set the dimension accordingly"

Since the variable string already exists when the array is defined, it ends up as a{1}, and needs to be skipped.

To clarify, add this to the second datastep:

x1 = vname(a{1});

and after testing that, try a slightly modified step:

data new;
array a{*} $1 s1-s10;
set old;
u = string;
do j = 1 to 10;
  a{j} = substr(u,j,1);
end;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Super User
Posts: 9,227

Re: Use of array and substr

Posted in reply to FaithInFate

You say you got this as a training video, where did you get it from as it seems to show some bad methodology there.  For instance:
- Why take 16 characters in the input string, if you know you are only dealing with 10 characters.  This suggests to me problems for the future - i.e. you would miss 6 characters if the string was longer.

- As above, hardcoded limit of characters in array but not in input.

- Why would the array need to keep _character_ as well??  That is asking for problems, add one more character to the dataset for instance and your program will not give you the intended result.

 

Your code can be dissolved to:

data old;
  input string$1-16;
datalines;
abcdefghij
klmnopqrst
;
run;
data new;
  set old;
  array s{10} $1;
  do i=1 to 10;
    s{i}=char(string,i);
  end;
run;

Depending on the length of the string in, you may want to take max(lengthn(string)) and use that rather than assuming 10 characters.  Also, avoid the whole mixed case coding, its hard to read.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 116 views
  • 3 likes
  • 3 in conversation