BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MJP1
Calcite | Level 5

Hi!

 

I am trying to use an array to efficiently create a set of new variables.  But SAS doesn't seem to accept an array reference as the second argument in the substr function.  Is there another way to do this without having to just type separate code for each of the new variables?

 

My data consists of two character string variables of letter codes indicating a status over 48 periods. I want to know the letter code from string1 that corresponds to the start of each instance of at least 3 repeated "I"s from string2. I have already used some regex coding to determine the position of each of those instances (intpos1-intpos4).  Now I would like to create a new set of variables (intcode1-intcode4) that extracts the character from string1 that is at each of those same positions.  If intposX is missing then the corresponding intcodeX should also be missing.

 

Obviously with only 4 new variables, typing out the code for each would not be that onerous, but I would like to do it as elegantly as possible, and learn something in the process.  Smiley Wink (4 is the current maximum instance of the recurring pattern in my data, but that could increase with new data)

 

I'm using SAS 9.4 through SAS EG 7.12.

data work.test;
set work.test;
array intcode $1. intcode1-intcode4;
array intpos intpos1-intpos4;
do b=1 to 4;
     intcode(b)=substr(string1,intpos(b),1); /* generates error for invalid second argument*/
     end;
run;

Data:

 

data WORK.test;
   infile datalines dsd truncover;
   input string1:$48. string2:$48. intpos1:8. intpos2:8. intpos3:8. intpos4:8.;
 datalines;
FAAAAFFFFFFFFFFFUUUUUFFFFFFFFFFFFFFFFFFFFFFFFFFF FIIIIFFFFFFFFFFFIIIIIFFFFFFFFFFFFFFFFFFFFFFFFFFF 2 17 .
777777777777777777777777NNNNNNNNNSPPPPPNNNNNNNNN 777777777777777777777777IIIIIIIIIIPPPPPIIIIIIIII 25 40 . .
NNNNNNNNNNNNNNNNNNPPPPPPFFFFFFFFFFFFFFFFFFFFFFFF IIIIIIIIIIIIIIIIIIPPPPPPFFFFFFFFFFFFFFFFFFFFFFFF 1 . . .
SSSSSSSNNSSSSSSSSSSSSSSSNNNNNNNNNNNNNNNNNNNNNNNN IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 1 . . .
NNNNNNNNNNNNNNNNNNNNPPPPPPPPPPPPPPPPPPPPPPPPPPPP IIIIIIIIIIIIIIIIIIIIPPPPPPPPPPPPPPPPPPPPPPPPPPPP 1 . . .
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 1 . . .
PPPPPPPPPNNNNNNNNNNNNNNNAAAAAAAAAAAAAAAAAAAAAAAA PPPPPPPPPIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 10 . . .
NNNNNNNNNFFFFFFFFFFFRRRRRRRRRRRRRRRRRRRRRRRRRRRR IIIIIIIIIFFFFFFFFFFFRRRRRRRRRRRRRRRRRRRRRRRRRRRR 1 . . .
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAAAF FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFIIIF 45 . . .
AAAFFFFFFFFFAAAFFFFFFFFFNSSSFFFFFFFNNSSSFFFFFFFN IIIFFFFFFFFFIIIFFFFFFFFFIIIIFFFFFFFIIIIIFFFFFFFI 1 13 25 36
FFFFFFFFFFFUUUUFFFFFFFUUFFFFFFFFFFSSNNNNFFFFFFFN FFFFFFFFFFFIIIIFFFFFFFIIFFFFFFFFFFIIIIIIFFFFFFFI 12 35 . .
PPPPPPPPNNNNFFFFFFFFFFFFPPPPPPPPPPPPPPPPPPPPNSSN PPPPPPPPIIIIFFFFFFFFFFFFPPPPPPPPPPPPPPPPPPPPIIII 9 45 . .
;;;;;;;;;;;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Make sure to use a valid index into the string. 

Your STRING1 variable is defined as length $48 so index must be between 1 and 48.

if intpos(b) in (1:48) then intcode(b)=substr(string1,intpos(b),1);

View solution in original post

3 REPLIES 3
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @MJP1 

 

There is nothing wrong with your code, but with your input data. The invalid-second-argument note is given when the substr function has a missing (or otherwise invalid) start position, and one or more start positions are missing in all input records except no. 10.

 

A tip: Never use the same data set name as input and output. When you run the second data step, the input is overwritten, so you can't go back and examine it or rerun the second step, unless you generate a new input.

Tom
Super User Tom
Super User

Make sure to use a valid index into the string. 

Your STRING1 variable is defined as length $48 so index must be between 1 and 48.

if intpos(b) in (1:48) then intcode(b)=substr(string1,intpos(b),1);
MJP1
Calcite | Level 5

D'oh! (Face palm).  Thanks Tom and  Erik! Robot Happy

 

MJP

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 831 views
  • 0 likes
  • 3 in conversation