Example:
Let's say I have observations with a variable string as such (it differs across observations):
Person A) . ------XXX------XXXXXXXXXXXXXXXXXXXXX-------
Person B) . ------XXXXXXXXXXXXXXXXXXXXXXXXX-------
Person C) . ------XXX---XXXXXXXXXX----XXXXXXXXXXXXXXXXXXXX-------
I have the position number of the red X and have defined this position number as POS.
How do I use POS to find the position of the blue X?
Thank you!
This might be brutal and inelegant, but it should work:
data want;
set have;
if POS > 1 then do k=POS-1 to 1 by -1 until (substr(string, k, 1) ne "X");
end;
first_x = k + 1;
run;
So you're looking for the start of the substring?
Do you have the delimiters shown, the ---? If so, I would consider modifying this a bit to use the SCAN() or you could just loop through and check each character until you find the first - and then use that.
You need to confirm the structure of your data first though.
@soomx wrote:
Example:
Let's say I have observations with a variable string as such (it differs across observations):
Person A) . ------XXX------XXXXXXXXXXXXXXXXXXXXX-------
Person B) . ------XXXXXXXXXXXXXXXXXXXXXXXXX-------
Person C) . ------XXX---XXXXXXXXXX----XXXXXXXXXXXXXXXXXXXX-------
I have the position number of the red X and have defined this position number as POS.
How do I use POS to find the position of the blue X?
Thank you!
Like this?
data HAVE;
A='------123-----Aabcdefghijklmnopqrsx--------'; POS=25; output;
A='----BXXXXXXXXXXXXXXXXXXXXXXXX------- '; POS=20; output;
A='----XXX---XXXXXXXXXX----CXXXXXXXXXXXXXXXX--'; POS=40; output;
run;
data WANT;
set HAVE;
POS2=findc(reverse(A),'-',length(A)-POS)-1;
B=char(reverse(A),POS2);
run;
A | POS | POS2 | B |
---|---|---|---|
------123-----Aabcdefghijklmnopqrsx-------- | 25 | 29 | A |
----BXXXXXXXXXXXXXXXXXXXXXXXX------- | 20 | 39 | B |
----XXX---XXXXXXXXXX----CXXXXXXXXXXXXXXXX-- | 40 | 19 | C |
This might be brutal and inelegant, but it should work:
data want;
set have;
if POS > 1 then do k=POS-1 to 1 by -1 until (substr(string, k, 1) ne "X");
end;
first_x = k + 1;
run;
Thank you! This worked perfectly.
To answer the other questions (apologies for not being more clear):
The variable string have values of dashes and Xs (i.e. the dashes are not fake fillers). Observations have differing lengths and patterns of Xs and dashes. These dashes and Xs represent presence at a specific visit/time point (akin to yes/no present), and the position of each character represents which time point. I have the variable POS to indicate the position of the red X because that is my time point of interest. I wanted to find the position of the blue X to find the starting point in which observations had continuous presence at each visit until the visit of interest. Hopefully that is more clear.
I would use CALL SCAN:
data have;
input key $10. string $50.;
cards;
Person A) ------XXX------XXXXXXXXXXXXXXXXXXyXX-------
Person B) ------XXXXXXXXXXXXXXXXXXXXXXyXX-------
Person C) ------XXX---XXXXXXXXXX----XXXXXXXXXXXXXXXXXyXX-------
;run;
data want;
set have;
pos=index(string,'y');
do _N_=1 by 1 until(pos2<pos<pos2+len or pos2=0);
call scan(string,_N_,pos2,len,'-');
end;
drop len;
run;
The position of the first letter in the word containing the letter "y" (a substitute for the red "X") should then be in the variable pos2.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.