Hi SAS experts,
I am trying to add 3 control conditions to a do loop to run but not able to figure it out.
what i have is a pseudo logic from where I have to build a sas code to run through my data to find first instance of a specific number in the sequence.
for this example lets use 2. attached the pseudo logic and sample data.
set sequence_event to 0 set in_sequence_position to -1 set k to 0 repeat until (( character in sequence >= 2 or blank) or (k = length of squence)): { get next character in sequence increment in_sequence_position by 1 increment K by 1 }
if character in sequence > 2 then increment s by 1 and in_sequence_position + 24 } else if s = 'n/a' } return (s,
in_sequence_position)
sample Data
id | sub id | sequence |
1 | 1 | 1111111111111111XX111111111111121111121111111111 |
1 | 2 | 111111111111111111111111111111111111111111111111 |
1 | 3 | 1 |
2 | 1 | 554321125555555555555555555555555555555555555555 |
3 | 2 | 111111111111111111111111111111111111111111X11111 |
3 | 1 | 554321111111111111111111111111111111555555555555 |
4 | 1 | 111111111111111111111 |
4 | 2 | 11 |
4 | 3 | 111211111111111111111111111111111111111111111111 |
desired output
id | sub id | sequence | S | in_sequence_position |
1 | 1 | 1111111111111111XX111111111111121111121111111111 | 1 | 32 |
1 | 2 | 111111111111111111111111111111111111111111111111 | 0 | 0 |
1 | 3 | 1 | 0 | 0 |
2 | 1 | 554321125555555555555555555555555555555555555555 | 1 | 5 |
3 | 2 | 111111111111111111111111111111111111111111X11111 | 0 | 0 |
3 | 1 | 554321111111111111111111111111111111555555555555 | 1 | 5 |
4 | 1 | 111111111111111111111 | 0 | 0 |
4 | 2 | 11 | 0 | 0 |
4 | 3 | 111211111111111111111111111111111111111111111111 | 1 | 4 |
any help is much appreciated.
If you did want to re invent the wheel using your own DO loops here is one way.
data want;
set have;
found=0;
do location=1 to length(sequence) while (char(sequence,location) ne '2');
end;
if location > length(sequence) then location=0;
else found=1;
run;
Is that input supposed to be a representation of a dataset structured like this one?
data have;
input id subid sequence :$50. ;
cards;
1 1 1111111111111111XX111111111111121111121111111111
1 2 111111111111111111111111111111111111111111111111
1 3 1
2 1 554321125555555555555555555555555555555555555555
3 2 111111111111111111111111111111111111111111X11111
3 1 554321111111111111111111111111111111555555555555
4 1 111111111111111111111
4 2 11
4 3 111211111111111111111111111111111111111111111111
;
What is the desired output for that input?
i should have included full pseudo code
this is after the do loop
if character in seq > 2 then increment s by 1 and in_sequence_position + 24 } else if s = 'n/a' } return (s,in_sequence_position)
my desired out put should hve the "s" and "in_sequence_position" values returned.
my desired out put should hve the "s" and "in_sequence_position" values returned.
I don't know what you mean by those two quoted words.
Also what is the actual values you expect to get for those two variables for the actual input you provided?
Here: Just replace the missing values in this data step and re-post to show what answers you expect.
data want;
input id subid sequence :$50. s in_sequence_position;
cards;
1 1 1111111111111111XX111111111111121111121111111111 . .
1 2 111111111111111111111111111111111111111111111111 . .
1 3 1 . .
2 1 554321125555555555555555555555555555555555555555 . .
3 2 111111111111111111111111111111111111111111X11111 . .
3 1 554321111111111111111111111111111111555555555555 . .
4 1 111111111111111111111 . .
4 2 11 . .
4 3 111211111111111111111111111111111111111111111111 . .
;
I cannot understand what you want.
But to your question you can include a WHILE condition (or an UNTIL condition) on top of a normal start/stop interval.
do k=1 to length(sequence) while ( <some stopping condition);
...
end;
You could do that but have you seen FINDC() or FINDW()?
Do you want to learn how to implement your logic specifically or would you like to learn how the SAS methods of accomplishing your objective?
If the later, it may be easier to specify the logic you'd like to implement. Using built in optimized functions is definitely more efficient and faster.
@PrudhviB wrote:
Hi SAS experts,
I am trying to add 3 control conditions to a do loop to run but not able to figure it out.
what i have is a pseudo logic from where I have to build a sas code to run through my data to find first instance of a specific number in the sequence.
for this example lets use 2. attached the pseudo logic and sample data.
set sequence_event to 0 set in_sequence_position to -1 set k to 0 repeat until (( character in sequence >= 2 or blank) or (k = length of squence)): { get next character in sequence increment in_sequence_position by 1 increment K by 1 }
sample Data
id sub id sequence 1 1 1111111111111111XX111111111111121111121111111111 1 2 111111111111111111111111111111111111111111111111 1 3 1 2 1 554321125555555555555555555555555555555555555555 3 2 111111111111111111111111111111111111111111X11111 3 1 554321111111111111111111111111111111555555555555 4 1 111111111111111111111 4 2 11 4 3 111211111111111111111111111111111111111111111111
any help is much appreciated.
You might describe why you are doing this and what the actual expected output would be.
I would guess that you may be doing something related to sequential values. Which, if that is the case might be easier, depending on actual want, than processing each character.
For instance if you want to find out how many 1's are in the first sequence such as "1111111111111111XX111111111111121111121111111111"
For your consideration:
data example; x="1111111111111111XX111111111111121111121111111111"; sequences= countw(x,'1','k'); do i= 1 to sequences; thisseq =scan(x,i,'1','k'); lengthseq = countc(scan(x,i,'1','k'),'1'); output; end; run;
In this example the Countw function uses everything except the 1 as a delimiter. So it says there are 4 sequences containing only the character 1. Then the Scan function can extract the actual sequence. The Countc function counts how many characters. Actually with all a single character the Length function would work as well.
So, what is it you are actually trying to do?
So now that you have posted what you expect as output it looks like you just want to find the first occurrence of the digit 2.
You also appear to want to create an extra variable is is true when the location was found (not sure what value that adds).
That is want the INDEX() or INDEXC() function will do.
data have;
input id subid sequence :$50. s in_sequence_position;
cards;
1 1 1111111111111111XX111111111111121111121111111111 1 32
1 2 111111111111111111111111111111111111111111111111 0 0
1 3 1 0 0
2 1 554321125555555555555555555555555555555555555555 1 5
3 2 111111111111111111111111111111111111111111X11111 0 0
3 1 554321111111111111111111111111111111555555555555 1 5
4 1 111111111111111111111 0 0
4 2 11 0 0
4 3 111211111111111111111111111111111111111111111111 1 4
;
data want;
set have;
location = indexc(sequence,'2');
found = location > 0 ;
run;
proc print;
var id subid s found in: location sequence;
run;
Result
in_sequence_ Obs id subid s found position location sequence 1 1 1 1 1 32 32 1111111111111111XX111111111111121111121111111111 2 1 2 0 0 0 0 111111111111111111111111111111111111111111111111 3 1 3 0 0 0 0 1 4 2 1 1 1 5 5 554321125555555555555555555555555555555555555555 5 3 2 0 0 0 0 111111111111111111111111111111111111111111X11111 6 3 1 1 1 5 5 554321111111111111111111111111111111555555555555 7 4 1 0 0 0 0 111111111111111111111 8 4 2 0 0 0 0 11 9 4 3 1 1 4 4 111211111111111111111111111111111111111111111111
If you did want to re invent the wheel using your own DO loops here is one way.
data want;
set have;
found=0;
do location=1 to length(sequence) while (char(sequence,location) ne '2');
end;
if location > length(sequence) then location=0;
else found=1;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.