Hi All,
So I am running into a big of a snag, I am using the following code to create a new variable for data mapping purposes:
data txttest1;
set work.ptxt;
if find(tx, 'mg', "i")>0 then P2_PR_Dose_Units=01;
else if find(tx, 'g', "i", "-1")=1 and find(tx, 'mg', "i")=0 then P2_PR_Dose_Units=02;
else if find(tx, 'unit', "i")>0 then P2_PR_Dose_Units=03;
run;
So, my 1s and 3s are coming up okay, but all 2s are coming up as missing variables. Basically, I am trying to count from right to left and if the first letter is a g but does not contain mg then it should be a 2, if there is a g after the first position from right to left then it needs to be either a 1 o a 3 depending on the category. I am mainly having issues getting the start position to work.
It runs without errors, but I still get missing values.
For example, one of my response categories could be Tylenol 1g, I would want this to show up in the new variable with a value of 2. This is opposed to guanfacin 5mg, which I would want to show up as a 1 in the new variable. I was thinking that since I am counting from right to left this would work, but it is not.
This doesn't make any sense.
find(tx, 'g', "i", "-1")
What is the "-1" supposed to mean?
Your description makes it sound like you want to test if the last character is a G.
indexc(char(tx,length(tx)),'gG')
With the -1 I am trying to set a starting position so that it is being read right to left.
And sort of, the data I have lists grams for each medication with a g at the end, but those measured in mg needs to be differentiated.
@GScottEpi wrote:
With the -1 I am trying to set a starting position so that it is being read right to left.
And sort of, the data I have lists grams for each medication with a g at the end, but those measured in mg needs to be differentiated.
In that case you need to give FIND() a NUMBER, not a string. And also test if the result is the last position in the string.
find(tx, 'g', "i", -1)=length(tx)
Sorry, I don't fully understand your response, when you say give it a number instead of a string? And what does "length" mean in the part that says =length(tx)?
I strongly suggest you start reading the documentation, because right now you seem to be quite clueless.
And consult the documentation whenever something is not completely clear to you (read: always; every good SAS coder does it). There is a VERY BIG reason why Maxim 1 is number one.
PS
"-1"
is a string,
-1
is a number.
@GScottEpi wrote:
Sorry, I don't fully understand your response, when you say give it a number instead of a string? And what does "length" mean in the part that says =length(tx)?
The documentation is clear that start position is a number.
start-position
is a numeric constant, variable, or expression with an integer value that specifies the position at which the search should start and the direction of the search.
It is also clear that what it returns is to position that it was found.
The FIND function searches string for the first occurrence of the specified substring, and returns the position of that substring. If the substring is not found in string, FIND returns a value of 0.
By testing for a result of 1 when searching from the back you are testing if the ONLY place there is G is the first position.
Okay, that makes more sense. Thank you so much for your responses.
Note that you need to worry about the trailing spaces SAS adds to pad strings to the storage length.
if lowcase(reverse(trim(tx))) =: 'gm' then units=1;
else if lowcase(reverse(trim(tx))) =: 'g' then units=2;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.