Hi: is it possible to trim zero off of a variable during left join.
For example, i want to do a left join with these two tables by ID_num. But ID_num in one table leads with zero in front of the number. Can i trim the zero while while doing a left join with it? or would i have to trim it using prxchange function then do the sql left join data step.
data fruit;
input ID_num Fruit_Type$ rate$;
datalines;
0123 Apples 1.2
0456 Oranges 1.3
0567 Bananas 1.4
0890 Pineapples 1.5
;
run;
data fruit_options;
input ID_num Fruit_Type2$ Add_on$ Price$;
datalines;
123 Oranges wine 75
456 Watermelon cheese 25
567 Plum cheese 50
890 OneFruit wine 35
;
run;
@Scooby3g wrote:
What if the variables joining are characters and one variable in a table has zeros while the other does not? In a scenario where you want to keep it as Character variables.
data fruit;
input ID_char$ Fruit_Type$ rate$;
datalines;
0123 Apples 1.2
0456 Oranges 1.3
0567 Bananas 1.4
0890 Pineapples 1.5
;
run;data fruit_options;
input ID_char$ Fruit_Type2$ Add_on$ Price$;
datalines;
123 Oranges wine 75
456 Watermelon cheese 25
567 Plum cheese 50
890 OneFruit wine 35
;
run;
There are a number of approaches depending on the actual values. If the code to match on is essentially numbers where the leading zero is actually insignificant (i.e. "0123" and "00123" are not both in the data a mean something else) you can INPUT then character values in the On
On input(a.id,12.) = input(b.id,12.)
or if only one is character and the other numeric input the character and not the numeric.
HOWEVER: if you have values like "0ABC123" and are matching "123" good luck.
You can use almost any function in the ON to modify values. If all the character values in one set have exactly one leading zero and all of the values in the other set have no leading zeroes you could use something like
On a.id = cats('0',b.id) to add exactly one leading 0 to the values without the leading 0.
If there are random numbers of leading zeroes and the second set has some values with the leading 0 and others without then you need to provide the logic as to which values have too many or too few leading (or trailing) 0 characters.
Ok, question why trim when ID_num is a numeric in both datasets. Which means this works:
proc sql ;
create table temp as
select * from fruit left join fruit_options
on fruit.ID_num = fruit_options.ID_num
;
quit ;
proc sql ;
create table temp as
select * from fruit left join fruit_options
on input(fruit.ID_num, 8.) = fruit_options.ID_num
;
quit ;
There is a mistake in the example code, in both data sets id_num is read in as a numeric variable so the leading 0 is removed from the example data.
If it is as shown, it would likely be a character variable, then use INPUT() to convert it on the join. If both are actually numeric you should not have any issues. Note that leading zeroes can be on a character variable. If the value is numeric with a Z format, SAS should still merge it correct, so if that's the case you have something else causing your non matches. If that is the case, please show an example of when this doesn't work.
What if the variables joining are characters and one variable in a table has zeros while the other does not? In a scenario where you want to keep it as Character variables.
data fruit;
input ID_char$ Fruit_Type$ rate$;
datalines;
0123 Apples 1.2
0456 Oranges 1.3
0567 Bananas 1.4
0890 Pineapples 1.5
;
run;
data fruit_options;
input ID_char$ Fruit_Type2$ Add_on$ Price$;
datalines;
123 Oranges wine 75
456 Watermelon cheese 25
567 Plum cheese 50
890 OneFruit wine 35
;
run;
@Scooby3g wrote:
What if the variables joining are characters and one variable in a table has zeros while the other does not? In a scenario where you want to keep it as Character variables.
data fruit;
input ID_char$ Fruit_Type$ rate$;
datalines;
0123 Apples 1.2
0456 Oranges 1.3
0567 Bananas 1.4
0890 Pineapples 1.5
;
run;data fruit_options;
input ID_char$ Fruit_Type2$ Add_on$ Price$;
datalines;
123 Oranges wine 75
456 Watermelon cheese 25
567 Plum cheese 50
890 OneFruit wine 35
;
run;
There are a number of approaches depending on the actual values. If the code to match on is essentially numbers where the leading zero is actually insignificant (i.e. "0123" and "00123" are not both in the data a mean something else) you can INPUT then character values in the On
On input(a.id,12.) = input(b.id,12.)
or if only one is character and the other numeric input the character and not the numeric.
HOWEVER: if you have values like "0ABC123" and are matching "123" good luck.
You can use almost any function in the ON to modify values. If all the character values in one set have exactly one leading zero and all of the values in the other set have no leading zeroes you could use something like
On a.id = cats('0',b.id) to add exactly one leading 0 to the values without the leading 0.
If there are random numbers of leading zeroes and the second set has some values with the leading 0 and others without then you need to provide the logic as to which values have too many or too few leading (or trailing) 0 characters.
Thanks all for you feedback and suggestions. It seems like the easier way to go about it is to cut the data then trim the zero using this. Then do a left join and assign what is A and what is B. I originally wanted to avoid this way because the datasets are large and I may run out of space when running it this way.
data fruit2;
set fruit;
ID_1= prxchange ('s/^0+//o', ID_char);
run;
proc sql;
create table temp as
select * from Fruit2 as a
left join fruit_options (keep=ID_char Fruit_Type2) as b
on a.ID_1=b.ID_char;
quit;
I've had issues with converting from btw with Char and Num before using the Input and Put functions so I am trying to avoid using those during join when possible. I was trying to go about it this way:
proc sql;
create table TEMP as select *
from ((select * from fruit where ID_num NOT LIKE '0%') as a
left join (select ID_num, Fruit_Type2, Add_on from fruit_options ) as b
on a.ID_num=b.ID_num);
quit;
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.