BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Scooby3g
Obsidian | Level 7

Hi: is it possible to trim zero off of a variable during left join.

 

For example, i want to do a left join with these two tables by ID_num. But ID_num in one table leads with zero in front of the number. Can i trim the zero while while doing a left join with it? or would i have to trim it using prxchange function then do the sql left join data step.

 

data fruit;
input ID_num Fruit_Type$ rate$;
datalines;
0123 Apples 1.2
0456 Oranges 1.3
0567 Bananas 1.4
0890 Pineapples 1.5
;
run;

data fruit_options;
input ID_num Fruit_Type2$ Add_on$ Price$;
datalines;
123 Oranges wine 75
456 Watermelon cheese 25
567 Plum cheese 50
890 OneFruit wine 35
;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@Scooby3g wrote:

What if the variables joining are characters and one variable in a table has zeros while the other does not? In a scenario where you want to keep it as Character variables.

 

data fruit;
input ID_char$ Fruit_Type$ rate$;
datalines;
0123 Apples 1.2
0456 Oranges 1.3
0567 Bananas 1.4
0890 Pineapples 1.5
;
run;

data fruit_options;
input ID_char$ Fruit_Type2$ Add_on$ Price$;
datalines;
123 Oranges wine 75
456 Watermelon cheese 25
567 Plum cheese 50
890 OneFruit wine 35
;
run;


There are a number of approaches depending on the actual values. If the code to match on is essentially numbers where the leading zero is actually insignificant (i.e. "0123" and "00123" are not both in the data a mean something else) you can INPUT then character values in the On

 

On input(a.id,12.) = input(b.id,12.)

or if only one is character and the other numeric input the character and not the numeric.

 

HOWEVER: if you have values like "0ABC123" and are matching "123" good luck.

You can use almost any function in the ON to modify values. If all the character values in one set have exactly one leading zero and all of the values in the other set have no leading zeroes you could use something like

On a.id = cats('0',b.id)  to add exactly one leading 0 to the values without the leading 0.

If there are random numbers of leading zeroes and the second set has some values with the leading 0 and others without then you need to provide the logic as to which values have too many or too few leading (or trailing) 0 characters.

View solution in original post

8 REPLIES 8
AMSAS
SAS Super FREQ

Ok, question why trim when ID_num is a numeric in both datasets. Which means this works:

 

proc sql ;
	create table temp as
	select * from fruit left join fruit_options 
	on fruit.ID_num = fruit_options.ID_num
;
quit ;
Scooby3g
Obsidian | Level 7
Thanks for your response!
What if it is in CHAR? Would I do a step like this while I only keep some variables?

proc sql;
create table FRUIT_basket2 as select *
from ((select * from fruit_options) as a
left join (select ID_num, Fruit_Type2 from fruit trim '0%' ) as b
on a.ID_num=b.ID_num);
quit;
Scooby3g
Obsidian | Level 7
The reason for trimming is that sometimes with the zero present it doesn't always merge right when the datasets are large.
Reeza
Super User
proc sql ;
	create table temp as
	select * from fruit left join fruit_options 
	on input(fruit.ID_num, 8.) = fruit_options.ID_num
;
quit ;

 

There is a mistake in the example code, in both data sets id_num is read in as a numeric variable so the leading 0 is removed from the example data. 

If it is as shown, it would likely be a character variable, then use INPUT() to convert it on the join. If both are actually numeric you should not have any issues. Note that leading zeroes can be on a character variable. If the value is numeric with a Z format, SAS should still merge it correct, so if that's the case you have something else causing your non matches. If that is the case, please show an example of when this doesn't work. 

 

 

 

 

Scooby3g
Obsidian | Level 7

What if the variables joining are characters and one variable in a table has zeros while the other does not? In a scenario where you want to keep it as Character variables.

 

data fruit;
input ID_char$ Fruit_Type$ rate$;
datalines;
0123 Apples 1.2
0456 Oranges 1.3
0567 Bananas 1.4
0890 Pineapples 1.5
;
run;

data fruit_options;
input ID_char$ Fruit_Type2$ Add_on$ Price$;
datalines;
123 Oranges wine 75
456 Watermelon cheese 25
567 Plum cheese 50
890 OneFruit wine 35
;
run;

ballardw
Super User

@Scooby3g wrote:

What if the variables joining are characters and one variable in a table has zeros while the other does not? In a scenario where you want to keep it as Character variables.

 

data fruit;
input ID_char$ Fruit_Type$ rate$;
datalines;
0123 Apples 1.2
0456 Oranges 1.3
0567 Bananas 1.4
0890 Pineapples 1.5
;
run;

data fruit_options;
input ID_char$ Fruit_Type2$ Add_on$ Price$;
datalines;
123 Oranges wine 75
456 Watermelon cheese 25
567 Plum cheese 50
890 OneFruit wine 35
;
run;


There are a number of approaches depending on the actual values. If the code to match on is essentially numbers where the leading zero is actually insignificant (i.e. "0123" and "00123" are not both in the data a mean something else) you can INPUT then character values in the On

 

On input(a.id,12.) = input(b.id,12.)

or if only one is character and the other numeric input the character and not the numeric.

 

HOWEVER: if you have values like "0ABC123" and are matching "123" good luck.

You can use almost any function in the ON to modify values. If all the character values in one set have exactly one leading zero and all of the values in the other set have no leading zeroes you could use something like

On a.id = cats('0',b.id)  to add exactly one leading 0 to the values without the leading 0.

If there are random numbers of leading zeroes and the second set has some values with the leading 0 and others without then you need to provide the logic as to which values have too many or too few leading (or trailing) 0 characters.

Reeza
Super User
Then do the INPUT on BOTH of them, assuming they can be numeric or you need to standardize them before you join.

Scooby3g
Obsidian | Level 7

Thanks all for you feedback and suggestions. It seems like the easier way to go about it is to cut the data then trim the zero using this. Then do a left join and assign what is A and what is B. I originally wanted to avoid this way because the datasets are large and I may run out of space when running it this way.

data fruit2;
set fruit;
ID_1= prxchange ('s/^0+//o', ID_char);
run;

proc sql;
create table temp as
select * from Fruit2 as a
left join fruit_options (keep=ID_char Fruit_Type2) as b
on a.ID_1=b.ID_char;
quit;

I've had issues with converting from btw with Char and Num before using the Input and Put functions so I am trying to avoid using those during join when possible. I was trying to go about it this way:

proc sql;
create table TEMP as select *
from ((select * from fruit where ID_num NOT LIKE '0%') as a
left join (select ID_num, Fruit_Type2, Add_on from fruit_options ) as b
on a.ID_num=b.ID_num);
quit;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 3151 views
  • 0 likes
  • 4 in conversation