BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SAShole
Pyrite | Level 9

I'm not sure why my merge is failing:

 


data temp_table;
	input start_ip_int 1-11 end_ip_int 12-21  edge_country $23-24;
	datalines;
1227141270 1227141290 GB
415250900  415250930  US
403411530  403411540  DE
;	
run;


data temp_table2;
	input ip_numeric 1-11;
	datalines;
1844552390
2452812925 
3001796479
1844552383
;	
run;


proc sql;
	  create table ip_match as
	  select  a.start_ip_int,
	  		  a.end_ip_int,
	  		  a.edge_country,
	  		  b.ip_numeric	  		
	  from  temp_table a
	  		left join
	  		temp_table2 b
	  		on a.start_ip_int <= b.ip_numeric <= a.end_ip_int;   
quit;

 

Edit: Thank you everyone for the responses!

 

1 ACCEPTED SOLUTION

Accepted Solutions
jimbarbour
Meteorite | Level 14

Are you sort of trying to do an IP country look up?  In other words, if I have an IP address and it's within one of those ranges, I return the corresponding country as the country for that IP address?

 

If so, you have a bit of a data problem.  None of those IP addresses are within any of those ranges.  You can write SQL until you're blue in the face and not get any matches.

 

However, if we re-make our test data like so, now we're in business:

 

data temp_table;
	input	_start_ip_int	1-11 
			_end_ip_int		12-21  
			Edge_Country	$23-24;
	datalines;
1227141270 1227141290 GB
415250900  415250930  US
403411530  403411540  DE
;	
run;


data temp_table2;
	input IP_Numeric 1-11;
	datalines;
1227141285
415250910
415250920
415250930
403411535
;	
run;

 

 

Now, this is pretty old school, but I'm going to do the look up using a multi-level array.  I'll say more about why I'm doing this, below.  For now, here's the code:

PROC	SQL;
	SELECT		STRIP(PUT(COUNT(1),10.))
		INTO	:	Rows
		FROM	Temp_Table;
QUIT;

%PUT	NOTE:  &=Rows;

DATA	Temp_Table3;
	DROP	_:;
	FORMAT	IP_Numeric	11.;
	IF	_N_									=	1			THEN
		DO;
			DO	UNTIL	(End_Of_Data);
				_Iteration					+	1;
				SET		Temp_Table	END	=	End_Of_Data;
				ARRAY	IP_Range	[&Rows, 2]	_TEMPORARY_;
				ARRAY	Country		[&Rows]	$2	_TEMPORARY_;
				IP_Range	[_Iteration, 1]	=	_start_ip_int;
				IP_Range	[_Iteration, 2]	=	_end_ip_int;
				Country		[_Iteration]	=	Edge_Country;
			END;
		END;

	CALL	MISSING(_Start_IP_Int, _End_IP_Int, Edge_Country);

	SET	Temp_Table2;
	DO	_i									=	1	TO	&Rows;
		_Start_IP_Int						=	IP_Range[_i, 1];
		_End_IP_Int							=	IP_Range[_i, 2];
		IF	_Start_IP_Int	<=	IP_Numeric	<=	_End_IP_Int	THEN
			Edge_Country					=	Country[_i];
	END;
RUN;

And of course we want results.  Notice now that each IP address has been assigned a country based on the ranges.

jimbarbour_0-1602647985474.png

 

OK, so why this (very) old school multi-level array business?  Well, while it is more work to code a Data step (generally) than SQL, you have much more powerful tools debugging tools (Typically PUTLOG and LDEBUG) available.  If something isn't working, I fall back to a Data step where I can do a PUTLOG after every line (if necessary) or execute interactively, line by line, with the LDEBUG option.  Display Manager supports LDEBUG and Enterprise Guide from version 7.1.3 onward also supports LDEBUG.  I believe SAS Studio does too.  Once I turned on LDEBUG, the problem was obvious:  Those IP addresses and those ranges are a no match.  They don't correspond to one another.  A quick data makeover rectifies the problem, and, voila, we have assigned each IP a country.

 

I hope this was what you were after, but even if not, you have been exposed to a technique I don't see much any more, multi-level TEMPORARY arrays used as a look up table.

 

Jim

View solution in original post

3 REPLIES 3
CurtisMackWSIPP
Lapis Lazuli | Level 10

If I understand your question, you have your left join inverted.  But I don't think I do because none of your "IP" ranges overlap any of the values in table2.

 

Are you trying to do something like this?

 

data temp_table;
	input start_ip_int 1-11 end_ip_int 13-22  edge_country $25-26;
	datalines;
1844552000  1844552999  GB
2452812000  2452813000  US
3001796000  3001796999  DE
;	
run;


data temp_table2;
	input ip_numeric 1-11;
	datalines;
1844552390
2452812925 
3001796479
1844552383
;	
run;


proc sql;
	  create table ip_match as
	  select  a.start_ip_int,
	  		  a.end_ip_int,
	  		  a.edge_country,
	  		  b.ip_numeric	  		
	  from temp_table2 b 
	  	  left join
	  		temp_table a
	  		on a.start_ip_int <= b.ip_numeric <= a.end_ip_int;   
quit;

 

jimbarbour
Meteorite | Level 14

Are you sort of trying to do an IP country look up?  In other words, if I have an IP address and it's within one of those ranges, I return the corresponding country as the country for that IP address?

 

If so, you have a bit of a data problem.  None of those IP addresses are within any of those ranges.  You can write SQL until you're blue in the face and not get any matches.

 

However, if we re-make our test data like so, now we're in business:

 

data temp_table;
	input	_start_ip_int	1-11 
			_end_ip_int		12-21  
			Edge_Country	$23-24;
	datalines;
1227141270 1227141290 GB
415250900  415250930  US
403411530  403411540  DE
;	
run;


data temp_table2;
	input IP_Numeric 1-11;
	datalines;
1227141285
415250910
415250920
415250930
403411535
;	
run;

 

 

Now, this is pretty old school, but I'm going to do the look up using a multi-level array.  I'll say more about why I'm doing this, below.  For now, here's the code:

PROC	SQL;
	SELECT		STRIP(PUT(COUNT(1),10.))
		INTO	:	Rows
		FROM	Temp_Table;
QUIT;

%PUT	NOTE:  &=Rows;

DATA	Temp_Table3;
	DROP	_:;
	FORMAT	IP_Numeric	11.;
	IF	_N_									=	1			THEN
		DO;
			DO	UNTIL	(End_Of_Data);
				_Iteration					+	1;
				SET		Temp_Table	END	=	End_Of_Data;
				ARRAY	IP_Range	[&Rows, 2]	_TEMPORARY_;
				ARRAY	Country		[&Rows]	$2	_TEMPORARY_;
				IP_Range	[_Iteration, 1]	=	_start_ip_int;
				IP_Range	[_Iteration, 2]	=	_end_ip_int;
				Country		[_Iteration]	=	Edge_Country;
			END;
		END;

	CALL	MISSING(_Start_IP_Int, _End_IP_Int, Edge_Country);

	SET	Temp_Table2;
	DO	_i									=	1	TO	&Rows;
		_Start_IP_Int						=	IP_Range[_i, 1];
		_End_IP_Int							=	IP_Range[_i, 2];
		IF	_Start_IP_Int	<=	IP_Numeric	<=	_End_IP_Int	THEN
			Edge_Country					=	Country[_i];
	END;
RUN;

And of course we want results.  Notice now that each IP address has been assigned a country based on the ranges.

jimbarbour_0-1602647985474.png

 

OK, so why this (very) old school multi-level array business?  Well, while it is more work to code a Data step (generally) than SQL, you have much more powerful tools debugging tools (Typically PUTLOG and LDEBUG) available.  If something isn't working, I fall back to a Data step where I can do a PUTLOG after every line (if necessary) or execute interactively, line by line, with the LDEBUG option.  Display Manager supports LDEBUG and Enterprise Guide from version 7.1.3 onward also supports LDEBUG.  I believe SAS Studio does too.  Once I turned on LDEBUG, the problem was obvious:  Those IP addresses and those ranges are a no match.  They don't correspond to one another.  A quick data makeover rectifies the problem, and, voila, we have assigned each IP a country.

 

I hope this was what you were after, but even if not, you have been exposed to a technique I don't see much any more, multi-level TEMPORARY arrays used as a look up table.

 

Jim

andreas_lds
Jade | Level 19

Storing ip-addresses as numerics can cause trouble:

100.1.201.55 = 100120155

100.120.15.5 = 100120155

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 747 views
  • 4 likes
  • 4 in conversation