I'm not sure why my merge is failing:
data temp_table;
input start_ip_int 1-11 end_ip_int 12-21 edge_country $23-24;
datalines;
1227141270 1227141290 GB
415250900 415250930 US
403411530 403411540 DE
;
run;
data temp_table2;
input ip_numeric 1-11;
datalines;
1844552390
2452812925
3001796479
1844552383
;
run;
proc sql;
create table ip_match as
select a.start_ip_int,
a.end_ip_int,
a.edge_country,
b.ip_numeric
from temp_table a
left join
temp_table2 b
on a.start_ip_int <= b.ip_numeric <= a.end_ip_int;
quit;
Edit: Thank you everyone for the responses!
Are you sort of trying to do an IP country look up? In other words, if I have an IP address and it's within one of those ranges, I return the corresponding country as the country for that IP address?
If so, you have a bit of a data problem. None of those IP addresses are within any of those ranges. You can write SQL until you're blue in the face and not get any matches.
However, if we re-make our test data like so, now we're in business:
data temp_table;
input _start_ip_int 1-11
_end_ip_int 12-21
Edge_Country $23-24;
datalines;
1227141270 1227141290 GB
415250900 415250930 US
403411530 403411540 DE
;
run;
data temp_table2;
input IP_Numeric 1-11;
datalines;
1227141285
415250910
415250920
415250930
403411535
;
run;
Now, this is pretty old school, but I'm going to do the look up using a multi-level array. I'll say more about why I'm doing this, below. For now, here's the code:
PROC SQL;
SELECT STRIP(PUT(COUNT(1),10.))
INTO : Rows
FROM Temp_Table;
QUIT;
%PUT NOTE: &=Rows;
DATA Temp_Table3;
DROP _:;
FORMAT IP_Numeric 11.;
IF _N_ = 1 THEN
DO;
DO UNTIL (End_Of_Data);
_Iteration + 1;
SET Temp_Table END = End_Of_Data;
ARRAY IP_Range [&Rows, 2] _TEMPORARY_;
ARRAY Country [&Rows] $2 _TEMPORARY_;
IP_Range [_Iteration, 1] = _start_ip_int;
IP_Range [_Iteration, 2] = _end_ip_int;
Country [_Iteration] = Edge_Country;
END;
END;
CALL MISSING(_Start_IP_Int, _End_IP_Int, Edge_Country);
SET Temp_Table2;
DO _i = 1 TO &Rows;
_Start_IP_Int = IP_Range[_i, 1];
_End_IP_Int = IP_Range[_i, 2];
IF _Start_IP_Int <= IP_Numeric <= _End_IP_Int THEN
Edge_Country = Country[_i];
END;
RUN;
And of course we want results. Notice now that each IP address has been assigned a country based on the ranges.
OK, so why this (very) old school multi-level array business? Well, while it is more work to code a Data step (generally) than SQL, you have much more powerful tools debugging tools (Typically PUTLOG and LDEBUG) available. If something isn't working, I fall back to a Data step where I can do a PUTLOG after every line (if necessary) or execute interactively, line by line, with the LDEBUG option. Display Manager supports LDEBUG and Enterprise Guide from version 7.1.3 onward also supports LDEBUG. I believe SAS Studio does too. Once I turned on LDEBUG, the problem was obvious: Those IP addresses and those ranges are a no match. They don't correspond to one another. A quick data makeover rectifies the problem, and, voila, we have assigned each IP a country.
I hope this was what you were after, but even if not, you have been exposed to a technique I don't see much any more, multi-level TEMPORARY arrays used as a look up table.
Jim
If I understand your question, you have your left join inverted. But I don't think I do because none of your "IP" ranges overlap any of the values in table2.
Are you trying to do something like this?
data temp_table;
input start_ip_int 1-11 end_ip_int 13-22 edge_country $25-26;
datalines;
1844552000 1844552999 GB
2452812000 2452813000 US
3001796000 3001796999 DE
;
run;
data temp_table2;
input ip_numeric 1-11;
datalines;
1844552390
2452812925
3001796479
1844552383
;
run;
proc sql;
create table ip_match as
select a.start_ip_int,
a.end_ip_int,
a.edge_country,
b.ip_numeric
from temp_table2 b
left join
temp_table a
on a.start_ip_int <= b.ip_numeric <= a.end_ip_int;
quit;
Are you sort of trying to do an IP country look up? In other words, if I have an IP address and it's within one of those ranges, I return the corresponding country as the country for that IP address?
If so, you have a bit of a data problem. None of those IP addresses are within any of those ranges. You can write SQL until you're blue in the face and not get any matches.
However, if we re-make our test data like so, now we're in business:
data temp_table;
input _start_ip_int 1-11
_end_ip_int 12-21
Edge_Country $23-24;
datalines;
1227141270 1227141290 GB
415250900 415250930 US
403411530 403411540 DE
;
run;
data temp_table2;
input IP_Numeric 1-11;
datalines;
1227141285
415250910
415250920
415250930
403411535
;
run;
Now, this is pretty old school, but I'm going to do the look up using a multi-level array. I'll say more about why I'm doing this, below. For now, here's the code:
PROC SQL;
SELECT STRIP(PUT(COUNT(1),10.))
INTO : Rows
FROM Temp_Table;
QUIT;
%PUT NOTE: &=Rows;
DATA Temp_Table3;
DROP _:;
FORMAT IP_Numeric 11.;
IF _N_ = 1 THEN
DO;
DO UNTIL (End_Of_Data);
_Iteration + 1;
SET Temp_Table END = End_Of_Data;
ARRAY IP_Range [&Rows, 2] _TEMPORARY_;
ARRAY Country [&Rows] $2 _TEMPORARY_;
IP_Range [_Iteration, 1] = _start_ip_int;
IP_Range [_Iteration, 2] = _end_ip_int;
Country [_Iteration] = Edge_Country;
END;
END;
CALL MISSING(_Start_IP_Int, _End_IP_Int, Edge_Country);
SET Temp_Table2;
DO _i = 1 TO &Rows;
_Start_IP_Int = IP_Range[_i, 1];
_End_IP_Int = IP_Range[_i, 2];
IF _Start_IP_Int <= IP_Numeric <= _End_IP_Int THEN
Edge_Country = Country[_i];
END;
RUN;
And of course we want results. Notice now that each IP address has been assigned a country based on the ranges.
OK, so why this (very) old school multi-level array business? Well, while it is more work to code a Data step (generally) than SQL, you have much more powerful tools debugging tools (Typically PUTLOG and LDEBUG) available. If something isn't working, I fall back to a Data step where I can do a PUTLOG after every line (if necessary) or execute interactively, line by line, with the LDEBUG option. Display Manager supports LDEBUG and Enterprise Guide from version 7.1.3 onward also supports LDEBUG. I believe SAS Studio does too. Once I turned on LDEBUG, the problem was obvious: Those IP addresses and those ranges are a no match. They don't correspond to one another. A quick data makeover rectifies the problem, and, voila, we have assigned each IP a country.
I hope this was what you were after, but even if not, you have been exposed to a technique I don't see much any more, multi-level TEMPORARY arrays used as a look up table.
Jim
Storing ip-addresses as numerics can cause trouble:
100.1.201.55 = 100120155
100.120.15.5 = 100120155
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.