Hello
I have a data set with list of customer accounts and ID.
(Each customer can be owned by multiple ID's and specific ID can be also in multiple accounts).
I want to select rows from "have data set" of all ID's of customer 1234567 .
Way1 of solution i working well.
I want to ask why way2 is not working well and what is the way to fix it?
Are there other ways to solve it to get the wanted data set?
thanks
Data have;
input IP customerID;
cards;
111 1234567
222 1234567
333 8788888
444 8798777
555 8765432
222 8765432
222 7654221
111 8273774
;
Run;
PROC SQL;
create table tbl1 as
select distinct IP
from have
where customerID=1234567
;
QUIT;
PROC SQL;
create table wanted_way1 as
select a.*
from have as a
inner join tbl1 as b
on a.IP=b.IP
;
QUIT;
Data wanted_way2;
set have(where=(IP IN (select distinct IP from have where customerID=1234567)));
Run;
Well, your second method is a data step. Inside the data step, you have a WHERE with a Select. As far as I know, Select is valid in Proc SQL but is not valid in a Data step.
There are multiple ways you could get it working including a merge, a hash table, or arrays.
Given what you've already got, perhaps a merge is simplest. Here is an example:
Proc SORT DATA=Have;
BY IP CustomerID;
RUN;
Proc SORT DATA=Tbl1;
BY IP;
RUN;
Data wanted_way2_merge;
MERGE Have (IN = Have)
Tbl1 (IN = IP_List);
BY IP;
IF Have and IP_List;
Run;
Jim
And here are examples using an array and a hash table below. Note that for the array I inserted %LET IP_Cnt = &SQLOBS; right after the SQL that creates Tbl1.
So, now you have four methods:
Take your pick.
Data wanted_way2_array;
DROP _:;
IF _N_ = 1 THEN
DO;
DO _i = 1 TO &IP_Cnt;
SET Tbl1;
ARRAY IP_List {&IP_Cnt} _TEMPORARY_;
IP_List{_i} = IP;
END;
END;
SET Have;
IF IP IN IP_List;
Run;
Data wanted_way2_hash;
IF 0 THEN
SET Tbl1;
IF _N_ = 1 THEN
DO;
DECLARE Hash h_IP_List(DATASET: 'Tbl1');
h_IP_List.DEFINEKEY ('IP');
h_IP_List.DEFINEDONE();
END;
SET Have;
IF h_IP_List.FIND() = 0;
Run;
Jim
Way 2 is not just "not working well", it doesn't work at all!
Please follow Maxim 2 and read the log!
To use a WHERE= in a data step, you need an intermediate step and a macro variable:
proc sql noprint;
select distinct ip into :iplist separated by ","
from have
where customerID=1234567;
quit;
data wanted_way2;
set have (where=(ip in (&iplist.)));
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.