proc sql; create table want as select t1.* from t1 where id in (select distinct id from t2) ; quit;
Replace table and column names as needed
Depending on dataset size, a DATA step with hash approach might perform better.
Thanks @PeterClemmensen , @Kurt_Bremser and @PGStats for your response.
I tried the proc sql as you advised but the code never stop running even for an entire day.
I wonder if it is due to the huge datasets i.e 10000k rows of data.
What other ways would you suggest please?
I am completely new to hash object so I don't understand how to write the syntax.
Thanking you in advance.
Then a hash approach should be tried.
Which would look like this:
data want; set t1; if _n_ = 1 then do; declare hash t2 (dataset:("t2"); t2.definekey("id","date"); t2.definedone(); end; if t2.check() = 0; run;
This will work as long as the two variables * observations from t2 fit into memory.
How do we write the proc sql to retrieve only the observation rows when the same ID and the same date are present in both the data please?
Thanking you all in advance.
Why would you use SQL for such a simple request?
data want ; merge dataset1 (in=in1) dataset2(in=in2); by id date; if in1 and in2; run;
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.