i have two datasets namely data1 and data2
data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;
i want another dataset namely data3 to contain
x c
12 10
24 10
45 10
Any help would be appreciated. I'm currently using sas 9.4 on windows.
proc sql; create table want as select data1.x, data2.c from data1,data2 ; quit;
proc sql; create table want as select data1.x, data2.c from data1,data2 ; quit;
proc sql carterian or datastep one to one merge:
data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;
proc sql;
create table want as
select x,c
from data1,data2;
quit;
data want;
set data1;
if _n_=1 then set data2;
run;
thank you very much 😄
Like this?
data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;
proc sql;
create table data3 as
select data1.*, data2.c
from data1 join data2 on c;
quit;
You can take advantage of the fact the any variables in the program data vector that result from a SET (or MERGE) statement are automatically ratained until the next iteration of a SET (or MERGE) of the same data set.
So make sure you only SET DATA2 for one iteraction of the data step:
data want;
set data1;
if _n_=1 then set data2;
run;
The above only reads in DATA2 in the first iteration of the data step, while DATA1 is read in for every iteration. The data step ends when a SET attempts to read beyond the end of an incoming data set.
And the data step method:
data want;
set data1;
if _n_=1 then set data2;
run;
@variathu wrote:
i have two datasets namely data1 and data2
data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;
i want another dataset namely data3 to contain
x c
12 10
24 10
45 10
Any help would be appreciated. I'm currently using sas 9.4 on windows.
Another data step approach besides the classical solution that many have mentioned:
data want;
set data1;
n=1;
set data2 point= n;
run;
I believe there is a performance problem with the direct-access method (set data2 point=n;). The first DATA _NULL_ below took 5.1 seconds on my windows machine, while the second (using the "point=" technique) took 28.8 seconds.
data data1;
do x=1 to 100000000;output;end;
run;
data data2;
c=0;
run;
data _null_;
set data1;
if _n_=1 then set data2;
run;
data _null_;
set data1;
retain n 1;
set data2 point=n;
run;
Yes, looks like the unconditional execution is the culprit?
23 data _null_; 24 set data1; 25 if _n_=1 then set data2; 26 run; NOTE: There were 100000000 observations read from the data set WORK.DATA1. NOTE: There were 1 observations read from the data set WORK.DATA2. NOTE: DATA statement used (Total process time): real time 3.43 seconds cpu time 3.40 seconds 27 28 data _null_; 29 set data1; 30 if _n_=1 then set data2 point=_n_; 31 32 run; NOTE: There were 100000000 observations read from the data set WORK.DATA1. NOTE: DATA statement used (Total process time): real time 3.48 seconds cpu time 3.44 seconds
Take the advantage of automatic retention. Read pitfalls of automatic retention in this paper (http://analytics.ncsu.edu/sesug/2015/CC-34.pdf)
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.