i have two datasets namely data1 and data2
data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;
i want another dataset namely data3 to contain
x c
12 10
24 10
45 10
Any help would be appreciated. I'm currently using sas 9.4 on windows.
proc sql; create table want as select data1.x, data2.c from data1,data2 ; quit;
proc sql; create table want as select data1.x, data2.c from data1,data2 ; quit;
proc sql carterian or datastep one to one merge:
data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;
proc sql;
create table want as
select x,c
from data1,data2;
quit;
data want;
set data1;
if _n_=1 then set data2;
run;
thank you very much 😄
Like this?
data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;
proc sql;
create table data3 as
select data1.*, data2.c
from data1 join data2 on c;
quit;
You can take advantage of the fact the any variables in the program data vector that result from a SET (or MERGE) statement are automatically ratained until the next iteration of a SET (or MERGE) of the same data set.
So make sure you only SET DATA2 for one iteraction of the data step:
data want;
set data1;
if _n_=1 then set data2;
run;
The above only reads in DATA2 in the first iteration of the data step, while DATA1 is read in for every iteration. The data step ends when a SET attempts to read beyond the end of an incoming data set.
And the data step method:
data want;
set data1;
if _n_=1 then set data2;
run;
@variathu wrote:
i have two datasets namely data1 and data2
data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;
i want another dataset namely data3 to contain
x c
12 10
24 10
45 10
Any help would be appreciated. I'm currently using sas 9.4 on windows.
Another data step approach besides the classical solution that many have mentioned:
data want;
set data1;
n=1;
set data2 point= n;
run;
I believe there is a performance problem with the direct-access method (set data2 point=n;). The first DATA _NULL_ below took 5.1 seconds on my windows machine, while the second (using the "point=" technique) took 28.8 seconds.
data data1;
do x=1 to 100000000;output;end;
run;
data data2;
c=0;
run;
data _null_;
set data1;
if _n_=1 then set data2;
run;
data _null_;
set data1;
retain n 1;
set data2 point=n;
run;
Yes, looks like the unconditional execution is the culprit?
23 data _null_; 24 set data1; 25 if _n_=1 then set data2; 26 run; NOTE: There were 100000000 observations read from the data set WORK.DATA1. NOTE: There were 1 observations read from the data set WORK.DATA2. NOTE: DATA statement used (Total process time): real time 3.43 seconds cpu time 3.40 seconds 27 28 data _null_; 29 set data1; 30 if _n_=1 then set data2 point=_n_; 31 32 run; NOTE: There were 100000000 observations read from the data set WORK.DATA1. NOTE: DATA statement used (Total process time): real time 3.48 seconds cpu time 3.44 seconds
Take the advantage of automatic retention. Read pitfalls of automatic retention in this paper (http://analytics.ncsu.edu/sesug/2015/CC-34.pdf)
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.