i have two datasets namely data1 and data2
data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;
i want another dataset namely data3 to contain
x c
12 10
24 10
45 10
Any help would be appreciated. I'm currently using sas 9.4 on windows.
proc sql; create table want as select data1.x, data2.c from data1,data2 ; quit;
proc sql; create table want as select data1.x, data2.c from data1,data2 ; quit;
proc sql carterian or datastep one to one merge:
data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;
proc sql;
create table want as
select x,c
from data1,data2;
quit;
data want;
set data1;
if _n_=1 then set data2;
run;
thank you very much 😄
Like this?
data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;
proc sql;
create table data3 as
select data1.*, data2.c
from data1 join data2 on c;
quit;
You can take advantage of the fact the any variables in the program data vector that result from a SET (or MERGE) statement are automatically ratained until the next iteration of a SET (or MERGE) of the same data set.
So make sure you only SET DATA2 for one iteraction of the data step:
data want;
set data1;
if _n_=1 then set data2;
run;
The above only reads in DATA2 in the first iteration of the data step, while DATA1 is read in for every iteration. The data step ends when a SET attempts to read beyond the end of an incoming data set.
And the data step method:
data want;
set data1;
if _n_=1 then set data2;
run;
@variathu wrote:
i have two datasets namely data1 and data2
data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;
i want another dataset namely data3 to contain
x c
12 10
24 10
45 10
Any help would be appreciated. I'm currently using sas 9.4 on windows.
Another data step approach besides the classical solution that many have mentioned:
data want;
set data1;
n=1;
set data2 point= n;
run;
I believe there is a performance problem with the direct-access method (set data2 point=n;). The first DATA _NULL_ below took 5.1 seconds on my windows machine, while the second (using the "point=" technique) took 28.8 seconds.
data data1;
do x=1 to 100000000;output;end;
run;
data data2;
c=0;
run;
data _null_;
set data1;
if _n_=1 then set data2;
run;
data _null_;
set data1;
retain n 1;
set data2 point=n;
run;
Yes, looks like the unconditional execution is the culprit?
23 data _null_;
24 set data1;
25 if _n_=1 then set data2;
26 run;
NOTE: There were 100000000 observations read from the data set WORK.DATA1.
NOTE: There were 1 observations read from the data set WORK.DATA2.
NOTE: DATA statement used (Total process time):
real time 3.43 seconds
cpu time 3.40 seconds
27
28 data _null_;
29 set data1;
30 if _n_=1 then set data2 point=_n_;
31
32 run;
NOTE: There were 100000000 observations read from the data set WORK.DATA1.
NOTE: DATA statement used (Total process time):
real time 3.48 seconds
cpu time 3.44 seconds
Take the advantage of automatic retention. Read pitfalls of automatic retention in this paper (http://analytics.ncsu.edu/sesug/2015/CC-34.pdf)
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.