BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
variathu
Fluorite | Level 6

i have two datasets namely data1 and data2

 

data data1;

input x;

cards;

12

24

45

;

data data2;

input c;

cards;

10

;

 

i want another dataset namely data3 to contain

   x     c

  12   10

  24   10

  45   10

 

Any help would be appreciated. I'm currently using sas 9.4 on windows.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User
proc sql;
  create table want as
  select data1.x, data2.c
  from data1,data2
  ;
quit;

View solution in original post

10 REPLIES 10
ballardw
Super User
proc sql;
  create table want as
  select data1.x, data2.c
  from data1,data2
  ;
quit;
novinosrin
Tourmaline | Level 20

proc sql carterian or datastep one to one merge:

 


data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;

proc sql;
create table want as
select x,c
from data1,data2;
quit;

 

data want;
set data1;
if _n_=1 then set data2;
run;

PeterClemmensen
Tourmaline | Level 20

Like this?

 

data data1;
input x;
cards;
12
24
45
;
data data2;
input c;
cards;
10
;

proc sql;
   create table data3 as
   select data1.*, data2.c
   from data1 join data2 on c;
quit;
mkeintz
PROC Star

You can take advantage of the fact the any variables in the program data vector that result from a SET (or MERGE) statement are automatically ratained until the next iteration of a SET (or MERGE) of the same data set.

 

So make sure you only SET DATA2 for one iteraction of the data step:

 

data want; 
  set data1;
  if _n_=1 then set data2;
run;

 

The above only reads in DATA2 in the first iteration of the data step, while DATA1 is read in for every iteration.  The data step ends when a SET attempts to read beyond the end of an incoming data set.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User

And the data step method:

 

data want;
set data1;
if _n_=1 then set data2;
run;

@variathu wrote:

i have two datasets namely data1 and data2

 

data data1;

input x;

cards;

12

24

45

;

data data2;

input c;

cards;

10

;

 

i want another dataset namely data3 to contain

   x     c

  12   10

  24   10

  45   10

 

Any help would be appreciated. I'm currently using sas 9.4 on windows.


 

Haikuo
Onyx | Level 15

Another data step approach besides the classical solution that many have mentioned:

 

data want;
set data1;
n=1;
set data2 point= n;
run;
mkeintz
PROC Star

I believe there is a performance problem with the direct-access method (set data2 point=n;).  The first DATA _NULL_ below took 5.1 seconds on my windows machine, while the second (using the "point=" technique) took 28.8 seconds.

 

data data1;
  do x=1 to 100000000;output;end;
run;
data data2;
  c=0;
run;

data _null_;
  set data1;
  if _n_=1 then set data2;
run;

data _null_;
  set data1;
  retain n 1;
  set data2 point=n;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Haikuo
Onyx | Level 15

Yes, looks like the unconditional execution is the culprit?

23         data _null_;
24           set data1;
25           if _n_=1 then set data2;
26         run;

NOTE: There were 100000000 observations read from the data set WORK.DATA1.
NOTE: There were 1 observations read from the data set WORK.DATA2.
NOTE: DATA statement used (Total process time):
      real time           3.43 seconds
      cpu time            3.40 seconds
      

27         
28         data _null_;
29           set data1;
30         if _n_=1 then  set data2 point=_n_;
31         
32         run;

NOTE: There were 100000000 observations read from the data set WORK.DATA1.
NOTE: DATA statement used (Total process time):
      real time           3.48 seconds
      cpu time            3.44 seconds
SAS_inquisitive
Lapis Lazuli | Level 10

Take the advantage of automatic retention. Read pitfalls of automatic retention in this paper (http://analytics.ncsu.edu/sesug/2015/CC-34.pdf)

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 948 views
  • 7 likes
  • 8 in conversation