BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
atul_desh
Quartz | Level 8

Hello,

 

I've two datasets, Lets say A and B,  there are  215 Variables in each of the dataset and datasize is 90 G.

 

I want to append B dataset into A but not to append the duplicate values which are common based on "Three Variables, not based on all 215 variable".

 

I can't use (syncadd=no uniquesave=yes) and also can't use below code, as it will check all variables for duplicity. 

 


proc sql feedback;
create table lib_w.want
select * from lib_A.dsA
union corresponding
select * from lib_B.dsB
;
quit;

 

I'm thininking to creat new varialbe as  catx(all three variable) in both A and B and then  "A.catx_var ne B.Catx_Var"

 

is there is other way around ?

 

Please help, Thanks in Advance. 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
atul_desh
Quartz | Level 8

This is how I'm doing it now.. now it is less time consuming and giving what I wanted : 

 


proc sort data=A;
by Var1 Var2 Var3;
run;

 

proc sort data=B;
by Var1 Var2 Var3;
run;


data pre_want;
merge A (in=a) B (in=b);
byVar1 Var2 Var3;
if not a and b;
run;

proc append base=A data=pre_want;
run;

View solution in original post

7 REPLIES 7
RW9
Diamond | Level 26 RW9
Diamond | Level 26

So, you have 2 datasets which are 90gb each?  That is a really large amount of data.  Do you want to drop data from A or B based on being in both?  As a suggestion, you could get a distinct list of the 3 variables from the table you want to keep data, and then drop from the other before doing a proc append:

proc sort data=a out=distlist nodupkey;
  by var1 var2 var3;
run;

data _null_;
  set distlist end=last;
  if _n_=1 then call execute('data b_proc; set b;');
  call execute('if var1="'||strip(var1)||'" and var2="'||strip(var2)||'" and var3="'||strip(var3)||'" then delete;');
  if last then call execute('run;');
run;

proc append base=a data=b_proc force;
run;

This will create a big datastep with lots of if statements to drop the data from B based on a distinct list of var1-var3 from A.  

 

 

 

 

atul_desh
Quartz | Level 8

what if I append all the observation and in last I do nodupkey ?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Theres only one way to know, try it.  The proc append literally should just drop the header block and tag the data onto the existing dataset, so that shouldn't matter either way, but the proc sort/datastep is where the real work is being done, so anything to minimize that.

Ksharp
Super User

Did you try Hash Table ? Or try SQL.

 

proc sql;
create table only_in_B as
select var1,var2,var3  from B
except
select var1,var2,var3 from A ;
quit;
atul_desh
Quartz | Level 8
how can use only_in_b for append b to a ??
atul_desh
Quartz | Level 8

This is how I'm doing it now.. now it is less time consuming and giving what I wanted : 

 


proc sort data=A;
by Var1 Var2 Var3;
run;

 

proc sort data=B;
by Var1 Var2 Var3;
run;


data pre_want;
merge A (in=a) B (in=b);
byVar1 Var2 Var3;
if not a and b;
run;

proc append base=A data=pre_want;
run;

s_lassen
Meteorite | Level 14

@atul_desh: You could try with a hash table for checking if a row is in table A (assuming your key variables are called VAR1, VAR2 and VAR3):

Data A; /* we are modifying A in place */
  if 0 then modify A; /* we are not reading any obs. here */
  if _N_=1 then do;
    declare hash h(dataset: 'A(keep=var1 var2 var3)');
    rc=h.definekey('var1','var2','var3');
    h.definedone();
    end;
  set B;
  if h.find() then
    output A; /* if key not found, append to A */
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 3188 views
  • 0 likes
  • 4 in conversation