Hi,
I have a base table of a million or above rows which has 4 variables as provided below.
i wish to update the date and name column in the base table using the values provided in the small_table. The join is to be on the ID and the SUB_ID variables.
My base table has a composite index defined on it hence i would require a update solution which would not delete the index.
Could someone guide me in the correct direction.
DATA BASE_TABLE;
INFILE DATALINES DELIMITER = '|';
INPUT ID1 : $10. SUB_ID : $10. NAME : $30. DATE : DATE9.
DATALINES;
AA|123|NAME1|01JAN2012
AA|134|NAME2|01JAN2012
AB|123|NAME3|01JAN2012
AA|145|NAME4|01JAN2012
;
RUN;
DATA SMALL_TABLE;
INPUT ID1 : $10. SUB_ID : $10. NAME : $30. DATE : DATE9.
DATALINES;
AA|134|NAME6|01MAR2012
AB|123|NAME7|10MAR2012
;
RUN;
Two methods that will maintain the index :
DATA BASE(index=(myIndex=(ID1 SUB_ID)));
INFILE DATALINES DELIMITER = '|';
INPUT ID1 : $10. SUB_ID : $10. NAME : $30. DATE : DATE9. ;
DATALINES;
AA|123|NAME1|01JAN2012
AA|134|NAME2|01JAN2012
AB|123|NAME3|01JAN2012
AA|145|NAME4|01JAN2012
;
DATA SMALL;
INFILE DATALINES DELIMITER = '|';
INPUT ID1 : $10. SUB_ID : $10. NAME : $30. DATE : DATE9. ;
DATALINES;
AA|134|NAME6|01MAR2012
AB|123|NAME7|10MAR2012
;
proc sql;
select * from base;
update base as B
set name=(select name from small as S where B.ID1=S.ID1 and B.SUB_ID=S.SUB_ID),
date=(select date from small as S where B.ID1=S.ID1 and B.SUB_ID=S.SUB_ID)
where exists (select * from small as S where B.ID1=S.ID1 and B.SUB_ID=S.SUB_ID);
select * from base;
quit;
or
data base;
set small(rename=(name=newName date=newDate));
modify base key=myIndex;
name=newname;
date=newDate;
if _iorc_=0 then replace;
run;
proc print data=base; run;
PG
Two methods that will maintain the index :
DATA BASE(index=(myIndex=(ID1 SUB_ID)));
INFILE DATALINES DELIMITER = '|';
INPUT ID1 : $10. SUB_ID : $10. NAME : $30. DATE : DATE9. ;
DATALINES;
AA|123|NAME1|01JAN2012
AA|134|NAME2|01JAN2012
AB|123|NAME3|01JAN2012
AA|145|NAME4|01JAN2012
;
DATA SMALL;
INFILE DATALINES DELIMITER = '|';
INPUT ID1 : $10. SUB_ID : $10. NAME : $30. DATE : DATE9. ;
DATALINES;
AA|134|NAME6|01MAR2012
AB|123|NAME7|10MAR2012
;
proc sql;
select * from base;
update base as B
set name=(select name from small as S where B.ID1=S.ID1 and B.SUB_ID=S.SUB_ID),
date=(select date from small as S where B.ID1=S.ID1 and B.SUB_ID=S.SUB_ID)
where exists (select * from small as S where B.ID1=S.ID1 and B.SUB_ID=S.SUB_ID);
select * from base;
quit;
or
data base;
set small(rename=(name=newName date=newDate));
modify base key=myIndex;
name=newname;
date=newDate;
if _iorc_=0 then replace;
run;
proc print data=base; run;
PG
Thanks a lot...
The datastep solution works for me.. for some reason the sql process just takes too much time. however the datastep works a lot faster...not sure why..
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.