Hi,
I have a base table of a million or above rows which has 4 variables as provided below.
i wish to update the date and name column in the base table using the values provided in the small_table. The join is to be on the ID and the SUB_ID variables.
My base table has a composite index defined on it hence i would require a update solution which would not delete the index.
Could someone guide me in the correct direction.
DATA BASE_TABLE;
INFILE DATALINES DELIMITER = '|';
INPUT ID1 : $10. SUB_ID : $10. NAME : $30. DATE : DATE9.
DATALINES;
AA|123|NAME1|01JAN2012
AA|134|NAME2|01JAN2012
AB|123|NAME3|01JAN2012
AA|145|NAME4|01JAN2012
;
RUN;
DATA SMALL_TABLE;
INPUT ID1 : $10. SUB_ID : $10. NAME : $30. DATE : DATE9.
DATALINES;
AA|134|NAME6|01MAR2012
AB|123|NAME7|10MAR2012
;
RUN;
Two methods that will maintain the index :
DATA BASE(index=(myIndex=(ID1 SUB_ID)));
INFILE DATALINES DELIMITER = '|';
INPUT ID1 : $10. SUB_ID : $10. NAME : $30. DATE : DATE9. ;
DATALINES;
AA|123|NAME1|01JAN2012
AA|134|NAME2|01JAN2012
AB|123|NAME3|01JAN2012
AA|145|NAME4|01JAN2012
;
DATA SMALL;
INFILE DATALINES DELIMITER = '|';
INPUT ID1 : $10. SUB_ID : $10. NAME : $30. DATE : DATE9. ;
DATALINES;
AA|134|NAME6|01MAR2012
AB|123|NAME7|10MAR2012
;
proc sql;
select * from base;
update base as B
set name=(select name from small as S where B.ID1=S.ID1 and B.SUB_ID=S.SUB_ID),
date=(select date from small as S where B.ID1=S.ID1 and B.SUB_ID=S.SUB_ID)
where exists (select * from small as S where B.ID1=S.ID1 and B.SUB_ID=S.SUB_ID);
select * from base;
quit;
or
data base;
set small(rename=(name=newName date=newDate));
modify base key=myIndex;
name=newname;
date=newDate;
if _iorc_=0 then replace;
run;
proc print data=base; run;
PG
Two methods that will maintain the index :
DATA BASE(index=(myIndex=(ID1 SUB_ID)));
INFILE DATALINES DELIMITER = '|';
INPUT ID1 : $10. SUB_ID : $10. NAME : $30. DATE : DATE9. ;
DATALINES;
AA|123|NAME1|01JAN2012
AA|134|NAME2|01JAN2012
AB|123|NAME3|01JAN2012
AA|145|NAME4|01JAN2012
;
DATA SMALL;
INFILE DATALINES DELIMITER = '|';
INPUT ID1 : $10. SUB_ID : $10. NAME : $30. DATE : DATE9. ;
DATALINES;
AA|134|NAME6|01MAR2012
AB|123|NAME7|10MAR2012
;
proc sql;
select * from base;
update base as B
set name=(select name from small as S where B.ID1=S.ID1 and B.SUB_ID=S.SUB_ID),
date=(select date from small as S where B.ID1=S.ID1 and B.SUB_ID=S.SUB_ID)
where exists (select * from small as S where B.ID1=S.ID1 and B.SUB_ID=S.SUB_ID);
select * from base;
quit;
or
data base;
set small(rename=(name=newName date=newDate));
modify base key=myIndex;
name=newname;
date=newDate;
if _iorc_=0 then replace;
run;
proc print data=base; run;
PG
Thanks a lot...
The datastep solution works for me.. for some reason the sql process just takes too much time. however the datastep works a lot faster...not sure why..
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.