I have a data set1:
ID TIME PRED new_study
1 0 0 1
1 0.25 1.1063 1
1 0.5 5.1534 1
2 0 0 1
2 0.25 1.0462 1
2 0.5 4.8104 1
1 0 0 2
1 0.25 1.1063 2
1 0.5 5.1534 2
2 0 0 2
2 0.25 1.0462 2
2 0.5 4.8104 2
and a data set2
id time ratio
1 0 0
1 0.25 0.937629938
1 0.5 0.925815966
2 0 0
2 0.25 0.991493022
2 0.5 0.991830201
I would like to merge these data sets into a data set final (example below) in which the same ratio values from dataset2 are repeated for each subject in each new_study
ID TIME PRED new_study ratio
1 0 0 1 0
1 0.25 1.1063 1 0.937629938
1 0.5 5.1534 1 0.925815966
2 0 0 1 0
2 0.25 1.0462 1 0.991493022
2 0.5 4.8104 1 0.991830201
1 0 0 2 0
1 0.25 1.1063 2 0.937629938
1 0.5 5.1534 2 0.925815966
2 0 0 2 0
2 0.25 1.0462 2 0.991493022
2 0.5 4.8104 2 0.991830201
I have tried some merges but each time the study and ratios get sorted which is not what I want. I need the newstudy and ratio values not to be sorted. I have included my simple merge code.
proc sort data=data1; by id time;run;
proc sort data=data2; by id time; run;
Data final;
merge data1 data2;
by id time;
run;
data have;
input ID TIME PRED new_study;
datalines;
1 0 0 1
1 0.25 1.1063 1
1 0.5 5.1534 1
2 0 0 1
2 0.25 1.0462 1
2 0.5 4.8104 1
1 0 0 2
1 0.25 1.1063 2
1 0.5 5.1534 2
2 0 0 2
2 0.25 1.0462 2
2 0.5 4.8104 2
;
data have1;
input id time ratio;
datalines;
1 0 0
1 0.25 0.937629938
1 0.5 0.925815966
2 0 0
2 0.25 0.991493022
2 0.5 0.991830201
;
data final;
if _N_ = 1 then do;
if 0 then do;
set have;
set have1;
end;
declare hash myhash(dataset:'have1', multidata:'yes' );
myhash.defineKey('id','time');
myhash.defineData('ratio');
myhash.defineDone( );
end;
set have;
if myhash.find() ne 0 then ratio=0;
run;
Most simple is hash find method. Is it ok for you to implement hash?
So far, you're doing the right thing (although using PROC SQL might simplify the process). All you are missing is a final PROC SORT to put the data back into the right order:
proc sort data=final;
by new_study id time;
run;
data have;
input ID TIME PRED new_study;
datalines;
1 0 0 1
1 0.25 1.1063 1
1 0.5 5.1534 1
2 0 0 1
2 0.25 1.0462 1
2 0.5 4.8104 1
1 0 0 2
1 0.25 1.1063 2
1 0.5 5.1534 2
2 0 0 2
2 0.25 1.0462 2
2 0.5 4.8104 2
;
data have1;
input id time ratio;
datalines;
1 0 0
1 0.25 0.937629938
1 0.5 0.925815966
2 0 0
2 0.25 0.991493022
2 0.5 0.991830201
;
data final;
if _N_ = 1 then do;
if 0 then do;
set have;
set have1;
end;
declare hash myhash(dataset:'have1', multidata:'yes' );
myhash.defineKey('id','time');
myhash.defineData('ratio');
myhash.defineDone( );
end;
set have;
if myhash.find() ne 0 then ratio=0;
run;
I made an error in coding in your response. When corrected it worked perfectly by listing all of the studies..
Thanks for the help.
It performed the sort as you stated but it only output new_study1 there was no new_study 2 output.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.