Dear all,
supposing I have a table A and B as shown below
data A;
infile datalines;
input patientID $ sex $ diagnosis $8. therapie $15. age;
datalines;
MD10 F cancer radiotherapy 18
MD56 F Athritis electrotherapy 25
MD77 M headache medication 35
MD15 M stroke reanimation 60
;
run;
data B;
infile datalines;
input patientID $ sex $ diagnosis $9. therapie $16. age hobbys $9. pre_illness $4.;
datalines;
MD12 M cancer operation 58 cinema yes
MD66 F allergies medication 95 reading no
MD17 F headache medication 88 swimming
MD11 M Cold medication 60 hockey yes
MD10 F cancer radiotherapy 18 football no
MD56 F Athritis electrotherapy 25 no
MD77 M headache medication 35 badminton yes
MD15 M stroke reanimation 60 cooking
;
run;
and I wish to search for the patients with the same sex, diagnosis , therapy and age in table B using a macro (preferably proc sql) without having to type in the values manually (I mean using a sort of iteration macro). How can I do that?
It's not clear what you mean by
patients with the same sex, diagnosis , therapy and age
Do you mean that they have to have ALL four of those variables EXACTLY match in both tables? It would be helpful if you provided the desired output from these two data sets.
It's also not clear why you are saying the solution has to contain macros, as this is often a requirement that un-necessarily complicates the coding in situations where you can achieve the desired output without macros.
Here is my attempt to select records from B that match A without macros (pending on your clarification of the issues above)
proc sql;
create table c as select b.* from
b right join a
on b.sex=a.sex and b.diagnosis=a.diagnosis and b.therapie=a.therapie and b.age=a.age;
quit;
It's not clear what you mean by
patients with the same sex, diagnosis , therapy and age
Do you mean that they have to have ALL four of those variables EXACTLY match in both tables? It would be helpful if you provided the desired output from these two data sets.
It's also not clear why you are saying the solution has to contain macros, as this is often a requirement that un-necessarily complicates the coding in situations where you can achieve the desired output without macros.
Here is my attempt to select records from B that match A without macros (pending on your clarification of the issues above)
proc sql;
create table c as select b.* from
b right join a
on b.sex=a.sex and b.diagnosis=a.diagnosis and b.therapie=a.therapie and b.age=a.age;
quit;
What would the output look like?
When you say "same age" does that have to be exactly the same or within a range?
How many records are in your sets? Matching on 4 requirements may mean there are no exact matches if your data sets are not large.
How consistent is the spelling of the therapy and diagnosis? Even you small examples show that case may not be consistent which makes matching a bit more problematic.
Hi @ballardw @PaigeMiller @Reeza
Thanks for the quick reply. It's all about a research project in which a set of patients (with a specified characteristics are to be evaluated).
Data A shows how these characteristics are defined (For example the patient, must be a female, be a cancer patient, must have received a radiotherapy and should be 18 years old).
Data B is a hospital DB (with about 100 000 datasets) in which I want to search for the patients with such defined characteristics. That is the idea behind my question.
You haven't really specifically addressed the issues raised by @ballardw or myself, you just gone ahead and provided the same information in different words (and you have informed us the true size of data set B).
Does the code I provided do what you want?
Please show us your desired result given your (very nicely posted) sample data.
@PeterClemmensen @PaigeMiller , the desired results should look like @PaigeMiller 's output, only that the comparism shouldn't be exactly 1 to 1, let's say range or similar diagnosis/therapy spelling should do.
@Anita_n wrote:
@PeterClemmensen @PaigeMiller , the desired results should look like @PaigeMiller 's output, only that the comparism shouldn't be exactly 1 to 1, let's say range or similar diagnosis/therapy spelling should do.
Please be more specific ... what "range" do you mean?
We can't help you write code if you are not specific and tell us in detail exactly what you want.
proc sql;
create table want as
select * from b
where catx(' ',sex,diagnosis,therapie,age) in
(select catx(' ',sex,diagnosis,therapie,age) from a);
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.