Forgive me, I am a self-taught user so if I use language that is not the "norm," that's why. I'm using 9.2
Here's my issue. I recently needed to pull a specific set of records from a relational database. I needed phone numbers and addresses for customers who had made a complaint in late 2013 through the end of 2015. I first went into one specific table (complaints) and pulled complaint ID numbers based on the date the complaint was received (October 1, 2013 to December 31, 2015). These complaint numbers are unique identifiers that reside on most, but not all, tables in this database. Because I needed data from other tables residing in the same database, I then used the complaint ID numbers to pull data from these other tables so that I would end up with the same subset of customer complaints for each additional set of data I drew. Because there were more than 1,000 complaint IDs, I ran 999 of these IDs against the table first and created a file (person1) and then ran the remaining 222 IDs against the same table to produce a second file (person2). When I merged person1 and person2, I lost 41 records, i.e., 41 complaint IDs. I can see the IDs in my syntax. They are there but these same 41 complaints IDs and accompanying data do not show up in the resulting dataset.
Here's the syntax. Because including the 1,221 IDs would make this long and cumbersome, I didn't include them. I did reference where they resided within the syntax.
Step 1: This is the syntax to pull the complaint IDs:
data complaintid (keep=complaint_id);
set database.complaint_vw;
if date_reported >= '1oct2013:00:00:00'dt
& date_reported <='31dec2015:00:00:00'dt ;
if complaint_type ne 99;
if inactive ne 'Y';
run;
Step 2: Using the complaint IDs from Step 1 (I exported the data into Excel then copied and pasted the complaint IDs into my syntax), I start pulling records from another table.
data person1 (keep = case_number complaint_id person_id date_reported);
set database.complaint;
where complaint_id = 168
or complaint_id = 174
or complaint_id = 176
or complaint_id = 177 etc. etc., 999 in all ;
run;
data person2 (keep = case_number complaint_id person_id date_reported);
set database.complaint;
where complaint_id = 678
or complaint_id = 889
or complaint_id = 989 etc. etc., 222 in all ;
run;
proc sort data = person1;
by person_id;
run;
proc sort data = person2;
by person_id;
run;
Now, right here is where I lose my 41 records. They simply do not make it through this merge. They are in person1 before the merge.
data personfinal;
merge person1 person2;
by person_id;
run;
Any ideas? Thank you in advance.
... View more