Can someone help me build a complex query?
I want to go thru all inpatient(ABC files) and outpatient (DEF files) files that we have. They are stored with same name and then FYNN. I need 10 years worth.
So I need it to go thrru each inpatient file by year and see if the ssn matches one in a file of 200k numbers. If it matches, then I want it to look thru a group of 10 diagnoses and exclude any records with dx 345, 346, or 347.
I want to end up with all the records that match the ssn's I was looking for that do NOt have the excluded diagnoses.
So how to do the looping for the years and possibly an array of the diagnoses...?
1) use a SAS format for your look-up on SSN values.
2) you can use the INTNX and INTCK functions in the SAS DATA step to increment (loop) through the time-period dates (year) - a DO / END loop could work, depending on your SAS data variable values.
3) if you need it, you can pre-load an array with constant values, and use a DATA step code construct to find a match, like below:
DO I=1 TO DIM();
* your match-test-logic goes here. ;
Suggest you post a small example of your input data, code up a SAS DATA step to read the DATALINES (sample input), creating a SAS file, then begin putting together your overall program, one small component at a time, and then post a reply to the forum subscribers for feedback and guidance.
It looks like you just have 20 files to examine. It may be easier to hard-code the concatenation than to build the looping code.
The ARRAY approach is the easiest way to go through the 10 diagnosis coded. In the loop, set a flag for the exclusion; outside the loop, examine the flag to decide to output the record or not.
Your expressed request, and this approach, will NOT exclude persons with these diagnoses on ANY encounter. To do that, output an ID list of the subset with the diagnosis on any encounter and re-merge that with the original set of hits. That is an easy SQL code:
SELECT * FROM keepers
WHERE ssn NOT IN (SELECT DISTINCT ssn FROM excludes)
I am trying to make the code as efficient as possible too, which is why I thought it might be better to loop through the years and accumulate everything at the end.
I am hoping that I can create a model that we can use repetitvely as well.
As suggested previously, explore using a DO / END loop to output SAS (numeric) DATE variables, using the INTNX function - see the SAS support http://support.sas.com/ website supplemental technical paper (link below) on this topic.