Desktop productivity for business analysts and programmers

Data missing in output for "NO" values

Reply
Frequent Contributor
Posts: 104

Data missing in output for "NO" values

 
 

Good morning,

 

This is my program for a diabetes screening and towards the end of the program what I'm doing is assigning "NO" or "YES" to to people who have had or haven't had a diabetes screening.  For some reason, the ones who have "YES" assigned to them, the data shows all the information pertaining to the individual enrollees.  The "NO" only brings in the "NO" and nothing else.  I'm not sure why this is bringing in full data for those with "YES" but no other variables/data for the other columns, coming in for those with "NO".  Any help you can provide is greatly appreciated.

 

I attached the log and an example of what the output looks like.

 

Thank you again.

Super User
Posts: 9,913

Re: Data missing in output for "NO" values

Start by inspecting the two datasets that go into that last data step.

If that does not immediately provide a clue, post samples of both (in a data step, see my footnotes).

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Posts: 13,330

Re: Data missing in output for "NO" values


@essdee wrote:
 
 

Good morning,

 

This is my program for a diabetes screening and towards the end of the program what I'm doing is assigning "NO" or "YES" to to people who have had or haven't had a diabetes screening.  For some reason, the ones who have "YES" assigned to them, the data shows all the information pertaining to the individual enrollees.  The "NO" only brings in the "NO" and nothing else.  I'm not sure why this is bringing in full data for those with "YES" but no other variables/data for the other columns, coming in for those with "NO".  Any help you can provide is greatly appreciated.

 

I attached the log and an example of what the output looks like.

 

Thank you again.


Apparently the relevant code is:

648        DATA NUMERATOR;
649        MERGE HEDIS.DENOM2C (IN=A)
650        	  HEDIS.SCREENING (IN=B);
651        BY EDIPN;
652        IF A;
653        IF B THEN DIABSCREEN="Y";
654        ELSE DIABSCREEN="N";
655        RUN;

Your statement:

 

IF A;

means that the  only records kept are those which match the variable Edipn in data set a.

So it is very likely that your concern with "NO" (actually "N") is that there are no matching Edipn values.

 

But since at no point is there any actual data included for input and since there are nearly 20 data step or SQL steps manipulating the data I am not going even attempt to trace anything prior to that.

 

Select some values of Edipn from your numerator data set. Print the values from both HEDIS.DENOM2C and HEDIS.SCREENING for those values.

If there aren't any values in HEDIS.SCREENING that's your issue, no matches.

Frequent Contributor
Posts: 104

Re: Data missing in output for "NO" values

Right, but the other variables are not coming in for all the "N" values. Someone QA'd it and found it strange that no other variables are there except the EDIPN numbers.
Super User
Posts: 9,913

Re: Data missing in output for "NO" values

Without seeing the data for that last step, we can only make guesses.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Posts: 13,330

Re: Data missing in output for "NO" values

[ Edited ]

@essdee wrote:
Right, but the other variables are not coming in for all the "N" values. Someone QA'd it and found it strange that no other variables are there except the EDIPN numbers.

Show the results of the Proc Prints that I suggested. If there are no matches in the second set then those values cannot be brought in because they do not exist.

 

An example of what may be going on with your data:

data work.junk1;
   input id val1 val2;
datalines;
1 2 3
2 4 5
5 6 7
;

data work.junk2;
   input id val3 val4;
datalines;
6 6 6
7 7 7
;
run;

data work.out;
   merge work.junk1 (in=A)
         work.junk2 (in=B)
   ;
   by id;
   if a;
   if b then newvar='Y';
   else newvar='N';
run;

The final data has missing values for the variables from the in=B set because there are no matches of the by variable between the two data sets. The variable Newvar is set to 'N' because B is never true for the data kept by the IF A; statement in this case.

 

Frequent Contributor
Posts: 104

Re: Data missing in output for "NO" values

I just realized my attachments don't appear to be showing.
Ask a Question
Discussion stats
  • 6 replies
  • 98 views
  • 0 likes
  • 3 in conversation