Solved: Re: Do loop to keep only first observations

LEINAARE · Posted 12-14-2018 10:26 AM

Hello,

I have a dataset that contains numerous observations for each unique id (i.e. 'orsid'). I am preparing my data to run a program that selects Medicaid claims data specific to family planning. In order for eligible enrollees who do not have claims data in a given year to be included in the denominator for calculations, I am supposed to retain one observation for that individual, and set the date to January 1 of the year.

I know that my use of 'first.orsid' in the code below is incorrect, but I am including it to demonstrate the logic that I have been trying to use. Could anyone help me to devise a strategy for retaining the first observation and changing the date only for individuals who do not have claims records for that year? If an individual (orsid) has no claims data for the year, then they would have 12 observations for that year (Jan 01 - Dec 01) in the input dataset. If they do have claims data, they would have many more observations than that.

data claim12b;
    set claim12;
    by orsid;
    if (icd1='' and icd2='' and icd3='' and icd4='' and icd5='')
        and (prc1='' and prc2='' and prc3='')
        and (ndc1='' and ndc2='' and ndc3='' and ndc4='' and ndc5='' and ndc6='')
        then do;
            keep first.orsid;
            date="o1jan2012"d;
        end;
run;

I know this is incorrect in two ways. First, even if the logic worked, it would read each individual observation, rather than reading all 12 observations for the unique ID and determining if the individual had no claims during the course of 12 months. Secord, the 'keep first.orsid' statement does not work.

I greatly appreciate any guidance you may offer.

Thank you,

Ted

PaigeMiller · Posted 12-14-2018 10:30 AM

data claim12b;
    set claim12;
    by orsid;
    if (icd1='' and icd2='' and icd3='' and icd4='' and icd5='')
        and (prc1='' and prc2='' and prc3='')
        and (ndc1='' and ndc2='' and ndc3='' and ndc4='' and ndc5='' and ndc6='')
        then do;
            if first.orsid then do;
                date="o1jan2012"d;
                output;
            end;
        end;
    else output;
run;

--
Paige Miller

View solution in original post

PaigeMiller · Posted 12-14-2018 10:30 AM

data claim12b;
    set claim12;
    by orsid;
    if (icd1='' and icd2='' and icd3='' and icd4='' and icd5='')
        and (prc1='' and prc2='' and prc3='')
        and (ndc1='' and ndc2='' and ndc3='' and ndc4='' and ndc5='' and ndc6='')
        then do;
            if first.orsid then do;
                date="o1jan2012"d;
                output;
            end;
        end;
    else output;
run;

--
Paige Miller

LEINAARE · Posted 12-14-2018 01:25 PM

Hi @PaigeMiller,

Thank you for offering a solution to this problem!

Ted

Do loop to keep only first observations

Re: Do loop to keep only first observations

Re: Do loop to keep only first observations

Re: Do loop to keep only first observations

SAS Innovate 2025: Save the Date