Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Data Mining within group

Reply
Regular Contributor
Posts: 215

Data Mining within group

Hi All,

I have the following table. can anyone help me with the code so that I can get student list (and their course) who has taken English 1 or English 2 first.

I have written the following code, but t's not working.

data have;

set out;

if first.Course='English 2' or if first.course='English 1';

run;

Student_ID        Course

101                   English 2

101                   Math 1

101                   Math 3

102                   English 1

102                   Math 2

102                   Math 2

103                   Math 2

103                   English 1

104                   English 1

104                   Math 3

My expected output would be:

Student_ID        Course

101                   English 2

101                   Math 1

101                   Math 3

102                   English 1

102                   Math 2

102                   Math 2

104                   English 1

104                   Math 3

SAS Super FREQ
Posts: 272

Re: Data Mining within group

There might be a better way to do this, but this should work:

data have;

  set out;

  by Student_ID;

  retain write;

  if first.Student_ID then do;

    if course in ('English 1', 'English 2') then write=1;

    else write=0;

  end;

  if write then output;

  drop write;

run;

Regular Contributor
Posts: 215

Re: Data Mining within group

Hi WendyCzika,

Thanks for the help. It worked!!!

Regular Contributor
Posts: 215

Re: Data Mining within group

Hi WendyCzika,

Would you please describe me a bit how the following part of your code worked, as I want to learn so that it help me in future.

Thanks,

data have;

  set out;

  by Student_ID;

  retain write;

if first.Student_ID then do;

    if course in ('English 1', 'English 2') then write=1;

    else write=0;

  end;

  if write then output;

  drop write;

run;

SAS Super FREQ
Posts: 272

Re: Data Mining within group

Adding comments to the code, hope this helps! 

data have;

  set out;

/****************************************************************************************************************************

    Use the BY statement to identify Student_ID as a BY variable so that first.student_id and last.student_id

    temporary variables are created:

    SAS sets the value of FIRST.variable to 1 when it reads the first observation in a BY group, and sets the value of        

   LAST.variable to 1 when it reads the last observation in a BY group. These temporary variables are

    available for DATA step programming but are not added to the output data set.

****************************************************************************************************************************/

  by Student_ID;

/****************************************************************************************************************************

Specify a variable in the RETAIN statement to keep the same value from observation to the next, instead of resetting to missing at each iteration of the

DATA step - so the WRITE variable will keep the same value for all observations with the same student_ID since it is only being changed for the first obs

of each student_id

****************************************************************************************************************************/

  retain write;

/****************************************************************************************************************************

So at the first observation in each BY group (first obs for each student_ID), set the variable WRITE to 1 if the course is English 1 or English 2, otherwise

  set WRITE to 0, and keep this value for the rest of the obs with this same student_ID.

****************************************************************************************************************************/

  if first.Student_ID then do;

    if course in ('English 1', 'English 2') then write=1;

    else write=0;

  end;


/**************************************************************************************************************************** 

Output all the obs that have WRITE=1.

****************************************************************************************************************************/

  if write then output;

  drop write;

run;

Regular Contributor
Posts: 215

Re: Data Mining within group

Hi WendyCzika,

I need one more help here. Can you please tell me the code to find out the students who has taken "English 1" and "Math 3"?


output would be:


Student_ID        Course

104                   English 1

104                   Math 3

Super Contributor
Posts: 275

Re: Data Mining within group

proc sql;

      select * from have group by student_id having course='English 1'  and course='Math 3';

quit;

Super User
Posts: 9,681

Re: Data Mining within group

data have;

input Student_ID        Course & $20.;

cards;

101                   English 2

101                   Math 1

101                   Math 3

102                   English 1

102                   Math 2

102                   Math 2

103                   Math 2

103                   English 1

104                   English 1

104                   Math 3

;

run;

data want;

do until(last.Student_ID);

  set have;

  by Student_ID;

  if Course='English 1' then a=1;

  if Course='Math 3' then b=1;

end;

do until(last.Student_ID);

  set have;

  by Student_ID;

  if a and b then output;

end;

drop a b;

run;

Regular Contributor
Posts: 215

Re: Data Mining within group

Hi Xia,

Thanks for your code, it's working perfect!!! Would you please answer the following two things:

1. describe the following highlighted section how it works? It will help me understanding you code.

data want;

do until(last.Student_ID);

  set have;

  by Student_ID;

  if Course='English 1' then a=1;

  if Course='Math 3' then b=1;

end;

do until(last.Student_ID);

  set have;

  by Student_ID;

  if a and b then output;

end;

drop a b;

run;

2. If I want student who has taken 'English I' or 'Math 3' should I just write 'if a or b then output' instead of 'if a and b then output'?

Super User
Posts: 9,681

Re: Data Mining within group

1. describe the following highlighted section how it works? It will help me understanding you code.

data want;

do until(last.Student_ID);

  set have;

  by Student_ID;

  if Course='English 1' then a=1;   /*Mark  this Student_ID group has  English 1 */

  if Course='Math 3' then b=1;     /*Mark  this Student_ID group has  Math 3 */

end;

do until(last.Student_ID);

  set have;

  by Student_ID;

  if a and b then output;    /*If these two conditions are satisfied then output. NOTE: it will output the whole Student_ID group */

end;

drop a b;

run;

2. If I want student who has taken 'English I' or 'Math 3' should I just write 'if a or b then output' instead of 'if a and b then output'?

Yes. You can do that .

Xia Keshan

Ask a Question
Discussion stats
  • 9 replies
  • 693 views
  • 4 likes
  • 4 in conversation