Help using Base SAS procedures

Help with data manipulation in data step

Occasional Contributor
Posts: 7

Help with data manipulation in data step

Hi All,

I am trying to link records in my data by whether a few conditions apply. I want to create a new variable to determine contact with an infected classmate. I have my data so that for every day a student is in a particular school there is a separate record. I want to create a new variable "contact" that indicates whether a student was at the same school on the same day as any of the infected students (does not matter if it was one infected student or 10). I also need a way to prevent it saying it was in contact with an infected student if the infected student was himself (I hope that makes sense).

Please help!

Thank you in advance.

Posts: 22

Re: Help with data manipulation in data step

I may be mis-reading your question and if so, I apologize. But it sounds like you can have a dataset (healthy) with variables     (school, student, date-attended) and another dataset (infected) with variables (school,infected-student,date-attended) then do a merge:

proc sort data=healthy; by school date-attended;

proc sort data=infected; by school date-attended;

data mergem;

set healthy; by school date-attended;

set infected; by school date-attended;

merge healthy (in=inh) infected (in=inf);

if inh and inf then output;

proc sort data=mergem; by student infected-student;

proc freq data=mergem; tables student*infected-student;

I think that would give you a table by healthy-student and each infected-student with the # of potential interactions for each combination.

Hope this helps.

Esteemed Advisor
Posts: 5,535

Re: Help with data manipulation in data step

I will make a slightly different guess about your data structure and assume you have a single dataset with variables School, Student, Date, and Contagious. The following query would give you useful information (untested) :

proc sql;

create table contact as



     coalesce(count(distinct B.student), 0) as nbStudents,

     min( as firstContact format=date9.,

     max( as lastContact format=date9.


     (select * from attendance where not contagious) as A left join

     (select * from attendance where contagious) as B

          on and and A.student ne B.student

group by, A.student;



Occasional Contributor
Posts: 7

Re: Help with data manipulation in data step

I think this worked! Thank you so much.

Ask a Question
Discussion stats
  • 3 replies
  • 3 in conversation