Help using Base SAS procedures

How to Match Observations from one Variable to another Variable?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

How to Match Observations from one Variable to another Variable?

I have recently migrated over from STATA. I have a set of panel data and want to observe the change in observations for a specific variable from one period to the next. I initially used the compare procedure; however, this does not work if the two variables are not sorted the same, etc. How should I go about comparing these variables?


Accepted Solutions
Solution
‎05-15-2014 11:34 AM
Occasional Contributor
Posts: 5

Re: How to Match Observations from one Variable to another Variable?

Posted in reply to brettmullga

Hi brettmullga,

Please find below the solution using datastep and sql. I have created a dataset based on the information you gave and tried to get the result in the final dataset. If you have more months column then the coding would need to be updated. Please let me know if you have any questions.

All the best.

Thanks.

Rafi

DATA original;

INPUT Participant Month1 Month2 Month3 Month4;

CARDS;

1 8 1 . 9

2 . . 3 15

3 4 5 . 12

4 1 . . 10

;

RUN;

* Count total for each month;

PROC SQL;

  CREATE TABLE m_tot AS

  SELECT *, COUNT (*) as total, COUNT(month1) AS m1_tot, COUNT(month2) AS m2_tot, COUNT(month3) AS m3_tot, COUNT(month4) AS m4_tot

  FROM original

  ORDER BY participant;

QUIT;

* Using datasetp - 1) flag patient movement in each month and 2) flag patient who has value in previous month(s);

DATA flag;

  SET m_tot;

  IF month1 NE . AND month2 NE . THEN m1tom2f = 1;

  IF month2 NE . AND month3 NE . THEN m2tom3f = 1;

  IF month3 NE . AND month4 NE . THEN m3tom4f = 1;

  

  IF month2 NE . AND month1 NE . THEN m2tom1f = 1;

  IF month3 NE . AND N(month1,month2)  GE 1 THEN m3tom1f = 1;

  IF month4 NE . AND N(month1,month2,month3) GE 1 THEN m4tom1f = 1;

RUN;

* Using proc sql - count necessary values;

PROC SQL;

  CREATE TABLE finalx AS

  SELECT *, COUNT(m1tom2f) AS m1tom2, COUNT(m2tom3f) AS m2tom3, COUNT(m3tom4f) AS m3tom4, COUNT(m2tom1f) AS m2tom1, COUNT(m3tom1f) AS m3tom1, COUNT(m4tom1f) AS m4tom1

  FROM flag

  ORDER BY participant;

QUIT;

* Final dataset where all the necessary information for outputs will be found;

DATA final;

  SET finalx (DROP=participant month1 month2 month3 month4  m1tom2f m2tom3f m3tom4f m2tom1f m3tom1f m4tom1f) END=eof ;

  IF eof;

  LABEL total = 'Toatl participant'

        m1_tot = 'Total participant in Month 1'

        m2_tot = 'Total participant in Month 2'

        m3_tot = 'Total participant in Month 3'

        m4_tot = 'Total participant in Month 4'

        m1tom2 = 'Participated in Month 1 and Month 2'

        m2tom3 = 'Participated in Month 2 and Month 3'

        m3tom4 = 'Participated in Month 3 and Month 4'

        m2tom1 = 'Participated in Month 2 and in earlier month(s)'

        m3tom1 = 'Participated in Month 3 and in earlier month(s)'

        m4tom1 = 'Participated in Month 4 and in earlier month(s)'

        ;

RUN;

View solution in original post


All Replies
Super User
Posts: 19,815

Re: How to Match Observations from one Variable to another Variable?

Posted in reply to brettmullga

What does your data look like? What type of output do you expect?

Occasional Contributor
Posts: 6

Re: How to Match Observations from one Variable to another Variable?

The variable I'm concerned is a list of participants. I have monthly data that goes back a decade or so. From one month to the next, I want to see how many drop out and how many new participants enter. I'm currently getting descriptive statistics as a part of a larger project.

Super User
Posts: 19,815

Re: How to Match Observations from one Variable to another Variable?

Ok...but how do your rows and columns look like. Do you have one record per participant and different variables for different time points or one record per participant per time?

It's really helpful to post some sample data and expected sample output.

Occasional Contributor
Posts: 6

Re: How to Match Observations from one Variable to another Variable?

Thank you, Reeza. I was confused about what I was asking for.

ParticipantMonth1Month2Month3Month4

1

81.9
2..315
345.12
41..10

I want to be able to count the number of participants in each month and the change in participants each month.

Here is my first attempt at this:

proc sql;

     select count(*) as Total,

     count(month1) as Month_1,

     count(month2) as Month_2,

     count(month3) as Month_3,

     count(month4) as Month_4

     from work.charity;

     quit;

Output:

Total     Month_1     Month_2     Month_3     Month_4 

    4            3                 2               1               4

This works to count individual columns but not the intersection of two months. The intersection will illustrate how many from one month continued on to the next month. Is it possible to specify this within the count statement?

Super User
Posts: 19,815

Re: How to Match Observations from one Variable to another Variable?

Posted in reply to brettmullga

What do you want the output to look like.

Occasional Contributor
Posts: 6

Re: How to Match Observations from one Variable to another Variable?

Output 1:

Month1intersectMonth2     Month2intersectMonth3     Month3intersectMonth4

               2                                        0                                   1

    

Output 2:

Month2intersectMonth1     Month3intersectunion(2,1)     Month4intersectunion(3,2,1)

               2                                        0                                        4

Output 1 is the number of rows that have values from one month to the next.

Output 2 is the number of rows that have non-unique entries in the preceding columns.

Note: for this to work, an if then else statement may have to be applied to turn the data set into 1s and missing.

Super User
Super User
Posts: 7,955

Re: How to Match Observations from one Variable to another Variable?

Posted in reply to brettmullga

Hi,


Try:

data have;
  attrib p format=$20. m1-m4 format=best.;
  infile datalines delimiter=',';
  input p $ m1-m4;
datalines;
1,8,1,., 9
2,.,.,3,15
3,4,5,.,12
4,1,.,.,10
;
quit;
proc sql;
  create table OUTPUT1
  (
    MONTH1INTERSECTMONTH2 num,
    MONTH2INTERSECTMONTH3 num,
    MONTH3INTERSECTMONTH4 num
  );
  create table OUTPUT2
  (
    MONTH2INTERSECTMONTH1 num,
    MONTH3INTERSECTMONTH21 num,
    MONTH4INTERSECTMONTH321 num
  );
  insert into OUTPUT1
  set MONTH1INTERSECTMONTH2=(select count(P) from WORK.HAVE where M1 is not null and M2 is not null),
      MONTH2INTERSECTMONTH3=(select count(P) from WORK.HAVE where M2 is not null and M3 is not null),
      MONTH3INTERSECTMONTH4=(select count(P) from WORK.HAVE where M3 is not null and M4 is not null);
  insert into OUTPUT2
  set MONTH2INTERSECTMONTH1=(select count(P) from WORK.HAVE where M2 is not null and M1 is not null),
      MONTH3INTERSECTMONTH21=(select count(P) from WORK.HAVE where M3 is not null and (M2 is not null or M1 is not null)),
      MONTH4INTERSECTMONTH321=(select count(P) from WORK.HAVE where M4 is not null and (M3 is not null or M2 is not null or M1 is not null));
quit;


Frequent Contributor
Posts: 83

Re: How to Match Observations from one Variable to another Variable?

Or you could use this after the initial read in:

data have2;

    set have;

    array mon (4) m1-m4;

    array moni (3) mi1-mi3;

    do i = 1 to 3;

    if mon(i) ~=. and mon(i+1)~=. then moni(i) = 1;

    else moni(i) = 0;

    end;

run

proc sql;

create table want as

select distinct sum(mi1) as mi1, sum(mi2) as mi2, sum(mi3) as mi3, sum(case when m2~=. and m1 ~=. then 1 else 0 end) as m2mi1,

sum(case when m3~=. and (m1~=. or m2~=.) then 1 else 0 end) as m3mi12, sum(case when m4 ~=. and (m1~=. or m2 ~=. or m3 ~=.) then 1 else 0 end) as m4mi123

from have2;

;

Solution
‎05-15-2014 11:34 AM
Occasional Contributor
Posts: 5

Re: How to Match Observations from one Variable to another Variable?

Posted in reply to brettmullga

Hi brettmullga,

Please find below the solution using datastep and sql. I have created a dataset based on the information you gave and tried to get the result in the final dataset. If you have more months column then the coding would need to be updated. Please let me know if you have any questions.

All the best.

Thanks.

Rafi

DATA original;

INPUT Participant Month1 Month2 Month3 Month4;

CARDS;

1 8 1 . 9

2 . . 3 15

3 4 5 . 12

4 1 . . 10

;

RUN;

* Count total for each month;

PROC SQL;

  CREATE TABLE m_tot AS

  SELECT *, COUNT (*) as total, COUNT(month1) AS m1_tot, COUNT(month2) AS m2_tot, COUNT(month3) AS m3_tot, COUNT(month4) AS m4_tot

  FROM original

  ORDER BY participant;

QUIT;

* Using datasetp - 1) flag patient movement in each month and 2) flag patient who has value in previous month(s);

DATA flag;

  SET m_tot;

  IF month1 NE . AND month2 NE . THEN m1tom2f = 1;

  IF month2 NE . AND month3 NE . THEN m2tom3f = 1;

  IF month3 NE . AND month4 NE . THEN m3tom4f = 1;

  

  IF month2 NE . AND month1 NE . THEN m2tom1f = 1;

  IF month3 NE . AND N(month1,month2)  GE 1 THEN m3tom1f = 1;

  IF month4 NE . AND N(month1,month2,month3) GE 1 THEN m4tom1f = 1;

RUN;

* Using proc sql - count necessary values;

PROC SQL;

  CREATE TABLE finalx AS

  SELECT *, COUNT(m1tom2f) AS m1tom2, COUNT(m2tom3f) AS m2tom3, COUNT(m3tom4f) AS m3tom4, COUNT(m2tom1f) AS m2tom1, COUNT(m3tom1f) AS m3tom1, COUNT(m4tom1f) AS m4tom1

  FROM flag

  ORDER BY participant;

QUIT;

* Final dataset where all the necessary information for outputs will be found;

DATA final;

  SET finalx (DROP=participant month1 month2 month3 month4  m1tom2f m2tom3f m3tom4f m2tom1f m3tom1f m4tom1f) END=eof ;

  IF eof;

  LABEL total = 'Toatl participant'

        m1_tot = 'Total participant in Month 1'

        m2_tot = 'Total participant in Month 2'

        m3_tot = 'Total participant in Month 3'

        m4_tot = 'Total participant in Month 4'

        m1tom2 = 'Participated in Month 1 and Month 2'

        m2tom3 = 'Participated in Month 2 and Month 3'

        m3tom4 = 'Participated in Month 3 and Month 4'

        m2tom1 = 'Participated in Month 2 and in earlier month(s)'

        m3tom1 = 'Participated in Month 3 and in earlier month(s)'

        m4tom1 = 'Participated in Month 4 and in earlier month(s)'

        ;

RUN;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 550 views
  • 8 likes
  • 5 in conversation