Solved: Conditional formatting in proc report for a variable that is both nume...

asinusdk · Posted 10-12-2018 08:52 PM

My data is sorted by both year and personal id and there are plural observations per each year as well.

While some predictors are time-invariant, others are time-varying (but categorical).

Some are dead, so there are variables indicating dates of death, but it is only shown in the last year.

I should manage this data so that I can exclude some IDs by criteria (such as excluding people who didn't show any change in time-varying variables and I also have to count the number of the event if it happens.)

My data looks like this.

year id sex inc visit1 visit2 death

1999 1 1 3 1 . .

2000 1 1 3 1 . .

2000 1 1 2 1 . .

2001 1 1 2 . . .

2002 1 1 2 . . .

1999 2 2 3 1 1 .

2000 2 2 3 1 1 .

2001 2 2 2 1 1 200108

1999 3 2 3 . . .

2000 3 2 3 . . .

2001 3 2 3 . . .

2002 3 2 3 . . .

so I want to exclude id3 because the value in inc is constant and also want to calculate the number of visit1 and visit 2 for id 1 and id2 by year. I wonder how I can manipulate data. Should I stick to person-period format? Otherwise, should I transpose the data so that it becomes person-level data for data handling.

+addition

Thanks for the suggestion. Here is the data I want to get

year id sex inc visit1 visit2 death total_visit1 total_visit2

1999 1 1 3 1 . . 3 0

2000 1 1 3 1 . . 3 0

2000 1 1 2 1 . . 3 0

2001 1 1 2 . . . 3 0

2002 1 1 2 . . . 3 0

1999 2 2 3 1 1 . 5 5

2000 2 2 3 1 1 . 5 5

2001 2 2 2 1 1 200108 5 5

I eliminated id3 because there is no variation in inc. I want IDs that showed variation in inc. How can I eliminate those who don't meet the criteria?

Thanks in advance.

Ksharp · Posted 10-14-2018 06:21 AM

data have;
input year   id   sex   inc   visit1  visit2 death;
cards;
1999  1     1       3      1        .         .
2000  1     1       3      1        .         .
2000  1     1       2      1        .         .
2001  1     1       2      .        .         .
2002  1     1       2      .         .         .
1999  2     2       3      1        1        .
1999  2     2       3      1        1        .
1999  2     2       3      1        1        .
2000  2     2       3      1        1        .
2001  2     2       2      1        1        200108
1999  3     2       3      .        .        .
2000  3     2       3     .        .        .
2001  3     2       3      .        .        .
2002  3     2       3      .        .        .
;
run;
proc sql;
create table want as
 select *,coalesce(sum(visit1),0) as total_visit1 ,
coalesce(sum(visit2),0) as total_visit2
  from have
   group by id
    having count(distinct inc) ne 1
	 order by id,year;
quit;

View solution in original post

ChrisNZ · Posted 10-13-2018 12:04 AM

Your explanations are hard to follow.

Please show the expected output, and the logic used to achieve the values shown.

High-Performance SAS Coding - Third Edition

mkeintz · Posted 10-13-2018 09:47 AM

There is a good reason for @ChrisNZ to ask for a sample desired output. For instance, it is not clear if you want to generate one record per year (edited in "year") with total visit counts, or preserve multiple records per year, but including a total yearly visit1 and visit2 count.

However, as to your question, there is no apparent reason to reorganize the data layout to accommodate your requests.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ksharp · Posted 10-14-2018 06:21 AM

data have;
input year   id   sex   inc   visit1  visit2 death;
cards;
1999  1     1       3      1        .         .
2000  1     1       3      1        .         .
2000  1     1       2      1        .         .
2001  1     1       2      .        .         .
2002  1     1       2      .         .         .
1999  2     2       3      1        1        .
1999  2     2       3      1        1        .
1999  2     2       3      1        1        .
2000  2     2       3      1        1        .
2001  2     2       2      1        1        200108
1999  3     2       3      .        .        .
2000  3     2       3     .        .        .
2001  3     2       3      .        .        .
2002  3     2       3      .        .        .
;
run;
proc sql;
create table want as
 select *,coalesce(sum(visit1),0) as total_visit1 ,
coalesce(sum(visit2),0) as total_visit2
  from have
   group by id
    having count(distinct inc) ne 1
	 order by id,year;
quit;

asinusdk · Posted 10-15-2018 10:32 AM

Thank you so much. It really helps a lot.

Conditional formatting in proc report for a variable that is both numerical and categorical

Re: Conditional formatting in proc report for a variable that is both numerical and categorical

Re: Conditional formatting in proc report for a variable that is both numerical and categorical

Re: Conditional formatting in proc report for a variable that is both numerical and categorical

Re: Conditional formatting in proc report for a variable that is both numerical and categorical

Re: Conditional formatting in proc report for a variable that is both numerical and categorical

Catch up on SAS Innovate 2026

Conditional formatting in proc report for a variable that is both numerical and categorical

Re: Conditional formatting in proc report for a variable that is both numerical and categorical

Re: Conditional formatting in proc report for a variable that is both numerical and categorical

Re: Conditional formatting in proc report for a variable that is both numerical and categorical

Re: Conditional formatting in proc report for a variable that is both numerical and categorical

Re: Conditional formatting in proc report for a variable that is both numerical and categorical

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away