BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
asinusdk
Calcite | Level 5

My data is sorted by both year and personal id and there are plural observations per each year as well.

While some predictors are time-invariant, others are time-varying (but categorical). 

Some are dead, so there are variables indicating dates of death, but it is only shown in the last year. 

I should manage this data so that I can exclude some IDs by criteria (such as excluding people who didn't show any change in time-varying variables and I also have to count the number of the event if it happens.)

 

My data looks like this.


year   id   sex   inc   visit1  visit2 death

1999  1     1       3      1        .         .

2000  1     1       3      1        .         .

2000  1     1       2      1        .         .

2001  1     1       2      .        .         .

2002  1     1       2      .         .         .

1999  2     2       3      1        1        .

1999  2     2       3      1        1        .

1999  2     2       3      1        1        .

2000  2     2       3      1        1        .

2001  2     2       2      1        1        200108

1999  3     2       3      .        .        .

2000  3     2       3     .        .        .

2001  3     2       3      .        .        .

2002  3     2       3      .        .        .

 

 

so I want to exclude id3 because the value in inc is constant and also want to calculate the number of visit1 and visit 2 for id 1 and id2 by year. I wonder how I can manipulate data. Should I stick to person-period format? Otherwise, should I transpose the data so that it becomes person-level data for data handling.

 

+addition

Thanks for the suggestion. Here is the data I want to get

 

year   id   sex   inc   visit1  visit2 death total_visit1 total_visit2   

1999  1     1       3      1        .         .              3                    0

2000  1     1       3      1        .         .              3                    0   

2000  1     1       2      1        .         .              3                    0

2001  1     1       2      .        .         .               3                    0

2002  1     1       2      .         .         .              3                    0

1999  2     2       3      1        1        .              5                    5

1999  2     2       3      1        1        .              5                    5

1999  2     2       3      1        1        .              5                    5

2000  2     2       3      1        1        .              5                    5

2001  2     2       2      1        1        200108   5                    5

 

 

I eliminated id3 because there is no variation in inc. I want IDs that showed variation in inc. How can I eliminate those who don't meet the criteria? 

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input year   id   sex   inc   visit1  visit2 death;
cards;
1999  1     1       3      1        .         .
2000  1     1       3      1        .         .
2000  1     1       2      1        .         .
2001  1     1       2      .        .         .
2002  1     1       2      .         .         .
1999  2     2       3      1        1        .
1999  2     2       3      1        1        .
1999  2     2       3      1        1        .
2000  2     2       3      1        1        .
2001  2     2       2      1        1        200108
1999  3     2       3      .        .        .
2000  3     2       3     .        .        .
2001  3     2       3      .        .        .
2002  3     2       3      .        .        .
;
run;
proc sql;
create table want as
 select *,coalesce(sum(visit1),0) as total_visit1 ,
coalesce(sum(visit2),0) as total_visit2
  from have
   group by id
    having count(distinct inc) ne 1
	 order by id,year;
quit;

 

View solution in original post

4 REPLIES 4
ChrisNZ
Tourmaline | Level 20

Your explanations are hard to follow.

Please show the expected output, and the logic used to achieve the values shown.

mkeintz
PROC Star

There is a good reason for @ChrisNZ to ask for a sample desired output.  For instance, it is not clear if you want to generate one record per year (edited in "year") with total visit counts, or preserve multiple records per year, but including a total yearly visit1 and visit2 count.

 

However, as to your question, there is no apparent reason to reorganize the data layout to accommodate your requests.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
data have;
input year   id   sex   inc   visit1  visit2 death;
cards;
1999  1     1       3      1        .         .
2000  1     1       3      1        .         .
2000  1     1       2      1        .         .
2001  1     1       2      .        .         .
2002  1     1       2      .         .         .
1999  2     2       3      1        1        .
1999  2     2       3      1        1        .
1999  2     2       3      1        1        .
2000  2     2       3      1        1        .
2001  2     2       2      1        1        200108
1999  3     2       3      .        .        .
2000  3     2       3     .        .        .
2001  3     2       3      .        .        .
2002  3     2       3      .        .        .
;
run;
proc sql;
create table want as
 select *,coalesce(sum(visit1),0) as total_visit1 ,
coalesce(sum(visit2),0) as total_visit2
  from have
   group by id
    having count(distinct inc) ne 1
	 order by id,year;
quit;

 
asinusdk
Calcite | Level 5

Thank you so much. It really helps a lot.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 706 views
  • 0 likes
  • 4 in conversation