BookmarkSubscribeRSS Feed
juliajulia
Obsidian | Level 7

I have a longitudinal data set from 2013 to 2016 and I want to know how many people in the 2013 are not in 2014, 2015 and 2016 and how many people are new in 2014, 2015 and 2016. not sure what is the best way to do this. my data looks like this:

 

id year x1 x2

1 2013 1 2

1 2014 2 1

1 2015 1 0

1 2016 0 1

2 2015 0 1

2 2016 0 1

3 2014 1 0

3 2015 1 0

4 2013 0 1

4 2014 01

4 2016 01

;

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Should just be a matter of proc freq:

proc sort data=have out=want;
  by year id;
run;
proc freq;
  tables year / output=num_id_by_year;
run;

It really depends, can't tell what output you are looking for, show an example.  You could take min/max for each by group - assumes of course there are no gaps that way.  Or you could transpose the data to get years across the top, then proc summary that  to get overall per year.  Really just guessing what you want though.

Astounding
PROC Star

The best way would give you all patterns that exist in the data.  That way, you can even track if a person left and later returned.  For example (and assuming your data is sorted by ID):

 

data want;

set have;

by id;

if first.id then pattern = 'NNNN';

retain pattern;

if year = 2013 then substr(pattern, 1, 1) = 'Y';

else if year = 2014 then substr(pattern, 2, 1) = 'Y';

else if year = 2015 then substr(pattern, 3, 1) = 'Y';

else if year = 2016 then substr(pattern, 4, 1) = 'Y';

if last.id;

run;

 

That gives you a four-character PATTERN variable for each ID that shows the pattern of participation over the years.  To count them, use PROC FREQ:

 

proc freq data=want;

tables pattern;

run;

juliajulia
Obsidian | Level 7

Thanks. this works. here is the code I used:

PROC SORT DATA=out2.Client_VL_Euci44_f

OUT=test1;

BY euci44 year;

RUN ;

data test;

set test1;

by euci44;

if first.euci then pattern = '3456';

if year = 2013 then substr(pattern, 1, 1) = '3';

else if year = 2014 then substr(pattern, 2, 1) = '4';

else if year = 2015 then substr(pattern, 3, 1) = '5';

else if year = 2016 then substr(pattern, 4, 1) = '6';

if last.euci44;

run;

proc freq data=test ;

table pattern;

run;

 

pattern Frequency Percent Cumulative
Frequency
Cumulative
Percent
6 310244 67.23 310244 67.23
5 71554 15.51 381798 82.73
4 46492 10.07 428290 92.81
3 33186 7.19 461476 100.00
Astounding
PROC Star

Close, but not there yet.  This code gives you the last year of participation.  There are two things missing.  First, this statement needs to appear in the DATA step:

 

retain pattern;

 

Second, the initial value for PATTERN cannot be "3456".  Otherwise how can you tell when "3" means you put it there initially, or it was put there by the appearance of 2013 in the data?  This would be an acceptable initial value for PATTERN:

 

if first.euci44 then pattern = '----';

 

That way, any digits that appear in the end would be the result of a matching value for YEAR in the data.

ballardw
Super User

Without knowing what you actually want the output to look like it is hard to provide lots of details.

Here are a couple of ideas.

data have;
   input id year x1 x2;
datalines;
1 2013 1 2
1 2014 2 1
1 2015 1 0
1 2016 0 1
2 2015 0 1
2 2016 0 1
3 2014 1 0
3 2015 1 0
4 2013 0 1
;
;
proc sql;
   create table missing2014 as
   select distinct id from have where year=2013
   except
   select distinct id from have where year=2014
   ;
   create table new2014 as
   select distinct id from have where year=2014
   except
   select distinct id from have where year=2013
   ;
quit;

proc format library=work;
value present
., 0 = ' '
other= 'X'
;
run;

proc tabulate data=have;
   class id year;
   tables id, year*n=''*f=present.;
run;

The Proc SQL could include multiple years for comparison by using: where year in (2014 2015 2016) for example to get ids that appeared in ANY of those years.

 

The tabulate provides a table indicating the years that each Id appears.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1496 views
  • 0 likes
  • 4 in conversation