BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
wj2
Quartz | Level 8 wj2
Quartz | Level 8

Hello Community,

 

I need to create a dataset from a raw data file that only retains individuals in a study who completed a study assessment at two data points- one at study baseline (VISNO=01) AND one at study follow-up (VISNO=02). Individuals who only completed a study assessment at only one time point should not be included in the new dataset. I have provided an example of my input file as well as my desired output file below. Any help with how to program this would be greatly appreciated!

 

Have:

PID

VISNO

Other variables

01

01

01

02

02

01

03

01

03

02

04

02

05

02

 

Want:

PID

VISNO

Other variables

01

01

01

02

03

01

03

02

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

If your data is sorted as shown this might work:

 

data want;
  set have;
  by pid ;
  if not (first.pid and last.pid);
run;

BY creates automatic boolean variables that indicate whether a record is the first or last of a by group. The variables are accessed using the First. and Last. (note the dots there) before the variable.

So if there is only a single record then it is both first and last at the same time.

 

 


@wj2 wrote:

Hello Community,

 

I need to create a dataset from a raw data file that only retains individuals in a study who completed a study assessment at two data points- one at study baseline (VISNO=01) AND one at study follow-up (VISNO=02). Individuals who only completed a study assessment at only one time point should not be included in the new dataset. I have provided an example of my input file as well as my desired output file below. Any help with how to program this would be greatly appreciated!

 

Have:

PID

VISNO

Other variables

01

01

01

02

02

01

03

01

03

02

04

02

05

02

 

Want:

PID

VISNO

Other variables

01

01

01

02

03

01

03

02


 

View solution in original post

2 REPLIES 2
ballardw
Super User

If your data is sorted as shown this might work:

 

data want;
  set have;
  by pid ;
  if not (first.pid and last.pid);
run;

BY creates automatic boolean variables that indicate whether a record is the first or last of a by group. The variables are accessed using the First. and Last. (note the dots there) before the variable.

So if there is only a single record then it is both first and last at the same time.

 

 


@wj2 wrote:

Hello Community,

 

I need to create a dataset from a raw data file that only retains individuals in a study who completed a study assessment at two data points- one at study baseline (VISNO=01) AND one at study follow-up (VISNO=02). Individuals who only completed a study assessment at only one time point should not be included in the new dataset. I have provided an example of my input file as well as my desired output file below. Any help with how to program this would be greatly appreciated!

 

Have:

PID

VISNO

Other variables

01

01

01

02

02

01

03

01

03

02

04

02

05

02

 

Want:

PID

VISNO

Other variables

01

01

01

02

03

01

03

02


 

PaigeMiller
Diamond | Level 26

If you make VISNO numeric instead of character, this is pretty simple

 

proc sql;
    create table want as select * from have 
    where max(visno)=2 and min(visno)=1 
    group by pid;
quit;
--
Paige Miller

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 859 views
  • 2 likes
  • 3 in conversation