BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
wj2
Quartz | Level 8 wj2
Quartz | Level 8

Hello Community,

 

I need to create a dataset from a raw data file that only retains individuals in a study who completed a study assessment at two data points- one at study baseline (VISNO=01) AND one at study follow-up (VISNO=02). Individuals who only completed a study assessment at only one time point should not be included in the new dataset. I have provided an example of my input file as well as my desired output file below. Any help with how to program this would be greatly appreciated!

 

Have:

PID

VISNO

Other variables

01

01

01

02

02

01

03

01

03

02

04

02

05

02

 

Want:

PID

VISNO

Other variables

01

01

01

02

03

01

03

02

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

If your data is sorted as shown this might work:

 

data want;
  set have;
  by pid ;
  if not (first.pid and last.pid);
run;

BY creates automatic boolean variables that indicate whether a record is the first or last of a by group. The variables are accessed using the First. and Last. (note the dots there) before the variable.

So if there is only a single record then it is both first and last at the same time.

 

 


@wj2 wrote:

Hello Community,

 

I need to create a dataset from a raw data file that only retains individuals in a study who completed a study assessment at two data points- one at study baseline (VISNO=01) AND one at study follow-up (VISNO=02). Individuals who only completed a study assessment at only one time point should not be included in the new dataset. I have provided an example of my input file as well as my desired output file below. Any help with how to program this would be greatly appreciated!

 

Have:

PID

VISNO

Other variables

01

01

01

02

02

01

03

01

03

02

04

02

05

02

 

Want:

PID

VISNO

Other variables

01

01

01

02

03

01

03

02


 

View solution in original post

2 REPLIES 2
ballardw
Super User

If your data is sorted as shown this might work:

 

data want;
  set have;
  by pid ;
  if not (first.pid and last.pid);
run;

BY creates automatic boolean variables that indicate whether a record is the first or last of a by group. The variables are accessed using the First. and Last. (note the dots there) before the variable.

So if there is only a single record then it is both first and last at the same time.

 

 


@wj2 wrote:

Hello Community,

 

I need to create a dataset from a raw data file that only retains individuals in a study who completed a study assessment at two data points- one at study baseline (VISNO=01) AND one at study follow-up (VISNO=02). Individuals who only completed a study assessment at only one time point should not be included in the new dataset. I have provided an example of my input file as well as my desired output file below. Any help with how to program this would be greatly appreciated!

 

Have:

PID

VISNO

Other variables

01

01

01

02

02

01

03

01

03

02

04

02

05

02

 

Want:

PID

VISNO

Other variables

01

01

01

02

03

01

03

02


 

PaigeMiller
Diamond | Level 26

If you make VISNO numeric instead of character, this is pretty simple

 

proc sql;
    create table want as select * from have 
    where max(visno)=2 and min(visno)=1 
    group by pid;
quit;
--
Paige Miller
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 939 views
  • 2 likes
  • 3 in conversation