BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jcooper3
Calcite | Level 5

I currently have a dataset that looks like this:

pat_idtimesequence
11M
12MS
13M
14M
15M
16M
17M
18M
19M
110M
21M
22MST
23ST
24MT
25M
26MS
27S
28S
29S
210S

I am trying to make it so that whenever there is a repeat observation from line to line it only records the first observation of that value. I'm looking speciifcally at the sequence variable. For example, I would like to keep only the first three rows of data for patient 1, so I would like to basically get rid of all the 'M' observations for time 4-10.

Ultimately I would like to transpose the data and make a string variable, so I'd like to get my final dataset to look like this, but I'm hung up on the above part....

pat_idsequence
1M-MS-M
2M-MST-ST-MT-M-MS-S

Any thoughts? Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Use the notsorted option on your by statement and append the first time you see a new sequence. Output on the last patient ID.

data have;

input pat_id    time    sequence $;

cards;

1    1    M

1    2    MS

1    3    M

1    4    M

1    5    M

1    6    M

1    7    M

1    8    M

1    9    M

1    10    M

2    1    M

2    2    MST

2    3    ST

2    4    MT

2    5    M

2    6    MS

2    7    S

2    8    S

2    9    S

2    10    S

;

run;

data want;

set have;

by pat_id sequence notsorted;

length full_sequence $200.;

retain full_sequence;

if first.pat_id then full_sequence='';

if first.sequence then full_sequence=catx("-", full_sequence, sequence);

if last.pat_id then output;

keep pat_id full_sequence;

run;

View solution in original post

4 REPLIES 4
sashelpwanted
Calcite | Level 5

So do you need the time field at all?  If not then you can do this very simple by doing a nodupkey:

proc sort data=yourfile nodupkey out=yourfile2;

     by pat_id sequence;

run;

If you need to keep that time then you could do something like this:

proc sort data=yourfile;

     by sequence pat_id time;

data yourfile2;

     set yourfile;

     by sequence pat_id time;

     if first.pat_id;

run;

jcooper3
Calcite | Level 5

thanks, sashelpwanted!

I don't really care about the exact time (ie. what exact time value), but I'm trying to keep the order. So if you look at patient 1, when I try to nodupkey option, I lose the piece when the patient starts taking 'M' again, meaning I lose what happens during time 3-10. When I try the other option you recommend, it only gives me the first observation per patient. Ideally, I would like the first observation whenever there's a change in the sequence, if that makes any sense. Thank you.

Reeza
Super User

Use the notsorted option on your by statement and append the first time you see a new sequence. Output on the last patient ID.

data have;

input pat_id    time    sequence $;

cards;

1    1    M

1    2    MS

1    3    M

1    4    M

1    5    M

1    6    M

1    7    M

1    8    M

1    9    M

1    10    M

2    1    M

2    2    MST

2    3    ST

2    4    MT

2    5    M

2    6    MS

2    7    S

2    8    S

2    9    S

2    10    S

;

run;

data want;

set have;

by pat_id sequence notsorted;

length full_sequence $200.;

retain full_sequence;

if first.pat_id then full_sequence='';

if first.sequence then full_sequence=catx("-", full_sequence, sequence);

if last.pat_id then output;

keep pat_id full_sequence;

run;

jcooper3
Calcite | Level 5

thank you, Reeza! This worked perfectly. much appreciated! Smiley Happy

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1389 views
  • 3 likes
  • 3 in conversation