BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
y_fu
Fluorite | Level 6

I have a large data set, and one of the sorted column is like the following:

 

Sample_ID

A-001

A-002

A-003

A-004

A-010

A-011

A-012

B-001

B-002

and so on...

 

I want to add a new column to identify the samples that out of sequential order (discontinue  ID numbers), the new column I want is:

 

Sample_ID      Out_of_order

A-001

A-002

A-003

A-004   

A-010               Yes

A-011

A-012

B-001

B-002

B-004               Yes

and so on...

 

I want to know if there are any way to approach it or similar result, thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
data have;
input Sample_ID $;
c=compress(Sample_ID,,'ka');
n=input(compress(Sample_ID,,'kd'),8.);
cards;
A-001
A-002
A-003
A-004
A-010
A-011
A-012
B-001
B-002
B-004
;
data want;
set have;
by c;
k=dif(n);
if not first.c and k ne 1 then Out_of_order='YES';
keep sample_id Out_of_order;
run;

View solution in original post

7 REPLIES 7
DBailey
Lapis Lazuli | Level 10

in a data step, you can use the lag function to look at previous rows:

 

options pagesize= linesize= pageno=1 nodate;

data one;
   input x @@;
   y=lag1(x);
   z=lag2(x);
   datalines;
1 2 3 4 5 6
;

proc print data=one;
   title 'LAG Output';
run;

 

Output from Generating Two Lagged Values

                                  LAG Output                                 1

                              Obs    x    y    z

                               1     1    .    .
                               2     2    1    .
                               3     3    2    1
                               4     4    3    2
                               5     5    4    3
                               6     6    5    4
y_fu
Fluorite | Level 6

I have a large data set that has more than 1000 obs. I do not think lag function will be efficient. 

PeterClemmensen
Tourmaline | Level 20

Why is Out_Of_Order not equal to 'Yes' for B-001?

ballardw
Super User

@y_fu wrote:

I have a large data set that has more than 1000 obs. I do not think lag function will be efficient. 


If the values are sorted then you only need to look at the previous record.

 

data have;
   input sample_id $;
datalines;
A-001
A-002
A-003
A-004
A-010 
A-011
A-012
B-001
B-002
B-004
;
run;

data want;
   set have;
   prefix=scan(sample_id,1,'-');
   lprefix=scan(lag(sample_id),1,'-');
   order = input(scan(sample_id,2,'-'),4.);
   lorder = input(scan(lag(sample_id),2,'-'),4.);
   Outoforder = (prefix=lprefix) and (order ne (lorder+1));
   drop prefix lprefix order lorder;
run;

You do not specify if the A, B etc value has to be checked in any way.

 

You also do not specify if the first A, B etc value has to be 001 or not.

y_fu
Fluorite | Level 6

Sorry, I should be more specific.

The A,B,C etc. check will not be necessary, and all of the group of samples will start with -001

novinosrin
Tourmaline | Level 20
data have;
input Sample_ID $;
c=compress(Sample_ID,,'ka');
n=input(compress(Sample_ID,,'kd'),8.);
cards;
A-001
A-002
A-003
A-004
A-010
A-011
A-012
B-001
B-002
B-004
;
data want;
set have;
by c;
k=dif(n);
if not first.c and k ne 1 then Out_of_order='YES';
keep sample_id Out_of_order;
run;
y_fu
Fluorite | Level 6

Thank you for the help! It is the result I want.

It is nice to learn something new about how to use compress function.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 3917 views
  • 1 like
  • 5 in conversation