/**how to assign sequence number every 5 records with group wise(sex, descending age)**/
name sex age height weight seq
Janet F 15 62.5 112.5 1
Mary F 15 66.5 112 1
Carol F 14 62.8 102.5 1
Judy F 14 64.3 90 1
Alice F 13 56.5 84 1
Barbara F 13 65.3 98 2
Jane F 12 59.8 84.5 2
Louise F 12 56.3 77 2
Joyce F 11 51.3 50.5 2
Philip M 16 72 150 2
Ronald M 15 67 133 3
William M 15 66.5 112 3
Alfred M 14 69 112.5 3
Henry M 14 63.5 102.5 3
Jeffrey M 13 62.5 84 3
James M 12 57.3 83 4
John M 12 59 99.5 4
Robert M 12 64.8 128 4
Thomas M 11 57.5 85 4
I want to assign sequence number group wise sex and descending age wise with every 5 records if same group display in different sequence number have to forward to next sequence number .
output should be like:
name sex age height weight seq
Janet F 15 62.5 112.5 1
Mary F 15 66.5 112 1
Carol F 14 62.8 102.5 1
Judy F 14 64.3 90 1
Alice F 13 56.5 84 2
Barbara F 13 65.3 98 2
Jane F 12 59.8 84.5 2
Louise F 12 56.3 77 2
Joyce F 11 51.3 50.5 2
Philip M 16 72 150 3
Ronald M 15 67 133 3
William M 15 66.5 112 3
Alfred M 14 69 112.5 3
Henry M 14 63.5 102.5 3
Jeffrey M 13 62.5 84 4
James M 12 57.3 83 4
John M 12 59 99.5 4
Robert M 12 64.8 128 4
Thomas M 11 57.5 85 4
how to do do while/do until/do loop concept?
Let me see if I can translate that so you can confirm your intention.
You have the records in SEX*AGE groups. Put those groups into combinations of no more than 5 observations. In this example the first 5 records include 3 groups of 2 member each. So you just want the first 4 to be in the first new group (seq).
This raises a big question: What do you do when one of the groups has more than 5 members already? Does that new grouping (seq) include more than 5 in that case? Or does the group get split into across two (or more) of the new SEQ groupings?
Let's assume you want any large group to be its own seq.
data want;
do n=1 by 1 until(last.age);
set have;
by sex age notsorted;
end;
if (total+n) > 5 or _n_=1 then do;
seq+1;
total=n;
end;
else total+n;
do n=1 to n ;
set have;
output;
end;
run;
Result
Obs n name sex age height weight seq_want total seq 1 1 Janet F 15 62.5 112.5 1 2 1 2 2 Mary F 15 66.5 112.0 1 2 1 3 1 Carol F 14 62.8 102.5 1 4 1 4 2 Judy F 14 64.3 90.0 1 4 1 5 1 Alice F 13 56.5 84.0 2 2 2 6 2 Barbara F 13 65.3 98.0 2 2 2 7 1 Jane F 12 59.8 84.5 2 4 2 8 2 Louise F 12 56.3 77.0 2 4 2 9 1 Joyce F 11 51.3 50.5 2 5 2 10 1 Philip M 16 72.0 150.0 3 1 3 11 1 Ronald M 15 67.0 133.0 3 3 3 12 2 William M 15 66.5 112.0 3 3 3 13 1 Alfred M 14 69.0 112.5 3 5 3 14 2 Henry M 14 63.5 102.5 3 5 3 15 1 Jeffrey M 13 62.5 84.0 4 1 4 16 1 James M 12 57.3 83.0 4 4 4 17 2 John M 12 59.0 99.5 4 4 4 18 3 Robert M 12 64.8 128.0 4 4 4 19 1 Thomas M 11 57.5 85.0 4 5 4
You could just use arithmetic.
data want;
set have;
seq = 1 + int((_n_-1)/5);
run;
Or just count.
data want;
seq+1;
do subseq=1 to 5;
set have;
output;
end;
drop subseq;
run;
Thank you for your quick response but this code not reaching for my requirement .
The age group should be same sequence number with in sex group
Please find screen shot I have set the data in sex, descending order after that if i apply the your code/logic it's giving every five records one number but I need
if same group split into two sequence number the first record/first few records forward to sequence number.
for example:
5th and 6th records are same sex and age groups but sequence numbers are wrong . Here 5th record sequence number should be 2nd
I cannot understand what your restriction is.
Do you want to COUNT the existing groups?
Or do you want to CREATE the groups?
Or is it some combination of the two?
Your original example just set the first 5 observations to SEQ=1, the next 5 to SEQ=2 etc. So it was creating groups by simply assigning the first 5 to the first group, etc.
If instead you want to NUMBER the groups that exist then just use BY group processing.
For example this will generate a new SEQ number of for each unique SEX*AGE combination.
data want;
set have;
by sex age;
seq + first.age;
run;
If the data is already grouped, but not necessarily sorted then use the NOTSORTED keyword on the BY statement.
Or perhaps you want to split the groups into subgroups of 5 in a row by restarting the seq numbers from one when a new group starts?
data want;
seq+1;
do subseq=1 to 5 until(last.age);
set have ;
by sex age;
output;
end;
if last.sex then seq=0;
run;
Please share data as text, not photographs. Preferable as data steps that can be used to recreate the data.
simple one like what I am trying to say we need to generate sequence number every 5 records .
5 record(nothing but seq=1) = 6th record(seq=2) both are (5th,6th records) same unique sex*age groups . So here 5th record also sequence number should be '2'
Outpu should be like(see below output):
name sex age height weight seq
Janet F 15 62.5 112.5 1
Mary F 15 66.5 112 1
Carol F 14 62.8 102.5 1
Judy F 14 64.3 90 1
Alice F 13 56.5 84 2
Barbara F 13 65.3 98 2
Jane F 12 59.8 84.5 2
Louise F 12 56.3 77 2
Joyce F 11 51.3 50.5 2
Philip M 16 72 150 3
Ronald M 15 67 133 3
William M 15 66.5 112 3
Alfred M 14 69 112.5 3
Henry M 14 63.5 102.5 3
Jeffrey M 13 62.5 84 4
James M 12 57.3 83 4
John M 12 59 99.5 4
Robert M 12 64.8 128 4
Thomas M 11 57.5 85 4
Let me see if I can translate that so you can confirm your intention.
You have the records in SEX*AGE groups. Put those groups into combinations of no more than 5 observations. In this example the first 5 records include 3 groups of 2 member each. So you just want the first 4 to be in the first new group (seq).
This raises a big question: What do you do when one of the groups has more than 5 members already? Does that new grouping (seq) include more than 5 in that case? Or does the group get split into across two (or more) of the new SEQ groupings?
Let's assume you want any large group to be its own seq.
data want;
do n=1 by 1 until(last.age);
set have;
by sex age notsorted;
end;
if (total+n) > 5 or _n_=1 then do;
seq+1;
total=n;
end;
else total+n;
do n=1 to n ;
set have;
output;
end;
run;
Result
Obs n name sex age height weight seq_want total seq 1 1 Janet F 15 62.5 112.5 1 2 1 2 2 Mary F 15 66.5 112.0 1 2 1 3 1 Carol F 14 62.8 102.5 1 4 1 4 2 Judy F 14 64.3 90.0 1 4 1 5 1 Alice F 13 56.5 84.0 2 2 2 6 2 Barbara F 13 65.3 98.0 2 2 2 7 1 Jane F 12 59.8 84.5 2 4 2 8 2 Louise F 12 56.3 77.0 2 4 2 9 1 Joyce F 11 51.3 50.5 2 5 2 10 1 Philip M 16 72.0 150.0 3 1 3 11 1 Ronald M 15 67.0 133.0 3 3 3 12 2 William M 15 66.5 112.0 3 3 3 13 1 Alfred M 14 69.0 112.5 3 5 3 14 2 Henry M 14 63.5 102.5 3 5 3 15 1 Jeffrey M 13 62.5 84.0 4 1 4 16 1 James M 12 57.3 83.0 4 4 4 17 2 John M 12 59.0 99.5 4 4 4 18 3 Robert M 12 64.8 128.0 4 4 4 19 1 Thomas M 11 57.5 85.0 4 5 4
Could you please explain me how the code/program works?
Could you please explain me step by step?
Count how many observations are in the group.
Does adding that many observations make the size of the current SEQ group exceed the limit? If so start a new SEQ group.
Re-read the observations in the group and write them back out so that all of the observations in the group have the new SEQ variable.
It is just a twist on the basic double DOW loop. https://www.google.com/search?q=%40sas.com+dow+loop
In your example:
name sex age height weight seq
Janet F 15 62.5 112.5 1
Mary F 15 66.5 112 1
Carol F 14 62.8 102.5 1
Judy F 14 64.3 90 1
Alice F 13 56.5 84 2
Barbara F 13 65.3 98 2
Jane F 12 59.8 84.5 2
Louise F 12 56.3 77 2
Joyce F 11 51.3 50.5 2
you change to seq=2 after only 4 observations, but then keep seq=2 for 5 observations. So do you increment with the 5th, or after 5?
Thank you for your response.
If unique groups having more than 5 records for example 8 records here I consider every 8 observations with unique groups sequence number.
5 is just sample/example
As of now below code is working.
Thank you.
data want;
do n=1 by 1 until(last.age);
set have;
by sex age notsorted;
end;
if (total+n) > 5 or _n_=1 then do;
seq+1;
total=n;
end;
else total+n;
do n=1 to n ;
set have;
output;
end;
run;
@thanikondharish wrote:
Thank you for your response.
If unique groups having more than 5 records for example 8 records here I consider every 8 observations with unique groups sequence number.
5 is just sample/example
In that case you need to process the dataset twice. Once to find the maximum group size. The second to make the new groups.
But why did you use 5 for your example instead of the actual maximum of 3?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.