BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
niki0209
Obsidian | Level 7

Hello,

 

I have a data set for apartments with these variables:

Building_id: ID for the building that the apartment is located in.

Apartment_id: ID for the apartment itself.

 

I would like to generate a variable, Group_id, that groups the apartments together in groups. These groups have to include a minimum of 12 apartments and these apartments need the same building_id.

 

So, if there are 23 apartments with the same building_id they would simply be in one group, because there are not enough apartments for 2 groups of 12. If there are 24 apartments with the same building_id, they would be in two groups of 12. If there are 25 apartments with the same building_id, they would be in two groups of 12 and 13.

 

Another example: if there are 100 apartments in the same building (with the same building_id), I want the first 12 to be grouped with one Group_id, 13-24 to be grouped with another Group_id, 25-36 with a third Group_id,..., and then the last apartments from 84-100 to be grouped together with the same Group_id. The last group includes 16 apartments (larger than 12), but if we grouped 84-96 together, then the last 4 apartments would be too few to be in a group, because of the 12 minimum rule.

 

My data set include a lot of apartments with a lot of different building-id's, so I am hoping for a code that does this grouping automatically.

 

I hope this makes sense - please ask away if it doesn't. Hoping someone can help :-).

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

With computers, one should always start counting at zero 😉

 

data want;
count = 0;
do until (last.building_id);
  set have;
  by building_id;
  count + 1;
end;
group_id = 0;
count2 = 0;
do until (last.building_id);
  set have;
  by building_id;
  if mod(count2,12) = 0 and count - count2 ge 12 then group_id + 1;
  count2 + 1;
  output;
end;
drop count count2;
run;

View solution in original post

5 REPLIES 5
Kurt_Bremser
Super User

See this (with simple numeric ID's):

data have;
building_id = 1;
do appartment_id = 1 to 23;
  output;
end;
building_id = 2;
do appartment_id = 1 to 24;
  output;
end;
building_id = 3;
do appartment_id = 1 to 25;
  output;
end;
run;

data want;
count = 0;
do until (last.building_id);
  set have;
  by building_id;
  count + 1;
end;
group_id = 1;
count2 = 0;
do until (last.building_id);
  set have;
  by building_id;
  count2 + 1;
  if mod(count2,12) = 0 and count - count2 ge 12 then group_id + 1;
  output;
end;
drop count count2;
run;
niki0209
Obsidian | Level 7
Thank you very much Kurt. The code seems to be (almost) working.
I see, however, that the first group of each building_id contains only 11 apartments.

Do you have any idea why that would be the case?
Kurt_Bremser
Super User

With computers, one should always start counting at zero 😉

 

data want;
count = 0;
do until (last.building_id);
  set have;
  by building_id;
  count + 1;
end;
group_id = 0;
count2 = 0;
do until (last.building_id);
  set have;
  by building_id;
  if mod(count2,12) = 0 and count - count2 ge 12 then group_id + 1;
  count2 + 1;
  output;
end;
drop count count2;
run;
niki0209
Obsidian | Level 7
Good point! Thank you so much. Works perfectly now. You've made my day 1000 times easier. Have a good one!
s_lassen
Meteorite | Level 14

Something like this perhaps?

data want;
  do count=1 by 1 until(last.building_ID);
    set have(keep=building_ID);
    by Building_ID;
    end;
  group_ID+1;
  do _N_=1 by 1 until(last.building_ID);
    set have;
    by building_ID;
    output;
    if mod(_N_,12)=0 and count-_N_>12 then
      group_ID+1;
    end;
run;

The first DO loop just counts the number of apartments in each building, the second reads the data again and outputs, and for each 12 observations it adds to to the group_ID if there are enough apartments left.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 464 views
  • 4 likes
  • 3 in conversation