BookmarkSubscribeRSS Feed
jcochrane424
Calcite | Level 5

Hello everyone, 

 

So I'm working in SAS 9.4 with projected population data. I'm trying to alter the age group of "18 to 24" years to include people of the age of 16 and 17, making it 16-24 years. I want to do this by taking the difference of the Population_in_thousands for the age groups "16 years and over" and "18 years and over" and then summing that to the existing Population_in_thousands for "18 to 24 years". I have no problem renaming the group and then deleting the two groups of 16 and 18 years or over, I just need to know how to alter the Population_in_thousands for 18-24 to add in people 16 and 17. Thanks if you can help!

My problem today is trying to projected pop data.png

3 REPLIES 3
sustagens
Pyrite | Level 9

I would take out and segregate all 18 years and over data in one dataset, and another for 16 years and over.

When I have that I can join them on matching date and gender and get their difference.

 

Once I have the difference I can add it to the "18 to 24 years" value.

 

My code can probably be optimised to make it shorter.

 

/*Take all '18 years and over' rows*/
proc sql;
create table data_18 as(
select * from have 
where Age_group = '18 years and over'
);
quit;

/*Take all '16 years and over' rows*/
proc sql;
create table data_16 as(
select * from have 
where Age_group = '16 years and over'
);
quit;

/*Compute difference*/
proc sql;
create table diff_of_16_and_18 as(
select t1.Date,
t1.Gender,
'16 to 24 years' as Age_Group,
sum(t1.Population_in_thousands,-t2.Population_in_thousands) as Population_in_thousands
from data_18 t1 left join data_16 t2 
on t1.Date=t2.Date 
and t1.Gender=t2.Gender
);
quit;

/*Combine the two tables: (1) original table, without the "16- and 18- years"*/
/*						  (2) difference table, ready for joining to "18-24 years"*/
data want;
set have (where=(Age_group not in ('16 years and over' '18 years and over')))
diff_of_16_and_18
;

if Age_group in ('18 to 24 years') then new_age_group='16 to 24 years'; /*Rename "18 to 24 years" to correct grouping so they can be summed together in next step*/
else New_age_group = Age_group;

run;

/*Sum by new age group*/
proc sql;
create table sum_by_new_age_group as(
select Date, 
Gender, 
New_age_group,
sum(Population_in_Thousands) as Population_in_Thousands
from want
group by Date, Gender, New_age_group
);
quit;
andreas_lds
Jade | Level 19

Posting the data you have in a usable form would make it much easier to provide a solution. I think it is possible to solve the issue with a single data step using a hash-object.

 

EDIT: Maybe something like this:

data want;
   set work.have;
   where  Age_Group not in ("16 years and over", "18 years and over");

   length pop16 pop18 oldPop 8;
   drop pop16 pop18 oldPop;

   if _n_ = 1 then do;
      declare hash h(dataset: "work.have(where=(Age_Group in ('16 years and over', '18 years and over'))))");
      h.defineKey("Date", "Gender", "Age_Group");
      h.defineData("Population_in_thousands");
      h.defineDone();
   end;

   if Age_Group = "18 to 24 years" then do;
      Age_Group = "16 to 24 years";
      oldPop = Population_in_thousands;
      h.find(key: Date, key: Gender, key: "16 years and over");
      pop16 = Population_in_thousands;
      h.find(key: Date, key: Gender, key: "18 years and over");
      pop18 = Population_in_thousands;
      Population_in_thousands = oldPop + pop16 - pop18;
   end;
run;
Kurt_Bremser
Super User

Run a double DOW:

data want;
do until (last.date);
  set have;
  by date;
  select (age_group);
    when ("18 years and over") over18 = population_in_thousands;
    when ("16 years and over") over16 = population_in_thousands;
    otherwise;
  end;
end;
do until (last.date);
  set have;
  by date;
  if age_group = "18 to 24 years" and nmiss(over16,over18) = 0
  then population_in_thousands = population_in_thousands + over16 - over18;
  output;
end;
drop over16 over18;
run;

Please do not post example data in pictures, as such are useless for us. Post example data in a data step with datalines like this:

data have;
infile datalines dlm=',' dsd;
input date :$4. gender :$20. age_group :$25. population_in_thousands;
format population_in_thousands comma10.;
datalines;
2015,both sexes,18 to 24 years,30983
;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 4000 views
  • 0 likes
  • 4 in conversation