i have a dataset as below;
data city;
input TIMEZONE:$20. STATE:$20. CITY:$20. POPULATION;
infile datalines dlm=' ' dsd;
datalines;
Alaska Alaska Anchorage 291826
Atlantic Puerto Rico San Antonio 1327407
Atlantic Puerto Rico San Juan 395326
Central Minnesota Minneapolis 382578
Central Texas Dallas 1197816
Eastern Florida Miami 399457
Eastern Maryland Baltimore 620961
Eastern Michigan Detroit 713777
Eastern New York New York 8175133
Mountain Arizona Phoenix 1445631
Mountain Colorado Denver 600158
Mountain Texas El Paso 649121
Pacific California Los Angeles 3792621
Pacific California Sacramento 466488
Pacific California San Francisco 805235
Pacific Nevada Las Vegas 583756
;
the final output should be like
| TIMEZONE | STATE | CITY | POPULATION | CUMUL_POP | 
| Alaska | Alaska | Anchorage | 291826 | 291826 | 
| Alaska | Alaska | Total | 291826 | 291826 | 
| Alaska | Total | 
 | 291826 | 291826 | 
| Atlantic | Puerto Rico | San Antonio | 1327407 | 1619233 | 
| Atlantic | Puerto Rico | San Juan | 395326 | 2014559 | 
| Atlantic | Puerto Rico | Total | 1722733 | 2014559 | 
| Atlantic | Total | 
 | 1722733 | 2014559 | 
| Central | Minnesota | Minneapolis | 382578 | 2397137 | 
| Central | Minnesota | Total | 382578 | 2397137 | 
| Central | Texas | Dallas | 1197816 | 3594953 | 
| Central | Texas | Total | 1197816 | 3594953 | 
| Central | Total | 
 | 1580394 | 3594953 | 
| Eastern | Florida | Miami | 399457 | 3994410 | 
| Eastern | Florida | Total | 399457 | 3994410 | 
| Eastern | Maryland | Baltimore | 620961 | 4615371 | 
| Eastern | Maryland | Total | 620961 | 4615371 | 
| Eastern | Michigan | Detroit | 713777 | 5329148 | 
| Eastern | Michigan | Total | 713777 | 5329148 | 
| Eastern | New York | New York | 8175133 | 13504281 | 
| Eastern | New York | Total | 8175133 | 13504281 | 
| Eastern | Total | 
 | 9909328 | 13504281 | 
| Mountain | Arizona | Phoenix | 1445631 | 14949912 | 
| Mountain | Arizona | Total | 1445631 | 14949912 | 
| Mountain | Colorado | Denver | 600158 | 15550070 | 
| Mountain | Colorado | Total | 600158 | 15550070 | 
| Mountain | Texas | El Paso | 649121 | 16199191 | 
| Mountain | Texas | Total | 649121 | 16199191 | 
| Mountain | Total | 
 | 2694910 | 16199191 | 
| Pacific | California | Los Angeles | 3792621 | 19991812 | 
| Pacific | California | Sacramento | 466488 | 20458300 | 
| Pacific | California | San Francisco | 805235 | 21263535 | 
| Pacific | California | Total | 5064344 | 21263535 | 
| Pacific | Nevada | Las Vegas | 583756 | 21847291 | 
| Pacific | Nevada | Total | 583756 | 21847291 | 
| Pacific | Total | 
 | 5648100 | 21847291 | 
i have coded as below but not able to get the intermidiate summary:
proc sort data=city; by timezone state; run;
data want;
set city;
by timezone state;
if first.state then cum_pop=0;
cum_pop+population;
run;
could anyone please provide the solution both in proc sql as well as data step. Thanks in advance!!
Seems pretty straight forward.
data city;
  infile datalines dlm=' ' dsd;
  input TIMEZONE :$20. STATE :$20. CITY :$20. POPULATION;
datalines;
Alaska Alaska Anchorage 291826
Atlantic "Puerto Rico" "San Antonio" 1327407
Atlantic "Puerto Rico" "San Juan" 395326
Central Minnesota Minneapolis 382578
Central Texas Dallas 1197816
Eastern Florida Miami 399457
Eastern Maryland Baltimore 620961
Eastern Michigan Detroit 713777
Eastern "New York" "New York" 8175133
Mountain Arizona Phoenix 1445631
Mountain Colorado Denver 600158
Mountain Texas "El Paso" 649121
Pacific California "Los Angeles" 3792621
Pacific California Sacramento 466488
Pacific California "San Francisco" 805235
Pacific Nevada "Las Vegas" 583756
;
data want;
  set city ;
  by timezone state  ;
  if first.state then total=0;
  if first.timezone then tz_total=0;
  total+population;
  tz_total+population;
  cumul_pop+population;
  output;
  if last.state then do;
     city='TOTAL';
     population=total;
     output;
  end;
  if last.timezone then do;
     state='TOTAL';
     city=' ';
     population=tz_total;
     output;
  end;
  drop total tz_total;
run;
     It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
