How do I divide this data into regions (Northeast, Midwest, South, and West)? How do I create a box-plot with these regions?
Here is my data:
Rank2012 City State Estimate2012 Census2010 Change 1 New York New York 8,336,697 8,175,133 1.98% 2 Los Angeles California 3,857,799 3,792,621 1.72% 3 Chicago Illinois 2,714,856 2,695,598 0.71% 4 Houston Texas 2,160,821 2,100,263 2.88% 5 Philadelphia Pennsylvania 1,547,607 1,526,006 1.42% 6 Phoenix Arizona 1,488,750 1,445,632 2.98% 7 San Antonio Texas 1,382,951 1,327,407 4.18% 8 San Diego California 1,338,348 1,307,402 2.37% 9 Dallas Texas 1,241,162 1,197,816 3.62% 10 San Jose California 982,765 945,942 3.89% 11 Austin Texas 842,592 790,390 6.60% 12 Jacksonville Florida 836,507 821,784 1.79% 13 Indianapolis Indiana 834,852 820,445 1.76% 14 San Francisco California 825,863 805,235 2.56% 15 Columbus Ohio 809,798 787,033 2.89% 16 Fort Worth Texas 777,992 741,206 4.96% 17 Charlotte North Carolina 775,202 731,424 5.99% 18 Detroit Michigan 701,475 713,777 (1.72%) 19 El Paso Texas 672,538 649,121 3.61% 20 Memphis Tennessee 655,155 646,889 1.28% 21 Boston Massachusetts 636,479 617,594 3.06% 22 Seattle Washington 634,535 608,660 4.25% 23 Denver Colorado 634,265 600,158 5.68% 24 Washington District of Columbia 632,323 601,723 5.09% 25 Nashville Tennessee 624,496 601,222 3.87% 26 Baltimore Maryland 621,342 620,961 0.06% 27 Louisville Kentucky 605,110 597,337 1.30% 28 Portland Oregon 603,106 583,776 3.31% 29 Oklahoma City Oklahoma 599,199 579,999 3.31% 30 Milwaukee Wisconsin 598,916 594,833 0.69% 31 Las Vegas Nevada 596,424 583,756 2.17% 32 Albuquerque New Mexico 555,417 545,852 1.75% 33 Tucson Arizona 524,295 520,116 0.80% 34 Fresno California 505,882 494,665 2.27% 35 Sacramento California 475,516 466,488 1.94% 36 Long Beach California 467,892 462,257 1.22% 37 Kansas City Missouri 464,310 459,787 0.98% 38 Mesa Arizona 452,084 439,041 2.97% 39 Virginia Beach Virginia 447,021 437,994 2.06% 40 Atlanta Georgia 443,775 420,003 5.66% 41 Colorado Springs Colorado 431,834 416,427 3.70% 42 Raleigh North Carolina 423,179 403,892 4.78% 43 Omaha Nebraska 421,570 408,958 3.08% 44 Miami Florida 413,892 399,457 3.61% 45 Oakland California 400,740 390,724 2.56% 46 Tulsa Oklahoma 393,987 391,906 0.53% 47 Minneapolis Minnesota 392,880 382,578 2.69% 48 Cleveland Ohio 390,928 396,815 (1.48%) 49 Wichita Kansas 385,577 382,368 0.84% 50 Arlington Texas 375,600 365,438 2.78% 51 New Orleans Louisiana 369,250 343,829 7.39% 52 Bakersfield California 358,597 347,483 3.20% 53 Tampa Florida 347,645 335,709 3.56% 54 Honolulu Hawaii 345,610 337,256 2.48% 55 Anaheim California 343,248 336,265 2.08% 56 Aurora Colorado 339,030 325,078 4.29% 57 Santa Ana California 330,920 324,528 1.97% 58 St. Louis Missouri 318,172 319,294 (0.35%) 59 Riverside California 313,673 303,871 3.23% 60 Corpus Christi Texas 312,195 305,215 2.29% 61 Pittsburgh Pennsylvania 306,211 305,704 0.17% 62 Lexington Kentucky 310,573 295,803 4.99% 63 Anchorage Alaska 298,610 291,826 2.32% 64 Stockton California 297,984 291,707 2.15% 65 Cincinnati Ohio 296,550 296,943 (0.13%) 66 Saint Paul Minnesota 290,770 285,068 2.00% 67 Toledo Ohio 284,012 287,208 (1.11%) 68 Newark New Jersey 277,727 277,140 0.21% 69 Greensboro North Carolina 277,080 269,666 2.75% 70 Plano Texas 272,068 259,841 4.71% 71 Henderson Nevada 265,679 257,729 3.08% 72 Lincoln Nebraska 265,404 258,379 2.72% 73 Buffalo New York 259,384 261,310 (0.74%) 74 Fort Wayne Indiana 254,555 253,691 0.34% 75 Jersey City New Jersey 254,441 247,597 2.76% 76 Chula Vista California 252,422 243,916 3.49% 77 Orlando Florida 249,562 238,300 4.73% 78 St. Petersburg Florida 246,541 244,769 0.72% 79 Norfolk Virginia 245,782 242,803 1.23% 80 Chandler Arizona 245,628 236,123 4.03% 81 Laredo Texas 244,731 236,091 3.66% 82 Madison Wisconsin 240,323 233,209 3.05% 83 Durham North Carolina 239,358 228,330 4.83% 84 Lubbock Texas 236,065 229,573 2.83% 85 Winston–Salem North Carolina 234,349 229,617 2.06% 86 Garland Texas 233,564 226,876 2.95% 87 Glendale Arizona 232,143 226,721 2.39% 88 Hialeah Florida 231,941 224,669 3.24% 89 Reno Nevada 231,027 225,221 2.58% 90 Baton Rouge Louisiana 230,058 229,493 0.25% 91 Irvine California 229,985 212,375 8.29% 92 Chesapeake Virginia 228,417 222,209 2.79% 93 Irving Texas 225,427 216,290 4.22% 94 Scottsdale Arizona 223,514 217,385 2.82% 95 North Las Vegas Nevada 223,491 216,961 3.01% 96 Fremont California 221,986 214,089 3.69% 97 Gilbert Arizona 221,140 208,453 6.09% 98 San Bernardino California 213,295 209,924 1.61% 99 Boise Idaho 212,303 205,671 3.22% 100 Birmingham Alabama 212,038 212,237 (0.09%)
Box plot of what?
What is your definition of a region. Likely the easiest for use and construction is to create a custom format.
That would look something like:
Proc format; value $region "New York", "Pennsylvania" <list the rest comma delimited> = "Northeast" "Ohio","Indiana" <the rest> ="Midwest" <repeat for other regions> run;
Then you would use state as the Group variable on a VBOX or HBOX statement in Proc Sgplot.
Assign the format to the state with a statement like: Format state $region.; in the proc.
Box plot of what?
What is your definition of a region. Likely the easiest for use and construction is to create a custom format.
That would look something like:
Proc format; value $region "New York", "Pennsylvania" <list the rest comma delimited> = "Northeast" "Ohio","Indiana" <the rest> ="Midwest" <repeat for other regions> run;
Then you would use state as the Group variable on a VBOX or HBOX statement in Proc Sgplot.
Assign the format to the state with a statement like: Format state $region.; in the proc.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.