SAS Programming

geneshackman · Posted 07-21-2022 04:12 PM

Hi all

I hope everyone is healthy!

Here is part of my data set

Not Hispanic or Latino	Albany County, NY (36001)	American Indian or Alaska Native	1,370	300
Not Hispanic or Latino	Albany County, NY (36001)	Asian	62,742	4,520
Not Hispanic or Latino	Albany County, NY (36001)	Black or African American	116,750	11,993
Not Hispanic or Latino	Albany County, NY (36001)	Native Hawaiian or Other Pacific Islander	452	12
Not Hispanic or Latino	Albany County, NY (36001)	White	652,841	137,877
Not Hispanic or Latino	Albany County, NY (36001)	More than one race	22,848	1,135
Not Hispanic or Latino	Albany County, NY (36001)	Total	#####	#####
Not Hispanic or Latino	Allegany County, NY (36003)	American Indian or Alaska Native	295	62
Not Hispanic or Latino	Allegany County, NY (36003)	Asian	1,701	115
Not Hispanic or Latino	Allegany County, NY (36003)	Black or African American	1,709	79
Not Hispanic or Latino	Allegany County, NY (36003)	Native Hawaiian or Other Pacific Islander	38	3
Not Hispanic or Latino	Allegany County, NY (36003)	White	130,012	26,328
Not Hispanic or Latino	Allegany County, NY (36003)	More than one race	1,632	121
Not Hispanic or Latino	Allegany County, NY (36003)	Total	#####	26,708

I want to make this part "Albany County, NY (countynumber") into two variables

county name (e.g. "Albany")

and

county number (e.g. 36001)

Is that possible?

Second, one row is "Native Hawaiian or Other Pacific Islander" and another row is "Asian".

I'd like to add them together to make a row "Asian/Pacific Islander", and then remove the two component rows.

Thanks

Gene

Reeza · Posted 07-21-2022 04:29 PM

Not sure if this will scale but based on your post:

CountyName = scan(variable, 1, ",");
CountyNumber = scan(variable, 2, "()");

race_combined = race;
if race_combined in ('Asian', 'Native ...') then race_combined = 'Asian/Pacific Islander';

To combine your rows, you'd need to summarize your data - you can use PROC MEANS/SUMMARY for that. The summary stat is most likely sum, but may depend on the data. You didn't include variable names so I made guesses.

geneshackman · Posted 07-21-2022 04:35 PM

Thanks Reeza, I forgot. Yes, variable names would be Ethnicity, County, Race, Total, Age65Plus

Reeza · Posted 07-21-2022 04:44 PM

I'm going to assume you can modify the code with the correct variable names.

If you have issues, post your code, log and explain the issue.

ballardw · Posted 07-21-2022 04:41 PM

It might be nice to include your existing variable names.

The county bits can be solved with the SCAN function mostly. The key is setting the exact characters used for delimiters.

Tranwrd removes the word County from the result. I hope you don't have a "County County, NY ".

data example;
   County="Albany County, NY (36001)";
   length countyname $ 20;
   countyname = tranwrd(scan(county,1,','),'County','');
   countynum  = scan(county,2,'()');
run;

For your second bit about the race coding my preference would be to leave the values as they are and use a custom format to create text for the combined group.

Proc format;
value $myracecode
"Asian","Native Hawaiian or Other Pacific Islander" ="Asian/Pacific Islander"
;
run;

The use the format with the value, example:

Proc freq data=have;
   tables race;
   format race $myracecode. ;
run;

Why a format you may ask. First as soon as you remove or replace the values someone will ask for a report or analysis separating the two, especially as these are standard, more or less, values. So if you remove them then you get to go back in your code and an undo that. With the format just remove the format (or use a different format).

The groups created by a format will be honored by reporting, analysis and graphing procedures.

A comment: Inclusion of "Total" or similar summary rows inside a data set is not a good idea for many purposes. It is extremely easy to accidentally include the "total" is some other summary, such as a statewide total and get very incorrect results.

I would almost say this is the result of another procedure's output and would suggest looking if that is the case what the underlying data is.

geneshackman · Posted 07-21-2022 04:59 PM

Ballardw, your solution to splitting the county name works. I'll try the remaining part tomorrow. Reeza your part almost works. I'm not clear if I indicated that I didn't want the word "County" in the output, just the name of the county.

Thanks both.

Reeza · Posted 07-21-2022 05:05 PM

If you don't want the word COUNTY you'll want to use mine and then substr it to remove the COUNTY. Many counties in my experience do have two or three names, not just one.

Or map the county number to the name using one of the datasets from SASMAPS library.

geneshackman · Posted 07-22-2022 11:48 AM

For separating out the county name and ID, Ballardw's method worked for me

countyname = tranwrd(scan(county,1,','),'County','');

I'll look at the other part now, collapsing two groups.

Thanks

andreas_lds · Posted 07-22-2022 01:48 AM

Posting data in usable form makes it much easier to suggest solutions. The following code implements the idea of @Reeza to use a map dataset:

data work.CountyFmt;
   set mapsgfk.us_all_attr(keep= id idname rename=(idname = Label));
   length Start $ 10;
   retain FmtName '$CountyName';

   Start = compress(id,, 'kd');

   drop id;
run;

proc format cntlin=work.CountyFmt;
run;

data work.want;
   set work.have;
   
   length CountyName $ 55 CountyNumber $ 10;

   CountyNumber = compress(County,, 'kd');
   CountyName = put(CountyNumber, $CountyName.);
run;

SAS Programming

seperating variable text, and creating new line

Re: seperating variable text, and creating new line

Re: seperating variable text, and creating new line

Re: seperating variable text, and creating new line

Re: seperating variable text, and creating new line

Re: seperating variable text, and creating new line

Re: seperating variable text, and creating new line

Re: seperating variable text, and creating new line

Re: seperating variable text, and creating new line

Follow Us

What is...

SAS Programming

Join us for our biggest event of the year!

SAS Training: Just a Click Away

Follow Us

What is...