BookmarkSubscribeRSS Feed
kcvaldez98
Obsidian | Level 7

Hi SAS coders!

 

I am trying to extract the county from a string variable that is meshed in with the estimate and CI.

So the variable (called Regional) looks like this: 

 LARIMER County/Regional Estimate 14% (95% C.I.: 11.9 - 16.2)

 LAS ANIMASCounty/Regional Estimate 16.7% (95% C.I.: 9.6 - 23.8)

 

I am trying to extract the LARIMER and LAS ANIMAS into their own variable and call it "COUNTY"

I've used SCAN, SUBSTR, COMPRESS, all the extracting functions and they always come out not ideal. My end result should be

LARIMER

LAS ANIMAS

The code I was using is:

 

DATA WANT;
SET OLD;
COUNTY = COMPRESS(SCAN(Regional, 1,'C'),'');
RUN;

The end result of this code is 

LARIMER

LASANIMAS 

I am trying to produce the output where LASANIMAS has a space so it should look like LAS ANIMAS.

Also some of the county names that I am working with start with C like "Cheyenne" and because I put the delimiter as "C" the county name cuts off so it produces an output of "heyenne"

 

I don't know what I am doing wrong.

5 REPLIES 5
PaigeMiller
Diamond | Level 26

Find the location of the word COUNTY. On your first row, that word begins at position 9. So then take the text in positions 1 through 8 (and optionally trim the blank off the end)

 

position=find(regional,'county','i');
county_name=substr(regional,1,position-1);
--
Paige Miller
ballardw
Super User

Don't forget to set a length for your county name long enough to hold the longest expected value.

mkeintz
PROC Star

It appears that REGIONAL has the county name, followed by the string "County", followed by a forward slash, and then other information irrelevant to extracting the county name.  And the string "County" may or may not be preceded by a blank.

 

Then

(1) extract the text prior to the "/":       = scan(regional,1,'/')  

(2) then eliminate the last 6 non-blank characters:     =substr(of above,1,length(of above)-6).

 

data want;
  set have;
  length county $25;
  county=substr(scan(regional,1,'/')
               ,1
               ,length(scan(regional,1,'/'))-6
               );
run;

Now if the word "County" was always spelled in lowercase except the first letter, you could avoid using the substr function in (2).  Instead, you could apply a somewhat simpler expression:

 

data want;
  set have;
  length county $25;
  county=tranwrd(scan(regional,1,'/'),'County','');
run;

This, of course, assumes that the string 'County' does not occur twice (once before the slash, and once as part of the county's proper name) in the REGION variable.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

If you wanted the keep the spaces why did you remove them?  Remove the COMPRESS() function call.

 

Are you sure none of the county names have an uppercase letter C?  If they do your SCAN() function call is also not what you want.

 

 

Patrick
Opal | Level 21

And below yet another option using a regular expression.

data have;
  infile datalines truncover;
  input have_str $100.;
  datalines;
LARIMER County/Regional Estimate 14% (95% C.I.: 11.9 - 16.2)
LAS ANIMASCounty/Regional Estimate 16.7% (95% C.I.: 9.6 - 23.8)
xxx xxx/Regional Estimate 16.7% (95% C.I.: 9.6 - 23.8)
;

data want(drop=_:);
  set have;
  length county $40.;
  _prxid = prxparse('/^.*(?=county\/)/oi');
  call prxsubstr(_prxid, have_str, _pos, _len);
  if _len>0 then county = substr(have_str, _pos, _len);
run;

proc print data=want;
run;

Patrick_0-1698370407970.png

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 609 views
  • 0 likes
  • 6 in conversation