BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
newtriks
Obsidian | Level 7

I'm trying to remove bothersome and unnecessary characters from my dataset. A portion of my code:

 

DATA regions;
SET regionsc;
SLOE_1_Reg = TRANWRD (SLOE_1_Reg,'Midwest (IA, IL, IN, OH, KS, MI, MN, MO, ND, SD, WI)','Midwest');

RUN;

 

...This returns the identical column with which I started. There's no error message in the log. What am I doing wrong?

 

Thanks-

1 ACCEPTED SOLUTION

Accepted Solutions
AMSAS
SAS Super FREQ

A simple example, assuming you want to remove all the states using TRANWRD

I'm assuming you have the value I set in haveStr, and want to convert it to the value I have in wantStr. 

I suggest you provide a sample of the data you are attempting to transform, what you want as a result, what code you currently have and the log from that code

1456  data _null_ ;
1457      haveStr="Midwest (IA, IL, IN, OH, KS, MI, MN, MO, ND, SD, WI)" ;
1458      wantStr="Midwest" ;
1459
1460      convert=tranwrd(haveStr,"Midwest (IA, IL, IN, OH, KS, MI, MN, MO, ND, SD,
1460! WI)","MidWest") ;
1461
1462      put haveStr= ;
1463      put wantStr= ;
1464      put convert= ;
1465  run ;

haveStr=Midwest (IA, IL, IN, OH, KS, MI, MN, MO, ND, SD, WI)
wantStr=Midwest
convert=MidWest
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

 
Admin Note:
Marking this as the Accepted Solution, as it has a link to the TRANWRD documentation
Also, want to call out replies by:

 

@PaigeMiller which notes the importance of knowing your data:

The spaces and commas are critical if you are going to make TRANWRD work

@FreelanceReinh which offers a method to convert values that are "messy" inconsistent: 

SLOE_1_Reg = prxchange('s/(\w+)\s*\(.+?\)/$1/',-1,SLOE_1_Reg);

Basically, this would replace substrings of the form "word (unwanted details)" with "word".

 

 

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26

We need for you to provide a portion of your data in order to figure this out. Please provide the data as working SAS data step code, which you can type in yourself, or follow these instructions. Please do not provide data as Excel files or as screen captures or any other format.

--
Paige Miller
newtriks
Obsidian | Level 7

Ok, this is an imported Excel file but I've put a few datalines together to give you an idea:

 

DATALINES;

gender Med_Type SLOE_1_Reg 

M Private Midwest(IA, IL, IN, OH, KS, MI, MN, MO, ND, SD, WI)

F Public Northeast(CT, District of Columbia, DE, MA, MD, ME, NH, PA, NY, NJ, RI, VT)

F Private Southeast(AL, AR, FL, CA, KY, LA, PR, MS, NC, SC, TN, VA, WV)

M Public West(AK, AZ, CA, CO, ID, HI, MT, NM, NV, OK, OR, TX, UT, WA, WY)

;

 

You are correct, I'm just trying to get those overly-descriptive cells down to the basic single word which describes the region.

 

Thank you all for your time and efforts!

 
 

 

ballardw
Super User

@newtriks wrote:

Ok, this is an imported Excel file but I've put a few datalines together to give you an idea:

 

DATALINES;

gender Med_Type SLOE_1_Reg 

M Private Midwest(IA, IL, IN, OH, KS, MI, MN, MO, ND, SD, WI)

F Public Northeast(CT, District of Columbia, DE, MA, MD, ME, NH, PA, NY, NJ, RI, VT)

F Private Southeast(AL, AR, FL, CA, KY, LA, PR, MS, NC, SC, TN, VA, WV)

M Public West(AK, AZ, CA, CO, ID, HI, MT, NM, NV, OK, OR, TX, UT, WA, WY)

;

 

You are correct, I'm just trying to get those overly-descriptive cells down to the basic single word which describes the region.

 

Thank you all for your time and efforts


Those examples look like what you really want is the first word, which the SCAN function will get quite easily.

length region $ 10;
region = scan(SLOE_1_Reg ,1,'(');
PaigeMiller
Diamond | Level 26

@newtriks wrote:

Ok, this is an imported Excel file but I've put a few datalines together to give you an idea:

 

DATALINES;

gender Med_Type SLOE_1_Reg 

M Private Midwest(IA, IL, IN, OH, KS, MI, MN, MO, ND, SD, WI)

F Public Northeast(CT, District of Columbia, DE, MA, MD, ME, NH, PA, NY, NJ, RI, VT)

F Private Southeast(AL, AR, FL, CA, KY, LA, PR, MS, NC, SC, TN, VA, WV)

M Public West(AK, AZ, CA, CO, ID, HI, MT, NM, NV, OK, OR, TX, UT, WA, WY)

;

 

You are correct, I'm just trying to get those overly-descriptive cells down to the basic single word which describes the region.

 

Thank you all for your time and efforts!

 
 

 


You have typed different text on the Midwest line than what you showed in the original message. The spaces and commas are critical if you are going to make TRANWRD work, and so if we're going to help, you absolutely must provide data in the form I requested, as working SAS data step code, preferably using the instructions at the link I gave. That's absolutely MUST provide data in the form I requested. We are trying to help you, but you need to help us.

--
Paige Miller
AMSAS
SAS Super FREQ

A simple example, assuming you want to remove all the states using TRANWRD

I'm assuming you have the value I set in haveStr, and want to convert it to the value I have in wantStr. 

I suggest you provide a sample of the data you are attempting to transform, what you want as a result, what code you currently have and the log from that code

1456  data _null_ ;
1457      haveStr="Midwest (IA, IL, IN, OH, KS, MI, MN, MO, ND, SD, WI)" ;
1458      wantStr="Midwest" ;
1459
1460      convert=tranwrd(haveStr,"Midwest (IA, IL, IN, OH, KS, MI, MN, MO, ND, SD,
1460! WI)","MidWest") ;
1461
1462      put haveStr= ;
1463      put wantStr= ;
1464      put convert= ;
1465  run ;

haveStr=Midwest (IA, IL, IN, OH, KS, MI, MN, MO, ND, SD, WI)
wantStr=Midwest
convert=MidWest
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

 
Admin Note:
Marking this as the Accepted Solution, as it has a link to the TRANWRD documentation
Also, want to call out replies by:

 

@PaigeMiller which notes the importance of knowing your data:

The spaces and commas are critical if you are going to make TRANWRD work

@FreelanceReinh which offers a method to convert values that are "messy" inconsistent: 

SLOE_1_Reg = prxchange('s/(\w+)\s*\(.+?\)/$1/',-1,SLOE_1_Reg);

Basically, this would replace substrings of the form "word (unwanted details)" with "word".

 

 

newtriks
Obsidian | Level 7

Thanks for your time and effort!

 

The version below of what you put together seems to work. The only issue now is that it gives me just one observation. How can I modify it so that it gives me all 500+?


data sloeconvert7 ;
SLOE_1_Reg ="Midwest (IA, IL, IN, OH, KS, MI, MN, MO, ND, SD, WI)" ;
Region="Midwest" ;
Region=tranwrd(SLOE_1_Reg,"Midwest (IA, IL, IN, OH, KS, MI, MN, MO, ND, SD, WI)","Midwest");
PUT SLOE_1_Reg= ;
PUT Region= ;
RUN;

vijaypratap0195
Obsidian | Level 7

Please provide some sample data so that I can figure out the solutions and the cause.

 

-Vijay

FreelanceReinh
Jade | Level 19

Hello @newtriks,

 

If TRANWRD does not replace the substring specified in the second argument, the corresponding substring in variable SLOE_1_Reg must be different. Possible differences include missing blanks, double blanks, other white-space characters (such as 'A0'x or tab characters) instead of blanks, upper/lower case, etc. So, either you need to ensure that the search string matches exactly what you want to replace (use the $HEX. format to determine the exact content of variable SLOE_1_Reg) or use more flexible pattern matching, e.g. with PRXCHANGE.

 

Example:

SLOE_1_Reg = prxchange('s/(\w+)\s*\(.+?\)/$1/',-1,SLOE_1_Reg);

Basically, this would replace substrings of the form "word (unwanted details)" with "word".

 

Edit: Corrected typo: The intention was to include the case of no space between "word" and the opening parenthesis by using \s*, not \s+, in the regex.

newtriks
Obsidian | Level 7

Thank you all for your interest and help. I ultimately found it easier to fix the columns in Excel and then re-import and merge them.

 

I appreciate your time.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2048 views
  • 9 likes
  • 6 in conversation