BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cheenaChuks
Obsidian | Level 7

Hi all,

 

I am quite confused about how to use the lengthn, cat, catx and substr functions all together for string manipulation. I am trying to add a variable that inserts a dash between the 2nd to last character and the last character of Item and wraps the value in parentheses. For example, Item 144L becomes (144-L). Use the LENGTHN, SUBSTR, CAT and CATX functions. The solution should be generalizable to any value of Item containing 2 or more characters. So far, I have been building the program using examples I have seen but I keep getting confused. Please can you help me fix my program and include explanations if u can. I just need to understand how all the functions work together to solve this problem. Thanks.

data CustomerDB ;
  input Customer  &      $18.
        Item      :      $11. ; 
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
;
data newTemp;
	set CustomerDB;
	if lengthn(Item) ge 3 then
		secondtoLastRemoved = catx("-",(substr(Item,1,(lengthn(Item)-2))),(substr(Item,lengthn(Item), 1))
									); /*Dropping Second to last characters from string?*/
	else
		secondtoLastRemoved = Item;
	if findc(Item,'mart','i') gt 0 then
	bool = 1; /*setting boolean for words that have the string "mart" in them */
	else
	bool = 0;
	lowcaseCustomer = lowcase(compress(Customer,,'p')); /* Remove Punctuation and make letters lowercase */
run;
proc print data = newTemp;
title "CustomerDB string Manipulations";
run;
1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Before any explanation. Does this give you the desired result? 🙂

 

data CustomerDB ;
  input Customer  &      $18.
        Item      :      $11. ; 
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
;

data want;
   set CustomerDB;
   length NewItem $50;
   NewItem=cats("(", substr(Item, 1, length(Item)-1), "-", char(Item, length(Item)), ")");
run;

proc print data=want;
run;

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

Before any explanation. Does this give you the desired result? 🙂

 

data CustomerDB ;
  input Customer  &      $18.
        Item      :      $11. ; 
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
;

data want;
   set CustomerDB;
   length NewItem $50;
   NewItem=cats("(", substr(Item, 1, length(Item)-1), "-", char(Item, length(Item)), ")");
run;

proc print data=want;
run;
cheenaChuks
Obsidian | Level 7

Yes it does but please could you explain the program and the functions. That is what I was confused about.

andreas_lds
Jade | Level 19

When working with char-variables it is important to use the length-statement to define the variable with suitable length.

 

I think that

if findc(Item,'mart','i') gt 0 then

has at least to issues: The variable "Item" seems to be wrongly used, it should be "Customer", right? The function findc is used to verify the presence of letters in a string, not to find words. Maybe findw is what you need here.

 

To debug such a step use additional variables for lengthn(Item) and the substrings you take to see what sas does.

 

Please post the expected result, i am not sure that i fully understand the logic you want applied:

data newTemp;
    set CustomerDB;
    
    length secondtoLastRemoved $ 14;

    if lengthn(Item) ge 3 then do;
        secondtoLastRemoved = cats('(', substr(Item, 1, lengthn(Item)-1), '-', substr(Item, lengthn(Item)-1, 1), ')');
    end;
    else do;
        secondtoLastRemoved=Item;
    end;

    /*setting boolean for words that have the string "mart" in them */
    bool = index(lowcase(Customer), 'mart') > 0;
    *bool = findw(Customer, 'mart', , 'spit') > 0;
    
     /* Remove Punctuation and make letters lowercase */
    lowcaseCustomer = lowcase(compress(Customer, , 'p'));
run;

And again @PeterClemmensen  was faster 😉

PeterClemmensen
Tourmaline | Level 20

However, @andreas_lds was way more thorough in his explanation 🙂

cheenaChuks
Obsidian | Level 7

Hi Andreas,

 

Thanks for the explanation and also pointing out the mistake I made with the findc function. Yes, I was trying to find the name of the customer that had the string 'mart'.  Here is a snippet of the code and output.

data test ;
  length Customer $18 Item $11 Want $14 ;
  input Customer & Item; 
  want = cats('(',substr(item,1,length(item)-1),'-',char(item,length(item)),')');
  bool = index(lowcase(Customer), 'mart') > 0;
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
;
proc print;
run;

 

Ksharp
Super User

You did not post the output yet .

 

 

data CustomerDB ;
  input Customer  &      $18.
        Item      :      $11. ; 
want=prxchange('s/^(\d+)([a-z]+)$/($1-$2)/i',1,strip(item));
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
;
Tom
Super User Tom
Super User

You didn't show what you want the result to look like.  Here is what your words said to do.

data test ;
  length Customer $18 Item $11 Want $14 ;
  input Customer & Item; 
  want = cats('(',substr(item,1,length(item)-1),'-',char(item,length(item)),')');
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
;
proc print;
run;
Obs    Customer              Item           Want

  1    Bartco Corporation    143L           (143-L)
  2    Cost Cutter's         142            (14-2)
  3    Minimart Inc.         188SW          (188S-W)
  4    Bosco, LLC            908X321        (908X32-1)
  5    K-MART.               A122L          (A122-L)
  6    Food Unlimited        1898XVWX       (1898XVW-X)
  7    Shop and Drop         24L            (24-L)
  8    Shop Marty's          222X           (222-X)
  9    Pet's are Us          W100           (W10-0)
 10    Roger's Spirits       1522L          (1522-L)
 11    Artful Spirits        407XxxXxx      (407XxxXx-x)
 12    Up/Down mart Inc.     ABCdefghxxx    (ABCdefghxx-x)
PGStats
Opal | Level 21

If you want to avoid problems with short Item strings, tweak @Tom 's idea a bit:

 

data test ;
  length Customer $18 Item $11 Want $14 ;
  input Customer & Item; 
  want = cats('(',catx("-", substrn(item,1,length(item)-1),char(item,length(item))),')');
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
Tiny test  a
Nasty test  .
;
proc print;
run;
             Obs    Customer              Item           Want

               1    Bartco Corporation    143L           (143-L)
               2    Cost Cutter's         142            (14-2)
               3    Minimart Inc.         188SW          (188S-W)
               4    Bosco, LLC            908X321        (908X32-1)
               5    K-MART.               A122L          (A122-L)
               6    Food Unlimited        1898XVWX       (1898XVW-X)
               7    Shop and Drop         24L            (24-L)
               8    Shop Marty's          222X           (222-X)
               9    Pet's are Us          W100           (W10-0)
              10    Roger's Spirits       1522L          (1522-L)
              11    Artful Spirits        407XxxXxx      (407XxxXx-x)
              12    Up/Down mart Inc.     ABCdefghxxx    (ABCdefghxx-x)
              13    Tiny test             a              (a)
              14    Nasty test                           ()
PG

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 7146 views
  • 1 like
  • 6 in conversation