BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cheenaChuks
Obsidian | Level 7

Hi all,

 

I am quite confused about how to use the lengthn, cat, catx and substr functions all together for string manipulation. I am trying to add a variable that inserts a dash between the 2nd to last character and the last character of Item and wraps the value in parentheses. For example, Item 144L becomes (144-L). Use the LENGTHN, SUBSTR, CAT and CATX functions. The solution should be generalizable to any value of Item containing 2 or more characters. So far, I have been building the program using examples I have seen but I keep getting confused. Please can you help me fix my program and include explanations if u can. I just need to understand how all the functions work together to solve this problem. Thanks.

data CustomerDB ;
  input Customer  &      $18.
        Item      :      $11. ; 
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
;
data newTemp;
	set CustomerDB;
	if lengthn(Item) ge 3 then
		secondtoLastRemoved = catx("-",(substr(Item,1,(lengthn(Item)-2))),(substr(Item,lengthn(Item), 1))
									); /*Dropping Second to last characters from string?*/
	else
		secondtoLastRemoved = Item;
	if findc(Item,'mart','i') gt 0 then
	bool = 1; /*setting boolean for words that have the string "mart" in them */
	else
	bool = 0;
	lowcaseCustomer = lowcase(compress(Customer,,'p')); /* Remove Punctuation and make letters lowercase */
run;
proc print data = newTemp;
title "CustomerDB string Manipulations";
run;
1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Before any explanation. Does this give you the desired result? 🙂

 

data CustomerDB ;
  input Customer  &      $18.
        Item      :      $11. ; 
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
;

data want;
   set CustomerDB;
   length NewItem $50;
   NewItem=cats("(", substr(Item, 1, length(Item)-1), "-", char(Item, length(Item)), ")");
run;

proc print data=want;
run;

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

Before any explanation. Does this give you the desired result? 🙂

 

data CustomerDB ;
  input Customer  &      $18.
        Item      :      $11. ; 
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
;

data want;
   set CustomerDB;
   length NewItem $50;
   NewItem=cats("(", substr(Item, 1, length(Item)-1), "-", char(Item, length(Item)), ")");
run;

proc print data=want;
run;
cheenaChuks
Obsidian | Level 7

Yes it does but please could you explain the program and the functions. That is what I was confused about.

andreas_lds
Jade | Level 19

When working with char-variables it is important to use the length-statement to define the variable with suitable length.

 

I think that

if findc(Item,'mart','i') gt 0 then

has at least to issues: The variable "Item" seems to be wrongly used, it should be "Customer", right? The function findc is used to verify the presence of letters in a string, not to find words. Maybe findw is what you need here.

 

To debug such a step use additional variables for lengthn(Item) and the substrings you take to see what sas does.

 

Please post the expected result, i am not sure that i fully understand the logic you want applied:

data newTemp;
    set CustomerDB;
    
    length secondtoLastRemoved $ 14;

    if lengthn(Item) ge 3 then do;
        secondtoLastRemoved = cats('(', substr(Item, 1, lengthn(Item)-1), '-', substr(Item, lengthn(Item)-1, 1), ')');
    end;
    else do;
        secondtoLastRemoved=Item;
    end;

    /*setting boolean for words that have the string "mart" in them */
    bool = index(lowcase(Customer), 'mart') > 0;
    *bool = findw(Customer, 'mart', , 'spit') > 0;
    
     /* Remove Punctuation and make letters lowercase */
    lowcaseCustomer = lowcase(compress(Customer, , 'p'));
run;

And again @PeterClemmensen  was faster 😉

PeterClemmensen
Tourmaline | Level 20

However, @andreas_lds was way more thorough in his explanation 🙂

cheenaChuks
Obsidian | Level 7

Hi Andreas,

 

Thanks for the explanation and also pointing out the mistake I made with the findc function. Yes, I was trying to find the name of the customer that had the string 'mart'.  Here is a snippet of the code and output.

data test ;
  length Customer $18 Item $11 Want $14 ;
  input Customer & Item; 
  want = cats('(',substr(item,1,length(item)-1),'-',char(item,length(item)),')');
  bool = index(lowcase(Customer), 'mart') > 0;
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
;
proc print;
run;

 

Ksharp
Super User

You did not post the output yet .

 

 

data CustomerDB ;
  input Customer  &      $18.
        Item      :      $11. ; 
want=prxchange('s/^(\d+)([a-z]+)$/($1-$2)/i',1,strip(item));
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
;
Tom
Super User Tom
Super User

You didn't show what you want the result to look like.  Here is what your words said to do.

data test ;
  length Customer $18 Item $11 Want $14 ;
  input Customer & Item; 
  want = cats('(',substr(item,1,length(item)-1),'-',char(item,length(item)),')');
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
;
proc print;
run;
Obs    Customer              Item           Want

  1    Bartco Corporation    143L           (143-L)
  2    Cost Cutter's         142            (14-2)
  3    Minimart Inc.         188SW          (188S-W)
  4    Bosco, LLC            908X321        (908X32-1)
  5    K-MART.               A122L          (A122-L)
  6    Food Unlimited        1898XVWX       (1898XVW-X)
  7    Shop and Drop         24L            (24-L)
  8    Shop Marty's          222X           (222-X)
  9    Pet's are Us          W100           (W10-0)
 10    Roger's Spirits       1522L          (1522-L)
 11    Artful Spirits        407XxxXxx      (407XxxXx-x)
 12    Up/Down mart Inc.     ABCdefghxxx    (ABCdefghxx-x)
PGStats
Opal | Level 21

If you want to avoid problems with short Item strings, tweak @Tom 's idea a bit:

 

data test ;
  length Customer $18 Item $11 Want $14 ;
  input Customer & Item; 
  want = cats('(',catx("-", substrn(item,1,length(item)-1),char(item,length(item))),')');
datalines;
Bartco Corporation  143L
Cost Cutter's  142
Minimart Inc.  188SW
Bosco, LLC  908X321
K-MART.  A122L
Food Unlimited  1898XVWX
Shop and Drop  24L
Shop Marty's  222X
Pet's are Us  W100
Roger's Spirits  1522L
Artful Spirits  407XxxXxx
Up/Down mart Inc.  ABCdefghxxx
Tiny test  a
Nasty test  .
;
proc print;
run;
             Obs    Customer              Item           Want

               1    Bartco Corporation    143L           (143-L)
               2    Cost Cutter's         142            (14-2)
               3    Minimart Inc.         188SW          (188S-W)
               4    Bosco, LLC            908X321        (908X32-1)
               5    K-MART.               A122L          (A122-L)
               6    Food Unlimited        1898XVWX       (1898XVW-X)
               7    Shop and Drop         24L            (24-L)
               8    Shop Marty's          222X           (222-X)
               9    Pet's are Us          W100           (W10-0)
              10    Roger's Spirits       1522L          (1522-L)
              11    Artful Spirits        407XxxXxx      (407XxxXx-x)
              12    Up/Down mart Inc.     ABCdefghxxx    (ABCdefghxx-x)
              13    Tiny test             a              (a)
              14    Nasty test                           ()
PG