BookmarkSubscribeRSS Feed
MILKYLOVE
Calcite | Level 5

Hello guys, 

 

I am creating a new table 'Result', using different columns from a table 'Person'.

I have been provided with different rules to apply on these columns:

 

R16: Delete character "-"

R17: Delete character "*"

R18: Delete character "."

R19: Delete Numerical Letters

R20: Delete character "0"

R21: Delete character "?"

 

These rules have to be applied to several columns. So I cannot use 'Compress' since I will have to do it on each column every time. Is there a proc format that I can apply for deleting characters?

 

MILKYLOVE_1-1664785234929.png

 

 

8 REPLIES 8
sbxkoenk
SAS Super FREQ

Hello,

 

It's not "replace", it's delete !

 

See here :

Deleting a substring from a SAS string
By Leonid Batkhan on SAS Users February 22, 2021
https://blogs.sas.com/content/sgf/2021/02/22/deleting-a-substring-from-a-sas-string/

 

For more of this string-based stuff ,

see https://blogs.sas.com/content/?s=Leonid+Batkhan+string

deleting, inserting, replacing, ...

 

Good luck,

Koen

 
MILKYLOVE
Calcite | Level 5

Actually this does not help me because these are examples where they are using strings, and mostly have only 1/2 lines to treat.

In my case, I have many columns to apply the same treatment and it will not be efficient to do it on each column every time.

That's why I am asking if there is a proc format or another way to automate this step.

 

Thank you.

sbxkoenk
SAS Super FREQ

I think it does help you :

 

/* suppress hyphen        (- ) from ABC vars R1 - R4 */
/* suppress question mark (?) from XYZ vars R8 - R10 */
data work.WANT;
   SUB1 = '-';
   SUB2 = '?';
   set work.HAVE;
   array ABC{4} $ R1 R2 R3 R4;
   array XYZ{3} $ R8 R9 R10  ;
   do i = 1 to dim(ABC);
     ABC(i) = transtrn(ABC(i),SUB1,trimn(''));
   end;
   do i = 1 to dim(XYZ);
     XYZ(i) = transtrn(XYZ(i),SUB2,trimn(''));
   end;
run;
/* end of program */

 

Best,

Koen

MILKYLOVE
Calcite | Level 5

The problem with this solution is that not all the variables of each array list are of the same data type. 

Therefore I am getting errors.

 

The program also does not recognise '-'

Astounding
PROC Star

First, you will need to clarify the request.  Probably, you should be processing character variables only.  But that's "probably", not a guarantee.  For example, numeric variables can take on a value of 0.  And they can appear to be a letter when they take on special numeric values like .A, .B, etc.  So make sure you are supposed to process character variables only.  And finally make sure that you are supposed to remove the offending characters, rather than replace them with blanks.

 

Having done that, it's easy to set up an array containing all the character variables:

 

array just_these {*} _character_;

 

Then move through the array, using the functions of your choice.

sbxkoenk
SAS Super FREQ

I concur with @Astounding .

Also, in your original post, you have this : 
R19: Delete Numerical Letters

 

What are Numerical Letters, is this "digits" in a character string??

 

Suppressing digits can be done with COMPRESS function, like here :
(but there are other modifiers and arguments to the COMPRESS function that can accomplish the same)

 

data one;
   LENGTH x $ 100 y $ 100;
   x='Hearing @ mooing and 3 seeing cows ! tend to occur 77 , together';
   y=compress(x, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ ', 'ki');
   put y=;
run;

Best,

Koen

ballardw
Super User

@MILKYLOVE wrote:

The problem with this solution is that not all the variables of each array list are of the same data type. 

Therefore I am getting errors.

 

The program also does not recognise '-'


Which of your variables is of a "different type" that needs any of those rules applied. All of your rules involve text.

 

What do you mean by "does not recognise '-' "? If you are getting an error then show us the LOG of the actual code you ran. Copy the Log for the step with all the code and all the messages. On the forum open a text box with the </> icon and paste the text.

The text box is important as it will display any of the diagnostics that SAS often supplies correctly where the main message windows will reformat text making them less useful.

 

Also, provide actual example data. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

 

Reeza
Super User

@MILKYLOVE wrote:

The problem with this solution is that not all the variables of each array list are of the same data type. 

Therefore I am getting errors.

 

The program also does not recognise '-'


That rule doesn't make sense for a numeric variable. In SAS, the variable would not be able to have - characters in the first place. If you need to convert types that's a more complex rule. 

 

Cleaning data is always a somewhat tedious process. You have two options, for each rule loop through the variables, or for each variable loop through the applicable rules. It's easier to do the first one in SAS via arrays. 

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1257 views
  • 5 likes
  • 5 in conversation