BookmarkSubscribeRSS Feed
InêsMaximiano
Obsidian | Level 7

I have a column named Cod_Sale, which should only be either null (there's no sale) or the code all in UpperCase Letters.

Example:

Cod_Sale IN (PRIM, SEC, THIRD, UP) ... etc, each month we can receive new Codes

 

I have to remove all data that don't have a sale cod, meaning Cod_Sale <> "".

 

However, when I do this it also selects some value that have Cod_Sale = "". I tried the functions compress, spit and trim to see if it had any blanks, but it still doesn't work.

 

Even if I select compress(Cod_Sale) <> "" or split(Cod_Sale) <> "" or trim(Cod_Sale) <> "" it still selects data with Cod_Sale in blank!

 

Can you please help me?

8 REPLIES 8
Patrick
Opal | Level 21

One possibility for what you observe is that there are other non-print characters in your data - a tab for example. 

You could try below syntax for such a situation:

if missing(compress(Cod_Sale,,'s')) then delete;
InêsMaximiano
Obsidian | Level 7
Thank you so much for your fast answer, but that still didn't work.
Patrick
Opal | Level 21

For a row with a value that you believe should get deleted but doesn't issue below command:

put <variable> hex.;

Then share with us what gets written to the log. 

 

 

Astounding
PROC Star
Show the context in which <> is being used. SQL interprets it differently than a DATA step.
Tom
Super User Tom
Super User

To test if a string has invalid characters use the VERIFY() function.  It will return the position of the first character not in the list of valid characters.  If there are no invalid characters it returns zero.

data want;
  set have;
  if verify(strip(cod_sale),'ABCDEFGHIJKLMNOPQRSTUVWXYZ') then delete;
run;

Also it will not going to work to use the max operator, <>,  when comparing strings.  That only works with numeric values.

1    data test;
2      set sashelp.class(obs=2);
3      if name <> ' ' then put name= 'larger than space.';
NOTE: The "<>" operator is interpreted as "MAX".
4      else put name= 'same as space.';
5    run;

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
      3:11
NOTE: Invalid numeric data, 'Alfred' , at line 3 column 11.
Name=Alfred same as space.
Name=Alfred Sex=M Age=14 Height=69 Weight=112.5 _ERROR_=1 _N_=1
NOTE: Invalid numeric data, 'Alice' , at line 3 column 11.
Name=Alice same as space.
Name=Alice Sex=F Age=13 Height=56.5 Weight=84 _ERROR_=1 _N_=2
NOTE: There were 2 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.TEST has 2 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

 

InêsMaximiano
Obsidian | Level 7

Thank you so much for all your helps!

The function compress(Cod_Sale, , 'kw') worked to remove all blanks and non-print characters!! 

Tom
Super User Tom
Super User

@InêsMaximiano wrote:

Thank you so much for all your helps!

The function compress(Cod_Sale, , 'kw') worked to remove all blanks and non-print characters!! 


I thought the task was to detect OBSERVATIONS that invalid values in the COD_SALE variable. 

 

Will just removing the invalid characters result in converting the invalid values of COD_SALE into valid values?

alexben
Fluorite | Level 6

@InêsMaximianosurahyaseenpdf wrote:

I have a column named Cod_Sale, which should only be either null (there's no sale) or the code all in UpperCase Letters.

Example:

Cod_Sale IN (PRIM, SEC, THIRD, UP) ... etc, each month we can receive new Codes

 

I have to remove all data that don't have a sale cod, meaning Cod_Sale <> "".

 

However, when I do this it also selects some value that have Cod_Sale = "". I tried the functions compress, spit and trim to see if it had any blanks, but it still doesn't work.

 

Even if I select compress(Cod_Sale) <> "" or split(Cod_Sale) <> "" or trim(Cod_Sale) <> "" it still selects data with Cod_Sale in blank!

 

Can you please help me?


To filter out invalid entries and keep only uppercase Cod_Sale values, try using the UPCASE function to ensure the code is uppercase and not blank.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 815 views
  • 5 likes
  • 5 in conversation