I have a column named Cod_Sale, which should only be either null (there's no sale) or the code all in UpperCase Letters.
Example:
Cod_Sale IN (PRIM, SEC, THIRD, UP) ... etc, each month we can receive new Codes
I have to remove all data that don't have a sale cod, meaning Cod_Sale <> "".
However, when I do this it also selects some value that have Cod_Sale = "". I tried the functions compress, spit and trim to see if it had any blanks, but it still doesn't work.
Even if I select compress(Cod_Sale) <> "" or split(Cod_Sale) <> "" or trim(Cod_Sale) <> "" it still selects data with Cod_Sale in blank!
Can you please help me?
One possibility for what you observe is that there are other non-print characters in your data - a tab for example.
You could try below syntax for such a situation:
if missing(compress(Cod_Sale,,'s')) then delete;
For a row with a value that you believe should get deleted but doesn't issue below command:
put <variable> hex.;
Then share with us what gets written to the log.
To test if a string has invalid characters use the VERIFY() function. It will return the position of the first character not in the list of valid characters. If there are no invalid characters it returns zero.
data want;
set have;
if verify(strip(cod_sale),'ABCDEFGHIJKLMNOPQRSTUVWXYZ') then delete;
run;
Also it will not going to work to use the max operator, <>, when comparing strings. That only works with numeric values.
1 data test; 2 set sashelp.class(obs=2); 3 if name <> ' ' then put name= 'larger than space.'; NOTE: The "<>" operator is interpreted as "MAX". 4 else put name= 'same as space.'; 5 run; NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 3:11 NOTE: Invalid numeric data, 'Alfred' , at line 3 column 11. Name=Alfred same as space. Name=Alfred Sex=M Age=14 Height=69 Weight=112.5 _ERROR_=1 _N_=1 NOTE: Invalid numeric data, 'Alice' , at line 3 column 11. Name=Alice same as space. Name=Alice Sex=F Age=13 Height=56.5 Weight=84 _ERROR_=1 _N_=2 NOTE: There were 2 observations read from the data set SASHELP.CLASS. NOTE: The data set WORK.TEST has 2 observations and 5 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
Thank you so much for all your helps!
The function compress(Cod_Sale, , 'kw') worked to remove all blanks and non-print characters!!
@InêsMaximiano wrote:
Thank you so much for all your helps!
The function compress(Cod_Sale, , 'kw') worked to remove all blanks and non-print characters!!
I thought the task was to detect OBSERVATIONS that invalid values in the COD_SALE variable.
Will just removing the invalid characters result in converting the invalid values of COD_SALE into valid values?
@InêsMaximianosurahyaseenpdf wrote:I have a column named Cod_Sale, which should only be either null (there's no sale) or the code all in UpperCase Letters.
Example:
Cod_Sale IN (PRIM, SEC, THIRD, UP) ... etc, each month we can receive new Codes
I have to remove all data that don't have a sale cod, meaning Cod_Sale <> "".
However, when I do this it also selects some value that have Cod_Sale = "". I tried the functions compress, spit and trim to see if it had any blanks, but it still doesn't work.
Even if I select compress(Cod_Sale) <> "" or split(Cod_Sale) <> "" or trim(Cod_Sale) <> "" it still selects data with Cod_Sale in blank!
Can you please help me?
To filter out invalid entries and keep only uppercase Cod_Sale
values, try using the UPCASE
function to ensure the code is uppercase and not blank.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.