BookmarkSubscribeRSS Feed
Marco_park
Calcite | Level 5

Hi everyone, I am trying to convert a postgresql in proc sas.

I need these main functions available in postgresql least, greatest and regexp_replace.

How do they translate in SAS?

 

Thank you,

Marco

13 REPLIES 13
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Its a good idea to review the manual on SAS's implementation of ANSI SQL.  

Least is most likely to be min()

Greatest is most likely to be max()

regexp_replace could be a number of different things, for example tranwrd() function.  Perl regex is also available in SAS:

https://support.sas.com/rnd/base/datastep/perl_regexp/regexp-tip-sheet.pdf

 

You might also find it simpler coding if you learn some Base SAS and use that rather than SQL, for instance min/max can be done in proc means, summary, or datastep.

Marco_park
Calcite | Level 5

Thank you, I used compress for REGEXP_REPLACE, and I have tried MIN and MAX, however MIN and MAX require numeric inputs, what if I have character strings?

 

Thank you

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Why would you store numbers in character variables?  Anyways:

...
  min(input(char_variable,best.)) as min_result
...

Use the input to convert char to number, and put to convert numeric to char.

Marco_park
Calcite | Level 5

I have tried the following 

min(input(char_variable1,best.),input(char_variable2,best.)) as min_result

However I have errors. The code I am trying to convert is the following

 LEAST(REGEXP_REPLACE(REGEXP_REPLACE(identificativo_veicolo_denuncian,'\*',''),' ',''),REGEXP_REPLACE(REGEXP_REPLACE(identificativo_veicolo_contropar,'\*',''),' ','') ) AS targa1,
 GREATEST(REGEXP_REPLACE(REGEXP_REPLACE(identificativo_veicolo_denuncian,'\*',''),' ',''),REGEXP_REPLACE(REGEXP_REPLACE(identificativo_veicolo_contropar,'\*',''),' ','') ) AS targa2,

Any help would be greatly appreciated.

 

Thank you 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

You are using regex statement to process some character data into a numeric type.  I need to see some example of what the data looks like before.  So what does 

identificativo_veicolo_denuncian

Contain, what does it look like? 

Marco_park
Calcite | Level 5

It is in this from ' *AA123BB' and with the SAS expression compress(identificativo_veicolo_denuncian,'* ') it becomes

'AA123BB'.

Same applies to identificativo_veicolo_contropar. I need the MIN and MAX function to store in two different variables the MIN and MAX (alphabetically) the two strings.

 

Thank you

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ok, so:

min(input(compress(identificativo_veicolo_denuncian," ","kd"),best.)) as want

So the compress, I remove blanks, and I use the options k = keep, d = digits, to keep only the numbers in the string, then I convert number and min().

Marco_park
Calcite | Level 5

Thank you very much,

the thing is that for each row I need the min of the two variables identificativo_veicolo_contropar and identificativo_veicolo_denuncian. in PostgreSQL this could be done with the instruction LEAST(VAR1,VAR2)

 

How can I do this?

Can I be achieved as?

min(input(compress(identificativo_veicolo_denuncian," ","kd"),best.),input(compress(identificativo_veicolo_contropar," ","kd"),best.)) as targa1,

 

 

Thank you

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Yes, min() function is a SAS function which can take a list of numeric values and return the minimum.  The min() from SQL min's a column of data, so you can use either and the compiled is clever enough to know which one.

Marco_park
Calcite | Level 5

Thank you,

 

My final query looks like this one

 

proc sql;
CREATE TABLE prova AS
SELECT DISTINCT data_sinistro, identificativo_veicolo,
			 min(input(compress(identificativo_veicolo_denuncian," ","kd"),best.),input(compress(identificativo_veicolo_contropar," ","kd"),best.)) as targa1,
             max(input(compress(identificativo_veicolo_denuncian," ","kd"),best.),input(compress(identificativo_veicolo_contropar," ","kd"),best.)) AS targa2,
             CASE WHEN ruolo_veicolo_richiesto = 'D' THEN 'D'
             ELSE 'C' END AS ruolo_veicolo_richiesto,
             'NO_CARD' AS tipo
             FROM Ssrc
			 WHERE calculated targa1 NOT IN ('','ZZ999ZZ',  'ZZ999XX',  'ZZ99999',  'ZZ000ZZ',  'XXXXXX',   'AA00000',  'AA000AA',  'AA000BB',  'AA000XX',  'AA11111', 'AA000ZZ', 'AA111BB', 'XX000XX', 'ELIMINAT', 'XX999XX', 'XX111XX', 'XX123XX',  'MI111111', 'PLPENPLP', 'XX000YY', 'RMZ94732', 'AA111AA', 'AA123BB', 'BS400BZ', 'KASKO', 'RCDIV','VARIE','TRASP','PEDONE','MOTVEL','FURTOT','FURTOP','IGNOTO','INCEND','FURTOR','AGG.TO','ESTERA','FURTOL')
             AND calculated targa2 NOT IN ('','ZZ999ZZ',  'ZZ999XX',  'ZZ99999',  'ZZ000ZZ',  'XXXXXX',   'AA00000',  'AA000AA',  'AA000BB',  'AA000XX',  'AA11111', 'AA000ZZ', 'AA111BB', 'XX000XX', 'ELIMINAT', 'XX999XX', 'XX111XX', 'XX123XX',  'MI111111', 'PLPENPLP', 'XX000YY', 'RMZ94732', 'AA111AA', 'AA123BB', 'BS400BZ', 'KASKO', 'RCDIV','VARIE','TRASP','PEDONE','MOTVEL','FURTOT','FURTOP','IGNOTO','INCEND','FURTOR','AGG.TO','ESTERA','FURTOL')
			 AND data_sinistro is not missing
;
quit;

but, unfortunately it returns the error

 

ERROR: Expression using IN has components that are of different data types.
ERROR: Expression using IN has components that are of different data types.

What am I doing wrong?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

You are comparing apples to bananas:

min(input(compress(identificativo_veicolo_denuncian," ","kd"),best.),input(compress(identificativo_veicolo_contropar," ","kd"),best.)) as targa1,

This returns a number which is the minimum of the two converted text fields.  It will be a number or missing.

 

 

WHERE calculated targa1 NOT IN ('','ZZ999ZZ',  'ZZ999XX',  'ZZ99999',  'ZZ000ZZ',  'XXXXXX',   'AA00000',  'AA000AA',  'AA000BB',  'AA000XX',  'AA11111', 'AA000ZZ', 'AA111BB', 'XX000XX', 'ELIMINAT', 'XX999XX', 'XX111XX', 'XX123XX',  'MI111111', 'PLPENPLP', 'XX000YY', 'RMZ94732', 'AA111AA', 'AA123BB', 'BS400BZ', 'KASKO', 'RCDIV','VARIE','TRASP','PEDONE','MOTVEL','FURTOT','FURTOP','IGNOTO','INCEND','FURTOR','AGG.TO','ESTERA','FURTOL')

This is comparing the numeric value of targa1 as defined previously, with a set of character strings, something that can't be done, and doesn't make sense anyway.  

 

Say you two character variables are:

AA123BB   and  BA345BB 

We calculate targa1 as 123, as remove all the character data, then the minimum is 123 which is < 345.  So the numeric variable targa1=123.  Then we say is 123 in (characters), which is a fail as you don't compare numbers to characters.  It is also a fail as 123 is never going to be in that list of text strings even if convert the number to character again.

 

What you might want (and again, flying completely blind here):

WHERE identificativo_veicolo_denuncian NOT IN ('','ZZ999ZZ',  'ZZ999XX',  'ZZ99999',  'ZZ000ZZ',  'XXXXXX',   'AA00000',  'AA000AA',  'AA000BB',  'AA000XX',  'AA11111', 'AA000ZZ', 'AA111BB', 'XX000XX', 'ELIMINAT', 'XX999XX', 'XX111XX', 'XX123XX',  'MI111111', 'PLPENPLP', 'XX000YY', 'RMZ94732', 'AA111AA', 'AA123BB', 'BS400BZ', 'KASKO', 'RCDIV','VARIE','TRASP','PEDONE','MOTVEL','FURTOT','FURTOP','IGNOTO','INCEND','FURTOR','AGG.TO','ESTERA','FURTOL')
Marco_park
Calcite | Level 5

Thank you very much for your help. I have found a solution that seems to be working as

 

SELECT DISTINCT data_sinistro, identificativo_veicolo,
             ifc(compress(identificativo_veicolo_denuncian,"* ")<compress(identificativo_veicolo_contropar,"* "),compress(identificativo_veicolo_denuncian,"* "),compress(identificativo_veicolo_contropar,"* ")) as targa1,
             ifc(compress(identificativo_veicolo_denuncian,"* ")>compress(identificativo_veicolo_contropar,"* "),compress(identificativo_veicolo_denuncian,"* "),compress(identificativo_veicolo_contropar,"* ")) as targa2,
             CASE WHEN ruolo_veicolo_richiesto = 'D' THEN 'D'
             ELSE 'C' END AS ruolo_veicolo_richiesto,
             'NO_CARD' AS tipo
             FROM Ssrc
             WHERE calculated targa1 NOT IN ('','ZZ999ZZ',  'ZZ999XX',  'ZZ99999',  'ZZ000ZZ',  'XXXXXX',   'AA00000',  'AA000AA',  'AA000BB',  'AA000XX',  'AA11111', 'AA000ZZ', 'AA111BB', 'XX000XX', 'ELIMINAT', 'XX999XX', 'XX111XX', 'XX123XX',  'MI111111', 'PLPENPLP', 'XX000YY', 'RMZ94732', 'AA111AA', 'AA123BB', 'BS400BZ', 'KASKO', 'RCDIV','VARIE','TRASP','PEDONE','MOTVEL','FURTOT','FURTOP','IGNOTO','INCEND','FURTOR','AGG.TO','ESTERA','FURTOL')
             AND calculated targa2 NOT IN ('','ZZ999ZZ',  'ZZ999XX',  'ZZ99999',  'ZZ000ZZ',  'XXXXXX',   'AA00000',  'AA000AA',  'AA000BB',  'AA000XX',  'AA11111', 'AA000ZZ', 'AA111BB', 'XX000XX', 'ELIMINAT', 'XX999XX', 'XX111XX', 'XX123XX',  'MI111111', 'PLPENPLP', 'XX000YY', 'RMZ94732', 'AA111AA', 'AA123BB', 'BS400BZ', 'KASKO', 'RCDIV','VARIE','TRASP','PEDONE','MOTVEL','FURTOT','FURTOP','IGNOTO','INCEND','FURTOR','AGG.TO','ESTERA','FURTOL')
	     AND data_sinistro is not missing);

Now one last step.

In PostgreSQL I would run the following code:

 

CREATE TABLE prova AS
SELECT DISTINCT ON (data_sinistro, identificativo_veicolo, targa1, targa2)
data_sinistro, identificativo_veicolo, targa1, targa2, ruolo_veicolo_richiesto, tipo
  FROM (****)

Where the **** represents the block of code previously written. This codes eliminates duplicates based on this three fields only 

(data_sinistro, identificativo_veicolo, targa1, targa2)

Keeping all the variables

data_sinistro, identificativo_veicolo, targa1, targa2, ruolo_veicolo_richiesto, tipo  

 In the final table.

Is there something similar in SAS?

 

Thank you very much indeed for your help

Reeza
Super User

No, they don't in PROC SQL. Try it.

 


@Marco_park wrote:

Thank you, I used compress for REGEXP_REPLACE, and I have tried MIN and MAX, however MIN and MAX require numeric inputs, what if I have character strings?

 

Thank you


 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1194 views
  • 0 likes
  • 3 in conversation