BookmarkSubscribeRSS Feed
Babloo
Rhodochrosite | Level 12

Following is not working if I add max function before substr. Group_id is character data type. Any help?

 

MEMBER_NO||''||DEP_ID||'-'||max(SUBSTR(GROUP_ID, 1, 6))||max(SUBSTR(GROUP_ID, 8, 2))||'-'||PATIENT_DOB||'-'||PATIENT_GENDER_ID  as ID
 
 
MEMBER_NO||''||DEP_ID||'-'||max(SUBSTR(GROUP_ID, 1, 6)||SUBSTR(GROUP_ID, 8, 2))||'-'||PATIENT_DOB||'-'||PATIENT_GENDER_ID  as ID
11 REPLIES 11
Babloo
Rhodochrosite | Level 12

@Kurt_Bremser  If I remove the max function, it's working fine otherwise receiving the error message similar to invalid identifier. I don't remember the exact error message now as I logged off already.

PaigeMiller
Diamond | Level 26

@Babloo wrote:

@Kurt_Bremser  If I remove the max function, it's working fine otherwise receiving the error message similar to invalid identifier. I don't remember the exact error message now as I logged off already.


Please from now on, in all your future posts, provide us with the exact error message, in fact we need the log for this data step and not just you saying there is an error. Don't make us ask for the log and the error message, don't withhold this information from us.

 

But if it is working fine, why do you need to add MAX and find the maximum of the character string? What does it even mean to find the maximum of a character string?

--
Paige Miller
PaigeMiller
Diamond | Level 26

You want to find the max of a character string? Why do you feel it is necessary to add the MAX function?

--
Paige Miller
Babloo
Rhodochrosite | Level 12

@PaigeMiller  due to issues with group ids changing between. If I remove the max function it's working.

PaigeMiller
Diamond | Level 26

@Babloo wrote:

@PaigeMiller  due to issues with group ids changing between. If I remove the max function it's working.


Sorry, but I cannot decipher the meaning of this. Please provide examples and further explanation.

--
Paige Miller
Tom
Super User Tom
Super User

Since you have that AS keyword we can assume you are doing this in SQL code instead of normal SAS code.

 

There is big difference between selecting the value of a variable and taking the aggregate value of a variable across observations.  So I suspect the issue is with the context of where this snippet is used and not with SUBSTR() or MAX() per se.

 

Try these two queries and notice that you get different results:

select age as X from sashelp.class;
select max(age) as X from sashelp.class

 

And you are mixing the two by including PATIENT_ID and max(SUBSTR(GROUP_ID, 1, 6)) in the same expression.  It will work in SAS code, but notice the NOTE you get.

316  proc sql;
317  select name,age,max(age) as X from sashelp.class
318  quit;
NOTE: The query requires remerging summary statistics back with the original data.

That same query will cause an ERROR is most implementations of SQL since they will not do the automatic remerge of summary statistics the way that PROC SQL will.

 

 

Babloo
Rhodochrosite | Level 12
How to make this summary statistics work in normal sql?
Tom
Super User Tom
Super User

Do the REMERGE yourself.

Let's take a simple example:

select name,sex,height,mean(height) as gender_height
  from sashelp.class
  group by sex
  order by name
;

So just calculate the mean(height) in a separate sub-query and join it back to the main table by matching the SEX variable values.

select a.name,a.sex,a.height,b.gender_height
  from sashelp.class a
  inner join (select sex,mean(height) as gender_height
              from sashelp.class
              group by sex ) b
  on a.sex=b.sex
  order by a.name
;
Patrick
Opal | Level 21

MEMBER_NO||''||DEP_ID||'-'||max(SUBSTR(GROUP_ID, 1, 6))||max(SUBSTR(GROUP_ID, 8, 2))||'-'||PATIENT_DOB||'-'||PATIENT_GENDER_ID  as ID
 
 
MEMBER_NO||''||DEP_ID||'-'||max(SUBSTR(GROUP_ID, 1, 6)||SUBSTR(GROUP_ID, 8, 2))||'-'||PATIENT_DOB||'-'||PATIENT_GENDER_ID  as ID

On top of what others already wrote: 

Doesn't MEMBER_NO already identify an individual in your data? And if so wouldn't PATIENT_DOB and PATIENT_GENDER_ID always have the same value for a MEMBER_NO and though adding these two variable to the concatenated string wouldn't change a thing?

Can GROUP_ID have different values for the the same MEMBER_NO? ...and even if so if you depending on how you aggregate your values (GROUP BY) also max(SUBSTR(GROUP_ID, 1, 6)) and max(SUBSTR(GROUP_ID, 8, 2)) would always have the same value per MEMBER_NO which in the end means you could use MEMBER_NO directly.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 3398 views
  • 0 likes
  • 5 in conversation