BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
David_Billa
Rhodochrosite | Level 12

@Patrick Please clarify why you've added function in your program.

Patrick
Opal | Level 21

@David_Billa Because based on the sample data and code you've posted I've made the assumption that for the same ID there could be data as I've created in my post. So for same ID a value for CRF that could always contain a sub-string [INS] but then different function names in the remaining string. If so then you would likely want to aggregate on the lowest level of the function given the code you shared.

David_Billa
Rhodochrosite | Level 12

One last question. Can't we achieve it without creating function? Also what is the meaning of -l in PUT function?

Patrick
Opal | Level 21

@David_Billa The -l stands for LeftAligned. Without it the string in value would be right aligned with leading blanks which is something people normally don't want.

 

Yes. with SAS SQL you could also not use variable Function in the group by clause and it would still work. SAS somehow internally still creates groups per aggregate function. But then: The aggregation is on function level so this is just about clean code that someone else can understand and maintain (and you still can understand in one year's time).

ballardw
Super User

@David_Billa wrote:

One last question. Can't we achieve it without creating function? Also what is the meaning of -l in PUT function?


The -L (to make it more legible) in Put means to left justify the result. By default Put with a numeric format will right justify the value and if the value does not "fill" the format there will be leading zeroes.

 

Example:

data example;
    x=3;
    y= quote(put(x,f12.2));
    z= quote(put(x,f12.2 -L));
run;

The Quote places quote marks around the value so you can see just how the value has resolved with the Put function.

The presence of leading spaces can lead to all sorts of interesting behaviors depending what you do with a value like that later. Some are apparent "duplicate" values that are treated as different, failure to convert back to numeric with INPUT, unexpected spaces in concatenation operations and unexpected "sort" order results.

Patrick
Opal | Level 21

@David_Billa Here how to have everything in a single SQL without "cluttering" the code too much.

proc sql;
  create table Want as 
    select distinct
      Id,
      CRF, 
      function,
      put(
          case function 
            when "MIN" then min(Value)
            when "MAX" then max(Value)
            when "AVG" then avg(Value)
            when "STD" then std(Value)
            else missing(value)
          end 
          , 24.6 -l)
        as Value 
    from 
      (
        select 
          id,
          scan(CRF,1,'#') as CRF length=10,
          scan(CRF,-1,,'ka') as function length=3,
          input(Value,32.) as value format=24.6
        from have
      )

    group by Id,CRF,Function
  ;
quit;

 

Tom
Super User Tom
Super User

Are you sure the aggregate functions are running on the values you expect to include?  That is do you need a WHERE clause?  Or perhaps another nested CASE/END inside the MAX(), MIN() etc?

 

Do NOT include decimal number in the INFORMAT unless you know that the periods were purposely removed from the strings to save one character and you want INPUT() to divide by that power of 10.

Is it possible VALUE has leading spaces?  The maximum width the normal numeric informat supports is 32, use that.

input(left(Value),32.)

Why are you pushing the results back into a character result? Why not leave it as a number? Remove the STRIP(PUT()) wrappers.

 

Sajid01
Meteorite | Level 14

Hello @David_Billa 
Your program  logic needs a small correction.
Your source data set has a variable 'CRF'.

In the Proc SQL you are recalculating the variable  'CRF' .
It is this recalculated 'CRF' you want to use.
However the use of 'CRF' in the group by statement cause the PROC SQL to use the original 'CRF' and not the newly calculated one.

Therefore use the calculated keyword before 'CRF' in the group by statement as shown below and you will get the result.

proc Sql;
    create table Want as select distinct * from (
        select Id
               ,scan(CRF,1,'#') as CRF, (
            case 
                when scan(CRF,-1,,'ka')="MIN" then strip(put(min(input(Value,24.6)),24.6))
                when scan(CRF,-1,,'ka')="MAX" then strip(put(max(input(Value,24.6)),24.6))
                when scan(CRF,-1,,'ka')="AVG" then strip(put(avg(input(Value,24.6)),24.6))
                when scan(CRF,-1,,'ka')="STD" then strip(put(std(input(Value,24.6)),24.6))
                else ''
            end )
        as Value
        from INPUT
        group by Id,calculated  CRF);
quit;

You will get the desired output as can be seen below

Sajid01_0-1626200648903.png

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 22 replies
  • 1864 views
  • 12 likes
  • 6 in conversation