@Patrick Please clarify why you've added function in your program.
@David_Billa Because based on the sample data and code you've posted I've made the assumption that for the same ID there could be data as I've created in my post. So for same ID a value for CRF that could always contain a sub-string [INS] but then different function names in the remaining string. If so then you would likely want to aggregate on the lowest level of the function given the code you shared.
One last question. Can't we achieve it without creating function? Also what is the meaning of -l in PUT function?
@David_Billa The -l stands for LeftAligned. Without it the string in value would be right aligned with leading blanks which is something people normally don't want.
Yes. with SAS SQL you could also not use variable Function in the group by clause and it would still work. SAS somehow internally still creates groups per aggregate function. But then: The aggregation is on function level so this is just about clean code that someone else can understand and maintain (and you still can understand in one year's time).
@David_Billa wrote:
One last question. Can't we achieve it without creating function? Also what is the meaning of -l in PUT function?
The -L (to make it more legible) in Put means to left justify the result. By default Put with a numeric format will right justify the value and if the value does not "fill" the format there will be leading zeroes.
Example:
data example; x=3; y= quote(put(x,f12.2)); z= quote(put(x,f12.2 -L)); run;
The Quote places quote marks around the value so you can see just how the value has resolved with the Put function.
The presence of leading spaces can lead to all sorts of interesting behaviors depending what you do with a value like that later. Some are apparent "duplicate" values that are treated as different, failure to convert back to numeric with INPUT, unexpected spaces in concatenation operations and unexpected "sort" order results.
@David_Billa Here how to have everything in a single SQL without "cluttering" the code too much.
proc sql;
create table Want as
select distinct
Id,
CRF,
function,
put(
case function
when "MIN" then min(Value)
when "MAX" then max(Value)
when "AVG" then avg(Value)
when "STD" then std(Value)
else missing(value)
end
, 24.6 -l)
as Value
from
(
select
id,
scan(CRF,1,'#') as CRF length=10,
scan(CRF,-1,,'ka') as function length=3,
input(Value,32.) as value format=24.6
from have
)
group by Id,CRF,Function
;
quit;
Are you sure the aggregate functions are running on the values you expect to include? That is do you need a WHERE clause? Or perhaps another nested CASE/END inside the MAX(), MIN() etc?
Do NOT include decimal number in the INFORMAT unless you know that the periods were purposely removed from the strings to save one character and you want INPUT() to divide by that power of 10.
Is it possible VALUE has leading spaces? The maximum width the normal numeric informat supports is 32, use that.
input(left(Value),32.)
Why are you pushing the results back into a character result? Why not leave it as a number? Remove the STRIP(PUT()) wrappers.
Hello @David_Billa
Your program logic needs a small correction.
Your source data set has a variable 'CRF'.
In the Proc SQL you are recalculating the variable 'CRF' .
It is this recalculated 'CRF' you want to use.
However the use of 'CRF' in the group by statement cause the PROC SQL to use the original 'CRF' and not the newly calculated one.
Therefore use the calculated keyword before 'CRF' in the group by statement as shown below and you will get the result.
proc Sql;
create table Want as select distinct * from (
select Id
,scan(CRF,1,'#') as CRF, (
case
when scan(CRF,-1,,'ka')="MIN" then strip(put(min(input(Value,24.6)),24.6))
when scan(CRF,-1,,'ka')="MAX" then strip(put(max(input(Value,24.6)),24.6))
when scan(CRF,-1,,'ka')="AVG" then strip(put(avg(input(Value,24.6)),24.6))
when scan(CRF,-1,,'ka')="STD" then strip(put(std(input(Value,24.6)),24.6))
else ''
end )
as Value
from INPUT
group by Id,calculated CRF);
quit;
You will get the desired output as can be seen below
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.