Hi. Can anyone help me on the following?
I'm using SAS EG and trying to use Proc SQL to combine row values as follows:
Project_Name Ref_No
A 3
A 4
B 9
B 8
C 1
C 2
D 7
And I want to output a LIST table with the following
Project_Name Ref Nos
A 3/4
B 8/9
C 1/2
D 7
...
I've tried this in computed columns advanced expression.
CASE WHEN first.Project_Name THEN '' END
CATX('/',Ref_No,<this variable>)
CASE WHEN last.Project_Name THEN Output
Based on another similar query. I also need to do this for two more categorical variables.I'm relatively new to coding in SAS in several years.
Thanks!
proc sql is not the best tool to solve the problem, in fact i would not even consider to use proc sql. Have you tried using a data-step?
Assuming that the data is sorted by Project_Name:
data have;
input Project_Name $ Ref_No;
datalines;
A 3
A 4
B 9
B 8
C 1
C 2
D 7
;
run;
data want;
set have;
by Project_Name;
length Ref_Nos $ 50;
retain Ref_Nos;
if first.Project_Name then do;
Ref_Nos = ' ';
end;
Ref_Nos = catx('/', Ref_Nos, Ref_No);
if last.Project_Name then do;
output;
end;
drop Ref_No;
run;
alternatively
data have;
input Project_Name $ Ref_No;
datalines;
A 3
A 4
B 9
B 8
C 1
C 2
D 7
;
run;
data want;
set have;
by Project_Name;
retain Ref_Nos;
if first.Project_Name then Ref_Nos=strip(put(Ref_No,best.));
else Ref_Nos=catx('/',Ref_Nos,put(Ref_No,best.));
if last.Project_Name;
drop Ref_No;
run;
Nice, the only change I would think in your else is
Ref_Nos=catx('/',Ref_Nos,put(Ref_No,best.));
that, the put conversion is redundant if you are using catx.
So,
Ref_Nos=catx('/',Ref_Nos,Ref_No);
will suffice. Try, replacing your original with the above in your else, run again and notice the log.
Also, best. format defaults to 12 bytes of character as length. You might want to specify a longer length or choose the highest range like best32. Even then, It's safer to declare a length at compile time to avoid any unforeseen truncation.
data have;
input Project_Name $ Ref_No;
datalines;
A 3
A 4
B 9
B 8
C 1
C 2
D 7
;
run;
data want;
length want $ 40;
do until(last.Project_Name);
set have;
by Project_Name;
want=catx('/',want,Ref_No);
end;
drop Ref_No;
run;
You want to produce a data value that depends on row order. I.e. while the ratio you are trying to produce does need to look at 2 rows with the same project_name (which PROC SQL can do for you), you want to put the earlier row value over the later row value. PROC SQL can find and use relations between rows, but not the relative row positions . And you have no third variable that would permit distinguishing between the earlier and later rows.
That is why you are being offered DATA step solutions. Because the data step reads a data set in sequential order, it can allow you to reliably utilize observation (i.e. "row") order.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.