BookmarkSubscribeRSS Feed
markcat
Calcite | Level 5

Hi. Can anyone help me on the following?

I'm using SAS EG and trying to use Proc SQL to combine row values as follows:

 

Project_Name Ref_No

A 3

A 4

B 9

B 8

C 1

C 2

D 7

 

And I want to output a LIST table with the following

Project_Name Ref Nos

A 3/4

B 8/9

C 1/2

D 7

...

 

I've tried this in computed columns advanced expression.

 

CASE WHEN first.Project_Name THEN '' END

CATX('/',Ref_No,<this variable>)

CASE WHEN last.Project_Name THEN Output

 

Based on another similar query. I also need to do this for two more categorical variables.I'm relatively new to coding in SAS in several years.

 

Thanks!

5 REPLIES 5
andreas_lds
Jade | Level 19

proc sql is not the best tool to solve the problem, in fact i would not even consider to use proc sql. Have you tried using a data-step?

 

Assuming that the data is sorted by Project_Name:

 

data have;
input Project_Name $ Ref_No;
datalines;
A 3
A 4
B 9
B 8
C 1
C 2
D 7
;
run;


data want;
   set have;
   by Project_Name;

   length Ref_Nos $ 50;
   retain Ref_Nos;

   if first.Project_Name then do;
      Ref_Nos = ' ';
   end;

   Ref_Nos = catx('/', Ref_Nos, Ref_No);

   if last.Project_Name then do;
      output;
   end;

   drop Ref_No;
run;
Jagadishkatam
Amethyst | Level 16

alternatively 

 

data have;
input Project_Name $ Ref_No;
datalines;
A 3
A 4
B 9
B 8
C 1
C 2
D 7
;
run;

data want;
set have;
by Project_Name;
retain Ref_Nos;
if first.Project_Name then Ref_Nos=strip(put(Ref_No,best.));
else  Ref_Nos=catx('/',Ref_Nos,put(Ref_No,best.));
if last.Project_Name;
drop Ref_No;
run;
Thanks,
Jag
novinosrin
Tourmaline | Level 20

Nice, the only change I would think in your else is

 Ref_Nos=catx('/',Ref_Nos,put(Ref_No,best.));

that, the put conversion is redundant if you are using catx. 

 

So,

 Ref_Nos=catx('/',Ref_Nos,Ref_No);

will suffice. Try, replacing your original with the above in your else, run again and notice the log. 

 

Also, best. format defaults to 12 bytes of character as length. You might want to specify a longer length  or choose the highest range like best32.  Even then, It's safer to declare a length at compile time to avoid any unforeseen truncation. 

Ksharp
Super User
data have;
input Project_Name $ Ref_No;
datalines;
A 3
A 4
B 9
B 8
C 1
C 2
D 7
;
run;
data want;
length want $ 40;
do until(last.Project_Name); 
 set have;
 by Project_Name;
 want=catx('/',want,Ref_No);
end;
drop Ref_No;
run;
mkeintz
PROC Star

You want to produce a data value that depends on row order.  I.e. while the ratio you are trying to produce does need to look at 2 rows with the same project_name (which PROC SQL can do for you), you want to put the earlier row value over the later row value.  PROC SQL can find and use relations between rows, but not the relative row positions .  And you have no third variable that would permit distinguishing between the earlier and later rows.

 

That is why you are being offered DATA step solutions.  Because the data step reads a data set in sequential order, it can allow you to reliably utilize observation (i.e. "row") order.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1299 views
  • 2 likes
  • 6 in conversation