BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
nbora
Calcite | Level 5

Hi,

I have the following dataset:

 

data have;
     input col_a col_b :date9. col_c $ col_d $;
     format col_b date9.;
     datalines;
     123 31SEP2014 123a 123456789ABC
     123 31SEP2014 123a 123456789DEF
     123 31SEP2014 123a 123456789GHI
     456 31AUG2014 456f 987654321ABC
     456 31AUG2014 456g 987654321DEF
     456 31AUG2014 456h 987654321GHI
     789 31OCT2014 789a 987654321ABC
     789 31NOV2014 789a 987654321DEF
     789 31NOV2014 789b 987654321GHI 
;
run;

Using this data set and (proc) sql I like to create a new column that holds a string of distinct col_c for each distinct combination of col_an and col_b. Like the following:

 

 

data have;
     input col_a col_b :date9. col_c $ col_d $ col_e;
     format col_b date9.;
     datalines;
     123 31SEP2014 123a 123456789ABC 123_123a
     123 31SEP2014 123a 123456789DEF 123_123a
     123 31SEP2014 123a 123456789GHI 123_123a
     456 31AUG2014 456f 987654321ABC 456_456f_456g_456h
     456 31AUG2014 456g 987654321DEF 456_456f_456g_456h
     456 31AUG2014 456h 987654321GHI 456_456f_456g_456h
     789 31OCT2014 789a 987654321ABC 789_789a
     789 31NOV2014 789a 987654321DEF 789_789a_789b
     789 31NOV2014 789b 987654321GHI 789_789a_789b
;
run;

Thank you in advance,

 

B

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Not with SQL code.  Just use plain old SAS code instead.  Note: 30 days hath November, April... 

 

data have;
     input col_a col_b :date9. col_c $ col_d $;
     format col_b date9.;
datalines;
123 30SEP2014 123a 123456789ABC
123 30SEP2014 123a 123456789DEF
123 30SEP2014 123a 123456789GHI
456 31AUG2014 456f 987654321ABC
456 31AUG2014 456g 987654321DEF
456 31AUG2014 456h 987654321GHI
789 31OCT2014 789a 987654321ABC
789 30NOV2014 789a 987654321DEF
789 30NOV2014 789b 987654321GHI 
;

proc sort; 
  by col_a col_b col_c ;;
run;

data want ;
do _n_=1 by 1 until(last.col_b);
  set have ;
  by col_a col_b col_c ;
  length col_e $50;
  if first.col_c then col_e=catx('_',col_e,col_c);
end;
do _n_=1 to _n_;
  set have;
  output;
end;
run;

Results

Obs    col_a        col_b    col_c     col_d      col_e

 1      123     30SEP2014    123a     12345678    123a
 2      123     30SEP2014    123a     12345678    123a
 3      123     30SEP2014    123a     12345678    123a
 4      456     31AUG2014    456f     98765432    456f_456g_456h
 5      456     31AUG2014    456g     98765432    456f_456g_456h
 6      456     31AUG2014    456h     98765432    456f_456g_456h
 7      789     31OCT2014    789a     98765432    789a
 8      789     30NOV2014    789a     98765432    789a_789b
 9      789     30NOV2014    789b     98765432    789a_789b

View solution in original post

3 REPLIES 3
Reeza
Super User
Why SQL? This will be hard in SAS SQL as it doesn't have an aggregate function for text as some other languages do (e.g. LISTAGG in Oracle).

You could follow the example here to find the values and then merge back with the original data.

https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a
Tom
Super User Tom
Super User

Not with SQL code.  Just use plain old SAS code instead.  Note: 30 days hath November, April... 

 

data have;
     input col_a col_b :date9. col_c $ col_d $;
     format col_b date9.;
datalines;
123 30SEP2014 123a 123456789ABC
123 30SEP2014 123a 123456789DEF
123 30SEP2014 123a 123456789GHI
456 31AUG2014 456f 987654321ABC
456 31AUG2014 456g 987654321DEF
456 31AUG2014 456h 987654321GHI
789 31OCT2014 789a 987654321ABC
789 30NOV2014 789a 987654321DEF
789 30NOV2014 789b 987654321GHI 
;

proc sort; 
  by col_a col_b col_c ;;
run;

data want ;
do _n_=1 by 1 until(last.col_b);
  set have ;
  by col_a col_b col_c ;
  length col_e $50;
  if first.col_c then col_e=catx('_',col_e,col_c);
end;
do _n_=1 to _n_;
  set have;
  output;
end;
run;

Results

Obs    col_a        col_b    col_c     col_d      col_e

 1      123     30SEP2014    123a     12345678    123a
 2      123     30SEP2014    123a     12345678    123a
 3      123     30SEP2014    123a     12345678    123a
 4      456     31AUG2014    456f     98765432    456f_456g_456h
 5      456     31AUG2014    456g     98765432    456f_456g_456h
 6      456     31AUG2014    456h     98765432    456f_456g_456h
 7      789     31OCT2014    789a     98765432    789a
 8      789     30NOV2014    789a     98765432    789a_789b
 9      789     30NOV2014    789b     98765432    789a_789b

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 542 views
  • 0 likes
  • 3 in conversation