Hi,
I want to concatenate rows with more than one variable. My dataset has unique ID and the LOT categorical variable. I want to concatenate the DD variable by ID and respective LOT. For example for ID#1: It has two LOT values of 4. I want to concatenate the DD variable to have a value of 'Che, End' . Also, for Fa variable, they need to be concatenated as well. Notice ID#7: It has multiple categories for for 'Fa' variable. They need to be concatenated by ID and LOT as well.
Overall, DD and Fa variable need to be concatenated by ID and LOT variables.
I am starting with this code...
Data want;
length DD_before $50.;
do until(last.ID);
set have;
by ID LOT date_diff;
DD_before = catx(',',DD_before,DD);
end;
drop DD;
Run;
data have;
infile datalines dlm=',';
input ID Fa :$5. LOT LStart :mmddyy10. LEnd :mmddyy10. DD :$5. Date_K :mmddyy10. Date_diff Be :$5.;
format LStart mmddyy10.;
format LEnd mmddyy10.;
format Date_K mmddyy10.;
datalines;
1,C,3,11/26/2016,1/2/2017,End,8/1/2016,-116.525,Y
1,C,4,1/3/2017,1/29/2017,Che,8/1/2016,-154.525,Y
1,C,4,1/3/2017,1/29/2017,End,8/1/2016,-154.525,Y
2,C,0,5/25/2013,3/2/2017,End,.,.,Y
3,C,2,12/11/2016,12/7/2017,Che,10/17/2016,-54.499,Y
3,C,2,12/11/2016,12/7/2017,End,10/17/2016,-54.499,Y
3,C,3,12/8/2017,3/3/2018,End,10/17/2016,-416.499,Y
4,C,4,1/1/2017,2/11/2017,End,10/21/2016,-71.261,Y
4,C,5,2/12/2017,1/18/2018,End,10/21/2016,-113.261,Y
4,C,6,1/19/2018,1/20/2019,End,10/21/2016,-454.261,Y
5,C,5,1/29/2017,10/10/2017,End,11/7/2016,-82.104,Y
5,C,5,1/29/2017,10/10/2017,Tar,11/7/2016,-82.104,Y
6,C,4,2/14/2017,9/16/2017,Che,12/11/2016,-64.215,Y
6,C,4,2/14/2017,9/16/2017,End,12/11/2016,-64.215,Y
6,C,5,9/17/2017,12/10/2017,End,12/11/2016,-279.215,Y
6,C,5,9/17/2017,12/10/2017,Rad,12/11/2016,-279.215,Y
6,C,6,12/11/2017,1/23/2018,Che,12/11/2016,-364.215,Y
6,C,6,12/11/2017,1/23/2018,Rad,12/11/2016,-364.215,Y
7,C,2,4/7/2017,5/5/2018,End,1/30/2017,-66.311,Y
7,O,2,4/7/2017,5/5/2018,End,1/30/2017,-66.311,Y
7,O,2,4/7/2017,5/5/2018,Rad,1/30/2017,-66.311,Y
7,C,3,5/6/2018,9/23/2018,End,1/30/2017,-460.311,Y
7,O,3,5/6/2018,9/23/2018,End,1/30/2017,-460.311,Y
7,C,4,9/24/2018,10/14/2019,End,1/30/2017,-601.311,Y
7,O,4,9/24/2018,10/14/2019,End,1/30/2017,-601.311,Y
;
Data Want:
ID | Fa | LOT | LStart | LEnd | DD | Date_K | Date_diff | Be |
1 | C | 3 | 11/26/2016 | 1/2/2017 | End | 8/1/2016 | -116.525 | Y |
1 | C | 4 | 1/3/2017 | 1/29/2017 | Che, End | 8/1/2016 | -154.525 | Y |
2 | C | 0 | 5/25/2013 | 3/2/2017 | End | . | . | Y |
3 | C | 2 | 12/11/2016 | 12/7/2017 | Che, End | 10/17/2016 | -54.499 | Y |
3 | C | 3 | 12/8/2017 | 3/3/2018 | End | 10/17/2016 | -416.499 | Y |
4 | C | 4 | 1/1/2017 | 2/11/2017 | End | 10/21/2016 | -71.261 | Y |
4 | C | 5 | 2/12/2017 | 1/18/2018 | End | 10/21/2016 | -113.261 | Y |
4 | C | 6 | 1/19/2018 | 1/20/2019 | End | 10/21/2016 | -454.261 | Y |
5 | C | 5 | 1/29/2017 | 10/10/2017 | End, Tar | 11/7/2016 | -82.104 | Y |
6 | C | 4 | 2/14/2017 | 9/16/2017 | Che, End | 12/11/2016 | -64.215 | Y |
6 | C | 5 | 9/17/2017 | 12/10/2017 | End, Rad | 12/11/2016 | -279.215 | Y |
6 | C | 6 | 12/11/2017 | 1/23/2018 | Che, Rad | 12/11/2016 | -364.215 | Y |
7 | C, O | 2 | 4/7/2017 | 5/5/2018 | End, Rad | 1/30/2017 | -66.311 | Y |
7 | C, O | 3 | 5/6/2018 | 9/23/2018 | End | 1/30/2017 | -460.311 | Y |
7 | C, O | 4 | 9/24/2018 | 10/14/2019 | End | 1/30/2017 | -601.311 | Y |
One of below two coding variants should do.
data want;
set have;
by id lot;
length DD_before FA_before $30;
retain DD_before FA_before;
if find(DD_before,DD)<=0 then
DD_before=catx(',',DD_before,DD);
if find(FA_before,FA)<=0 then
FA_before=catx(',',FA_before,FA);
if last.lot then
do;
output;
call missing(DD_before, FA_before);
end;
drop dd fa;
run;
data want2(drop=_:);
set have;
by id lot;
length DD_before FA_before $30;
retain DD_before FA_before;
array vars {*} DD_before DD FA_before FA;
do _i=1 to dim(vars) by 2;
if find(vars[_i],vars[_i+1])<=0 then
vars[_i]=catx(',',vars[_i],vars[_i+1]);
end;
if last.lot then
do;
output;
call missing(of vars[*]);
end;
drop dd fa;
run;
Hi:
I don't think you want to have a DO UNTIL with your SET inside the DO Loop. I think you just need to use FIRST. and LAST. processing by reading sequentially through the input file. Then you only want to output when LAST.DATE_DIFF=1;
My thoughts when seeing this, other than questioning the4 DO UNTIL Loop were to wonder whether there would always be just 2 adjacent rows with the same ID, LOT and DATE_DIFF or whether there could be 3 or 4 rows. It would make a difference in how you coded the DATA step.
Cynthia
@Cynthia_sas : Thanks for the input. Been sort of struggling to solve this code. I am new to this kind of programming.
The specific LOT has its own unique date_diff if that helps.
Does below give you what you're after?
data want;
set have;
by id lot;
length DD_before FA_before $30;
retain DD_before FA_before;
DD_before=catx(',',DD_before,DD);
FA_before=catx(',',FA_before,FA);
if last.lot then
do;
output;
call missing(DD_before, FA_before);
end;
drop dd fa;
run;
@Patrick Thanks so much for this. Very close: For ID #7, for DD_before and FA_before it repeats the DD and FA twice. It should look like below:
ID | Fa | LOT | LStart | LEnd | DD | Date_K | Date_diff | Be |
7 | C, O | 2 | 4/7/2017 | 5/5/2018 | End, Rad | 1/30/2017 | -66.311 | Y |
7 | C, O | 3 | 5/6/2018 | 9/23/2018 | End | 1/30/2017 | -460.311 | Y |
7 | C, O | 4 | 9/24/2018 | 10/14/2019 | End | 1/30/2017 | -601.311 | Y |
One of below two coding variants should do.
data want;
set have;
by id lot;
length DD_before FA_before $30;
retain DD_before FA_before;
if find(DD_before,DD)<=0 then
DD_before=catx(',',DD_before,DD);
if find(FA_before,FA)<=0 then
FA_before=catx(',',FA_before,FA);
if last.lot then
do;
output;
call missing(DD_before, FA_before);
end;
drop dd fa;
run;
data want2(drop=_:);
set have;
by id lot;
length DD_before FA_before $30;
retain DD_before FA_before;
array vars {*} DD_before DD FA_before FA;
do _i=1 to dim(vars) by 2;
if find(vars[_i],vars[_i+1])<=0 then
vars[_i]=catx(',',vars[_i],vars[_i+1]);
end;
if last.lot then
do;
output;
call missing(of vars[*]);
end;
drop dd fa;
run;
@Patrick : this works great! Thank you so much.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.