- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have a data set of about 5k records that I need to roll-up values on and concatenate previous names and ID's to unique fields. Here is a sample of the data I have.
Contract_ID | P_ID | YEAR | NAME | PREVIOUS_P_ID | PREVIOUS_NAME | DESC | H | P | PACE | D | C | M | F | OS | SN | DE |
HMA_ABC | 001 | 2016 | HMA ABC-001 | 001 | HMA ABC-001 | Consolidated | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
HMA_ABC | 001 | 2016 | HMA ABC-001 | 002 | HMA ABC-002 | Consolidated | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
HMA_ABC | 001 | 2016 | HMA ABC-001 | 003 | HMA ABC-003 | Consolidated | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
I need to roll-up by the Contract_ID and P_ID. I'd like to concatenate the PREVIOUS_P_ID and PREVIOUS_NAME so that they roll-up to one field. It is also important to note that this only applies to fields with a DESC = 'Consolidated'. Here is an example of what I want my end results to look like
Contract_ID | P_ID | YEAR | NAME | PREVIOUS_P_ID | PREVIOUS_NAME | DESC | H | P | PACE | D | C | M | F | OS | SN | DE |
HMA_ABC | 001 | 2016 | HMA ABC-001 | 001,002,003 | HMA ABC-001,HMA ABC-002, HMA ABC-003 | Consolidated | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @bknitch
Here is an attempt to achieve this.
I have added a fourth row to row data to have a case where DESC is not equal to "Consolidated".
Is the output correct in this case?
data have;
infile datalines dlm="09"x;
input Contract_ID $ P_ID $ YEAR NAME:$20. PREVIOUS_P_ID $ PREVIOUS_NAME:$20. DESC:$20. H P PACE D C M F OS SN DE;
datalines;
HMA_ABC 001 2016 HMA ABC-001 001 HMA ABC-001 Consolidated 1 0 0 0 0 0 0 0 0 0
HMA_ABC 001 2016 HMA ABC-001 002 HMA ABC-002 Consolidated 1 0 0 0 0 0 0 0 0 0
HMA_ABC 001 2016 HMA ABC-001 004 HMA ABC-004 Not consolidated 1 0 0 0 0 0 0 0 0 0
HMA_ABC 001 2016 HMA ABC-001 003 HMA ABC-003 Consolidated 1 0 0 0 0 0 0 0 0 0
;
run;
proc sort data=have out=have_sorted;
by Contract_ID P_ID DESC YEAR NAME H P PACE D C M F OS SN DE;
run;
proc transpose data=have_sorted out=have_tr1 (drop=_name_);
var PREVIOUS_P_ID ;
by Contract_ID P_ID DESC YEAR NAME H P PACE D C M F OS SN DE;
run;
data want1;
set have_tr1;
length PREVIOUS_P_ID $100.;
PREVIOUS_P_ID = catx(", ", of col:);
drop col:;
run;
proc transpose data=have_sorted out=have_tr2 (drop=_name_);
var PREVIOUS_NAME;
by Contract_ID P_ID DESC YEAR NAME H P PACE D C M F OS SN DE;
run;
data want2;
set have_tr2;
length PREVIOUS_NAME $100.;
PREVIOUS_NAME = catx(", ", of col:);
drop col:;
run;
data want;
retain Contract_ID P_ID YEAR NAME PREVIOUS_P_ID PREVIOUS_NAME DESC H P PACE D C M F OS SN DE;
merge want1 want2;
by Contract_ID P_ID DESC;
run;
Best,
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @ed_sas_member thank you for the suggestion, unfortunately a transpose isn't plausible with the data set i'm working with, this is only a snippet of the fields that are contained within this data set. There are other changing variables per previous_plan_id that cause the transpose not to function correctly. I'm looking for more of a rollup or retain function where desc= 'Consolidated' and then concatenating the fields.With this step i would have to build several transposed tables excluding several fields and then merging.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@bknitch wrote:
Hi @ed_sas_member thank you for the suggestion, unfortunately a transpose isn't plausible with the data set i'm working with, this is only a snippet of the fields that are contained within this data set. There are other changing variables per previous_plan_id that cause the transpose not to function correctly. I'm looking for more of a rollup or retain function where desc= 'Consolidated' and then concatenating the fields.With this step i would have to build several transposed tables excluding several fields and then merging.
KEEP on the variables needed for Proc transpose. The Data set option KEEP (or drop) is always available to reduce a data set:
Proc transpose data=have (keep= <list variables needed in the transpose> where=(desc='Consolidated'))
<rest of transpose code>.
Without rules as to how to select other variable values the result of the data would likely have to be merged back to your original.