Desktop productivity for business analysts and programmers

Merge multiple records into one based on ID field

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 15
Accepted Solution

Merge multiple records into one based on ID field

Really a newbie here to SAS and just getting familiar with SAS Enterprise Guide. I am trying to merge multiple records into one record based on an ID field that is in a text format. Along with the ID variables included in the csv file I imported I also have a STATE field. The STATE field provided the 2 digit state abbreviation for most of the records, but some of the records are missing this value.

 

I'd like to merge the records to create a new dataset. For example, the current table looks like this:

 current table.jpg

 

However, I would like to change this so that no record is duplicated. I would like to assign values as follows:

0 - All of the STATE values associated with the ID are blank.

1 - If the STATE value(s) associated with the record equals TX and no other value is present for the STATE field.

2 - If the STATE field includes one or more STATE values and none of the the STATE value(s) associated with the record equals TX.

3 - If the STATE field includes values for both TX and at least one other state.

 

The new table would look like this:

desired table.jpg

 

Any guidance and help would be appreciated. I don't see a way to do this using the EG wizards, but could be missing it.

 

Thanks!


Accepted Solutions
Solution
‎07-12-2018 10:52 AM
Super User
Posts: 2,049

Re: Merge multiple records into one based on ID field

Posted in reply to runningjay
data have;
infile cards dlm=',';
input (ID STATE) (:$100.);
cards;
KREE990001,TX
KREE990001,TX
KREE990001,   
GHTR880001,   
GHTR880001,   
JHRT330001,NC
JHRT330001,   
WQUT110001,TX
WQUT110001,OH
;
proc sort data=have;
by id;
run;
data want;
c=0;
do n=1 by 1 until(last.id);
set have;
by id ;
if missing(state) then c+1;
if state='TX' then tx=1;
else if not missing(state) then no_tx=1;
end;
do until(last.id);
set have;
by id;
if n=c then _state=0;
else if tx and not no_tx then _state=1;
else if not tx and no_tx then _state=2;
else if tx and no_tx then _state=3;
end;
keep id _state;
run;

proc print;run;

SAS Output

Obs ID _state1234
GHTR8800010
JHRT3300012
KREE9900011
WQUT1100013

View solution in original post


All Replies
Super User
Posts: 2,049

Re: Merge multiple records into one based on ID field

Posted in reply to runningjay

Please post the sample as plain text for community members to easily copy paste 

Occasional Contributor
Posts: 15

Re: Merge multiple records into one based on ID field

Posted in reply to novinosrin

I'd like to merge the records to create a new dataset. For example, the current data looks like this:

ID,STATE
KREE990001,TX
KREE990001,TX
KREE990001,   
GHTR880001,   
GHTR880001,   
JHRT330001,NC
JHRT330001,   
WQUT110001,TX
WQUT110001,OH

However, I would like to change this so that no record is duplicated. I would like to assign values as follows:

0 - All of the STATE values associated with the ID are blank.
1 - If the STATE value(s) associated with the record equals TX and no other value is present for the STATE field.
2 - If the STATE field includes one or more STATE values and none of the the STATE value(s) associated with the record equals TX.
3 - If the STATE field includes values for both TX and at least one other state.

The new data would look like this:

ID,STATE
KREE990001,1
GHTR880001,0
JHRT330001,2
WQUT110001,3

Solution
‎07-12-2018 10:52 AM
Super User
Posts: 2,049

Re: Merge multiple records into one based on ID field

Posted in reply to runningjay
data have;
infile cards dlm=',';
input (ID STATE) (:$100.);
cards;
KREE990001,TX
KREE990001,TX
KREE990001,   
GHTR880001,   
GHTR880001,   
JHRT330001,NC
JHRT330001,   
WQUT110001,TX
WQUT110001,OH
;
proc sort data=have;
by id;
run;
data want;
c=0;
do n=1 by 1 until(last.id);
set have;
by id ;
if missing(state) then c+1;
if state='TX' then tx=1;
else if not missing(state) then no_tx=1;
end;
do until(last.id);
set have;
by id;
if n=c then _state=0;
else if tx and not no_tx then _state=1;
else if not tx and no_tx then _state=2;
else if tx and no_tx then _state=3;
end;
keep id _state;
run;

proc print;run;

SAS Output

Obs ID _state1234
GHTR8800010
JHRT3300012
KREE9900011
WQUT1100013
Occasional Contributor
Posts: 15

Re: Merge multiple records into one based on ID field

Posted in reply to novinosrin

Solution worked perfectly - thank you!

 

Not sure if I need to post this as a new, separate question, but since it is related to the prior solution I thought I'd add it here.

 

Using the problem identified, but adding one additional variable I'd like to do the same thing but unduplicate the records factoring in the additional variable. For example, the current data looks like this:

 

ID,STATE,YEAR

KREE990001,TX,2017

KREE990001,TX,2017

KREE990001,TX,2017

KREE990001,TX,2017

KREE990001,TX,2016

KREE990001,TX,2016

KREE990001,  

GHTR880001,  

GHTR880001,  

JHRT330001,NC,2016

JHRT330001,  

WQUT110001,TX,2017

WQUT110001,OH,2017

WQUT110001,OH,2017

WQUT110001,OH,2017

WQUT110001,TX,2016

WQUT110001,TX,2016

 

However, I would like to change this so that no record is duplicated in a single YEAR. I would like to assign values as follows taking the YEAR variable into consideration:

 

0 - All of the STATE values associated with the ID are blank.

1 - If the STATE value(s) associated with the record equals TX and no other value is present for the STATE field in that YEAR.

2 - If the STATE field includes one or more STATE values and none of the STATE value(s) associated with the record equals TX in that YEAR.

3 - If the STATE field includes values for both TX and at least one other state in that YEAR.

 

The new data would look like this:

 

ID,STATE,YEAR

KREE990001,1,2017

KREE990001,1,2016

KREE990001,0

GHTR880001,0

JHRT330001,2,2016

JHRT330001,0

WQUT110001,3,2017

WQUT110001,1,2016

 

Could the previous solution be modified to accomplish this or is a different method recommended?

Super User
Posts: 2,049

Re: Merge multiple records into one based on ID field

Posted in reply to runningjay

@runningjay Sure, will give you the solution. However, I am a little busy today as I am have an exam at my college. If it;s not urgent, can you hang in there for a day plz

Occasional Contributor
Posts: 15

Re: Merge multiple records into one based on ID field

Posted in reply to novinosrin

Sure - no problem. Appreciate your expertise and insight. Thanks!

Super User
Posts: 2,049

Re: Merge multiple records into one based on ID field

[ Edited ]
Posted in reply to runningjay

@runningjay First off, sorry for the delay as I was feeling very down on account of my poor performance on my college exam last night. My apologies. 

Ok,Anyways back to question

Just a minor change to include year in by group processing. See if this works and let me know. Also, I wonder why you have missed values in year column. That makes it inconceivable to do any great analysis in the future using the dataset. Well!

 

data have;
infile cards dlm=',' truncover;
input (ID STATE) (:$50.) year ;
cards;
KREE990001,TX,2017
KREE990001,TX,2017
KREE990001,TX,2017
KREE990001,TX,2017
KREE990001,TX,2016
KREE990001,TX,2016
KREE990001,  
GHTR880001,  
GHTR880001,  
JHRT330001,NC,2016
JHRT330001,  
WQUT110001,TX,2017
WQUT110001,OH,2017
WQUT110001,OH,2017
WQUT110001,OH,2017
WQUT110001,TX,2016
WQUT110001,TX,2016
;

proc sort data=have;
by id year;
run;
data want;
c=0;
do n=1 by 1 until(last.year);
set have;
by id year;
if missing(state) then c+1;
if state='TX' then tx=1;
else if not missing(state) then no_tx=1;
end;
do until(last.year);
set have;
by id year;
if n=c then _state=0;
else if tx and not no_tx then _state=1;
else if not tx and no_tx then _state=2;
else if tx and no_tx then _state=3;
end;
keep id _state year;
run;

 

Occasional Contributor
Posts: 15

Re: Merge multiple records into one based on ID field

Posted in reply to novinosrin

Perfect - many thanks!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 224 views
  • 1 like
  • 2 in conversation