DATA Step, Macro, Functions and more

DATA MERGE same name columns are not updated

Accepted Solution Solved
Reply
Super Contributor
Posts: 374
Accepted Solution

DATA MERGE same name columns are not updated

[ Edited ]

I'm merging DATA 1 and DATA 2 on the "CODE" variable. I want DATA 2 information update the same name columns in DATA 1. It works fine in demo below. Resulting data WANT has all the columns updated from both datasets merged. However, it doesn't work like this using my actual datasets. Please see screen shot where flag_icd9pc flag takes value 1 indicating MERGE happened but DESC_CODE AND CLASS_CODE columns are not updated from the second data which has information for these two columns. I made sure all same column characteristics in both datasets have same length, format and informat while all being "characters". I formatted them and fixed the variables to the same length before data merge, as below. They all looked fine in Column Attributes in data view by each variables. They're all consistently identical to their lengths, format and informat before merge.

 

What's going wrong here? My sincere appreciation in advance for your time. Thanks a lot!

 

length variable $5;

format variable $5.;

informat variable $5.;

 

 

data merge not updated.png

 

DATA DATA1; 
INPUT CODE_SYSTEM $	CODE_DESC $ CODE;
CARDS;
.	.	1
.	.	2
.	.	3
.	.	4
.	.	5
.	.	6
HIPPS	DISTANT	7
HIPPS	LOCAL	8
BILLING	DISTANT	9
; 

DATA DATA2; 
INPUT CODE_SYSTEM $	CODE_DESC $ CODE;
CARDS;
ICD10	LOCAL	1
ICD9	REGIONAL	2
CPT	DISTANT	3
ICD10	LOCAL	4
ICD9	DISTANT	5
CPT	REGIONAL 	6
;

DATA WANT; 
INPUT CODE_SYSTEM $	CODE_DESC $ CODE;
CARDS;
ICD10	LOCAL	1
ICD9	REGIONAL	2
CPT	DISTANT	3
ICD10	LOCAL	4
ICD9	DISTANT	5
CPT	REGIONAL 	6
HIPPS	DISTANT	7
HIPPS	LOCAL	8
BILLING	DISTANT	9
;

data WANT1;
merge DATA1 (in=A) 
	  DATA2 (in=B);
by CODE;
if A and B then FLAG_LINKED=1; else FLAG_LINKED=1=0; 
if A;
run;

 


Accepted Solutions
Solution
‎06-20-2018 04:26 PM
PROC Star
Posts: 8,167

Re: DATA MERGE same name columns are not updated

You didn't provide an example of when it doesn't work, so we can only guess. One possibility is that you have some values in data1, but want to replace them with data two. If so, you may want to use modify rather than merge. e.g.:

DATA DATA1; 
INPUT CODE_SYSTEM $ CODE_DESC $ CODE;
CARDS;
X   Y   1
X   Y   2
X   Y   3
.   .   4
.   .   5
.   .   6
HIPPS   DISTANT 7
HIPPS   LOCAL   8
BILLING DISTANT 9
; 

DATA DATA2; 
INPUT CODE_SYSTEM $ CODE_DESC $ CODE;
CARDS;
ICD10   LOCAL   1
ICD9    REGIONAL    2
CPT DISTANT 3
ICD10   LOCAL   4
ICD9    DISTANT 5
CPT REGIONAL    6
;

data want;
  set data1;
run;

data want;
  modify want data2 updatemode=nomissingcheck;
  by code;
  if _iorc_=0 then replace;
  else do;
    _error_=0;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

 

View solution in original post


All Replies
Super User
Posts: 13,583

Re: DATA MERGE same name columns are not updated


@Cruise wrote:

I'm merging DATA 1 and DATA 2 on the "CODE" variable. I want DATA 2 information update the same name columns in DATA 1. It works fine in demo below. Resulting data WANT has all the columns updated from both datasets merged. However, it doesn't work like this using my actual datasets. Please see screen shot where flag_icd9pc flag takes value 1 indicating MERGE happened but DESC_CODE AND CLASS_CODE columns are not updated from the second data which has information for these two columns. I made sure all same column characteristics in both datasets have same length, format and informat while all being "characters". I formatted them and fixed the variables to the same length before data merge, as below. They all looked fine in Column Attributes in data view by each variables. They're all consistently identical to their lengths, format and informat before merge.

 

What's going wrong here? My sincere appreciation in advance for your time. Thanks a lot!

 

length variable $5;

format variable $5.;

informat variable $5.;

 

 

data merge not updated.png

 

 


With a statement like "However, it doesn't work like this using my actual datasets." the first question is "Are the example values in your data1 and data2 the ones from your actual datasets that do not merge correctly? If not then provide them.

Note that if you have multiple records with the same value of CODE in both data sets you can get unexpected results. If CODE is duplicated in only one set then different results can happen depending on which set has the multiples.

 

Is Code duplicated in DATA1 in your actual data? If so, are all of the Code values supposed to get the same values from data2?

Solution
‎06-20-2018 04:26 PM
PROC Star
Posts: 8,167

Re: DATA MERGE same name columns are not updated

You didn't provide an example of when it doesn't work, so we can only guess. One possibility is that you have some values in data1, but want to replace them with data two. If so, you may want to use modify rather than merge. e.g.:

DATA DATA1; 
INPUT CODE_SYSTEM $ CODE_DESC $ CODE;
CARDS;
X   Y   1
X   Y   2
X   Y   3
.   .   4
.   .   5
.   .   6
HIPPS   DISTANT 7
HIPPS   LOCAL   8
BILLING DISTANT 9
; 

DATA DATA2; 
INPUT CODE_SYSTEM $ CODE_DESC $ CODE;
CARDS;
ICD10   LOCAL   1
ICD9    REGIONAL    2
CPT DISTANT 3
ICD10   LOCAL   4
ICD9    DISTANT 5
CPT REGIONAL    6
;

data want;
  set data1;
run;

data want;
  modify want data2 updatemode=nomissingcheck;
  by code;
  if _iorc_=0 then replace;
  else do;
    _error_=0;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

 

Super Contributor
Posts: 374

Re: DATA MERGE same name columns are not updated

Great. But why did you need this part in the code?

data N.want;
set N.data1;
run;
PROC Star
Posts: 8,167

Re: DATA MERGE same name columns are not updated

You don't need that step UNLESS you're unsure that the process will do what you want. I included it so that the original file would be left intact.

 

Art, CEO, AnalystFinder.com

 

Super Contributor
Posts: 340

Re: DATA MERGE same name columns are not updated

Hi,

 

Does class_code have any blank values on data2?

 

Please post the log of the merge that is not working, showing the code and any messages.

 

 

Regards,

Amir.

Super Contributor
Posts: 374

Re: DATA MERGE same name columns are not updated

[ Edited ]

@art297

Awesome. Now I can see my parent data is modified with all the information needed. How to flag the rows modified though? All the rows modified are all linked on the linkage variable? My input information is not unique to a data2. Modified dataset had included all rows from both datasets. How to keep data aligned to one of datasets like it's done in data merge?

 

awesome.png

Super Contributor
Posts: 374

Re: DATA MERGE same name columns are not updated

Trying to flag the matched rows from data modify. below didn't create a flag for me. Any idea how to flag matched rows in "modify mode"? thanks in advance.

data n.want;
  modify n.want(in=w) 
		 n.Icd9_pc(in=icd) updatemode=nomissingcheck;
  by omm_proc_cd1;
  if _iorc_=0 then replace;
  else do;
    _error_=0;
    output;
  end;
if w and icd then flag=1; 
run;
PROC Star
Posts: 8,167

Re: DATA MERGE same name columns are not updated

Read up on the modify statement (e.g., http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000173361.htm ).

 

Your answer will depend upon the combination of what your new file looks like and the updatemode you select.

 

Without seeing your actual datasets it's difficult to say what will do exactly what you want. You could use merge, update or modify. And, for each, you use options to control what will happen.

 

Art, CEO, AnalystFinder.com

 

Super User
Super User
Posts: 8,125

Re: DATA MERGE same name columns are not updated

I don't understand the question. Maybe because you provided a bad example?

If I just merge the two datasets you provided by CODE then I get exactly what you said you wanted.

 

data want;
  merge data1 data2 ;
  by code;
run;

Note that your example case both datasets have only one observation per value of CODE.

The merge will also work if DATA1 has multiple observations per value of CODE and DATA2 is unique.

 

But if DATA2 has multiple observation per value of CODE then you could get replication of the observations from DATA1.

 

Or if both have multiple observations per value of CODE then only partial replacement of the values from DATA1 will occur.  SAS could run out of records to read from DATA2 for that specific value of CODE. and so not replace the values of the common variables read from DATA1.

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 123 views
  • 1 like
  • 5 in conversation