BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cruise
Ammonite | Level 13

I'm merging DATA 1 and DATA 2 on the "CODE" variable. I want DATA 2 information update the same name columns in DATA 1. It works fine in demo below. Resulting data WANT has all the columns updated from both datasets merged. However, it doesn't work like this using my actual datasets. Please see screen shot where flag_icd9pc flag takes value 1 indicating MERGE happened but DESC_CODE AND CLASS_CODE columns are not updated from the second data which has information for these two columns. I made sure all same column characteristics in both datasets have same length, format and informat while all being "characters". I formatted them and fixed the variables to the same length before data merge, as below. They all looked fine in Column Attributes in data view by each variables. They're all consistently identical to their lengths, format and informat before merge.

 

What's going wrong here? My sincere appreciation in advance for your time. Thanks a lot!

 

length variable $5;

format variable $5.;

informat variable $5.;

 

 

data merge not updated.png

 

DATA DATA1; 
INPUT CODE_SYSTEM $	CODE_DESC $ CODE;
CARDS;
.	.	1
.	.	2
.	.	3
.	.	4
.	.	5
.	.	6
HIPPS	DISTANT	7
HIPPS	LOCAL	8
BILLING	DISTANT	9
; 

DATA DATA2; 
INPUT CODE_SYSTEM $	CODE_DESC $ CODE;
CARDS;
ICD10	LOCAL	1
ICD9	REGIONAL	2
CPT	DISTANT	3
ICD10	LOCAL	4
ICD9	DISTANT	5
CPT	REGIONAL 	6
;

DATA WANT; 
INPUT CODE_SYSTEM $	CODE_DESC $ CODE;
CARDS;
ICD10	LOCAL	1
ICD9	REGIONAL	2
CPT	DISTANT	3
ICD10	LOCAL	4
ICD9	DISTANT	5
CPT	REGIONAL 	6
HIPPS	DISTANT	7
HIPPS	LOCAL	8
BILLING	DISTANT	9
;

data WANT1;
merge DATA1 (in=A) 
	  DATA2 (in=B);
by CODE;
if A and B then FLAG_LINKED=1; else FLAG_LINKED=1=0; 
if A;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

You didn't provide an example of when it doesn't work, so we can only guess. One possibility is that you have some values in data1, but want to replace them with data two. If so, you may want to use modify rather than merge. e.g.:

DATA DATA1; 
INPUT CODE_SYSTEM $ CODE_DESC $ CODE;
CARDS;
X   Y   1
X   Y   2
X   Y   3
.   .   4
.   .   5
.   .   6
HIPPS   DISTANT 7
HIPPS   LOCAL   8
BILLING DISTANT 9
; 

DATA DATA2; 
INPUT CODE_SYSTEM $ CODE_DESC $ CODE;
CARDS;
ICD10   LOCAL   1
ICD9    REGIONAL    2
CPT DISTANT 3
ICD10   LOCAL   4
ICD9    DISTANT 5
CPT REGIONAL    6
;

data want;
  set data1;
run;

data want;
  modify want data2 updatemode=nomissingcheck;
  by code;
  if _iorc_=0 then replace;
  else do;
    _error_=0;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

 

View solution in original post

9 REPLIES 9
ballardw
Super User

@Cruise wrote:

I'm merging DATA 1 and DATA 2 on the "CODE" variable. I want DATA 2 information update the same name columns in DATA 1. It works fine in demo below. Resulting data WANT has all the columns updated from both datasets merged. However, it doesn't work like this using my actual datasets. Please see screen shot where flag_icd9pc flag takes value 1 indicating MERGE happened but DESC_CODE AND CLASS_CODE columns are not updated from the second data which has information for these two columns. I made sure all same column characteristics in both datasets have same length, format and informat while all being "characters". I formatted them and fixed the variables to the same length before data merge, as below. They all looked fine in Column Attributes in data view by each variables. They're all consistently identical to their lengths, format and informat before merge.

 

What's going wrong here? My sincere appreciation in advance for your time. Thanks a lot!

 

length variable $5;

format variable $5.;

informat variable $5.;

 

 

data merge not updated.png

 

 


With a statement like "However, it doesn't work like this using my actual datasets." the first question is "Are the example values in your data1 and data2 the ones from your actual datasets that do not merge correctly? If not then provide them.

Note that if you have multiple records with the same value of CODE in both data sets you can get unexpected results. If CODE is duplicated in only one set then different results can happen depending on which set has the multiples.

 

Is Code duplicated in DATA1 in your actual data? If so, are all of the Code values supposed to get the same values from data2?

art297
Opal | Level 21

You didn't provide an example of when it doesn't work, so we can only guess. One possibility is that you have some values in data1, but want to replace them with data two. If so, you may want to use modify rather than merge. e.g.:

DATA DATA1; 
INPUT CODE_SYSTEM $ CODE_DESC $ CODE;
CARDS;
X   Y   1
X   Y   2
X   Y   3
.   .   4
.   .   5
.   .   6
HIPPS   DISTANT 7
HIPPS   LOCAL   8
BILLING DISTANT 9
; 

DATA DATA2; 
INPUT CODE_SYSTEM $ CODE_DESC $ CODE;
CARDS;
ICD10   LOCAL   1
ICD9    REGIONAL    2
CPT DISTANT 3
ICD10   LOCAL   4
ICD9    DISTANT 5
CPT REGIONAL    6
;

data want;
  set data1;
run;

data want;
  modify want data2 updatemode=nomissingcheck;
  by code;
  if _iorc_=0 then replace;
  else do;
    _error_=0;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

 

Cruise
Ammonite | Level 13
Great. But why did you need this part in the code?

data N.want;
set N.data1;
run;
art297
Opal | Level 21

You don't need that step UNLESS you're unsure that the process will do what you want. I included it so that the original file would be left intact.

 

Art, CEO, AnalystFinder.com

 

Amir
PROC Star

Hi,

 

Does class_code have any blank values on data2?

 

Please post the log of the merge that is not working, showing the code and any messages.

 

 

Regards,

Amir.

Cruise
Ammonite | Level 13

@art297

Awesome. Now I can see my parent data is modified with all the information needed. How to flag the rows modified though? All the rows modified are all linked on the linkage variable? My input information is not unique to a data2. Modified dataset had included all rows from both datasets. How to keep data aligned to one of datasets like it's done in data merge?

 

awesome.png

Cruise
Ammonite | Level 13

Trying to flag the matched rows from data modify. below didn't create a flag for me. Any idea how to flag matched rows in "modify mode"? thanks in advance.

data n.want;
  modify n.want(in=w) 
		 n.Icd9_pc(in=icd) updatemode=nomissingcheck;
  by omm_proc_cd1;
  if _iorc_=0 then replace;
  else do;
    _error_=0;
    output;
  end;
if w and icd then flag=1; 
run;
art297
Opal | Level 21

Read up on the modify statement (e.g., http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000173361.htm ).

 

Your answer will depend upon the combination of what your new file looks like and the updatemode you select.

 

Without seeing your actual datasets it's difficult to say what will do exactly what you want. You could use merge, update or modify. And, for each, you use options to control what will happen.

 

Art, CEO, AnalystFinder.com

 

Tom
Super User Tom
Super User

I don't understand the question. Maybe because you provided a bad example?

If I just merge the two datasets you provided by CODE then I get exactly what you said you wanted.

 

data want;
  merge data1 data2 ;
  by code;
run;

Note that your example case both datasets have only one observation per value of CODE.

The merge will also work if DATA1 has multiple observations per value of CODE and DATA2 is unique.

 

But if DATA2 has multiple observation per value of CODE then you could get replication of the observations from DATA1.

 

Or if both have multiple observations per value of CODE then only partial replacement of the values from DATA1 will occur.  SAS could run out of records to read from DATA2 for that specific value of CODE. and so not replace the values of the common variables read from DATA1.

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2992 views
  • 1 like
  • 5 in conversation