BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jacksonan123
Lapis Lazuli | Level 10

I am using the following code to merge data sets newl4 and newd4 after they have been sorted by wsubj:

data all;
merge newl4(in=inp) newd4(in=ine);
by wsubj ;
if inp and ine;
run;

 

newd4

subj time cmt conc

1 0.25 1 0
1 0.25 2 0
1 0.25 3 0
1 0.25 4 0
1 0.25 5 0
1 0.25 6 0
1 0.25 7 0
1 0.25 8 0
1 0.25 9 0
1 0.25 10 0
1 0.25 11 0.2242009
1 0.25 12 0
1 0.25 13 0
1 0.25 14 0
1 0.25 15 0
1 0.25 16 0
1 0.25 17 0
1 0.25 18 0
1 0.25 19 0
1 0.25 20 0
1 0.25 21 0
1 0.25 22 0
1 0.25 23 0
1 0.25 24 0

 

Newl4

subj time cmt conc

1 0.25 1 0
1 0.25 2 0
1 0.25 3 0
1 0.25 4 0
1 0.25 5 0
1 0.25 6 0
1 0.25 7 0
1 0.25 8 0
1 0.25 9 0
1 0.25 10 0
1 0.25 11 0
1 0.25 12 0
1 0.25 13 0
1 0.25 14 0
1 0.25 15 0
1 0.25 16 0
1 0.25 17 0
1 0.25 18 0
1 0.25 19 0
1 0.25 20 0
1 0.25 21 0
1 0.25 22 0
1 0.25 23 0.0960861

10.25 24 0

 

The only difference between them is that newd5 and newl4 have a conc in different cmt(i.e., newd4-cmt 11 and newl4 cmt23).

Whe I try to merge them I only get a value for conc in cmt 11 of 0.22, while cmt 23=0.  Can someone  give me code that will result in both cmt11 and cmt23 showing the values in newd4 and newl4 in the merged data set all?

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

If conc is always >= 0 in both datasets, then you can simply do:

 

data all;
merge newl4(in=inp) newd4(in=ine rename=conc=conc2);
by subj ;
if inp and ine;
conc = max(conc, conc2);
drop conc2;
run;
PG

View solution in original post

3 REPLIES 3
Patrick
Opal | Level 21

Your variable conc exists in both datasets but is not part of the by statement for the merge. In such a case the result dataset will contain the value of conc of the last dataset where it exists (using the order as listed in the merge statement).

The easiest way to get around this: rename your variables so that the names are unique; ie. conc_newd4, conc_newl4

 

If you're after non-missing values from either dataset you then can use a coalesce() function in the data step where you merge, ie.

conc=coalesce(conc_newd4, conc_newl4);

You will still have to decide what to do in a case where there is a  non-missing value in the same row for both conc_newd4 and conc_newl4.

PGStats
Opal | Level 21

If conc is always >= 0 in both datasets, then you can simply do:

 

data all;
merge newl4(in=inp) newd4(in=ine rename=conc=conc2);
by subj ;
if inp and ine;
conc = max(conc, conc2);
drop conc2;
run;
PG
jacksonan123
Lapis Lazuli | Level 10

The code worked very well.

 

Thanks

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 816 views
  • 1 like
  • 3 in conversation