BookmarkSubscribeRSS Feed
azt5173
Obsidian | Level 7

I have three data sets (A,B,C) all of them have missing values but if i combine them together it should output a complete data set. All of the data set contains the variables(g1,g2,g3) and the data set kind of looks like this:

 

      g1 g2 g3               g1 g2 g3             g1 g2 g3

1      1.1 2.2              1 .    .   .              1 .    .   .

2       2 3  4               2  .   .   .               2.   .   .

3       5 6 7               3  .    .   .              3 .    .   .

4      1  2 3               4  .  .   .                 4 .   .  .

5      .   .  .                5 1 2 3                 5  .  .  .

6     1 2  3                 6 .   . .                  6.   .  .

7     .   .   .                                              7  5 6 7

 

I need the output data set to look like this:

 

       g1 g2 g3

1     1.1 2  2

2     2  3   4

3     5 6 7

4     1  2 3

5    1  2  3

6    1  2 3

7    5   6 7

I have tried the code below:

 

data m;
merge a b  c;
by g1 g2 g3;
retain tempg1 tempg2 tempg3;
if first.g1 then tempg1=.;
else if first.g2 then tempg2=.;
else if first.g3 then tempg3=.;
if g1=. then g1=tempg1;
else if g2=. then g2=tempg2;
else if g3=. then g3=tempg3;
if tempg1=g1;
if tempg2=g2;
if tempg3=g3;
run;

 

SAS says that the data sets have to be sorted but then the one data's missing values do not correspond non missing values in the other data set. Any help will be appreciated,

7 REPLIES 7
Astounding
PROC Star

What's the name of the first variable, the one that numbers the observations 1 through 7?

azt5173
Obsidian | Level 7

Hi, it doesn't have a name I guess its just the observations. 

Astounding
PROC Star

WIthout such a variable combining the data sets is unreliable.  You have different numbers of observations in each data set.  How do you know which ones are the right matches?

 

Assuming you do have a variable named ID that holds the contents of the first column ...

 

data combined;

set c b a;

run;

 

proc sort data=combined;

by id;

run;

 

data want;

update combined (obs=0) combined;

by id;

ballardw
Super User

Anywhere you use a BY statement the data set(s) must be sorted by all of the variables on the BY statement. From your example I don't believe that you want to use a by statement.

 

You need to provide some rules as to WHICH specific value you want when the variable has values in more than one data set.

 

If you need to compare in the existing order and want to pick the value from a if it exist, from b if it exists but not in a or from c if not in a or b then perhaps:

 

data want; 
   set
      a (rename=(g1= ag1 g2=ag2 g3=ag3) )
      b (rename=(g1= bg1 g2=bg2 g3=bg3) )
      c (rename=(g1= cg1 g2=cg2 g3=cg3) )
   ;
   g1= coalesce(ag1, bg1, cg1);
   g2= coalesce(ag2, bg2, cg2);
   g3= coalesce(ag3, bg3, cg3);
   keep g1 g2 g3;
run;

if the variables are actually character values then the function would be coalescec

 

Patrick
Opal | Level 21

@ballardw

I believe you need a MERGE not a SET for your approach to work as intended.

pau13rown
Lapis Lazuli | Level 10

first create an id variable in each data set and merge on that

 

data data1_;

retain id 0;

set data1;

id=id+1;

run;

 

and then merge all 3 datasets "by id". Youll need to rename variables as @Astounding says, and then the code is easy to replace the missings

Ksharp
Super User
data want; 
   merge
      a (rename=(g1= ag1 g2=ag2 g3=ag3) )
      b (rename=(g1= bg1 g2=bg2 g3=bg3) )
      c (rename=(g1= cg1 g2=cg2 g3=cg3) )
   ;
   g1= coalesce(ag1, bg1, cg1);
   g2= coalesce(ag2, bg2, cg2);
   g3= coalesce(ag3, bg3, cg3);
   keep g1 g2 g3;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 916 views
  • 0 likes
  • 6 in conversation