Solved: Re: merge multiple table by retaining value of flag in all columns

jeetendersinghc · Posted 03-21-2019 12:14 PM

hi,

I need help while merging multiple datasets. I have 6 datasets as following:

Dataset a
id	age	in_2012
1	30	Y
Dataset b
id	age	in_2013

Dataset c
id	age	in_2014
1	32	Y
Dataset d
id	age	in_2015
1	33	Y
Dataset e
id	age	in_2016
1	34	Y
Dataset f
id	age	in_2017
1	35	Y
Dataset g
id	age	in_2018
1	36	Y

Final output
id	age	in_2012	in_2013	in_2014	in_2015	in_2016	in_2017	in_2018
1	30	Y
1	32			Y
1	33				Y
1	34					Y
1	35						Y
1	36							Y

Subject Id 1 has record present from 2012 to 2018 hence we have variable in_2012... to in_2018.

When i am merging the above 6 datasets by id and age , i am getting the above output.

Can anyone pl guide me how can i get the output with the variable in_2012 to in_2018 has "Y" in all the rows(if records are available in all datasets) and not in selective rows as presented above.

This is the ouput which i want:

id	age	in_2012	in_2013	in_2014	in_2015	in_2016	in_2017	in_2018
1	30	Y		Y	Y	Y	Y	Y
1	32	Y		Y	Y	Y	Y	Y
1	33	Y		Y	Y	Y	Y	Y
1	34	Y		Y	Y	Y	Y	Y
1	35	Y		Y	Y	Y	Y	Y
1	36	Y		Y	Y	Y	Y	Y

Thanks,

Jeetender

sotojcr · Posted 03-21-2019 05:12 PM

data A;
input ID AGE IN_2012 $1.;
datalines;
1 30 Y
;
run;

data B;
input ID AGE IN_2013 $1.;
datalines;
;
run;

data C;
input ID AGE IN_2014 $1.;
datalines;
1 32 Y
;
run;

data D;
input ID AGE IN_2015 $1.;
datalines;
1 33 Y
;
run;

data E;
input ID AGE IN_2016 $1.;
datalines;
1 34 Y
;
run;

data F;
input ID AGE IN_2017 $1.;
datalines;
1 35 Y
;
run;

data G;
input ID AGE IN_2018 $1.;
datalines;
1 36 Y
;
run;
data  A_TO_G;
set  A B C D E F G;
keep ID AGE;
run;

proc sort data=A_TO_G nodupkey;
by ID AGE;
run;

proc sort data=A;
by ID AGE;
run;

proc sort data=B;
by ID AGE;
run;

proc sort data=C;
by ID AGE;
run;

proc sort data=D;
by ID AGE;
run;

proc sort data=E;
by ID AGE;
run;

proc sort data=F;
by ID AGE;
run;

proc sort data=G;
by ID AGE;
run;

data ALL_TABLE;
merge A_TO_G(in=a)
	  A(drop=age)
	  B(drop=age)
	  C(drop=age)
	  D(drop=age)
	  E(drop=age)
	  F(drop=age)
	  G(drop=age);
by ID;
if a;
run;

View solution in original post

sotojcr · Posted 03-21-2019 05:12 PM

data A;
input ID AGE IN_2012 $1.;
datalines;
1 30 Y
;
run;

data B;
input ID AGE IN_2013 $1.;
datalines;
;
run;

data C;
input ID AGE IN_2014 $1.;
datalines;
1 32 Y
;
run;

data D;
input ID AGE IN_2015 $1.;
datalines;
1 33 Y
;
run;

data E;
input ID AGE IN_2016 $1.;
datalines;
1 34 Y
;
run;

data F;
input ID AGE IN_2017 $1.;
datalines;
1 35 Y
;
run;

data G;
input ID AGE IN_2018 $1.;
datalines;
1 36 Y
;
run;
data  A_TO_G;
set  A B C D E F G;
keep ID AGE;
run;

proc sort data=A_TO_G nodupkey;
by ID AGE;
run;

proc sort data=A;
by ID AGE;
run;

proc sort data=B;
by ID AGE;
run;

proc sort data=C;
by ID AGE;
run;

proc sort data=D;
by ID AGE;
run;

proc sort data=E;
by ID AGE;
run;

proc sort data=F;
by ID AGE;
run;

proc sort data=G;
by ID AGE;
run;

data ALL_TABLE;
merge A_TO_G(in=a)
	  A(drop=age)
	  B(drop=age)
	  C(drop=age)
	  D(drop=age)
	  E(drop=age)
	  F(drop=age)
	  G(drop=age);
by ID;
if a;
run;

ballardw · Posted 03-21-2019 05:53 PM

I'm sort of interested in how you are going to use that output file.

The structure and variable names makes me think that sometime in 2019 or 2020 you will be adding more columns. Which means any code processing this data set will need to account for the added column a may become cumbersome.

This type of structure to a data set is pretty common to someone attempting to replicate a process that had been done in a spreadsheet and frequently is suboptimal when using other software.

Astounding · Posted 03-21-2019 07:39 PM

Assuming your data sets are all sorted:

data combined;
   set a b c d e f g;
   by id;
run;

data years;
   update combined (obs=0 drop=age) combined (drop=age);
   by id;
run;

data want;
   merge combined (keep=id age) years;
   by id;
run;

merge multiple table by retaining value of flag in all columns

Re: merge multiple table by retaining value of flag in all columns

Re: merge multiple table by retaining value of flag in all columns

Re: merge multiple table by retaining value of flag in all columns

Re: merge multiple table by retaining value of flag in all columns

Registration is open