BookmarkSubscribeRSS Feed
breebreebree
Calcite | Level 5

I am required to merge 5 data sets into one. Not all data sets had the same variables so I have linked the datasets as best as I can. I am now having issues with a couple of the observations, for example: 

 

Bahamas       609 
Bahamas (the)BSBHS   2.9367168   
Bahamas, The BHS29930High Income      

 

This is clearly all data relating to the same country, however it is not in one row. 

 

How can I make the data look like this?

 

BahamasBSBHS29930High Income 2.9367168609 

 

Thanks

3 REPLIES 3
VENKATAMAHESH
Calcite | Level 5

If you have different countries, convert the following code into a macro.

data original;
input country$14. val1$ val2$ val3 val4$11. val5$ val6$ val7$ val8$ val9$ val10$;
infile datalines missover dsd dlm="";
lines;
Bahamas                        60 9 
Bahamas (the) BS BHS              2.9 367168
Bahamas, The   BHS 29930 High Income
;
run;

data _NULL_;
set original;
if not missing(val1) then call symputx("val1",val1);
else if not missing(val2) then call symputx("val2",val2);
else if not missing(val3) then call symputx("val3",val3);
else if not missing(val4) then call symputx("val4",val4);
else if not missing(val5) then call symputx("val5",val5);
else if not missing(val6) then call symputx("val6",val6);
else if not missing(val7) then call symputx("val7",val7);
else if not missing(val8) then call symputx("val8",val8);
else if not missing(val9) then call symputx("val9",val9);
else if not missing(val10) then call symputx("val10",val10);
run;

data final;
country = "Bahamas";
val1 = "&val1";
val2 = "&val2";
val3 = "&val3";
val4 = "&val4";
val5 = "&val5";
val6 = "&val6";
val7 = "&val7";
val8 = "&val8";
val9 = "&val9";
val10 = "&val10";
run;
Patrick
Opal | Level 21

You need to standardize the columns you're using to merge the data. SAS got products for this (DataFlux/Data Quality Server) but for your case it's likely good enough if you just analyse the result after your first go (as you already did) and then implement standardization as below. Then use the standardized column for merging.

proc format;
  invalue $country_stdz(upcase)
    'BAHAMAS'       = 'Bahamas'     
    'BAHAMAS (THE)' = 'Bahamas'
    'BAHAMAS, THE'  = 'Bahamas'
  ;
run;

data demo;
  infile datalines truncover;
  input country $20.;
  length country_stdz $20.;
  country_stdz=input(country, $country_stdz.);
  datalines;
Bahamas      
Bahamas (the)
Bahamas, The 
;

proc print;
run;

 

If you're reading your data from an external file then you can use the informat also directly in the input statement.

proc format;
  invalue $country_stdz(upcase)
    'BAHAMAS'       = 'Bahamas'     
    'BAHAMAS (THE)' = 'Bahamas'
    'BAHAMAS, THE'  = 'Bahamas'
  ;
run;

data demo;
  infile datalines truncover;
  input country $country_stdz20.;
  datalines;
Bahamas      
Bahamas (the)
Bahamas, The 
Other country
;

proc print;
run;

Patrick_0-1591579642172.png

 

Ksharp
Super User
data original;
input country$14. val1$ val2$ val3 val4$11. val5$ val6$ val7$ val8$ val9$ val10$;
infile datalines missover dsd dlm="";
cards;
Bahamas                        60 9 
Bahamas (the) BS BHS              2.9 367168
Bahamas, The   BHS 29930 High Income
;
run;
data temp;
set original;
id=scan(country,1,,'ka');
run;
data want;
 update temp(obs=0) temp;
 by id;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 585 views
  • 3 likes
  • 4 in conversation