BookmarkSubscribeRSS Feed
bncoxuk
Obsidian | Level 7
I am trying to merge two datasets. Simply, data1 has variables of school, group, and sex1, and data2 has variables of school, group and sex2. Afer the following DATA step, the values for sex2 become 'O' if they are missing, not the 'Other' I want to get.

DATA work.data;
MERGE work.data1 work.data2;
BY school group;
LENGTH sex2 $5.;
FORMAT sex2 $5.;
IF MISSING(sex2) THEN sex2='Other';
RUN;
4 REPLIES 4
Patrick
Opal | Level 21
Hi

Have the variable declaration statements for sex2 before the merge.

DATA work.data;
LENGTH sex2 $5.;
FORMAT sex2 $5.;
MERGE work.data1 work.data2;
BY school group;
IF MISSING(sex2) THEN sex2='Other';
RUN;


Reason: Variable sex2 gets initiated based on the first occurrence in the code (which in your case is from the source data set where sex2 has a length of $1).

In your log there should also be a truncation warning.

HTH
Patrick
bncoxuk
Obsidian | Level 7
Hi Patrick,

I tried your way, but the variable sex2 no longer exists in the new work.data.
art297
Opal | Level 21
Could it just have been moved to being the left most variable? If not, your data isn't similar to Patrick's example as his code does appear to work correctly.

Art
---------
> Hi Patrick,
>
> I tried your way, but the variable sex2 no longer
> exists in the new work.data.
DF
Fluorite | Level 6 DF
Fluorite | Level 6
SAS will pick up the first reference to a variable, and set the format and length from there. In your case that comes from the merge statement, hence the length of sex2 comes from the dataset work.data2.

If you move your length/format statements to before the merge, then your code should work (albeit the column order will have changed).

data data1;
format school $20.;
format group $20.;
format sex1 $1.;
infile datalines dsd;
input School $ Group $ Sex1 $;
datalines;
School1,Group1,M
School1,Group2,
School2,Group3,M
;
run;

data data2;
format school $20.;
format group $20.;
format sex2 $1.;
infile datalines dsd;
input School $ Group $ Sex2 $;
datalines;
School1,Group1,M
School1,Group2,M
School2,Group3,
;
run;

proc sort data=data1;
by school group;
run;

proc sort data=data2;
by school group;
run;

DATA work.data;
LENGTH sex2 $5.;
FORMAT sex2 $5.;
MERGE work.data1 work.data2;
BY school group;
IF MISSING(sex2) THEN sex2='Other';
RUN;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 760 views
  • 0 likes
  • 4 in conversation