BookmarkSubscribeRSS Feed
kyath_sas
Calcite | Level 5

Hi could you please help on this 

 

data one;
input x y;
datalines ;
1 2
1 3

2 4

2 5

2 9
;
run;
data two;
input x z; 
cards;
1 4
1 5
1 6

2 3

2 7


;
run;

 

 

need output like this

x y z

1 2 4

1 3 5

1 .  6

2 4 3

2 5 7

2 9 .

 

11 REPLIES 11
PeterClemmensen
Tourmaline | Level 20

Do like this

 

data want;
   call missing(y,z);
   merge one two;
   by x;
run;
kyath_sas
Calcite | Level 5

I have a huge data 

varible values are 'ab-05-abc-$200'

 i am getting below error could you suggest on this

 

ERROR: Variable y has been defined as both character and numeric.
ERROR: Variable z has been defined as both character and numeric.

PeterClemmensen
Tourmaline | Level 20

The error message means exactly what it says: That the variable y is defined as character in one place and as numeric in another.

 

Do you only have two data sources as in your example? If so run a PROC CONTENTS on both data sets to find the error

kyath_sas
Calcite | Level 5
all are char only
PeterClemmensen
Tourmaline | Level 20

The SAS log does not lie to you. Somewhere you have a version of the variable y that is numeric.

 

Just to be clear: You want the variables to be character, correct? Because in the example data you post, all variables including y and z are numeric?

Kurt_Bremser
Super User

@kyath_sas wrote:
all are char only

Can't be. In the example data in your initial post, ALL variables are numeric. If you are unsure how to get correct data steps from your real data, use the macro from my footnote to convert a part of your real datasets to data steps.

Ksharp
Super User
data want;
   merge one two;
   by x;
   output;
   call missing(of _all_);
run;
janus58
Fluorite | Level 6

Ksharp, can you please explain how

output;
   call missing(of _all_);

gets rid of

ERROR: Variable y has been defined as both character and numeric.
ERROR: Variable z has been defined as both character and numeric.

 

is variable y and z character or numeric in data=want?

 

Thanks

Margaret

 

PeterClemmensen
Tourmaline | Level 20

@janus58 it does not. 

 

It simply ensures that the variable values are missing when no table contributes to the merge. if you remove the CALL MISSING part, no missing values will appear. Instead the variables are retained 

art297
Opal | Level 21

If your two files are already sorted by x I would use the following method:

 

data one;
  input x $ y $;
  datalines ;
1 2
1 3
2 4
2 5
2 9
;

data two;
  input x $ z $; 
  cards;
1 4
1 5
1 6
2 3
2 7
;

data one_a;
  set one;
  by x;
  if first.x then seq=1;
  else seq+1;
run;

data two_a;
  set two;
  by x;
  if first.x then seq=1;
  else seq+1;
run;

data want (drop=seq);
  merge one_a two_a;
  by x seq;
run;

@janus58: the placement of the call missing statement is critical. If it is the first assignment of those variables, it will make the fields numeric. That is why @Ksharp's code wouldn't produce a mixed case error, as the call missing will assign them as character (as they've already been assigned that way).

 

However, in the present case, it isn't needed. Since it is a many to many merge, that is the problem that has to be overcome. I did that by assigning seq numbers and including that variable in the by statement.

 

Art, CEO, AnalystFinder.com

 

hashman
Ammonite | Level 13

 

This is the way it's supposed to be done. Moreover, the files with the surrogate key created in this manner can be used in a full SQL join by those who prefer it to MERGE. Note, however, that using views rather than physical data sets is better in terms of performance and space utilization.

 

The first time I saw this method proposed was 20 years ago, by Ian Whitlock on SAS-L. At the time, he noted that negating the normal behavior of a many-to-many MERGE statement by forcing non-key variables to missing values is bad programming practice. MISSING (of _ALL_) is an even more deadly sin since it's easy to perceive what kind of havoc it can wreak on unsuspecting variables other than those targeted for nullification.      

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 2036 views
  • 2 likes
  • 7 in conversation