Hi could you please help on this
data one;
input x y;
datalines ;
1 2
1 3
2 4
2 5
2 9
;
run;
data two;
input x z;
cards;
1 4
1 5
1 6
2 3
2 7
;
run;
need output like this
x y z
1 2 4
1 3 5
1 . 6
2 4 3
2 5 7
2 9 .
Do like this
data want;
call missing(y,z);
merge one two;
by x;
run;
I have a huge data
varible values are 'ab-05-abc-$200'
i am getting below error could you suggest on this
ERROR: Variable y has been defined as both character and numeric.
ERROR: Variable z has been defined as both character and numeric.
The error message means exactly what it says: That the variable y is defined as character in one place and as numeric in another.
Do you only have two data sources as in your example? If so run a PROC CONTENTS on both data sets to find the error
The SAS log does not lie to you. Somewhere you have a version of the variable y that is numeric.
Just to be clear: You want the variables to be character, correct? Because in the example data you post, all variables including y and z are numeric?
@kyath_sas wrote:
all are char only
Can't be. In the example data in your initial post, ALL variables are numeric. If you are unsure how to get correct data steps from your real data, use the macro from my footnote to convert a part of your real datasets to data steps.
data want;
merge one two;
by x;
output;
call missing(of _all_);
run;
Ksharp, can you please explain how
output;
call missing(of _all_);
gets rid of
ERROR: Variable y has been defined as both character and numeric.
ERROR: Variable z has been defined as both character and numeric.
is variable y and z character or numeric in data=want?
Thanks
Margaret
@janus58 it does not.
It simply ensures that the variable values are missing when no table contributes to the merge. if you remove the CALL MISSING part, no missing values will appear. Instead the variables are retained
If your two files are already sorted by x I would use the following method:
data one;
input x $ y $;
datalines ;
1 2
1 3
2 4
2 5
2 9
;
data two;
input x $ z $;
cards;
1 4
1 5
1 6
2 3
2 7
;
data one_a;
set one;
by x;
if first.x then seq=1;
else seq+1;
run;
data two_a;
set two;
by x;
if first.x then seq=1;
else seq+1;
run;
data want (drop=seq);
merge one_a two_a;
by x seq;
run;
@janus58: the placement of the call missing statement is critical. If it is the first assignment of those variables, it will make the fields numeric. That is why @Ksharp's code wouldn't produce a mixed case error, as the call missing will assign them as character (as they've already been assigned that way).
However, in the present case, it isn't needed. Since it is a many to many merge, that is the problem that has to be overcome. I did that by assigning seq numbers and including that variable in the by statement.
Art, CEO, AnalystFinder.com
This is the way it's supposed to be done. Moreover, the files with the surrogate key created in this manner can be used in a full SQL join by those who prefer it to MERGE. Note, however, that using views rather than physical data sets is better in terms of performance and space utilization.
The first time I saw this method proposed was 20 years ago, by Ian Whitlock on SAS-L. At the time, he noted that negating the normal behavior of a many-to-many MERGE statement by forcing non-key variables to missing values is bad programming practice. MISSING (of _ALL_) is an even more deadly sin since it's easy to perceive what kind of havoc it can wreak on unsuspecting variables other than those targeted for nullification.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.