- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
So I am attempting to run code that merges two data set and sorts by the STUDYID variable. Unfortunately, the merge cannot run because the STUDYID variable has been identified as both character and numeric:
I have attempted to create a new variable STUDYIDnumeric which converts the STUDYID variable to just numeric via the code
STUDYIDnumeric = STUDYID*1.0. I have also tried using the code STUDYIDnumeric = INPUT(STUDYID*1.0)...
Anyway, here is the code I used.
proc sort data = ITCH.CrossSection1; by STUDYID;
proc sort data = ITCH.CROSS_ICD9_Pre; by STUDYID;
proc sort data = ITCH.CROSS_ICD9_Post; by STUDYID;
data ITCH.CROSS_MERGED;
STUDYIDnumeric = STUDYID*1.0;
merge ITCH.CrossSection1(in=A) ITCH.CROSS_ICD9_Pre ITCH.CROSS_ICD9_Post;
by STUDYIDnumeric;
if A;
run;
However, upon running that code, I still keep getting an error code:
What should I do next? I've been using SAS for about 1 week.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The first question you should really ask is how did you end up with STUDYID defined differently in the two datasets. If you fix the problem at the source then you will not have these types of problems later on.
If the values can be numbers then you will have less problems if you convert them all to numbers. The problem with id values that are converted to numbers is that leading zeros disappear. For numbers leading zeros don't matter, 00123 is the same as 123. But for character strings '00123' does not equal '123'.
You will need to convert the variable before trying to use it in a merge. Convert first and then sort because numbers and character strings will likely sort into different orders.
* STUDYID is character ;
data fix1 ;
set ITCH.CrossSection1 ;
studyid_num = input(studyid,32.);
drop studyid ;
run;
* STUDYID is character ;
data fix2 ;
set ITCH.CROSS_ICD9_Post;
studyid_num = input(studyid,32.);
drop studyid ;
run;
* STUDYID is already numeric ;
data fix3 ;
set ITCH.CROSS_ICD9_Pre;
studyid_num = studyid;
drop studyid ;
run;
proc sort data=fix1 ; by studyid_num ; run;
proc sort data=fix2 ; by studyid_num ; run;
proc sort data=fix3 ; by studyid_num ; run;
data ITCH.CROSS_MERGED;
merge fix1 (in=in1) fix2 fix3 ;
by studyid_num ;
if in1;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
StudyID_Num = input(studyID, best12.);
@RShaw wrote:
Hello,
So I am attempting to run code that merges two data set and sorts by the STUDYID variable. Unfortunately, the merge cannot run because the STUDYID variable has been identified as both character and numeric:
I have attempted to create a new variable STUDYIDnumeric which converts the STUDYID variable to just numeric via the code
STUDYIDnumeric = STUDYID*1.0. I have also tried using the code STUDYIDnumeric = INPUT(STUDYID*1.0)...
Anyway, here is the code I used.
proc sort data = ITCH.CrossSection1; by STUDYID;
proc sort data = ITCH.CROSS_ICD9_Pre; by STUDYID;
proc sort data = ITCH.CROSS_ICD9_Post; by STUDYID;
data ITCH.CROSS_MERGED;
STUDYIDnumeric = STUDYID*1.0;
merge ITCH.CrossSection1(in=A) ITCH.CROSS_ICD9_Pre ITCH.CROSS_ICD9_Post;
by STUDYIDnumeric;
if A;run;
However, upon running that code, I still keep getting an error code:
What should I do next? I've been using SAS for about 1 week.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, so I tried making the simple code change you suggested; unfortunately it did not work:
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The first question you should really ask is how did you end up with STUDYID defined differently in the two datasets. If you fix the problem at the source then you will not have these types of problems later on.
If the values can be numbers then you will have less problems if you convert them all to numbers. The problem with id values that are converted to numbers is that leading zeros disappear. For numbers leading zeros don't matter, 00123 is the same as 123. But for character strings '00123' does not equal '123'.
You will need to convert the variable before trying to use it in a merge. Convert first and then sort because numbers and character strings will likely sort into different orders.
* STUDYID is character ;
data fix1 ;
set ITCH.CrossSection1 ;
studyid_num = input(studyid,32.);
drop studyid ;
run;
* STUDYID is character ;
data fix2 ;
set ITCH.CROSS_ICD9_Post;
studyid_num = input(studyid,32.);
drop studyid ;
run;
* STUDYID is already numeric ;
data fix3 ;
set ITCH.CROSS_ICD9_Pre;
studyid_num = studyid;
drop studyid ;
run;
proc sort data=fix1 ; by studyid_num ; run;
proc sort data=fix2 ; by studyid_num ; run;
proc sort data=fix3 ; by studyid_num ; run;
data ITCH.CROSS_MERGED;
merge fix1 (in=in1) fix2 fix3 ;
by studyid_num ;
if in1;
run;