BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mcmaxwell
Fluorite | Level 6

Hi, I'm new to SAS Studio and I appreciate any help! 

I have two excel data files that I am trying to merge. The first contains survey data from students in a course (45 variables and 491 observations). The second one contains exam scores from all students within the same course (3 variables, 619 observations).

I am trying to one-to-one left merge the data using student ID number as the identifier, so that my new dataset contains all the information from the survey file and the exam scores only for the students who took my survey. 

I import the files using the following SAS code: 

/** Import survey file.  **/  
PROC IMPORT DATAFILE="/folders/myfolders/sasuser.v94/SMI PARTIALLY Coded Data.xlsx"  
		    OUT=L_DATA  
		    DBMS=XLSX  
		    REPLACE;  
RUN;  
  
/** Import grades file.  **/  
PROC IMPORT DATAFILE="/folders/myfolders/sasuser.v94/Exam 1 Scores.xlsx"  
		    OUT=R_DATA  
		    DBMS=XLSX  
		    REPLACE;  
RUN; 

Then, I create new ID columns that are characters rather than numbers: 

/** Create a new ID column as a character **/  
DATA LEFT; /* Names the new data LEFT */  
   set L_DATA (rename=(SUID=ID)); /* Says to use L_DATA data, and to rename variable SUID as ID */   
   SUID= put(ID,f5. -L); /* Not sure what this does. */  
   drop SUID; /* Drops the variable SUID */  
run; 

/** Create a new SUID column as Character in GRADES **/   
DATA RIGHT;    
   set R_DATA (rename=(SUID=ID));    
   SUID= put(ID,f5. -L);    
   drop SUID;   
run;  

Then, I sort the datasets by ID 

/** Sort datasets by ID **/  
PROC SORT DATA = LEFT;  
	BY ID;   
RUN;  
PROC SORT DATA = RIGHT;  
	BY ID;   
RUN;  

And finally, I try to merge the data sets: 

/** Merge datasets **/ 
DATA ALL; 
	MERGE LEFT (IN=LEFT) RIGHT (IN=RIGHT); 
	BY ID; 
	IF LEFT;  
RUN; 

Everything seems to work until this last step. I get the error message "ERROR: Variable ID has been defined as both character and numeric." 

 

How do I fix this? Is there a better way to import my data so I don't have to define the variables as characters each time? 

Thanks!

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

With the use of Excel files (the WORST file format for data transfer), you have to live with the guesses that SAS has to make about variable types.

Save your data to text (e.g. csv) files, which you can read with data steps where you have control.

Otherwise, you have to clean up the mess cause by the *#@%!? Excel files.

A step like this:

DATA RIGHT;    
   set R_DATA (rename=(SUID=ID));    
   SUID= put(ID,f5. -L);    
   drop SUID;   
run;  

does not really convert anything, as you immediately throw away the newly created character variable.

View solution in original post

2 REPLIES 2
Reeza
Super User

When you import from Excel you do not have control over the type of the data. You either need to convert it after the fact or you need to save the XLSX files as CSV and read them in and specify the correct types that way. 

 

To convert a number to a character use PUT - but note that you need to change the variable name. 

The format here should represent what you want the data to look like. You may also want to align it with the optional alignment parameter (-l).

 

Num2Char = put(varNum, 8. -l);

To convert a character to a number use INPUT(). 

 

Char2Num = input(varChar, 8.);

The informat here should represent what the data currently looks like. 

 


@mcmaxwell wrote:

Hi, I'm new to SAS Studio and I appreciate any help! 

I have two excel data files that I am trying to merge. The first contains survey data from students in a course (45 variables and 491 observations). The second one contains exam scores from all students within the same course (3 variables, 619 observations).

I am trying to one-to-one left merge the data using student ID number as the identifier, so that my new dataset contains all the information from the survey file and the exam scores only for the students who took my survey. 

I import the files using the following SAS code: 

/** Import survey file.  **/  
PROC IMPORT DATAFILE="/folders/myfolders/sasuser.v94/SMI PARTIALLY Coded Data.xlsx"  
		    OUT=L_DATA  
		    DBMS=XLSX  
		    REPLACE;  
RUN;  
  
/** Import grades file.  **/  
PROC IMPORT DATAFILE="/folders/myfolders/sasuser.v94/Exam 1 Scores.xlsx"  
		    OUT=R_DATA  
		    DBMS=XLSX  
		    REPLACE;  
RUN; 

Then, I create new ID columns that are characters rather than numbers: 

/** Create a new ID column as a character **/  
DATA LEFT; /* Names the new data LEFT */  
   set L_DATA (rename=(SUID=ID)); /* Says to use L_DATA data, and to rename variable SUID as ID */   
   SUID= put(ID,f5. -L); /* Not sure what this does. */  
   drop SUID; /* Drops the variable SUID */  
run; 

/** Create a new SUID column as Character in GRADES **/   
DATA RIGHT;    
   set R_DATA (rename=(SUID=ID));    
   SUID= put(ID,f5. -L);    
   drop SUID;   
run;  

Then, I sort the datasets by ID 

/** Sort datasets by ID **/  
PROC SORT DATA = LEFT;  
	BY ID;   
RUN;  
PROC SORT DATA = RIGHT;  
	BY ID;   
RUN;  

And finally, I try to merge the data sets: 

/** Merge datasets **/ 
DATA ALL; 
	MERGE LEFT (IN=LEFT) RIGHT (IN=RIGHT); 
	BY ID; 
	IF LEFT;  
RUN; 

Everything seems to work until this last step. I get the error message "ERROR: Variable ID has been defined as both character and numeric." 

 

How do I fix this? Is there a better way to import my data so I don't have to define the variables as characters each time? 

Thanks!

 

 

 


 

Kurt_Bremser
Super User

With the use of Excel files (the WORST file format for data transfer), you have to live with the guesses that SAS has to make about variable types.

Save your data to text (e.g. csv) files, which you can read with data steps where you have control.

Otherwise, you have to clean up the mess cause by the *#@%!? Excel files.

A step like this:

DATA RIGHT;    
   set R_DATA (rename=(SUID=ID));    
   SUID= put(ID,f5. -L);    
   drop SUID;   
run;  

does not really convert anything, as you immediately throw away the newly created character variable.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 4159 views
  • 0 likes
  • 3 in conversation