BookmarkSubscribeRSS Feed
InspectahDex
Obsidian | Level 7

I am creating a series of new variables into a new dataset from a previous dataset proc-imported from a xlsx file. My current code is currently creating new variables based off of the variables in the imported dataset.

 

So for example:

 

proc import datafile="&Excelfile"
out=work.excelfile
dbms=xlsx
replace;
run;

 

data NewDS;
set excelfile;
NewVar1=ExcelVar1;

NewVar2=ExcelVar2;

NewVar3=ExcelVar3;

NewVar4=ExcelVar3; 

run;

 

(Some variables in the NewDS dataset will be sourced from the same variable as seen with NewVar3 and NewVar4.)

 

I was told when I do this that the proper length of the variables from my excelfile dataset will not carry over correctly and will be shortened. And to correct this, that I would need to use the rename statement instead of creating new variables. And because I have some variables sourced from the same variable, I would have to create a new dataset to account for these. Is this method correct? 

3 REPLIES 3
PaigeMiller
Diamond | Level 26

Creating new variables makes the data set bigger, and thus takes longer to process. Renaming variables does not make the data set bigger, and so it does not take longer to process.

 

I was told when I do this that the proper length of the variables from my excelfile dataset will not carry over correctly and will be shortened. 

I don't think this is true. Of course, you could test this out yourself. For example

 

data cars;
	set sashelp.cars;
	newmodel=model;
run;

Variable MODEL has length 40. What is the length of NEWMODEL?

 

And to correct this, that I would need to use the rename statement instead of creating new variables. 

 

Rename doesn't change length, it only change the name.

 

And because I have some variables sourced from the same variable, I would have to create a new dataset to account for these.

I don't really know what the above means. Rename does not change the length of the variables.

--
Paige Miller
Reeza
Super User

@InspectahDex wrote:

 

I was told when I do this that the proper length of the variables from my excelfile dataset will not carry over correctly and will be shortened. 


No, and that's easy enough to check quickly in your data. 


@InspectahDex wrote:

And to correct this, that I would need to use the rename statement instead of creating new variables. 


No, but it makes more sense to use RENAME in general if you're just doing an assignment. But if you're also using the variable multiple times as in NewVar3/4 that wouldn't work.

 


@InspectahDex wrote:

I am creating a series of new variables into a new dataset from a previous dataset proc-imported from a xlsx file. My current code is currently creating new variables based off of the variables in the imported dataset.

 

So for example:

 

proc import datafile="&Excelfile"
out=work.excelfile
dbms=xlsx
replace;
run;

 

data NewDS;
set excelfile;
NewVar1=ExcelVar1;

NewVar2=ExcelVar2;

NewVar3=ExcelVar3;

NewVar4=ExcelVar3; 

run;

 

(Some variables in the NewDS dataset will be sourced from the same variable as seen with NewVar3 and NewVar4.)

 

I was told when I do this that the proper length of the variables from my excelfile dataset will not carry over correctly and will be shortened. And to correct this, that I would need to use the rename statement instead of creating new variables. And because I have some variables sourced from the same variable, I would have to create a new dataset to account for these. Is this method correct? 


No, but it is best practice to create a new data set when adding/renaming variables. If you make a mistake and are doing it in 'place' then you destroy your original data set. Then you have to re-import it or restore it somehow. It's also harder to debug when you code in this fashion and is very much not recommended. 

ballardw
Super User

Display formats and assigned labels do not "copy" when creating a new variable. An example:

data example;
   x='01JUL2020'd;
   format x mmddyy10.;
run;

data renametest;
   set example;
   rename x=newx;
run;

proc print data=renametest;
run;

data newvartest;
   set example;
   y = x;
run;

proc print data=newvartest;
run;

The formats would be most noticeable with date, time or datetime type variables.

So  if one of your variables were a date the new variable would default to appearing as something like 22097.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 5881 views
  • 1 like
  • 4 in conversation