- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am creating a series of new variables into a new dataset from a previous dataset proc-imported from a xlsx file. My current code is currently creating new variables based off of the variables in the imported dataset.
So for example:
proc import datafile="&Excelfile"
out=work.excelfile
dbms=xlsx
replace;
run;
data NewDS;
set excelfile;
NewVar1=ExcelVar1;
NewVar2=ExcelVar2;
NewVar3=ExcelVar3;
NewVar4=ExcelVar3;
run;
(Some variables in the NewDS dataset will be sourced from the same variable as seen with NewVar3 and NewVar4.)
I was told when I do this that the proper length of the variables from my excelfile dataset will not carry over correctly and will be shortened. And to correct this, that I would need to use the rename statement instead of creating new variables. And because I have some variables sourced from the same variable, I would have to create a new dataset to account for these. Is this method correct?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Creating new variables makes the data set bigger, and thus takes longer to process. Renaming variables does not make the data set bigger, and so it does not take longer to process.
I was told when I do this that the proper length of the variables from my excelfile dataset will not carry over correctly and will be shortened.
I don't think this is true. Of course, you could test this out yourself. For example
data cars;
set sashelp.cars;
newmodel=model;
run;
Variable MODEL has length 40. What is the length of NEWMODEL?
And to correct this, that I would need to use the rename statement instead of creating new variables.
Rename doesn't change length, it only change the name.
And because I have some variables sourced from the same variable, I would have to create a new dataset to account for these.
I don't really know what the above means. Rename does not change the length of the variables.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@InspectahDex wrote:
I was told when I do this that the proper length of the variables from my excelfile dataset will not carry over correctly and will be shortened.
No, and that's easy enough to check quickly in your data.
@InspectahDex wrote:
And to correct this, that I would need to use the rename statement instead of creating new variables.
No, but it makes more sense to use RENAME in general if you're just doing an assignment. But if you're also using the variable multiple times as in NewVar3/4 that wouldn't work.
@InspectahDex wrote:
I am creating a series of new variables into a new dataset from a previous dataset proc-imported from a xlsx file. My current code is currently creating new variables based off of the variables in the imported dataset.
So for example:
proc import datafile="&Excelfile"
out=work.excelfile
dbms=xlsx
replace;
run;
data NewDS;
set excelfile;
NewVar1=ExcelVar1;NewVar2=ExcelVar2;
NewVar3=ExcelVar3;
NewVar4=ExcelVar3;
run;
(Some variables in the NewDS dataset will be sourced from the same variable as seen with NewVar3 and NewVar4.)
I was told when I do this that the proper length of the variables from my excelfile dataset will not carry over correctly and will be shortened. And to correct this, that I would need to use the rename statement instead of creating new variables. And because I have some variables sourced from the same variable, I would have to create a new dataset to account for these. Is this method correct?
No, but it is best practice to create a new data set when adding/renaming variables. If you make a mistake and are doing it in 'place' then you destroy your original data set. Then you have to re-import it or restore it somehow. It's also harder to debug when you code in this fashion and is very much not recommended.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Display formats and assigned labels do not "copy" when creating a new variable. An example:
data example; x='01JUL2020'd; format x mmddyy10.; run; data renametest; set example; rename x=newx; run; proc print data=renametest; run; data newvartest; set example; y = x; run; proc print data=newvartest; run;
The formats would be most noticeable with date, time or datetime type variables.
So if one of your variables were a date the new variable would default to appearing as something like 22097.