BookmarkSubscribeRSS Feed
varatt90
Obsidian | Level 7

Hi, 

 

I have a dataset containing 100 variables, some of which I derived based on the original variables. Due to the abundance of variables, I decided to export the comprehensive dataset into a smaller one comprising approximately 20 variables for streamlined analysis.

 

Upon attempting to export and import my new dataset file as csv. file , I observed that the format of the newly created variables differs from the original ones.

 

For example, in the large dataset, Gender is formatted as BEST32. under the column "Informat": 

Screenshot 2024-01-15 at 10.24.18 PM.png

After exporting and importing the smaller dataset, the numeric variables become character and the format and informat columns differ.

Screenshot 2024-01-15 at 10.22.20 PM.png

How do I prevent this from happening?

 

Thanks in advance!

9 REPLIES 9
SASKiwi
PROC Star

Why can't you just read the required columns without exporting and importing?

data Want;
  set Have (keep = Gender Grade Province);
run; 
varatt90
Obsidian | Level 7
I can and have - but my code book is becoming quite long so I would like to create a new dataset for organizational purposes. Not sure if this is the best idea but I thought it would help keep things tidy.
SASKiwi
PROC Star

I don't follow why exporting and importing gives you any advantage - please explain. As you have found it has changed some column types, something that doesn't happen if you just read your source dataset. 

varatt90
Obsidian | Level 7
Sure! It's not an advantage in terms of analyzing the data but more so in keeping my sas file with my codes organized. I wanted to keep my original file for coding and cleaning the data and for my second file with the select variables as my analytic file. I also, will need to use other statistical software in the future to analyze my data and believe it would be easier to export my smaller dataset.
Ksharp
Super User

1) If your exporting data is CSV, you could try GUESSINGROWS=MAX option of PROC IMPORT.

 
proc import datafile='c:\temp\a\a.csv' out=a dbms=csv replace;
guessingrows=max;
run;



2) You could change CSV into other format ,like : ACCESS or SAV(SPSS) ,for other statistical software to use:
For SPSS:
proc export data=sashelp.class outfile='c:\temp\a\a.sav' dbms=sav replace;
run;
For Access:

/*Creat an empty a ACESS file "c:\temp\a\a.accdb" firstly before runing the following code*/
libname a access 'c:\temp\a\a.accdb';
proc copy in=sashelp out=a noclone;
select class;
run;
libname a clear;

 

Patrick
Opal | Level 21

.csv is a text file format that doesn't preserve variable attributes. 

If you just want to split your process into multiple programs that you can run separately then consider to store the table in a permanent location (a folder that you defined via a libname statement). This can still be a SAS table (SAS file, .sas7bdat).

 

If you are using SAS9.4 M8 or a "reasonably" recent Viya version then you could also store the data in Parquet format. Parquet has also the advantage that it stores data in a more compressed way.

 

Even with .sas7bdat there are still quite a few other software packages that can read the data - Python for example.

ballardw
Super User

@varatt90 wrote:
I can and have - but my code book is becoming quite long so I would like to create a new dataset for organizational purposes. Not sure if this is the best idea but I thought it would help keep things tidy.

I have a very hard time understanding how ADDING a data set REDUCES a code book. You now have to document two data sets because the second one is dependent on the first. Especially if any of the variables used to create the new ones were dropped from the new set.

 

 

andreas_lds
Jade | Level 19

To avoid changes in types, lengths and formats don't use proc import to read a csv-file, write a data step instead.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 978 views
  • 0 likes
  • 7 in conversation