Hi everyone,
I have a server I am importing my data from as it was initially created with SQL Server. I usually import the data like this but I have run across an issue with the formats and lengths of some of the character variables being extremely long or the wrong format. the data is yearly data from subjects enrolled. I am cleaning up the data and converting it into sas friendly files for all the years of data we have but this has turned into an issue as the files are way too large. This is the only way i know how to import the data with an OBDC connection to the server:
proc sql;
connect to odbc (dsn=Data);
create table DM.data12 as select * from connection to odbc
(
select * from [DW].[dbo].[fams12]
);
Some of the variables I am importing show up as being 8000 characters long and it's making my dataset impossibly large (50gb) and I cannot even get it to finish running. Is there any code i can try to change the length and/or format of the variables without having to first import and then altering the formats/length within this proc sql step? I would like to do this at the same time that I am importing in order to reduce the amount of memory/time required to change the length.
if there is an easier way outside of proc sql, I am willing to try it but I have not learned how else to connect to my data server/import my data that does not require proc sql. i am willing to learn other ways if it is much easier as i will be doing this regularly to convert SQL server files into sas files. PS i am using SAS 9.3. Thanks in advance!
Add the SAS option COMPRESS = YES or better COMPRESS = BINARY on your queries. This will remove all of the extra space from your long columns and reduce disk storage sizes a lot. You can also apply this option in an OPTIONS statement as well if you want it to apply to all datasets. This is what we do for all our SAS sessions to save space and avoid having to set column lengths explicitly.
proc sql;
connect to odbc (dsn=Data);
create table DM.data12 (compress = binary) as select * from connection to odbc
(
select * from [DW].[dbo].[fams12]
);
quit;
Add the SAS option COMPRESS = YES or better COMPRESS = BINARY on your queries. This will remove all of the extra space from your long columns and reduce disk storage sizes a lot. You can also apply this option in an OPTIONS statement as well if you want it to apply to all datasets. This is what we do for all our SAS sessions to save space and avoid having to set column lengths explicitly.
proc sql;
connect to odbc (dsn=Data);
create table DM.data12 (compress = binary) as select * from connection to odbc
(
select * from [DW].[dbo].[fams12]
);
quit;
This was exactly what i was looking for, thank you!
Glad to help!
First thing you need to find is what data types they exist in your source system. SAS applies the default formats based on the source data types. You can convert in-database using database specific function like CAST() or else you can convert them once returned to SAS environment in the first select clause as shown below.
proc sql;
connect to odbc (dsn=Data);
create table DM.data12 as
select var1 format=$100. length=100, var2 format=$25. length=25
from connection to odbc
(
select * from [DW].[dbo].[fams12]
);
If you have 100's of variables then you can get the data types of source table from (sys.cloumns or dbc.columns) as a table in sas and write a logic to apply sas formats based on source format and put them in macro using proc sql INTO and just call the macro in select clause.
This approach is helpful only if don't wish the default formats applied by SAS and custom format.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.