I have exported an .xpt file from R using the haven package. My goal is to be able to create files that SAS users are able to import and use. I am able to import the file back into R without problem using the haven package. However, the SAS users are not able to successfully import the .xpt file into SAS. Is there something that can be done when exporting the .xpt that would make it easier to open in SAS as an .xpt. Conversely, is there anything that can be done in SAS to more easily open the .xpt file? I am an R user but have little knowledge of SAS. As a second challenge, this data is part of government work and has to be saved on a shared drive. When trying to import .xpt datasets from the shared drive, it takes a very long time. In addition to my initial question, is there any suggestion on what method would be the quickest?
@jlt10f wrote:
I have exported an .xpt file from R using the haven package. My goal is to be able to create files that SAS users are able to import and use. I am able to import the file back into R without problem using the haven package. However, the SAS users are not able to successfully import the .xpt file into SAS. Is there something that can be done when exporting the .xpt that would make it easier to open in SAS as an .xpt. Conversely, is there anything that can be done in SAS to more easily open the .xpt file? I am an R user but have little knowledge of SAS. As a second challenge, this data is part of government work and has to be saved on a shared drive. When trying to import .xpt datasets from the shared drive, it takes a very long time. In addition to my initial question, is there any suggestion on what method would be the quickest?
Why not? Do they get an error? Do they not know how to import an xpt file?
Can you show use the code the users are attempting to use with the XPT file?
There is a step to read the file back into a SAS data set. If the resulting data set is not stored in a permanent library then the steps to read it will have to be repeated each time to use the file.
XPT files are not native data sets. They are a specifically formatted text file that provides the instructions to SAS to rebuild a data set so the current version/operating system can use it.
This is the code that gave us the "File is not a SAS dataset" error (with filename and locations replaced)
libname filename xport 'file location\filename.xpt';
data yourlibrary.dataset;
set filename.filename;
run;
Then when that didn't work, we ran this and it was running for an hour before we cut it off:
%xpt2loc(libref=work, filespec='c:\temp\trans.xpt') (again, with the appropriate filenames added)
I think it may be important to mention that we are working on a shared drive and it is possible that is what is making it take so long, although other things were imported in a reasonable time period, so I am not sure.
@jlt10f wrote:
Okay. They said they tried this and it results in taking an hour plus trying to read in the file, at which point they quit trying. Somone else suggested trying COMPRESS. I think it is some issue with the shared drive making everything take a very long time to import.
What size file are you talking about? If you are moving gigabytes of data it will take time.
How many observations? How many variables? How long are the variables?
How large is the resulting XPT file? Note that XPT files are NOT compressed so they can get very large if there are many long character variables.
@jlt10f wrote:
In a related question... we were able to successfully import the file under Version 5, but the problem is now dealing with the missing data. Since R and SAS having different means of recording missings, what is the best method for retaining numeric variables from R when imported into SAS while having the missing properly coded in SAS?
What happens now with your XPT files? What does it do wrong with missing values?
Do you want to switch to writing CSV files? When you write them out from R make sure to use the na="" option on the write.csv() call so it does not write that goofy NA string into the CSV file.
Note if you do send them CSV files you will need to also send along DOCUMENTATION about the variables so they know how to read the CSV file properly to get back the right type of data.
Have them LOOK at the file and see if R actually created something that a least looks like an XPT file. Just look at the first 240 bytes.
data _null_ ;
infile 'c:\temp\trans.xpt' lrecl=80 recfm=f obs=3;
input;
list;
run;
(or use some R code that will let you see the first 240 bytes and make sure it looks valid).
A SAS V5 transport file will look like this:
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 1 HEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!000000000000000000000000000000 2 SAS SAS SASLIB 9.4 X64_10PR 09OCT23:15:04:02 3 09OCT23:15:04:02
Something that will require the use of the %XPT2LOC() macro will have LIBV8 or LIBV9 instead of LIBRARY in that first line.
Describe in more detail the data you are trying to transfer? How many variables? How many observations? Are all of the variable names valid SAS names? Are any of the names longer then 8 characters (required for a V5 transport file)? Longer than 32 characters (SAS names cannot be longer than 32 characters)? Are any of the character variables longer than 200 bytes?
If you do have many long character variables then make sure they are running the code to import the XPT file with the COMPRESS option set to YES so it does not try to generate a giant SAS dataset. If you have variable that is defined to hold up to 200 characters but the maximum actual length is only 50 characters you are wasting 150 bytes of storage for every observation.
Having both R and SAS decided to test this out.
What version of Haven do you have?
EDIT: What is the length of the name of the data set? I shortened the file name to under 8 characters and it worked (replicated this process with it named SASfile and it worked ok for me )
I updated haven package in R and it worked.
I exported mpg (sashelp.cars equivalent) and am getting issues. Using @Tom code I get the following in my log for first three lines.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
1 HEADER RECORD*******LIBV8 HEADER RECORD!!!!!!!000000000000000000000000000000
2 SAS SAS SASLIB 6.06 bsd4.2 10OCT23:16:13:55
3 10OCT23:16:13:55
So attempting to use XPT2LOC, but then I get a different series of errors related to formats not being read correctly. Sample XPT file attached as zip for anyone else playing around.
78 %xpt2loc(libref=work, filespec=&xpt_file_loc.);
NOTE: Format XPRTFLT is already on the library WORK.FORMATS.
NOTE: Format XPRTFLT has been output.
NOTE: Informat XPRTFLT is already on the library WORK.FORMATS.
NOTE: Informat XPRTFLT has been output.
NOTE: PROCEDURE FORMAT used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 123.50k
OS Memory 20640.00k
Timestamp 10/10/2023 10:24:59 PM
Step Count 137 Switch Count 0
Page Faults 0
Page Reclaims 15
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 40
NOTE: The data set WORK.CNTLIN has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 534.12k
OS Memory 20900.00k
Timestamp 10/10/2023 10:24:59 PM
Step Count 138 Switch Count 2
Page Faults 0
Page Reclaims 100
Page Swaps 0
Voluntary Context Switches 11
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264
NOTE: There were 1 observations read from the data set WORK.CNTLIN.
NOTE: The data set WORK.CNTLIN has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 800.21k
OS Memory 21160.00k
Timestamp 10/10/2023 10:24:59 PM
Step Count 139 Switch Count 2
Page Faults 0
Page Reclaims 135
Page Swaps 0
Voluntary Context Switches 10
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264
NOTE: There were 1 observations read from the data set WORK.CNTLIN.
NOTE: SAS sort was used.
NOTE: 0 observations with duplicate key values were deleted.
NOTE: The data set WORK.CNTLIN has 1 observations and 4 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.00 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 780.15k
OS Memory 21160.00k
Timestamp 10/10/2023 10:24:59 PM
Step Count 140 Switch Count 2
Page Faults 0
Page Reclaims 127
Page Swaps 0
Voluntary Context Switches 10
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 264
NOTE: Format $MEMWANT is already on the library WORK.FORMATS.
NOTE: Format $MEMWANT has been output.
NOTE: PROCEDURE FORMAT used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 445.78k
OS Memory 20900.00k
Timestamp 10/10/2023 10:24:59 PM
Step Count 141 Switch Count 0
Page Faults 0
Page Reclaims 51
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 8
NOTE: There were 1 observations read from the data set WORK.CNTLIN.
NOTE: Deleting WORK.CNTLIN (memtype=DATA).
NOTE: PROCEDURE DELETE used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 36.21k
OS Memory 20640.00k
Timestamp 10/10/2023 10:24:59 PM
Step Count 142 Switch Count 2
Page Faults 0
Page Reclaims 15
Page Swaps 0
Voluntary Context Switches 9
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 0
NOTE: UNBUFFERED is the default with RECFM=N.
NOTE: The infile '/home/fkhurshed/Demo1/SAS_export_file.xpt' is:
Filename=/home/fkhurshed/Demo1/SAS_export_file.xpt,
Owner Name=fkhurshed,Group Name=oda,
Access Permission=-rw-r--r--,
Last Modified=10Oct2023:16:14:35,
File Size (bytes)=24320
NOTE: The file METADATA is:
Filename=/saswork/SAS_work77290001FCEC_odaws01-usw2.oda.sas.com/#LN00133,
Owner Name=fkhurshed,Group Name=oda,
Access Permission=-rw-r--r--,
Last Modified=10Oct2023:16:24:59
NOTE: The file INPTSTMT is:
Filename=/saswork/SAS_work77290001FCEC_odaws01-usw2.oda.sas.com/#LN00132,
Owner Name=fkhurshed,Group Name=oda,
Access Permission=-rw-r--r--,
Last Modified=10Oct2023:16:24:59
NOTE: 38 records were written to the file METADATA.
The minimum record length was 1.
The maximum record length was 71.
NOTE: 15 records were written to the file INPTSTMT.
The minimum record length was 3.
The maximum record length was 43.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 937.50k
OS Memory 21344.00k
Timestamp 10/10/2023 10:24:59 PM
Step Count 143 Switch Count 0
Page Faults 0
Page Reclaims 148
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 16
NOTE: The file SASCODE is:
Filename=/saswork/SAS_work77290001FCEC_odaws01-usw2.oda.sas.com/#LN00135,
Owner Name=fkhurshed,Group Name=oda,
Access Permission=-rw-r--r--,
Last Modified=10Oct2023:16:24:59
NOTE: The infile INPTSTMT is:
Filename=/saswork/SAS_work77290001FCEC_odaws01-usw2.oda.sas.com/#LN00132,
Owner Name=fkhurshed,Group Name=oda,
Access Permission=-rw-r--r--,
Last Modified=10Oct2023:16:24:59,
File Size (bytes)=248
NOTE: The infile METADATA is:
Filename=/saswork/SAS_work77290001FCEC_odaws01-usw2.oda.sas.com/#LN00133,
Owner Name=fkhurshed,Group Name=oda,
Access Permission=-rw-r--r--,
Last Modified=10Oct2023:16:24:59,
File Size (bytes)=990
NOTE: 52 records were written to the file SASCODE.
The minimum record length was 1.
The maximum record length was 71.
NOTE: 15 records were read from the infile INPTSTMT.
The minimum record length was 3.
The maximum record length was 43.
NOTE: 38 records were read from the infile METADATA.
The minimum record length was 1.
The maximum record length was 71.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.01 seconds
memory 751.62k
OS Memory 20896.00k
Timestamp 10/10/2023 10:24:59 PM
Step Count 144 Switch Count 0
Page Faults 0
Page Reclaims 47
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 16
NOTE: Fileref METADATA has been deassigned.
NOTE: Fileref INPTSTMT has been deassigned.
84 +format manufacturer $. ;
_
85
76
85 +informat manufacturer $. ;
_
85
76
87 +format model $. ;
_
85
76
88 +informat model $. ;
_
85
76
90 +format displ . ;
_
85
200
76
91 +informat displ . ;
_
85
200
76
93 +format year . ;
_
85
200
76
94 +informat year . ;
_
85
200
76
96 +format cyl . ;
_
85
200
76
97 +informat cyl . ;
_
85
200
76
99 +format trans $. ;
_
85
76
100 +informat trans $. ;
_
85
76
102 +format drv $. ;
_
85
76
103 +informat drv $. ;
_
85
76
105 +format cty . ;
_
85
200
76
106 +informat cty . ;
_
85
200
76
108 +format hwy . ;
_
85
200
76
109 +informat hwy . ;
_
85
200
76
111 +format fl $. ;
_
85
76
112 +informat fl $. ;
_
85
76
114 +format class $. ;
_
85
76
115 +informat class $. ;
_
85
76
ERROR 85-322: Expecting a format name.
ERROR 76-322: Syntax error, statement will be ignored.
ERROR 200-322: The symbol is not recognized and will be ignored.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.SAS_EXPORT_FILE may be incomplete. When this step was stopped there were 0 observations and 11
variables.
WARNING: Data set WORK.SAS_EXPORT_FILE was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 503.03k
OS Memory 21156.00k
Timestamp 10/10/2023 10:24:59 PM
Step Count 145 Switch Count 0
Page Faults 0
Page Reclaims 59
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 16
NOTE: Fileref SASCODE has been deassigned.
131
132
133 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
143
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.