BookmarkSubscribeRSS Feed
jlt10f
Fluorite | Level 6

I have exported an .xpt file from R using the haven package. My goal is to be able to create files that SAS users are able to import and use. I am able to import the file back into R without problem using the haven package. However, the SAS users are not able to successfully import the .xpt file into SAS. Is there something that can be done when exporting the .xpt that would make it easier to open in SAS as an .xpt. Conversely, is there anything that can be done in SAS to more easily open the .xpt file? I am an R user but have little knowledge of SAS.  As a second challenge, this data is part of government work and has to be saved on a shared drive.  When trying to import .xpt datasets from the shared drive, it takes a very long time.  In addition to my initial question, is there any suggestion on what method would be the quickest? 

19 REPLIES 19
Reeza
Super User

@jlt10f wrote:

I have exported an .xpt file from R using the haven package. My goal is to be able to create files that SAS users are able to import and use. I am able to import the file back into R without problem using the haven package. However, the SAS users are not able to successfully import the .xpt file into SAS. Is there something that can be done when exporting the .xpt that would make it easier to open in SAS as an .xpt. Conversely, is there anything that can be done in SAS to more easily open the .xpt file? I am an R user but have little knowledge of SAS.  As a second challenge, this data is part of government work and has to be saved on a shared drive.  When trying to import .xpt datasets from the shared drive, it takes a very long time.  In addition to my initial question, is there any suggestion on what method would be the quickest? 


Why not? Do they get an error? Do they not know how to import an xpt file?

ballardw
Super User

Can you show use the code the users are attempting to use with the XPT file?

 

There is a step to read the file back into a SAS data set. If the resulting data set is not stored in a permanent library then the steps to read it will have to be repeated each time to use the file.

 

XPT files are not native data sets. They are a specifically formatted text file that provides the instructions to SAS to rebuild a data set so the current version/operating system can use it.

jlt10f
Fluorite | Level 6

This is the code that gave us the "File is not a SAS dataset" error (with filename and locations replaced)
libname filename xport 'file location\filename.xpt';

 

data yourlibrary.dataset;
set filename.filename; 
run; 

 

Then when that didn't work, we ran this and it was running for an hour before we cut it off: 

%xpt2loc(libref=work, filespec='c:\temp\trans.xpt') (again, with the appropriate filenames added)

 

I think it may be important to mention that we are working on a shared drive and it is possible that is what is making it take so long, although other things were imported in a reasonable time period, so I am not sure.

 

Reeza
Super User
Use FCOPY to copy the XPT file from the shared drive to the work folder and then try and read it in using PROC COPY.

https://communities.sas.com/t5/SAS-Programming/Convert-XPT-to-SAS-Datasets/td-p/97872

jlt10f
Fluorite | Level 6
Unfortunately they will not allow anything to be copied off of the shared drive. I think the issue may be that it just takes a very long time to import anything off of the shared drive but I am not sure.
Reeza
Super User
That's a very silly take, as reading it into SAS puts it in the work library anyways. So there's no technical difference. And if SAS can read it from that location it can copy it using FCOPY. If it can't copy, then it can't likely read it either, so you have an access issue.

I would also ensure your haven package is the most recent as I tested above.
jlt10f
Fluorite | Level 6
Okay. I will suggest this. Thank you for this suggestion and I will provide an update on this.
jlt10f
Fluorite | Level 6
Okay. They said they tried this and it results in taking an hour plus trying to read in the file, at which point they quit trying. Somone else suggested trying COMPRESS. I think it is some issue with the shared drive making everything take a very long time to import.
Tom
Super User Tom
Super User

@jlt10f wrote:
Okay. They said they tried this and it results in taking an hour plus trying to read in the file, at which point they quit trying. Somone else suggested trying COMPRESS. I think it is some issue with the shared drive making everything take a very long time to import.

What size file are you talking about?  If you are moving gigabytes of data it will take time.

 

How many observations?  How many variables?  How long are the variables? 

 

How large is the resulting XPT file?  Note that XPT files are NOT compressed so they can get very large if there are many long character variables.

jlt10f
Fluorite | Level 6
It's not that large. It is only about half a MB. There are some long text responses which may be slowing it down, combined with the it being on the shared drive may be contributing to the long period of time.
jlt10f
Fluorite | Level 6
In a related question... we were able to successfully import the file under Version 5, but the problem is now dealing with the missing data. Since R and SAS having different means of recording missings, what is the best method for retaining numeric variables from R when imported into SAS while having the missing properly coded in SAS?
Tom
Super User Tom
Super User

@jlt10f wrote:
In a related question... we were able to successfully import the file under Version 5, but the problem is now dealing with the missing data. Since R and SAS having different means of recording missings, what is the best method for retaining numeric variables from R when imported into SAS while having the missing properly coded in SAS?

What happens now with your XPT files?  What does it do wrong with missing values?

 

Do you want to switch to writing CSV files?  When you write them out from R make sure to use the na="" option on the write.csv() call so it does not write that goofy NA string into the CSV file.

 

Note if you do send them CSV files you will need to also send along DOCUMENTATION about the variables so they know how to read the CSV file properly to get back the right type of data.

Tom
Super User Tom
Super User

Have them LOOK at the file and see if R actually created something that a least looks like an XPT file.  Just look at the first 240 bytes.

data _null_ ;
  infile 'c:\temp\trans.xpt' lrecl=80 recfm=f obs=3;
  input;
  list;
run;

(or use some R code that will let you see the first 240 bytes and make sure it looks valid).

 

A SAS V5 transport file will look like this:

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
1         HEADER RECORD*******LIBRARY HEADER RECORD!!!!!!!000000000000000000000000000000
2         SAS     SAS     SASLIB  9.4     X64_10PR                        09OCT23:15:04:02
3         09OCT23:15:04:02

Something that will require the use of the %XPT2LOC() macro will have LIBV8 or LIBV9 instead of LIBRARY in that first line.

 

Describe in more detail the data you are trying to transfer?  How many variables? How many observations?  Are all of the variable names valid SAS names?  Are any of the names longer then 8 characters (required for a V5 transport file)?  Longer than 32 characters (SAS names cannot be longer than 32 characters)?  Are any of the character variables longer than 200 bytes?

 

If you do have many long character variables then make sure they are running the code to import the XPT file with the COMPRESS option set to YES so it does not try to generate a giant SAS dataset.  If you have variable that is defined to hold up to 200 characters but the maximum actual length is only 50 characters you are wasting 150 bytes of storage for every observation.

Reeza
Super User

Having both R and SAS decided to test this out. 

What version of Haven do you have?

 

EDIT: What is the length of the name of the data set? I shortened the file name to under 8 characters and it worked (replicated this process with it named SASfile and it worked ok for me )

I updated haven package in R and it worked.

 

 

I exported mpg (sashelp.cars equivalent) and am getting issues. Using @Tom code I get the following in my log for first three lines.

 RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0                      
 1         HEADER RECORD*******LIBV8   HEADER RECORD!!!!!!!000000000000000000000000000000  
 2         SAS     SAS     SASLIB  6.06    bsd4.2                          10OCT23:16:13:55
 3         10OCT23:16:13:55                                                                

So attempting to use XPT2LOC, but then I get a different series of errors related to formats not being read correctly.  Sample XPT file attached as zip for anyone else playing around. 

78         %xpt2loc(libref=work, filespec=&xpt_file_loc.);
 NOTE: Format XPRTFLT is already on the library WORK.FORMATS.
 NOTE: Format XPRTFLT has been output.
 NOTE: Informat XPRTFLT is already on the library WORK.FORMATS.
 NOTE: Informat XPRTFLT has been output.
 
 NOTE: PROCEDURE FORMAT used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              123.50k
       OS Memory           20640.00k
       Timestamp           10/10/2023 10:24:59 PM
       Step Count                        137  Switch Count  0
       Page Faults                       0
       Page Reclaims                     15
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           40
       
 
 
 NOTE: The data set WORK.CNTLIN has 1 observations and 4 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              534.12k
       OS Memory           20900.00k
       Timestamp           10/10/2023 10:24:59 PM
       Step Count                        138  Switch Count  2
       Page Faults                       0
       Page Reclaims                     100
       Page Swaps                        0
       Voluntary Context Switches        11
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           264
       
 
 NOTE: There were 1 observations read from the data set WORK.CNTLIN.
 NOTE: The data set WORK.CNTLIN has 1 observations and 4 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              800.21k
       OS Memory           21160.00k
       Timestamp           10/10/2023 10:24:59 PM
       Step Count                        139  Switch Count  2
       Page Faults                       0
       Page Reclaims                     135
       Page Swaps                        0
       Voluntary Context Switches        10
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           264
       
 
 NOTE: There were 1 observations read from the data set WORK.CNTLIN.
 NOTE: SAS sort was used.
 NOTE: 0 observations with duplicate key values were deleted.
 NOTE: The data set WORK.CNTLIN has 1 observations and 4 variables.
 NOTE: PROCEDURE SORT used (Total process time):
       real time           0.00 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              780.15k
       OS Memory           21160.00k
       Timestamp           10/10/2023 10:24:59 PM
       Step Count                        140  Switch Count  2
       Page Faults                       0
       Page Reclaims                     127
       Page Swaps                        0
       Voluntary Context Switches        10
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           264
       
 
 NOTE: Format $MEMWANT is already on the library WORK.FORMATS.
 NOTE: Format $MEMWANT has been output.
 NOTE: PROCEDURE FORMAT used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              445.78k
       OS Memory           20900.00k
       Timestamp           10/10/2023 10:24:59 PM
       Step Count                        141  Switch Count  0
       Page Faults                       0
       Page Reclaims                     51
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           8
       
 NOTE: There were 1 observations read from the data set WORK.CNTLIN.
 
 NOTE: Deleting WORK.CNTLIN (memtype=DATA).
 NOTE: PROCEDURE DELETE used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              36.21k
       OS Memory           20640.00k
       Timestamp           10/10/2023 10:24:59 PM
       Step Count                        142  Switch Count  2
       Page Faults                       0
       Page Reclaims                     15
       Page Swaps                        0
       Voluntary Context Switches        9
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           0
       
 
 
 NOTE: UNBUFFERED is the default with RECFM=N.
 NOTE: The infile '/home/fkhurshed/Demo1/SAS_export_file.xpt' is:
       Filename=/home/fkhurshed/Demo1/SAS_export_file.xpt,
       Owner Name=fkhurshed,Group Name=oda,
       Access Permission=-rw-r--r--,
       Last Modified=10Oct2023:16:14:35,
       File Size (bytes)=24320
 
 NOTE: The file METADATA is:
       Filename=/saswork/SAS_work77290001FCEC_odaws01-usw2.oda.sas.com/#LN00133,
       Owner Name=fkhurshed,Group Name=oda,
       Access Permission=-rw-r--r--,
       Last Modified=10Oct2023:16:24:59
 
 NOTE: The file INPTSTMT is:
       Filename=/saswork/SAS_work77290001FCEC_odaws01-usw2.oda.sas.com/#LN00132,
       Owner Name=fkhurshed,Group Name=oda,
       Access Permission=-rw-r--r--,
       Last Modified=10Oct2023:16:24:59
 
 NOTE: 38 records were written to the file METADATA.
       The minimum record length was 1.
       The maximum record length was 71.
 NOTE: 15 records were written to the file INPTSTMT.
       The minimum record length was 3.
       The maximum record length was 43.
 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              937.50k
       OS Memory           21344.00k
       Timestamp           10/10/2023 10:24:59 PM
       Step Count                        143  Switch Count  0
       Page Faults                       0
       Page Reclaims                     148
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           16
       
 
 
 NOTE: The file SASCODE is:
       Filename=/saswork/SAS_work77290001FCEC_odaws01-usw2.oda.sas.com/#LN00135,
       Owner Name=fkhurshed,Group Name=oda,
       Access Permission=-rw-r--r--,
       Last Modified=10Oct2023:16:24:59
 
 NOTE: The infile INPTSTMT is:
       Filename=/saswork/SAS_work77290001FCEC_odaws01-usw2.oda.sas.com/#LN00132,
       Owner Name=fkhurshed,Group Name=oda,
       Access Permission=-rw-r--r--,
       Last Modified=10Oct2023:16:24:59,
       File Size (bytes)=248
 
 NOTE: The infile METADATA is:
       Filename=/saswork/SAS_work77290001FCEC_odaws01-usw2.oda.sas.com/#LN00133,
       Owner Name=fkhurshed,Group Name=oda,
       Access Permission=-rw-r--r--,
       Last Modified=10Oct2023:16:24:59,
       File Size (bytes)=990
 
 NOTE: 52 records were written to the file SASCODE.
       The minimum record length was 1.
       The maximum record length was 71.
 NOTE: 15 records were read from the infile INPTSTMT.
       The minimum record length was 3.
       The maximum record length was 43.
 NOTE: 38 records were read from the infile METADATA.
       The minimum record length was 1.
       The maximum record length was 71.
 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.01 seconds
       memory              751.62k
       OS Memory           20896.00k
       Timestamp           10/10/2023 10:24:59 PM
       Step Count                        144  Switch Count  0
       Page Faults                       0
       Page Reclaims                     47
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           16
       
 
 NOTE: Fileref METADATA has been deassigned.
 NOTE: Fileref INPTSTMT has been deassigned.
 84        +format manufacturer  $. ;
                                          _
                                          85
                                          76
 85        +informat manufacturer  $. ;
                                            _
                                            85
                                            76
 87        +format model  $. ;
                                   _
                                   85
                                   76
 88        +informat model  $. ;
                                     _
                                     85
                                     76
 90        +format displ  . ;
                                  _
                                  85
                                  200
                                  76
 91        +informat displ  . ;
                                    _
                                    85
                                    200
                                    76
 93        +format year  . ;
                                 _
                                 85
                                 200
                                 76
 94        +informat year  . ;
                                   _
                                   85
                                   200
                                   76
 96        +format cyl  . ;
                                _
                                85
                                200
                                76
 97        +informat cyl  . ;
                                  _
                                  85
                                  200
                                  76
 99        +format trans  $. ;
                                   _
                                   85
                                   76
 100       +informat trans  $. ;
                                     _
                                     85
                                     76
 102       +format drv  $. ;
                                 _
                                 85
                                 76
 103       +informat drv  $. ;
                                   _
                                   85
                                   76
 105       +format cty  . ;
                                _
                                85
                                200
                                76
 106       +informat cty  . ;
                                  _
                                  85
                                  200
                                  76
 108       +format hwy  . ;
                                _
                                85
                                200
                                76
 109       +informat hwy  . ;
                                  _
                                  85
                                  200
                                  76
 111       +format fl  $. ;
                                _
                                85
                                76
 112       +informat fl  $. ;
                                  _
                                  85
                                  76
 114       +format class  $. ;
                                   _
                                   85
                                   76
 115       +informat class  $. ;
                                     _
                                     85
                                     76
 
 ERROR 85-322: Expecting a format name.
 
 ERROR 76-322: Syntax error, statement will be ignored.
 
 ERROR 200-322: The symbol is not recognized and will be ignored.
 
 NOTE: The SAS System stopped processing this step because of errors.
 WARNING: The data set WORK.SAS_EXPORT_FILE may be incomplete.  When this step was stopped there were 0 observations and 11 
          variables.
 WARNING: Data set WORK.SAS_EXPORT_FILE was not replaced because this step was stopped.
 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              503.03k
       OS Memory           21156.00k
       Timestamp           10/10/2023 10:24:59 PM
       Step Count                        145  Switch Count  0
       Page Faults                       0
       Page Reclaims                     59
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           16
       
 
 NOTE: Fileref SASCODE has been deassigned.
 131        
 132        
 133        OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 143        

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 19 replies
  • 4176 views
  • 7 likes
  • 5 in conversation