I have 4 treatments. Two of the treatments are replicated 4 times and two of them are replicated 3 times. There are about 40 variables (I know its a lot). When I am trying to simply read the data into SAS and proc print it....SAS keeps giving me no more than 7 observations and is skipping every other line from my excel file (comma delimited and infiled). how do I get sas to read all my treatments?
this is my code
data transabro5;
infile 'C:\Users\pc\Documents\tranposedtop20.csv' firstobs = 2 dlm = ',' obs = 50;
input trt $ Aliivibriofischeri Aliivibriosalmonicida Alteromonasmacleodii Arcobacterbutzleri Campylobacterconcisus Campylobactercurvus Campylobacterfetus Campylobacterhominis Campylobacterjejuni Colwelliapsychrerythraea Escherichiacoli Helicobacterhepaticus Helicobacterpylori NitratiruptorspSB Photobacteriumangustum Photobacteriumprofundum Pseudoalteromonasatlantica Pseudoalteromonashaloplanktis Pseudoalteromonastunicata Pseudomonasaeruginosa Pseudomonasfluorescens Pseudomonasputida Salmonellaenterica Shewanellabaltica Shewanellafrigidimarina Shewanellapealeana ShewanellaspW Sulfurimonasdenitrificans Sulfurospirillumdeleyianum SulfurovumspNBC Vibrio alginolyticus Vibriocholerae Vibrioharveyi Vibrioparahaemolyticus VibriospEx VibriospMED Vibriosplendidus Vibriovulnificus Wolinellasuccinogenes;
proc print data = transabro5;
run;
this is my output (I renumbered the treatments 1 through 14 to see if it was an issue with SAS merging my treatments that were numbered the same....that does not seem to be the issue)
Obs trt Aliivibriofischeri Aliivibriosalmonicida Alteromonasmacleodii Arcobacterbutzleri
1 1 1.34308 0.46310 0.16862 21.5209
2 3 2.31566 0.67714 0.19690 12.1740
3 5 2.72847 0.81170 0.46672 7.3411
4 7 3.48375 0.99314 0.21128 0.2287
5 9 3.21311 0.92660 0.15830 5.3156
6 11 3.71651 1.06593 0.35935 0.0904
7 13 3.27903 0.97390 0.45478 2.3671
Try adding LRECL=32000 TRUNCOVER to your INFILE statement:
infile 'C:\Users\pc\Documents\tranposedtop20.csv' firstobs = 2 dlm = ',' obs = 50 LRECL=32000 TRUNCOVER;
See if that helps.
I'll bet you have some interesting messages in the SAS LOG. It should help you understand the problem.
I'm still learning how to read the error logs with help from SAS user information and google. Hoping to master SAS but its going to take me some time...lol
This is what I got and I didn't know how to interpret it.
128
129 data transabro8;
130 infile 'C:\Users\pc\Documents\tranposedtop20.csv' firstobs = 2 dlm = ',' obs = 50;
131 input trt $ Aliivibriofischeri Aliivibriosalmonicida Alteromonasmacleodii
131! Arcobacterbutzleri Campylobacterconcisus Campylobactercurvus Campylobacterfetus
131! Campylobacterhominis Campylobacterjejuni Colwelliapsychrerythraea Escherichiacoli
131! Helicobacterhepaticus Helicobacterpylori NitratiruptorspSB Photobacteriumangustum
131! Photobacteriumprofundum Pseudoalteromonasatlantica Pseudoalteromonashaloplanktis
131! Pseudoalteromonastunicata Pseudomonasaeruginosa Pseudomonasfluorescens
131! Pseudomonasputida Salmonellaenterica Shewanellabaltica Shewanellafrigidimarina
131! Shewanellapealeana ShewanellaspW Sulfurimonasdenitrificans Sulfurospirillumdeleyianum
131! SulfurovumspNBC Vibrio alginolyticus Vibriocholerae Vibrioharveyi
131! Vibrioparahaemolyticus VibriospEx VibriospMED Vibriosplendidus Vibriovulnificus
131! Wolinellasuccinogenes;
NOTE: The infile 'C:\Users\pc\Documents\tranposedtop20.csv' is:
Filename=C:\Users\pc\Documents\tranposedtop20.csv,
RECFM=V,LRECL=256,File Size (bytes)=7319,
Last Modified=29May2012:12:23:16,
Create Time=25May2012:09:15:53
NOTE: 14 records were read from the infile 'C:\Users\pc\Documents\tranposedtop20.csv'.
The minimum record length was 256.
The maximum record length was 256.
One or more lines were truncated.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.TRANSABRO8 has 7 observations and 41 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.06 seconds
132 proc print data = transabro8;
133 run;
NOTE: There were 7 observations read from the data set WORK.TRANSABRO8.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
NOTE: 14 records were read from the infile 'C:\Users\pc\Documents\tranposedtop20.csv'.
The minimum record length was 256.
The maximum record length was 256.
One or more lines were truncated.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
This is indication that LRECL is too small.
Try adding LRECL=32000 TRUNCOVER to your INFILE statement:
infile 'C:\Users\pc\Documents\tranposedtop20.csv' firstobs = 2 dlm = ',' obs = 50 LRECL=32000 TRUNCOVER;
See if that helps.
Thank you VERY much for your suggestion. It has allowed me to read all of the treatments. Generally speaking does SAS prefer to read files from excel in a particular format?
Well, technically you are not reading an Excel file; you are reading a comma-separated values (CSV) file that you might have created from Excel. If you did a "Save as" from Excel, this is just a plain old text file that is a copy of the rows and columns that were in your spreadsheet. Just remember that it is just a copy; if you change the data in the spreadsheet it will not be changed in the CSV file.
Of course, it can be confusing because Excel will open the CSV file and display it as a worksheet. That's just Excel trying ot be helpful. If all you have is a file with this "csv" file type, it's just a plain text file. It would only be a real Excel workbook if you did a "Save as" and chose to save it as an Excel workbook. As before, this would just make a separate copy into a completely different file (with a file type of "xls" or "xlsx" depending on the version of Excel you saved it as..
And SAS doesn't really have a preference. If you have SAS Access to PC File Formats licensed and installed on your PC, SAS can read the Excel spreadsheet directly.
The particular problem you ran into is very common. The "default" record length for external files is only 256 characters. I wish SAS would change it generally to avoid this problem but I suppose they have some reason not to do so. The other thing you could do is add this statement to your "autoexec.sas" file:
options lrecl=32000;
That will change the option for all your SAS sessions, in case you forget to add it to your INFILE statement. I do that myself.
While you're on the right road, I would suggest one further tweak. At all costs, avoid the options TRUNCOVER and MISSOVER. Most of the time, they will mask problems with your data. Instead, use this option: DSD
The issue you will be addressing is what happens when one of the columns from the source data file is blank? There will be two commas in a row in the .csv file. The DSD option tells SAS to treat them as separate delimiters, each marking the end of one of the fields you are trying to read.
Good luck, and keep reading the log!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!