- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SAS Programmers:
I spent an inordinate amount of time troubleshooting a “simple” problem of changing the length of a character variable in SAS. Even though I found a solution, I’m still befuddled and hoping someone here can explain.
A data set was imported from a CSV file via Proc Import per the following:
%Macro ReadCSV (infile , outfile );
PROC IMPORT OUT= &outfile
DATAFILE= &infile
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
GUESSINGROWS=200;
run;
%Mend ReadCSV;
Note that GUESSINGROWS was initially set to 75. Maximum character length of the values for the variable Company was 12, which appeared after line 75 in the table. SAS truncated the variable names to 8 characters, as reported by Proc Contents:
I tried changing the length in a subsequent data step:
data a;
length Company $12;
format Company $12.;
set a;
run;
Proc Contents returned the following attributes:
That didn’t help. When printed, variable names for Company were still truncated at 8 characters. Then I tried the following Proc SQL code:
proc sql;
alter table work.a
modify Company char(12) format=$12. informat=$12.;
quit;
Proc Contents returned the following attributes:
Even though this shows Length, Format, and Informat attributes with 12 characters, printed variable names for Company were still truncated @ 8 characters.
After some hair pulling, I went back and changed the GUESSINGROW value from 75 to 200 in Proc Import and re-ran the code. Proc Contents returned the following attributes:
This changed the character value from 8 to 12, printing just as I wanted. However, except for the variable number, these attributes are identical to that obtained via Proc SQL! Even when the GUESSINGROWS statement was deleted, I couldn’t effect any change in printed character length using the above data step or Proc SQL code regardless of what Proc Contents was showing.
Can anyone explain what’s going on here? Are there different syntax rules for changing the length of a character variable in a user programmed data step or in Proc SQL subsequent to Proc Import vs. reading data via the Input statement?
Thank you in advance.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
1. You read in the data with PROC IMPORT and get a truncated variable stored with a length of 8. That means only 8 characters are stored and the remainder are discarded from the data set.
2. You try and increase the variable length to 12 to get the full variable, except it's already been truncated when it was read in.When SAS reads it in as truncated it has no idea that "New York City" isn't just "New York C" so changing the length after the import doesn't fix the issue because it's already truncated in the data. You need to fix it in the import step.
3. You know how to fix it in the IMPORT step, setting GUESSINGROWS to a higher number, the default is higher than 75 is my guess.
4. PROC IMPORT is a guessing procedure, if you want to specify types and get them the same each time you should write a data step to import your data.
@Kip1 wrote:
SAS Programmers:
I spent an inordinate amount of time troubleshooting a “simple” problem of changing the length of a character variable in SAS. Even though I found a solution, I’m still befuddled and hoping someone here can explain.
A data set was imported from a CSV file via Proc Import per the following:
%Macro ReadCSV (infile , outfile ); PROC IMPORT OUT= &outfile DATAFILE= &infile DBMS=CSV REPLACE; GETNAMES=YES; DATAROW=2; GUESSINGROWS=200; run; %Mend ReadCSV;
Note that GUESSINGROWS was initially set to 75. Maximum character length of the values for the variable Company was 12, which appeared after line 75 in the table. SAS truncated the variable names to 8 characters, as reported by Proc Contents:
I tried changing the length in a subsequent data step:
data a; length Company $12; format Company $12.; set a; run;
Proc Contents returned the following attributes:
That didn’t help. When printed, variable names for Company were still truncated at 8 characters. Then I tried the following Proc SQL code:
proc sql; alter table work.a modify Company char(12) format=$12. informat=$12.; quit;
Proc Contents returned the following attributes:
Even though this shows Length, Format, and Informat attributes with 12 characters, printed variable names for Company were still truncated @ 8 characters.
After some hair pulling, I went back and changed the GUESSINGROW value from 75 to 200 in Proc Import and re-ran the code. Proc Contents returned the following attributes:
This changed the character value from 8 to 12, printing just as I wanted. However, except for the variable number, these attributes are identical to that obtained via Proc SQL! Even when the GUESSINGROWS statement was deleted, I couldn’t effect any change in printed character length using the above data step or Proc SQL code regardless of what Proc Contents was showing.
Can anyone explain what’s going on here? Are there different syntax rules for changing the length of a character variable in a user programmed data step or in Proc SQL subsequent to Proc Import vs. reading data via the Input statement?
Thank you in advance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
1. You read in the data with PROC IMPORT and get a truncated variable stored with a length of 8. That means only 8 characters are stored and the remainder are discarded from the data set.
2. You try and increase the variable length to 12 to get the full variable, except it's already been truncated when it was read in.When SAS reads it in as truncated it has no idea that "New York City" isn't just "New York C" so changing the length after the import doesn't fix the issue because it's already truncated in the data. You need to fix it in the import step.
3. You know how to fix it in the IMPORT step, setting GUESSINGROWS to a higher number, the default is higher than 75 is my guess.
4. PROC IMPORT is a guessing procedure, if you want to specify types and get them the same each time you should write a data step to import your data.
@Kip1 wrote:
SAS Programmers:
I spent an inordinate amount of time troubleshooting a “simple” problem of changing the length of a character variable in SAS. Even though I found a solution, I’m still befuddled and hoping someone here can explain.
A data set was imported from a CSV file via Proc Import per the following:
%Macro ReadCSV (infile , outfile ); PROC IMPORT OUT= &outfile DATAFILE= &infile DBMS=CSV REPLACE; GETNAMES=YES; DATAROW=2; GUESSINGROWS=200; run; %Mend ReadCSV;
Note that GUESSINGROWS was initially set to 75. Maximum character length of the values for the variable Company was 12, which appeared after line 75 in the table. SAS truncated the variable names to 8 characters, as reported by Proc Contents:
I tried changing the length in a subsequent data step:
data a; length Company $12; format Company $12.; set a; run;
Proc Contents returned the following attributes:
That didn’t help. When printed, variable names for Company were still truncated at 8 characters. Then I tried the following Proc SQL code:
proc sql; alter table work.a modify Company char(12) format=$12. informat=$12.; quit;
Proc Contents returned the following attributes:
Even though this shows Length, Format, and Informat attributes with 12 characters, printed variable names for Company were still truncated @ 8 characters.
After some hair pulling, I went back and changed the GUESSINGROW value from 75 to 200 in Proc Import and re-ran the code. Proc Contents returned the following attributes:
This changed the character value from 8 to 12, printing just as I wanted. However, except for the variable number, these attributes are identical to that obtained via Proc SQL! Even when the GUESSINGROWS statement was deleted, I couldn’t effect any change in printed character length using the above data step or Proc SQL code regardless of what Proc Contents was showing.
Can anyone explain what’s going on here? Are there different syntax rules for changing the length of a character variable in a user programmed data step or in Proc SQL subsequent to Proc Import vs. reading data via the Input statement?
Thank you in advance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So what you're saying is there's no way to change the length of a character variable after it's been truncated and stored with a length of 8. I didn't know this. I assume you could rename the variable with a length of 12 in the data step above because the column length is in fact 12 characters wide as reported by Proc Contents.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you are going to use PROC IMPORT to read a text file instead of reading it yourself you should at least use GUESSINGROWS=MAX.
Or use a better tool for guessing how to read a CSV file.