Is there a way to read in a csv data set that has "." in some cells and replace those with "*"? The things inside the quote marks, e.g., period and star. This is the import statement, I'm sure I can't do it here, probably in a data step.
proc import datafile = "\\folder\data.csv"
out=workingdata
dbms=csv
replace;
getnames=yes;
run;
The data set came in with "." I use * to represent suppressed data, and have a footnote explaining the symbol.
If you know how to do this, that would be very helpful.
FYI, I do NOT was to talk about why I'm doing this, why I should not do this, why doing this is never the right thing to do. Those points are not at all helpful.
You can use the TRANWRD function in a data step after the proc import.
Don't use PROC IMPORT for csv files. Write the data step yourself (or modify the one created by IMPORT), and you avoid all the hassle fixing things that IMPORT gets wrong.
Especially if that column is supposed to be numeric. In that case, a custom informat that reads the asterisk as missing is the way to go.
Edit: I see you're not talking numeric.
Use IFC:
var = ifc(var = ".","*",var);
Um, I forgot to include a bit of info. Sorry.
The columns where these "." appear are numeric, and, of course, many of the numbers also have "." in them, for example, "23.7", so I don't know if I can search and replace "." with "*".
A missing numeric value is represented by a dot in SAS. To use a different character, set the MISSING option:
options missing = "*";
To do this, you need an informat that reads the asterisk to a special missing, and a fornat that displays the special missing as asterisk:
proc format;
invalue at_best
"*" = .a
other = [best.]
;
value at_best
.a = "*"
other = [best.]
;
run;
Then you read the data and use both:
data want;
infile "\\folder\data.csv" dlm="," dsd truncover;
input
.......
numvar :at_best.
....
;
format numvar at_best.;
run;
@geneshackman wrote:
Kurt, thanks. The data set came with some cells being blank, which means no instances, and ".", which means suppressed. I only want to change the "." to *.
How would I use the custom informat? I'd want it to be when missing is "." then change it to *. But if missing is blank, then do -not- change to *.
What is the current format for the values that appear as blank?
I strongly suspect these are not numeric variables but character.
@geneshackman wrote:
Kurt, thanks. The data set came with some cells being blank, which means no instances, and ".", which means suppressed. I only want to change the "." to *.
How would I use the custom informat? I'd want it to be when missing is "." then change it to *. But if missing is blank, then do -not- change to *.
SAS has special missing values .A to .Z and ._ to allow you to have different types of missing data. So if you are reading a CSV file and you want to treat numeric variables that have an actual period in the CSV file differently than nothing at all (that is two adjacent commas) you will need to replace either the periods or the null strings with either a letter or an underscore. You might also need to issue the MISSING statement (not the MISSING option) to make sure that SAS treats the letter or underscore as meaning the special missing and not an invalid value for number.
So changing a line like:
123,.,,012
to
123,_,,012
would be interpreted as having one value of the special missing ._ and regular missing value.
You could even make the change in the same step that is reading the file by manipulating the _INFILE_ buffer. (Note: This will only work if no line in the file has more than 32K bytes since that is maximum length where you can modify the _INFILE_ buffer variable.)
Example:
missing _;
data test ;
infile cards dsd firstobs=2 truncover ;
input @ ;
_infile_=tranwrd(cats(',',_infile_,','),',.,',',_,');
input @2 var1-var3 ;
cards;
var1,var2,var3
123,,456
.,789,012
;
proc print;
run;
Obs var1 var2 var3 1 123 . 456 2 _ 789 12
If you want the ._ values to print as * instead of _ then make your own format.
proc format ;
value star ._ = '*' ;
run;
proc print data=test;
format var1-var3 star5. ;
run;
@geneshackman wrote:
Is there a way to read in a csv data set that has "." in some cells and replace those with "*"? The things inside the quote marks, e.g., period and star. This is the import statement, I'm sure I can't do it here, probably in a data step.
proc import datafile = "\\folder\data.csv"
out=workingdata
dbms=csv
replace;
getnames=yes;
run;
The data set came in with "." I use * to represent suppressed data, and have a footnote explaining the symbol.
If you know how to do this, that would be very helpful.
FYI, I do NOT was to talk about why I'm doing this, why I should not do this, why doing this is never the right thing to do. Those points are not at all helpful.
How about sharing some details.
Such as the data set variable type for the values you are talking about.
If the CSV file has consecutive commas then SAS interprets that as a missing value and by default displays a . if the variable is NUMERIC.
So if the variable(s) are numeric you cannot assign a value of "*" because that would be attempting to assign a character value to a numeric variable.
If this is the case then you can set an option that will display the missing as an asterisk:
options missing='*';
Or you can create custom formats to do the same.
Example.
data example; infile datalines dlm=',' dsd; input a $ x y z; datalines; abc,123,,678 pdq,111,222,333 ; proc print data=example; title 'Default missing option'; run; options missing='*'; proc print data=example; title 'missing option *'; run;
SAS reads . from CSV if the only character between commas as missing.
If the variable is character you would not see the dot though. Which is why I believe you have numeric values.
Your responses have only served to confuse me about what you actually have and what you want to do. Please create and post a simple example that shows the different types of data you want to read and the resulting dataset your want.
Let's set some baselines here.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.