Hi
I have a question :
I have got this data which I am using as text file input:
data ORION.LOCATIONRAWFILE;
infile datalines dsd truncover;
input price:$15. Address:$20. Style:$8. Zip:32. Bedroom:32. Baths:32.;
datalines4;
"64,000",sheppard Avenue,Ranch,1250,2,1
"65,850",Rand Street,Split,1190,1,1
"80,050",Market Street,Condo,1400,2,1.5
"107,250",Garris Street,TwoStory,1810,4,3
"86,650",Kemble Avenue,Ranch,1500,3,3
"94,450",West Drive,Split,1615,4,3
"73,650",Graham Avenue,Split,1305,3,1.5
;;;;
I am reading this txt file and Ii gives me the desired result that I want.
data orion.locationrawfile;
infile "/folders/myfolders/pg2/location.txt" dlm=',' dsd ;
length Address $20;
input Style $ Zip Bedroom Baths Address $ Price : dollar8. ;
proc print data = orion.locationrawfile noobs;
var Style Zip Bedroom Baths Address Price;
format price dollar8.;
format baths best.;
run;
Question: I am using the Colon input with Price , I am using the length statement for address:
If I remove length statement for address and try to use & sign with the character length $20, it gives as error,
Is that possible to remove the length statement and use code as below:
data orion.locationrawfile;
infile "/folders/myfolders/pg2/location.txt" dlm=',' dsd ;
input Style $ Zip Bedroom Baths Address & $20. Price : dollar8. ;
I got an error in the output.
Please advise if & works like this or some other modification is required without using the length statement.
thanks
TK
Thanks @Astounding and @bondtk,
I am lucky as I intended to help and found myself learning somthing new.
It is not clear what you mean by the & sign.
The INPUT statement accepts variable names and their informats,
Then maybe you want to code:
input Style $ Zip Bedroom Baths Address $20. Price : dollar8. ;
When you got error in log it will be efficient to post the log with the error message.
Hi Shmuel
thanks for your reply, if I use your code and don't use length statement then I have got the mismatched output, I am attaching the notes
and the error. The reason I was trying to use ampersand sign as there is a space between the address fields.
so with this code I didn't get the desired output.
input Style $ Zip Bedroom Baths Address $20. Price : dollar8. ;
In an INPUT statement, & means to keep on going until you hit two delimiters in a row. That would throw off the value of the variables that follow, since your data never has two delimiters in a row.
Hi Astounding
Thanks for your reply, the reason I tried to use ampersand sign as I saw this code in this example below:
The ampersand (&) modifier allows you to read character values that contain embedded blanks.
As you can see the space between New York and then use the $12. instead of length statement, but it didn't work in
my case when I try to use the similar statement without length statement.
input Style $ Zip Bedroom Baths Address & $20. Price : dollar8. ;
DATA citypops; infile DATALINES FIRSTOBS = 2; input city & $12. pop2000; DATALINES; City Yr2000Popn New York 8008278 Los Angeles 3694820 Chicago 2896016 Houston 1953631 Philadelphia 1517550 Phoenix 1321045 San Antonio 1144646 San Diego 1223400 Dallas 1188580 San Jose 894943 ; RUN; PROC PRINT data = citypops; title 'The citypops data set'; format pop2000 comma10.; RUN;
Pay attention:
using & worked well as long as CITY contained one string (like: Philadelphy) or two strings (like San Diego)
but issued message LOST CARD when CITY contained more strings (tested: San Diego CA).
I would prefer using either a delimiter (other then space) or a fix format input.
This example is perfectly fine. The & instructs SAS to keep reading until it finds two consecutive delimiters. (Just for the record, that does mean that you could read multiple fields, not just two, as long as there is only a single blank between each word.)
A LENGTH statement might be needed because the default length using list input is $8. So compare the results here:
data test;
length city1 $ 40;
input city1 & @1 city2 $ &;
cards;
San Diego CA
San Francisco CA
Two Blanks Somewhere
;
Your original example is complicated by the fact that you are using commas instead of blanks as delimiters.
Thanks @Astounding and @bondtk,
I am lucky as I intended to help and found myself learning somthing new.
@bondtk, my note is not the solution. It doesn't answer your question.
The best way to learn is to try run the code with several combinations and compare results.
Thus beside reading the documentation.
Your first question was:
If I remove length statement for address and try to use & sign with the character length $20, it gives as error, Is that possible to remove the length statement and use code as below: data orion.locationrawfile; infile "/folders/myfolders/pg2/location.txt" dlm=',' dsd ; input Style $ Zip Bedroom Baths Address & $20. Price : dollar8. ; I got an error in the output.
so, if you got error then you might not remove the lengh.
@Astounding gave the right answer:
A LENGTH statement might be needed because the default length using list input is $8.
So compare the results here:
and @Tom detailed the rules to use & in input statement.
If you are using DSD option then there should be no need for the & modifier. That DSD option means that SAS should treat two adjacent delimiters as representing a null value. The & says to look for two (or more) delimiters between "words" in the input line. You would NOT use & when using the DSD option. Also note that if you are not using the DSD option then you should make sure to represent missing values (numeric or character) by a period in the data line.
If the first place that you reference a variable is in the INPUT statement then SAS must guess at what type of variable you mean by looking at how you used it in the INPUT statement. So if you use an INFORMAT in the INPUT statement SAS will define a variable that is appropriate for that informat. Or if you just use the bare $ then SAS will know you want to define a character variable. Note that if you have already defined the variable there is no need to add a bare $ in the input statement since SAS already knows the variable is character. If SAS cannot tell what length to use then it will assign new character variables a length of $8.
Note that SAS will similarly make a guess about how to define a variable whenever it is used before it is defined. You can define a variable by sourcing it from another dataset or using the LENGTH or ATTRIB statement to define it.
In general you will have fewer headaches if you just get in the habit of defining your variables before using them. For example your first data step might look like this:
data ORION.LOCATIONRAWFILE;
length price $15 Address $20 Style $8 Zip 8 Bedroom 8 Baths 8;
infile datalines dsd truncover;
input price Address Style Zip Bedroom Baths ;
datalines4;
"64,000",sheppard Avenue,Ranch,1250,2,1
"65,850",Rand Street,Split,1190,1,1
"80,050",Market Street,Condo,1400,2,1.5
"107,250",Garris Street,TwoStory,1810,4,3
"86,650",Kemble Avenue,Ranch,1500,3,3
"94,450",West Drive,Split,1615,4,3
"73,650",Graham Avenue,Split,1305,3,1.5
;;;;
But really that first column looks like a number. Don't let the quotes fool you into thinking it is a character variable. The quotes are just there because they are required whenever a value contains the delimiter. To read a number with a comma in it you will need to use the COMMA informat. (or the DOLLAR informat, but they are really the same thing).
data ORION.LOCATIONRAWFILE;
length price 8 Address $20 Style $8 Zip 8 Bedroom 8 Baths 8;
infile datalines dsd truncover;
input price Address Style Zip Bedroom Baths ;
informat price comma.;
datalines4;
"64,000",sheppard Avenue,Ranch,1250,2,1
"65,850",Rand Street,Split,1190,1,1
"80,050",Market Street,Condo,1400,2,1.5
"107,250",Garris Street,TwoStory,1810,4,3
"86,650",Kemble Avenue,Ranch,1500,3,3
"94,450",West Drive,Split,1615,4,3
"73,650",Graham Avenue,Split,1305,3,1.5
;;;;
Hi Tom
I have slightly changed the code , instead of using informat statement , I used the informat in the input statement and it worked as well,
please advise is it right thing to do. input price : comma. Address Style Zip Bedroom Baths ;
data ORION.LOCATIONRAWFILE;
length price 8 Address $20 Style $8 Zip 8 Bedroom 8 Baths 8;
infile datalines dsd truncover;
input price : dollar10. Address Style Zip Bedroom Baths ;
datalines4;
"64,000",sheppard Avenue,Ranch,1250,2,1
"65,850",Rand Street,Split,1190,1,1
"80,050",Market Street,Condo,1400,2,1.5
"107,250",Garris Street,TwoStory,1810,4,3
"86,650",Kemble Avenue,Ranch,1500,3,3
"94,450",West Drive,Split,1615,4,3
"73,650",Graham Avenue,Split,1305,3,1.5
;;;;
proc print data =orion.locationrawfile;
format price dollar10.;
format baths best.;
run;
That works as long as you make sure to include the : modifier before the informat in the INPUT statement.
There is no need at add the statement
format baths best.;
to the proc print step. SAS will already use the BEST12. format for numeric variables that do not have an attached format.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.