BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PoojaP1
Fluorite | Level 6

Hello,

 

I have been trying the following program to create the dataset avgcelcius by inputting the records mentioned in datalines section. But, since it contains embedded spaces I tried using "&" format modifier of list input but it doesn't work. 

 

When I manually adjusted the records in fixed columns, it worked! But, I want to know how can I do it by using any other way ?

 

Thank you!

 

 

DATA avgcelsius;
input City &$17. jan feb mar apr may jun
jul aug sep oct nov dec;
DATALINES;
State College, PA -2 -2 2 8 14 19 21 20 16 10 4 -1
Miami, FL 20 20 22 23 26 27 28 28 27 26 23 20
St. Louis, MO -1 1 6 13 18 23 26 25 21 15 7 1
New Orleans, LA 11 13 16 20 23 27 27 27 26 21 16 12
Madison, WI -8 -5 0 7 14 19 22 20 16 10 2 -5
Houston, TX 10 12 16 20 23 27 28 28 26 21 16 12
Phoenix, AZ 12 14 16 21 26 31 33 32 30 23 16 12
Seattle, WA 5 6 7 10 13 16 18 18 16 12 8 6
San Francisco, CA 10 12 12 13 14 15 15 16 17 16 14 11
San Diego, CA 13 14 15 16 17 19 21 22 21 19 16 14
;
RUN;

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

The & will only do what you want if there are at least two spaces separating the fields. You can force an extra space as follows:

data avgcelsius (drop=x);
  informat City $17.;
  input @;
  x=anyalpha(_infile_,-50);
  _infile_=substr(_infile_,1,x)||' '||substr(_infile_,x+1);
  input City & jan feb mar apr may jun jul aug sep oct nov dec;
  DATALINES;
State College, PA -2 -2 2 8 14 19 21 20 16 10 4 -1
Miami, FL 20 20 22 23 26 27 28 28 27 26 23 20
St. Louis, MO -1 1 6 13 18 23 26 25 21 15 7 1
New Orleans, LA 11 13 16 20 23 27 27 27 26 21 16 12
Madison, WI -8 -5 0 7 14 19 22 20 16 10 2 -5
Houston, TX 10 12 16 20 23 27 28 28 26 21 16 12
Phoenix, AZ 12 14 16 21 26 31 33 32 30 23 16 12
Seattle, WA 5 6 7 10 13 16 18 18 16 12 8 6
San Francisco, CA 10 12 12 13 14 15 15 16 17 16 14 11
San Diego, CA 13 14 15 16 17 19 21 22 21 19 16 14
;
RUN;

Art, CEO, AnalystFinder.com

 

View solution in original post

16 REPLIES 16
novinosrin
Tourmaline | Level 20

fun stuff, just play:

 

DATA avgcelsius;

infile datalines ;

length city $30;

input @;

city=strip(substr(_infile_,1,anydigit(_infile_)-2)) ;

input @(length(city)+1) jan feb mar apr may jun

jul aug sep oct nov dec;

DATALINES;

State College, PA -2 -2 2 8 14 19 21 20 16 10 4 -1

Miami, FL 20 20 22 23 26 27 28 28 27 26 23 20

St. Louis, MO -1 1 6 13 18 23 26 25 21 15 7 1

New Orleans, LA 11 13 16 20 23 27 27 27 26 21 16 12

Madison, WI -8 -5 0 7 14 19 22 20 16 10 2 -5

Houston, TX 10 12 16 20 23 27 28 28 26 21 16 12

Phoenix, AZ 12 14 16 21 26 31 33 32 30 23 16 12

Seattle, WA 5 6 7 10 13 16 18 18 16 12 8 6

San Francisco, CA 10 12 12 13 14 15 15 16 17 16 14 11

San Diego, CA 13 14 15 16 17 19 21 22 21 19 16 14

;

RUN;

art297
Opal | Level 21

The & will only do what you want if there are at least two spaces separating the fields. You can force an extra space as follows:

data avgcelsius (drop=x);
  informat City $17.;
  input @;
  x=anyalpha(_infile_,-50);
  _infile_=substr(_infile_,1,x)||' '||substr(_infile_,x+1);
  input City & jan feb mar apr may jun jul aug sep oct nov dec;
  DATALINES;
State College, PA -2 -2 2 8 14 19 21 20 16 10 4 -1
Miami, FL 20 20 22 23 26 27 28 28 27 26 23 20
St. Louis, MO -1 1 6 13 18 23 26 25 21 15 7 1
New Orleans, LA 11 13 16 20 23 27 27 27 26 21 16 12
Madison, WI -8 -5 0 7 14 19 22 20 16 10 2 -5
Houston, TX 10 12 16 20 23 27 28 28 26 21 16 12
Phoenix, AZ 12 14 16 21 26 31 33 32 30 23 16 12
Seattle, WA 5 6 7 10 13 16 18 18 16 12 8 6
San Francisco, CA 10 12 12 13 14 15 15 16 17 16 14 11
San Diego, CA 13 14 15 16 17 19 21 22 21 19 16 14
;
RUN;

Art, CEO, AnalystFinder.com

 

PoojaP1
Fluorite | Level 6
Thank you so much for your help. It works fine now.
PoojaP1
Fluorite | Level 6
I am facing same difficulty again -

I want to input following data into a dataset. The values have embedded spaces but in this case I am unable to apply a solution similar to what you suggested. Please help.

data multiple;
input city $30. state $2. zip $7.;
datalines;
Flemington NJ 08822
North City NY 11518
Atlantic city NJ 08823
Rancho Santa Margarita CA 78556
;
run;
art297
Opal | Level 21

Same suggestion, but just change where you're adding the extra space. e.g.:

data multiple (drop=x);
  input @;
  x=anyalpha(_infile_,-50)-2;
  _infile_=substr(_infile_,1,x)||' '||substr(_infile_,x+1);
  input city & $30. state $2. zip $7.;
datalines;
Flemington NJ 08822
North City NY 11518
Atlantic city NJ 08823
Rancho Santa Margarita CA 78556
;
run;

Art, CEO, AnalystFinder.com

 

PoojaP1
Fluorite | Level 6

Thank you sir!
With the help of what you told I was trying the following -

 

data multiple(drop=x);
input #1 @1 name $50.
#2 @1 address $50.;
input @;
x=anyalpha(_infile_,-50)-2;
_infile_=substr(_infile_,1,x)||' '||substr(_infile_,x+1);
input #3 @1 city $50.
state $3.
zip $7.;
name=compbl(name);
address=compbl(address);
city=compbl(city);
datalines;
Ron Cody
89 Lazy Brook Road
Flemington NJ 08822
Tim Brown
35 Peak Dr
New York City NY 19454
;
run;

 

The city, state, zip variables are not coming up correctly.

Could you guide me.

Thank you!

art297
Opal | Level 21

Here is one way:

data multiple(drop=x rawin);
  infile cards truncover;
  informat name address city rawin $50.;
  informat state $2.;
  informat zip $5.;
  input name & /
      address & /
      rawin & @;
  x=anyalpha(_infile_,-50)-2;
  _infile_=substr(_infile_,1,x)||' '||substr(_infile_,x+1);
  input @1 city & state zip;
  datalines;
Ron Cody
89 Lazy Brook Road
Flemington NJ 08822
Tim Brown
35 Peak Dr
New York City NY 19454
;
run;

Art, CEO, AnalystFinder.com

 

PoojaP1
Fluorite | Level 6
Sorry, I forgot to mention that there were additional spaces in between values as below -

data multiple(drop=x);
input #1 @1 name $50.
#2 @1 address $50.;
input @;
x=anyalpha(_infile_,-50)-2;
_infile_=substr(_infile_,1,x)||' '||substr(_infile_,x+1);
input #3 @1 city $50.
state $3.
zip $7.;
name=compbl(name);
address=compbl(address);
city=compbl(city);
datalines;
Ron Cody
89 Lazy Brook Road
Flemington NJ 08822
Tim Brown
35 Peak Dr
New York City NY 19454
;
run;

How to deal with these extra blanks in city value?
art297
Opal | Level 21

Your extra spaces were and again are lost in your post because you didn't past them in a code box (i.e., the {i} icon).

 

Regardless, here is one way:

data multiple(drop=x);
  infile cards truncover;
  informat name address city $50.;
  informat state $2.;
  informat zip $5.;
  retain name address;
  input @;
  _infile_=compbl(_infile_);
  input name &;
  input @;
  _infile_=compbl(_infile_);
  input address &;
  input @;
  _infile_=compbl(_infile_);
  x=anyalpha(_infile_,-50)-2;
  _infile_=substr(_infile_,1,x)||' '||substr(_infile_,x+1);
  input @1 city & state zip;
  output;
  datalines;
Ron     Cody
89   Lazy   Brook   Road
Flemington     NJ    08822
Tim   Brown
35     Peak Dr
New   York    City   NY    19454
;
run;

Art, CEO, AnalystFinder.com

 

PoojaP1
Fluorite | Level 6
Thank you so much sir!
FriedEgg
SAS Employee
DATA avgcelsius;
length city $ 17;
infile cards col=col;
input @'2c'x + 4 @;
city = substr(_infile_,1,col-2);
input jan feb mar apr may jun jul aug sep oct nov dec;
DATALINES;
State College, PA -2 -2 2 8 14 19 21 20 16 10 4 -1
Miami, FL 20 20 22 23 26 27 28 28 27 26 23 20
St. Louis, MO -1 1 6 13 18 23 26 25 21 15 7 1
New Orleans, LA 11 13 16 20 23 27 27 27 26 21 16 12
Madison, WI -8 -5 0 7 14 19 22 20 16 10 2 -5
Houston, TX 10 12 16 20 23 27 28 28 26 21 16 12
Phoenix, AZ 12 14 16 21 26 31 33 32 30 23 16 12
Seattle, WA 5 6 7 10 13 16 18 18 16 12 8 6
San Francisco, CA 10 12 12 13 14 15 15 16 17 16 14 11
San Diego, CA 13 14 15 16 17 19 21 22 21 19 16 14
;
RUN;
Ksharp
Super User
data avgcelsius ;
infile datalines dlm=' ' truncover;
  input @;
  _infile_=prxchange('s/(\s[a-z][a-z]\s)/$1 /i',1,_infile_);
  input City & $20. jan feb mar apr may jun jul aug sep oct nov dec;
DATALINES;
State College, PA -2 -2 2 8 14 19 21 20 16 10 4 -1
Miami, FL 20 20 22 23 26 27 28 28 27 26 23 20
St. Louis, MO -1 1 6 13 18 23 26 25 21 15 7 1
New Orleans, LA 11 13 16 20 23 27 27 27 26 21 16 12
Madison, WI -8 -5 0 7 14 19 22 20 16 10 2 -5
Houston, TX 10 12 16 20 23 27 28 28 26 21 16 12
Phoenix, AZ 12 14 16 21 26 31 33 32 30 23 16 12
Seattle, WA 5 6 7 10 13 16 18 18 16 12 8 6
San Francisco, CA 10 12 12 13 14 15 15 16 17 16 14 11
San Diego, CA 13 14 15 16 17 19 21 22 21 19 16 14
;
RUN;
PoojaP1
Fluorite | Level 6
Hi,

Thank you for your help. I have never used this function prxchange. I learnt about it a bit, but I am not able to understand what you have done here. Could you please explain this ?

Thank you!
art297
Opal | Level 21

Regular expressions are good to learn, but a bit complex (at first) to understand. @Ksharp's for example:

  _infile_=prxchange('s/(\s[a-z][a-z]\s)/$1 /i',1,_infile_);

reading from left to right, the function uses s/ to indicate that it is a search/replace instance

(\s[a-z][a-z]\s)

is looking for a string that begins with a space, followed by a character between a and z, followed by another character between a and z, followed by a space

/$1 

says that, when found add one character, namely a space

/i'

says to ignore case (thus search for either a thru z and/or A thru Z)

 

Finally, the 1 instructs the function to only do the operation once, and then the _infile_ says to do it on the _infile_ variable.

 

Art, CEO, AnalystFinder.com

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 16 replies
  • 2547 views
  • 2 likes
  • 5 in conversation