BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Damon1
Obsidian | Level 7

Hi Everyone, 

 

I'm new to sas, and I am trying to enter a data set, where some of the character data contain spaces, apostrophes, and a dash. I'm not sure how to do this, and i keep getting a weird output. I have tried to use both the CHAR and VARYING informats, but i could be doing it wrong. I have provided both the code that i've tried and the output im getting below. I am required to do this using datalines. 

 

DATA cancer;

 INPUT type $char. new_cases yearly_deaths survival percent3.;

 DATALINES;

   breast 271270 42260 90%

   lung 228150 142670 23%

   prostate 164690 29430 98%

   colorectal 145600 51020 64%

   melanoma 96480 7230 92%

   bladder 80470 17670 77%

   non-hodgkin's lymphoma 74200 19970 71%

   kidney 73820 14770 75%

   endometrial 61880 12160 84%

   leukemia 61780 22840 61%

   pancreatic 56770 45750 9%

   thyroid 52070 2170 99%

   liver 42030 31780 18%

 ;

RUN;

 

PROC PRINT data=cancer;

RUN;

 

gives me 

 

The SAS System

 

Obs

type

new_cases

yearly_deaths

survival

1

breast

271270

42260

0.90

2

lung 22

8150

142670

0.23

3

prostat

.

164690

294.00

4

colorec

.

145600

510.00

5

melanom

.

96480

723.00

6

bladder

80470

17670

0.77

7

non-hod

.

.

742.00

8

kidney

73820

14770

0.75

9

endomet

.

61880

121.00

10

leukemi

.

61780

228.00

11

pancrea

.

56770

457.00

12

thyroid

52070

2170

0.99

13

liver 4

2030

31780

0.18

 

I've also tried putting "$VARYING." as well as entering DLM=',' and separating the data by commas, which made it worse. 

 

If anyone knows what i should enter or has any advice on how to handle this is would be greatly appreciate. 

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @Damon1  Great effort, I'm very impressed with your clear leads that i noticed in the way you attempted. Welcome to SAS communities. You were close and it's a piece of cake. A little help from us. Have fun!

 

data want;
input @;
_n_=anydigit(_infile_)-1;
INPUT type $varying32. _n_
 new_cases yearly_deaths survival :percent3.;
format survival percent.;
 DATALINES;
   breast 271270 42260 90%
   lung 228150 142670 23%
   prostate 164690 29430 98%
   colorectal 145600 51020 64%
   melanoma 96480 7230 92%
   bladder 80470 17670 77%
   non-hodgkin's lymphoma 74200 19970 71%
   kidney 73820 14770 75%
   endometrial 61880 12160 84%
   leukemia 61780 22840 61%
   pancreatic 56770 45750 9%
   thyroid 52070 2170 99%
   liver 42030 31780 18%
 ;

RUN;

View solution in original post

4 REPLIES 4
Reeza
Super User
Is this the exact data you need to input or do you need to generalize this to a real data set. In reality if your data set has spaces both within the field and it's also a delimiter then how can you tell the computer which is a new field and which isn't? Its not possible.

There are other ways, including specifying a different delimiter such as a comma. You say you tried that but didn't show that. That would be the approach I'd use.


Add this line to your code:

INFILE cards dlm=',' DSD TRUNCOVER;

Put that before your input after you add the comma's and you should be fine.
novinosrin
Tourmaline | Level 20

Hi @Damon1  Great effort, I'm very impressed with your clear leads that i noticed in the way you attempted. Welcome to SAS communities. You were close and it's a piece of cake. A little help from us. Have fun!

 

data want;
input @;
_n_=anydigit(_infile_)-1;
INPUT type $varying32. _n_
 new_cases yearly_deaths survival :percent3.;
format survival percent.;
 DATALINES;
   breast 271270 42260 90%
   lung 228150 142670 23%
   prostate 164690 29430 98%
   colorectal 145600 51020 64%
   melanoma 96480 7230 92%
   bladder 80470 17670 77%
   non-hodgkin's lymphoma 74200 19970 71%
   kidney 73820 14770 75%
   endometrial 61880 12160 84%
   leukemia 61780 22840 61%
   pancreatic 56770 45750 9%
   thyroid 52070 2170 99%
   liver 42030 31780 18%
 ;

RUN;
Damon1
Obsidian | Level 7

This worked perfectly 🙂 Thanks so much!!!

mkeintz
PROC Star

If

  1. TYPE always has at least one word - i.e. never empty.
  2. None of those words has a numeric character in it.
  3. Every one of the following variables has no internal blank, 

 

then you can find the position (FN) of the first numeric character, then transcribe characters 1 through FN-1 into TYPE, and (starting at position FN) use in INPUT statement to get all the other variables:

 

DATA cancer (drop=fn);
 input @;
 fn=anydigit(_infile_);
 length type $40;
 type=substr(_infile_,1,fn-1);
 INPUT @fn new_cases yearly_deaths survival percent3.;
 DATALINES;
   breast 271270 42260 90%
   lung 228150 142670 23%
   prostate 164690 29430 98%
   colorectal 145600 51020 64%
   melanoma 96480 7230 92%
   bladder 80470 17670 77%
   non-hodgkin's lymphoma 74200 19970 71%
   kidney 73820 14770 75%
   endometrial 61880 12160 84%
   leukemia 61780 22840 61%
   pancreatic 56770 45750 9%
   thyroid 52070 2170 99%
   liver 42030 31780 18%
 ;

The "trick" here is the bald INPUT statement which does nothing to transfer the input data line to automatic variable _INFILE_.  Then the ANYDIGIT function finds the position of the first number.  The SUBSTR function copies the first FN-1 characters to TYPE.  Then the INPUT function, starting at position FN, reads the rest.

 

The trailing "@" in the first INPUT statement is essential.  Otherwise the next INPUT would read from the next line, instead of from the current line (i.e. from the current _INFILE_ content).

 

I assume that TYPE is no more the 40 characters long.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 793 views
  • 3 likes
  • 4 in conversation