Help using Base SAS procedures

Urgent question

Reply
Occasional Contributor
Posts: 9

Urgent question

Can someone please let me know the output of the following program (that is values of variables TYPE and COLOR). Per my understanding, TYPE should be DAISY and COLOR should be YELLOW.

Thanks for the help.

DATA TEST1;

LENGTH TYPE $5 COLOR $ 11;

INFILE DATALINES TRUNCOVER;

INPUT TYPE $ COLOR $;

DATALINES;

DAISYYELLOW

;

Super Contributor
Posts: 418

Re: Urgent question

Posted in reply to noviceinsas

This will return one row, with the variable TYpe equal to DAISY, and the variable color equal to Blank. You are putting one variable "DAISYYELLOW" into the variable "type", however you are specifying that its length is 5, therefore you are truncating its length to 5. You are then setting the Color variable to null.

If you want both variables to populate "correctly" put a space between them.

Occasional Contributor
Posts: 9

Re: Urgent question

Posted in reply to Anotherdream

Thanks for the reply. I don't have access to the SAS software today but I ran a similar program few days back -

data survey;

     (Ques1 - Ques5) ( : $1.);

datalines;

12345

;

Output of this program was - Ques1 - 1 Ques2 - 2 Ques3 - 3 Ques4 - 4 Ques5 - 5

But if follow your logic then the expected output is Ques1 - 1 and Ques2 -Ques5  - Missing character value.

I read in a paper that in case of List input, SAS will read data until it encounters a delimiter, or end of record or for the length specified by informat, whichever comes first.Since the length of TYPE in the first example is 5, I thought pointer will be at Y in the raw data after reading TYPE and so COLOR would also be read correctly. Could you please clarify or if possible, please run this program using software once to check for the output. Thanks for your help again.

Trusted Advisor
Posts: 2,116

Re: Urgent question

Posted in reply to noviceinsas

The answer from Anotherdream and your last statement,

"or for the length specified by informat",

are consistent.  Type has a "format" and Quest1 has an informat.  They are different in use, but have the same tools.

Doc Muhlbaier

Duke

Occasional Contributor
Posts: 9

Re: Urgent question

Thanks for the reply. so if I modify the program as follows then would the output be - TYPE - DAISY COLOR - YELLOW. Here TYPE and COLOR have informats.

DATA TEST1;

INFILE DATALINES TRUNCOVER;

INPUT TYPE : $5. COLOR : $11.;

DATALINES;

DAISYYELLOW

;

Super Contributor
Posts: 418

Re: Urgent question

Posted in reply to noviceinsas

Sorry I'm not fully following you. Are you saying you have data that is a mix of delimited and column informatted values, and you want to be able to delimit them accordingly? Aka you have data like below.

Value 

KevinGreen

New,Blue

And you want your final dataset to be...

Value                    Value2

Kevin                    Green

New                      Blue

Let me know if I am reading your comments correctly.

Super Contributor
Posts: 418

Re: Urgent question

Posted in reply to noviceinsas

No, this would give the same results as the query above. The colon modifier tells the system to read UNTIL it finds a delimiter (which you don't have) or until it reaches the maximum length specified.

So in your case, it would read in the word "DAISYYELLOW" as one observation, and then it would stop reading when it got to a length of 5. However you have no seperator between your words, so it would not know to assign YELLOW to your color. You CANNOT split words up using input statments because it assumes a delimiter. You have to use column list input, which is different.

If your goal was to read in the data, and always break it at the 5th datapoint, and then everything after that gets put into the next variable, this is what you could do.

DATA TEST1;

INFILE DATALINES TRUNCOVER;

INPUT TYPE $1-5 COLOR  $6-11;

DATALINES;

DAISYYELLOW

;

Note your second variable would stop at the 11th spot, and anything after that would not be read in (unless you increase its length). Is this what you wanted? I am still not clear on exactly what you need!

Occasional Contributor
Posts: 9

Re: Urgent question

Posted in reply to Anotherdream

I got the following statement from this paper - http://www.nesug.org/proceedings/nesug03/bt/bt003.pdf

I am trying to understand the output of the program based on the above definition. My question is not related to column input.

Your Statement  - "The colon modifier tells the system to read UNTIL it finds a delimiter (which you don't have) or until it reaches the maximum length specified."

does not match with what is in the attached paper.

Super Contributor
Posts: 418

Re: Urgent question

Posted in reply to noviceinsas

My statment is almost an exact copy of the paper statement, so I am not sure what you mean by it not matching.

When you use the : identifier, it will scan through your observation until ANY of the following happens after which it stops. First: it reads  the length of the specified input statement, in your case 5 characters. Second, it comes to a delimiter as defined by the delimiter option (default is a space), third, it reads the end of the variable (variable is length 5, but only 3 characters are found).

In your example you only have 1 variable since you do not have a delimiter. Therefore, using ANY form of input statements besides column input is going to leave the color variable blank. All you are doing with specifying lengths the way you are is telling sas when to stop the reading of the first variable.

I actually specified what sas was doing earlier in my post dated "May 7, 2013 1:31 PM".  I will repost it below with a little more explanation, but again let me know if I am not answering your question.

Your original post used input to say "Grab 2 variables from the following data source (datalines), that are seperated by a Space delimiter". "For the first Variable, I read in DAISYYELLOW, however you tell me its length is only 5, therefore I remove everything after the fifth Btye.. AKa everything after Daisy".  "I then move onto the next data point, as determined by the space delimiter..."  "You do not have another datapoint, therefore I set Color to Blank, and finish".

Does that make sense and answer your question?

Occasional Contributor
Posts: 9

Re: Urgent question

Posted in reply to Anotherdream

I understand what you are saying but I am not convinced yet. Doc@Duke mentioned another point about format and informat. Could you do me a favor and run these programs for me. I don't have access to SAS. I will probably have better idea once I know the output. I new to SAS. Thanks for all your replies.

DATA TEST1;

INFILE DATALINES TRUNCOVER;

INPUT TYPE : $5. COLOR : $11.;

DATALINES;

DAISYYELLOW

;

DATA TEST2;

LENGTH TYPE $5 COLOR $ 11;

INFILE DATALINES TRUNCOVER;

INPUT TYPE $ COLOR $;

DATALINES;

DAISYYELLOW

;

Super User
Super User
Posts: 7,074

Re: Urgent question

Posted in reply to noviceinsas

The thing I learned from this posting is that you should NOT use the : (colon) modifier on list input if it is possible that the delimiter between the fields could be missing.

233  data _null_;

234    length type $20 color $20 ;

235    infile cards truncover ;

236    input @1 type color  @;

237    list;

238  put 'No informats at all : ' type = color= ;

239

240    input @1 type $5. color $10. @;

241  put 'Explicit informats  : ' type = color= ;

242

243    input @1 type : color :  @;

244  put 'Colon modifier only : ' type = color= ;

245

246    input @1 type : $5. color :$10. @;

247  put 'Informats with colon: ' type = color= ;

248  cards;

No informats at all : type=DAISYYELLOW color=

Explicit informats  : type=DAISY color=YELLOW

Colon modifier only : type=DAISYYELLOW color=

Informats with colon: type=DAISYYELLOW color=

RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0

249        DAISYYELLOW

NOTE: DATA statement used (Total process time):

      real time           0.00 seconds

      cpu time            0.01 seconds

Occasional Contributor
Posts: 9

Re: Urgent question

Thanks Tom. This was really helpful.

Super Contributor
Posts: 418

Re: Urgent question

Posted in reply to noviceinsas

These both return the exact same thing, and it is the thing I have listed several times above. Type would be Daisy, and color would be blank.

You are using LIST input, which assumes a delimiter, and you have no delimiter, therefore your second observation is becoming blank.

An Informat is the way to read in the data and a format is the way to display data. For character data it is not that relevent, as a format and informat just change the length of the variables to be read in and then displayed (which you are inherently doing in your input and length statements).

Read through the document you yourself linked for list input and hopefully that will help you understand how this is working. Also, there are numerous good books on learning sas that have programming examples in them, I would suggest one of these as well (that's how I learned).

Super User
Posts: 10,041

Re: Urgent question

Posted in reply to noviceinsas

No. yours are List input method .

For your situation , should use Formatted input method.

DATA TEST1;
INPUT TYPE $5. COLOR $6.;
DATALINES;
DAISYYELLOW
;
run;

Ksharp

Super Contributor
Posts: 418

Re: Urgent question

I was trying to say exactly the same thing as Ksharp, however I guess he said it much more "clearly" than I did.

You could also do the following

data test2;

infile datalines truncover;

input type $5. color $6.;

datalines;

DAISYYELLOW;

RUN;

Is your question what is different between what K Sharp said, and what you have written? And again, another way to do this is column inputed which is why I also suggested it.

Brandon

Ask a Question
Discussion stats
  • 19 replies
  • 1034 views
  • 7 likes
  • 6 in conversation