BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Nietzsche
Lapis Lazuli | Level 10

The semicolon is seems to be optional for data after the DATALINES statement. Honestly this is the first time I have encountered this in SAS programming so far, I wonder if there are other situations where semicolon is optional. I thought semicolon is compulsory in SAS statements.

 

data setA;
input Num VarA $;
datalines;
1 A1
2 A2
3 A3
;
run;

proc print data=setA;run;

data setB;
input Num VarB $;
datalines;
1 B1
2 B2
3 B3
run;

proc print data=setB;run;

Nietzsche_0-1670275470992.png

 

And if you don't put the seimicolon in a new line but after the last data line like

3 A3;

That obs will not be read.

Nietzsche_0-1670275628041.png

 

SAS Base Programming (2022 Dec), Preparing for SAS Advanced Programming (Cancelled).
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

The run is more or less optional if there is a ;

Your behavior for

 

data setB;
input Num VarB $;
datalines;
1 B1
2 B2
3 B3
run;

Is basically the same as

 

 

data setB;
input Num VarB $;
datalines;
1 B1
2 B2
3 B3
;

However if you do not have one of those form, or the ;;;; if using datalines4 or cards 4, you are likely to generate invalid data messages unless the next thing encountered is an implied section such as a Proc statement. Consider:

 

 

data setB;
input Num VarB $;
datalines;
1 B1
2 B2
3 B3

/* some comment */
proc print;
run;

No semicolon, invalid data message:

 

673  data setB;
674  input Num VarB $;
675  datalines;

NOTE: Invalid data for Num in line 680 1-2.
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--
680        /* some comment */
Num=. VarB=some _ERROR_=1 _N_=4
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.SETB has 4 observations and 2 variables.
N

and a result of

 
Obs Num VarB
1 1 B1
2 2 B2
3 3 B3
4 .

some

 

so you may rethink if the semicolon is the option or RUN. See what you think the result from this code should be:

data setB;
input Num VarB $;
datalines;
1 B1
2 B2
3 B3
run
;

then run it.

View solution in original post

6 REPLIES 6
ballardw
Super User

The run is more or less optional if there is a ;

Your behavior for

 

data setB;
input Num VarB $;
datalines;
1 B1
2 B2
3 B3
run;

Is basically the same as

 

 

data setB;
input Num VarB $;
datalines;
1 B1
2 B2
3 B3
;

However if you do not have one of those form, or the ;;;; if using datalines4 or cards 4, you are likely to generate invalid data messages unless the next thing encountered is an implied section such as a Proc statement. Consider:

 

 

data setB;
input Num VarB $;
datalines;
1 B1
2 B2
3 B3

/* some comment */
proc print;
run;

No semicolon, invalid data message:

 

673  data setB;
674  input Num VarB $;
675  datalines;

NOTE: Invalid data for Num in line 680 1-2.
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--
680        /* some comment */
Num=. VarB=some _ERROR_=1 _N_=4
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.SETB has 4 observations and 2 variables.
N

and a result of

 
Obs Num VarB
1 1 B1
2 2 B2
3 3 B3
4 .

some

 

so you may rethink if the semicolon is the option or RUN. See what you think the result from this code should be:

data setB;
input Num VarB $;
datalines;
1 B1
2 B2
3 B3
run
;

then run it.

ChrisHemedinger
Community Manager

I would not characterize the semicolon as optional here, even though you observe that things work in your test case with or without it.

 

DATALINES is a declarative statement, which means it is processed during DATA step compilation. The compiler here might seem to be forgiving...but I think any of would advise you to include the semicolon because it's clear you're ending the statement there, and it's always good provide clear indicators to the compiler.

 

DATALINES4 (you might have read) is an alternative to use when your input data might contain semicolons. In that case you end the statement with 4 semicolons.

 

DATALINES is an alias for the CARDS statement, which harkens back to the time when data was fed into SAS programs using punch cards. And that system had very little room for ambiguity!

SAS Innovate 2025: Call for Content! Submit your proposals before Sept 16. Accepted presenters get amazing perks to attend the conference!
Kurt_Bremser
Super User

I would like to see consistent behavior of the DATALINES statement and its termination.

run;

should be treated as data followed by a terminating semicolon, resulting in an ERROR or a LOST CARD in most cases.

Astounding
PROC Star

There is a consistency here, buried in the syntax.  After DATALINES, the first line that contains a semicolon is a programming statement.  Everything before that is data.  This program works perfectly fine (or at least it did the last time I checked):

data setB;
input Num VarB $;
datalines;
1 B1
2 B2
3 B3
proc means;
run;

It treats PROC MEANS as a programming statement, and computes statistics based on NUM. 

ballardw
Super User

Proc and section breaks work IF the Proc or other section word line of code contains the semicolon

If not, as in this example, Proc print or other section word such as Data becomes data until the line with the ; is encountered:

data junk;
  input a b $;
datalines;
1 a
2 b
3 c
proc print
   data=junk;
run;

Which treats "proc print" as data.

733  data junk;
734    input a b $;
735  datalines;

NOTE: Invalid data for a in line 739 1-4.
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--
739        proc print
a=. b=print _ERROR_=1 _N_=4
NOTE: The data set WORK.JUNK has 4 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


740     data=junk;
        ----
        180

ERROR 180-322: Statement is not valid or it is used out of proper order.

741  run;

Or try:

data junk;
  input a b $;
datalines;
1 a
2 b
3 c
data junk2
     junk3;
   set junk;
   if a=2 then output junk2;
   else output junk3;
run;

"Normally" a DATA statement might end the previous one, but if the data statement continues to another line or more than the lines until the semicolon are encountered are data.

 

So just simplify your life and use a separate semicolon, or 4, to end your data lines block and don't rely on special treatment of some lines that do not require the semicolon to be on the same code line as the start of the statement.

 

Tom
Super User Tom
Super User

@Kurt_Bremser wrote:

I would like to see consistent behavior of the DATALINES statement and its termination.

run;

should be treated as data followed by a terminating semicolon, resulting in an ERROR or a LOST CARD in most cases.


But that would be inconsistent with how it handles this code.

data want;
  input a b c ;
datalines;
1 2 3
4 5 6
7 8 9;

proc print;
run;

 How many observations does WANT have?

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 882 views
  • 1 like
  • 6 in conversation