BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
YasodaJayaweera
Obsidian | Level 7

When I execute the following code, value of State is set to missing for each observation in the output dataset. Can someone please explain why the State is set to missing?

 

The output I am expecting is as follows:

City State Region

Tulsa OK Western

Los Angales CA Western

Bangor <missing>  Western

data WORK.GEO;
	infile datalines;
	input City $20.;
	if City='Tulsa' then 
	State = 'OK';
	Region='Central';
	if City='Los Angeles' then 
	State = 'CA';
	Region='Western'; 
	datalines;
	Tulsa
	Los Angeles
	Bangor
	;
run;
proc print data=WORK.GEO; run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
jimbarbour
Meteorite | Level 14

You have an indentation in your Datalines statement.  Typically, a Datalines statement must be left justified.  Try moving over all of your data  lines so that they are flush with the left boundary of the window and re-running.

 

Jim

View solution in original post

9 REPLIES 9
jimbarbour
Meteorite | Level 14

You have an indentation in your Datalines statement.  Typically, a Datalines statement must be left justified.  Try moving over all of your data  lines so that they are flush with the left boundary of the window and re-running.

 

Jim

YasodaJayaweera
Obsidian | Level 7

Thank you very much. Left intending the datalines did the trick. I am running the code in SAS OnDemand. Has this always been the case? I was under the impression that SAS statements are independent of indentation.

Tom
Super User Tom
Super User

@YasodaJayaweera wrote:

Thank you very much. Left intending the datalines did the trick. I am running the code in SAS OnDemand. Has this always been the case? I was under the impression that SAS statements are independent of indentation.


SAS statements do not care about extra spaces between words in the code.  But you appear to be having issues with your lines of DATA, not your CODE.

 

If you are using SAS OnDemand then you are using SAS/Studio.  Which means any TAB characters that have accidentally been inserted into your program will become part of the code and part of the data lines. 

 

The program editor will autoindent when you press enter to match the indentation of the previous line.  So before you type the DATALINES (or CARDS) statement make sure to tab back to the first column.  Then as you type the actual lines of data they will start in column 1 also.

 

You can tell SAS/Studio to convert the tabs into spaces so you don't get those tab characters inserted into your program files by changing your preferences to enable the Substitute spaces for tabs option.

Tom_0-1655828651342.png

 

That will prevent tabs from accidentally being added to your program files.  And since they are not part of the file then SAS/Studio cannot accidentally send them to SAS as part of the data.

 

You also need to check how you are seeing REGION. Currently you are always setting REGION to western.  Indenting the Code properly can help with noticing that also.

Either keep the IF/THEN statemen ton a single line.

data WORK.GEO;
  infile datalines;
  input City $20.;
  if City='Tulsa' then State = 'OK';
  Region='Central';
  if City='Los Angeles' then State = 'CA';
  Region='Western'; 
datalines;
Tulsa
Los Angeles
Bangor
;

Of if you want to split it to multiple lines then move the semicolon that marks the end to a new line.

data WORK.GEO;
  infile datalines;
  input City $20.;
  if City='Tulsa' then
    State = 'OK'
  ;
  Region='Central';
  if City='Los Angeles' then
   State = 'CA'
  ;
  Region='Western'; 
datalines;
Tulsa
Los Angeles
Bangor
;

To have the REGION depend on the CITY then you might want to use DO/END block as the statement that the THEN condition runs.

data WORK.GEO;
  infile datalines;
  input City $20.;
  if City='Tulsa' then do;
    State = 'OK';
    Region='Central';
  end;
  if City='Los Angeles' then do;
    State = 'CA';
    Region='Western'; 
  end;
datalines;
Tulsa
Los Angeles
Bangor
;

 

 

 

ballardw
Super User

Show your LOG from running that code. Copy the code from the data step and all the messages for that step from the log, open a text box on the forum with the </> icon and paste the text.

 

I copied and ran what you show and get values of State for Tulsa and Los Angeles. Actually it matches your "expected" outcome. So I suspect your missing state will have something in the log to clarify possible causes.

 

 

YasodaJayaweera
Obsidian | Level 7
 
 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 72         
 73         data WORK.GEO;
 74         infile datalines;
 75         input City $20.;
 76         putlog _all_;
 77         if City='Tulsa' then
 78         State = 'OK';
 79         Region='Central';
 80         putlog _all_;
 81         if City='Los Angeles' then
 82         State = 'CA';
 83         Region='Western';
 84         putlog _all_;
 85         datalines;
 
 City=Tulsa State=  Region=  _ERROR_=0 _N_=1
 City=Tulsa State=  Region=Central _ERROR_=0 _N_=1
 City=Tulsa State=  Region=Western _ERROR_=0 _N_=1
 City=Los Angeles State=  Region=  _ERROR_=0 _N_=2
 City=Los Angeles State=  Region=Central _ERROR_=0 _N_=2
 City=Los Angeles State=  Region=Western _ERROR_=0 _N_=2
 City=Bangor State=  Region=  _ERROR_=0 _N_=3
 City=Bangor State=  Region=Central _ERROR_=0 _N_=3
 City=Bangor State=  Region=Western _ERROR_=0 _N_=3
 NOTE: The data set WORK.GEO has 3 observations and 3 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              677.09k
       OS Memory           30376.00k
       Timestamp           26/05/2021 08:04:26 AM
       Step Count                        31  Switch Count  2
       Page Faults                       0
       Page Reclaims                     144
       Page Swaps                        0
       Voluntary Context Switches        11
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           264
       
 
 89         ;
 90         run;
 91         
 92         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 104        

I added few putlog statements to see the values. However, left indenting the data lines did the trick.

jimbarbour
Meteorite | Level 14

I left justified the Datalines, but I suspect these results, see below, may not be what you want.  Remember, if you want multiple statements to be subject to the conditions of an IF statement, those statements must be grouped together inside of a DO;  ...statements...  END; construct.  Take a look below, and you'll see how I've modified your code, and below that you'll see the modified results.  I hope this is helpful.

 

Jim

 

Results as coded:

jimbarbour_0-1621984787172.png

 

Modified code:

data WORK.GEO;
	infile datalines;
	input City $20.;

	if City='Tulsa' then 
		DO;
			State = 'OK';
			Region='Central';
		END;
	ELSE
	if City='Los Angeles' then 
		DO;
			State = 'CA';
			Region='Western'; 
		END;
	ELSE
	if City='Bangor' then 
		DO;
			State = 'WA';
			Region='Western'; 
		END;

datalines;
Tulsa
Los Angeles
Bangor
;
run;

Results from modified code:

jimbarbour_1-1621984893472.png

 

 

 

YasodaJayaweera
Obsidian | Level 7
Thank you for the reply. I was aiming for the first outcome purposely. The aim is to understand do-end should be used if multiple statements were to be executed for an if-then statement.
ShubhamMakar
Calcite | Level 5
I came across the same question while preparing for base SAS. While the question here deals with something else, I was struggling to understand why all the values of var Region were Western. Thanks for the extra explanation!
Astounding
PROC Star
The $20. informat will left hand justify whatever it finds automatically. In fact, if you want to preserve leading blanks you would have to switch to the $char20. informat instead.

Most likely you used the tab key when typing in the data, and tab characters distort the data values by becoming part of the data.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1952 views
  • 4 likes
  • 6 in conversation