BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PekingDuck
Fluorite | Level 6

I got this coding exercise to extract certain length of characters from a given data set:

    Data Example;
    Input abc $12.;
    Datalines;
    083154564684

    783480645621

    384563604852
    ;
    Run;

The task is to extract the # from 2nd to 6th digits. So I put num = SUBSTR(abc,2,5);. But instead it returned an extraction from 1st to 5th digits, aka "08315, 78348, 38456". Why is it???

 

Then I looked into this Example data set and found out all the last digits of those numbers were missing... I'm confused now and not sure if the above problem is somewhat related to it. Robot Sad

 

(BTW I'm a beginner and I use the SAS University Edition for Windows.)

1 ACCEPTED SOLUTION

Accepted Solutions
ghosh
Barite | Level 11

your input stmt is reading the tab character you entered while typing the data.  With your code I got the same results as you. When I removed it (see below) I got the right answer.

Data Example;
	Input abc $12.;
	Datalines;
083154564684
783480645621
384563604852
	;
Run;

data test;
	set example;
	num = SUBSTR(abc,2,5);
run;

proc print;run;

View solution in original post

13 REPLIES 13
novinosrin
Tourmaline | Level 20
 Data Example;
    Input abc $12.;
    Datalines;
    083154564684
    783480645621
    384563604852
    ;
    Run;

data want;
 set example;
 num = SUBSTR(abc,2,5);
run;

proc print noobs;run;

abc num

08315456 83154
78348064 83480
38456360 84563

 

What's the problem with your test?

PekingDuck
Fluorite | Level 6

Thank you for your reply 😄

Mine gave me the result like the following -

 08315456468 --> 08315
 78348064562 --> 78348
 38456360485 --> 38456

I just tried and set the second parameter as 3 instead of 2, this way could get the correct output...(???)

novinosrin
Tourmaline | Level 20

Plz post your 

1. full code

2 log

3 output

 

so the community members can give you an informed suggestion

PekingDuck
Fluorite | Level 6
Data Example;
	Input abc $12.;
	Datalines;
	083154564684
	783480645621
	384563604852
	;
Run;

data test;
	set example;
	num = SUBSTR(abc,2,5);
run;
 
Output:
abc num
08315456468 08315
78348064562 78348
38456360485 38456
 
Log:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 Data Example;
74 Input abc $12.;
75 Datalines;
 
NOTE: The data set WORK.EXAMPLE has 3 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
 
 
79 ;
80 Run;
81
82 data test;
83 set example;
84 num = SUBSTR(abc,2,5);
85 run;
 
NOTE: There were 3 observations read from the data set WORK.EXAMPLE.
NOTE: The data set WORK.TEST has 3 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
 
 
86
87 proc print noobs;
88 run;
 
NOTE: There were 3 observations read from the data set WORK.TEST.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds
novinosrin
Tourmaline | Level 20

Jeez, that's strange. Okay, is there any chance of an unlikely occurence of a leading blank or control characters?

 

If at all there is, can you try STRIPPING the blanks?

	num = SUBSTR(strip(abc),2,5);
PekingDuck
Fluorite | Level 6

Nah. No luck with your solution.

But you did give me the idea of what's wrong. I realized the leading tab I entered before the data caused the problem... So the position of those digits shifted in the output. 

Forgive me of this stupid mistake (facepalm) 

And thanks for your help! Robot wink

Reeza
Super User
Post your log.
PekingDuck
Fluorite | Level 6

Hello. Please refer to my reply to @novinosrin 

Thank you! Robot Happy

ghosh
Barite | Level 11

your input stmt is reading the tab character you entered while typing the data.  With your code I got the same results as you. When I removed it (see below) I got the right answer.

Data Example;
	Input abc $12.;
	Datalines;
083154564684
783480645621
384563604852
	;
Run;

data test;
	set example;
	num = SUBSTR(abc,2,5);
run;

proc print;run;
unison
Lapis Lazuli | Level 10

good eye

-unison
PekingDuck
Fluorite | Level 6

Hello. Yes, I found out about it just a minute ago... Such a rookie mistake @_@.

Thanks very much!

Tom
Super User Tom
Super User

Don't let the bugs in the auto-indent function of the SAS editors cause you to generate difficult to maintain SAS code.

Always place the DATALINES (aka CARDS) statement in the FIRST column.  

Once you have done that the auto-indent should put the cursor in the first column for the lines of data.

data Example;
  Input abc $12.;
datalines;
083154564684
783480645621
384563604852
;

Also do not enter tabs into program files. There are options in the various SAS editors to replace tabs with spaces. You can still use the TAB key while typing to help keep your program lines indented consistently, but the presence of tabs in the actual file just leads to this type of confusion.

PekingDuck
Fluorite | Level 6

Hello Tom.  I appreciate the information and advice you've shared.

People here are so helpful and I've learned a lot from you guys just on this one simple/basic problem, which is very encouraging for me as a beginner. Robot Very Happy 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 1708 views
  • 6 likes
  • 6 in conversation