DATA Step, Macro, Functions and more

dsd versus no dsd

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

dsd versus no dsd

If i have a code -

data test;
infile datalines ;
input a b c;
datalines;
1 2 3 
 4 5
6 7
;

I get two observations -

1 2 3

4 5 6

 

But if i use the code 

data test;
infile datalines dsd dlm=' ';
input a b c ;
datalines;
1 2 3
 4 5
6 7 
;

Here it identifies all the spaces as missing values.

result is 

1 2 3

. 4 5

6 7.

 

DSD identifies two delimiters as space where as in the code we have only one space before 4 and it is still identified as a delimiter and a missing value comes before 4.  Why?

 


Accepted Solutions
Solution
Monday
Super User
Posts: 9,890

Re: dsd versus no dsd

Posted in reply to riyaaora275

With dsd, each single delimiter character (blank since you did not specify another) is counted, so the second row in the datalines is considered as

<empty><blank>4<blank>5

and therefore supplies three values.

The third row is

6<blank>7<blank><emtpy>

and supplies another three values.

Without dsd, the leading blank in the second row is simply discarded, and the data step skips to a new line (reading 4/5/6 instead of ./4/5). Since it automatically skips to a new line at the end of the data step iteration (and no more lines are present), value 7 is never read and only 2 observations are output.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Solution
Monday
Super User
Posts: 9,890

Re: dsd versus no dsd

Posted in reply to riyaaora275

With dsd, each single delimiter character (blank since you did not specify another) is counted, so the second row in the datalines is considered as

<empty><blank>4<blank>5

and therefore supplies three values.

The third row is

6<blank>7<blank><emtpy>

and supplies another three values.

Without dsd, the leading blank in the second row is simply discarded, and the data step skips to a new line (reading 4/5/6 instead of ./4/5). Since it automatically skips to a new line at the end of the data step iteration (and no more lines are present), value 7 is never read and only 2 observations are output.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
New Contributor
Posts: 4

Re: dsd versus no dsd

Posted in reply to KurtBremser

In second line  you said the values are in this format

<empty><blank>4<blank>5

but isnt it in the below format because we have only one space before 4 and one space before 5. 

<blank>4<blank>5

where did  <empty> come from?

Super User
Posts: 9,890

Re: dsd versus no dsd

Posted in reply to riyaaora275

@riyaaora275 wrote:

In second line  you said the values are in this format

<empty><blank>4<blank>5

but isnt it in the below format because we have only one space before 4 and one space before 5. 

<blank>4<blank>5

where did  <empty> come from?


Because of the dsd option, the data step assumes that there has to be a value before the first delimiter (blank). If there's nothing (consider it a string of length 0), then you have a missing value. Without dsd, leading delimiters are skipped until there's something.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 99 views
  • 1 like
  • 2 in conversation