BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jbscholten
Obsidian | Level 7

Hello,

 

I have a dataset in SAS that I am exporting to a delmited file (not a CSV). Is it possible to specify which lines to delimit? There are some lines which I do not want to delimit, as the file is being read incorrectly by another program and the delimiter (a space) may cause the files to be read in wrong by the other program.

 

Here's an example of the dataset:

Indicator Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9
0001                  
1116 1 2 3 4 5 6 7 8 9
1116 1 2 3 4 5 6 7 8 9
1216 1 2 3 4 5 6 7 8 9
0117 1 2 3 4 5 6 7 8 9
0217 1 2 3 4 5 6 7 8 9
0317 1 2 3 4 5 6 7 8 9
END                  
0002                  
1116 1 2 3 4 5 6 7 8 9
1216 1 2 3 4 5 6 7 8 9
0117 1 2 3 4 5 6 7 8 9
0217 1 2 3 4 5 6 7 8 9
0317 1 2 3 4 5 6 7 8 9
END                  
0003                  
1116 1 2 3 4 5 6 7 8 9
1216 1 2 3 4 5 6 7 8 9
0117 1 2 3 4 5 6 7 8 9
0217 1 2 3 4 5 6 7 8 9
0317 1 2 3 4 5 6 7 8 9

 

 

The 'END' and '0001, 0002, 0003, etc' specify the end of a grouping and the start of the next grouping. '0001, 0002, etc' ranges from 1-1001, but lets pretend these increase infinitely. the "1116, 1116, 1216, 0117, 0217, 0317" are dates (mmyy) and run up to 1166, but this could also change. When exporting to a txt file, I need the spaces that are being inserted as delimiters between var1-var9 to NOT be placed in the observations that have "end" or grouping number present. 

 

Any help would be greatly appreciated. 

 

Thanks

-Jbscholten

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

<Rant>

First, having done a fair amount of work around Y2K issues once upon a time I would suggest to whoever is using that file structure that they have a significant problem with 2 digit years. Coupled with your 0001  = Jan 2000 possibly confusion this is heading for a significant failure at some point.

</Rant>

 

Because of the aforementioned 0001 issue you really cannot rely on the value.

You also did not specify what the delimiter should be. And do you need INDENTS or is that an artifact or what ever you are copying from?

 

Add a FILE statement to data _null_ below, set the length for the STR variable, indicate the delimiter and this may work if indents are needed.

data have;
   infile datalines truncover;
   input Indicator $ Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 ;
datalines;
0001                   
1116 1 2 3 4 5 6 7 8 9 
1116 1 2 3 4 5 6 7 8 9 
1216 1 2 3 4 5 6 7 8 9 
0117 1 2 3 4 5 6 7 8 9 
0217 1 2 3 4 5 6 7 8 9 
0317 1 2 3 4 5 6 7 8 9 
END                   
0002                   
1116 1 2 3 4 5 6 7 8 9 
1216 1 2 3 4 5 6 7 8 9 
0117 1 2 3 4 5 6 7 8 9 
0217 1 2 3 4 5 6 7 8 9 
0317 1 2 3 4 5 6 7 8 9 
END                   
0003                   
1116 1 2 3 4 5 6 7 8 9 
1216 1 2 3 4 5 6 7 8 9 
0117 1 2 3 4 5 6 7 8 9 
0217 1 2 3 4 5 6 7 8 9 
0317 1 2 3 4 5 6 7 8 9 
END
;
run;

data _null_;
   set have;
   length str $100; /* long enough to hold all of the variables plus the delimiters*/
   if _n_=1 or Indicator='END' or lag(indicator)='END' then put indicator;
   else do;
      str= catx('|',indicator,of var:);
      put str;
   end;
run;

   
 

View solution in original post

6 REPLIES 6
Reeza
Super User

You need to use a manual data step,  and then you have full control of the file with the PUT statements.


BY groups will allow you to identify the groups you need. 

 

Post sample input data, I'm assuming this is output, and sample output that align. You can include only 3 or 4 variables to make it easier. Then we can mock up some code to show you what htis may look like.

 

Untested sketch of what this may look like:

 

data _null_;
file 'path to output file';

set dataset;
by group;

if _n_ = 1 then do;
put 'indicator' 'var1' 'var2' 'var3' 'var4' ... 'var9';
end;

if first.group then do;
put group;
end;

put var1-var9;

if last.group then do;
put 'END';
end;

run;
ballardw
Super User

<Rant>

First, having done a fair amount of work around Y2K issues once upon a time I would suggest to whoever is using that file structure that they have a significant problem with 2 digit years. Coupled with your 0001  = Jan 2000 possibly confusion this is heading for a significant failure at some point.

</Rant>

 

Because of the aforementioned 0001 issue you really cannot rely on the value.

You also did not specify what the delimiter should be. And do you need INDENTS or is that an artifact or what ever you are copying from?

 

Add a FILE statement to data _null_ below, set the length for the STR variable, indicate the delimiter and this may work if indents are needed.

data have;
   infile datalines truncover;
   input Indicator $ Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 ;
datalines;
0001                   
1116 1 2 3 4 5 6 7 8 9 
1116 1 2 3 4 5 6 7 8 9 
1216 1 2 3 4 5 6 7 8 9 
0117 1 2 3 4 5 6 7 8 9 
0217 1 2 3 4 5 6 7 8 9 
0317 1 2 3 4 5 6 7 8 9 
END                   
0002                   
1116 1 2 3 4 5 6 7 8 9 
1216 1 2 3 4 5 6 7 8 9 
0117 1 2 3 4 5 6 7 8 9 
0217 1 2 3 4 5 6 7 8 9 
0317 1 2 3 4 5 6 7 8 9 
END                   
0003                   
1116 1 2 3 4 5 6 7 8 9 
1216 1 2 3 4 5 6 7 8 9 
0117 1 2 3 4 5 6 7 8 9 
0217 1 2 3 4 5 6 7 8 9 
0317 1 2 3 4 5 6 7 8 9 
END
;
run;

data _null_;
   set have;
   length str $100; /* long enough to hold all of the variables plus the delimiters*/
   if _n_=1 or Indicator='END' or lag(indicator)='END' then put indicator;
   else do;
      str= catx('|',indicator,of var:);
      put str;
   end;
run;

   
 
jbscholten
Obsidian | Level 7

This seems to work; however, in my data set all of the values in var1-9 are in the format 12.34. In my previous code (with the delimiter column), I specified var1-9 as z5.2 to keep leading and trailing zeros, so that 1.2 is written as 01.20 in the export file.

 

How can I keep this when using the data _null_ statement?

jbscholten
Obsidian | Level 7
Also, thanks for the note about y2k. The groupings (0001 ...) are being read in by another program which identifies these as sets. I do see the problem though.
jbscholten
Obsidian | Level 7

Used this code to keep 0's and only delimit appropriate lines:

 

Set work.clearbreak;
    Length str $100; /* long enough to hold all of the variables plus the delimiters*/
    If _n_=1 or Indicator='END' or lag(indicator)='END' then put indicator;
    Else do;
        str = catx(' ',indicator, vvalue(VAR4), vvalue(VAR5), vvalue(VAR6), vvalue(VAR7), vvalue(VAR8), vvalue(VAR9), vvalue(VAR10),             vvalue(VAR11), vvalue(VAR12));
    Put str;
    End;
Run;

Ksharp
Super User
ODS CSVALL is good for this scenario.


ods csvall file='/folders/myfolders/xx.csv';
options nobyline;
title '#byval1';
footnote 'END';
proc report data=sashelp.class noheader;
by sex notsorted;
run;
ods csvall close;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1002 views
  • 0 likes
  • 4 in conversation