BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
NathanB
Calcite | Level 5

Hello All, 

I am fairly new to SAS and am trying to export a dataset to multiple CSVs. 

My dataset has 5 variables:

Var1 represents a line identifier (can be either header, volume or cost),

Var2 represents a bloc identifier

 

The format of the data for Var3 Var4 Var5 is dependent on the value of Var1. 

My goal is to export my dataset into different csv (grouping by Var2). 
I sorted my table based on Var2 and ran the following code: 

data _null_;
set MyDataSet;
by Var2 ;
fv=cats("MyDictory\",Var2,'.csv');
file csv filevar=fv dsd dlm=';' lrecl=32000 ;
put (_all_) (+0);
run;

 

Doing so gives me the correct content for my CSV files

 

Here comes my issue: I need a specific naming convention for the files. 
The first line of each csv is the header and I need to have this information in the filename as well. 

Any idea how I can rename the files as they are being exported, or in a second step, using information from the first line of each csv ? 

Thanks a lot !

Nathan

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Not sure I am following, but it sounds like you want to include on the FIRST value of VAR1 in the name of the file?

 

If that is the case then make this change so that FV is updated only once per value of VAR2.

data _null_;
  set Mydataset ;
  by VAR2;
  if first.var2 then fv=cats("/MyDirectory/",catx('_',of Var1--Var5),'.csv');
  retain fv;
  file csv filevar=fv dsd dlm=';' lrecl=32000 ;
  put (_all_) (+0);
run;

View solution in original post

8 REPLIES 8
Kurt_Bremser
Super User

The naming convention for your filenames must be implemented here:

fv = cats("MyDictory\",Var2,'.csv');

in whichever way you can derive the wanted filename from the content of var2.

Header lines can be inserted with

if first.var2 then put .....;

 If the "header" observation is in fact the first of each var2 group, the information for building the filename will be available at the same time.

NathanB
Calcite | Level 5

Thanks for the reply. 

The issue is that if I modify the filename in that location, then the content of the file changes as well. 
As the code is, the content is perfectly as wanted, I just need more flexibility in the naming of the file and was wondering if there was an easy way to separate the two (i.e. select the content of the file as it is now, and then define then file name based on the first line of the file)


Kurt_Bremser
Super User

@NathanB wrote:

Thanks for the reply. 

The issue is that if I modify the filename in that location, then the content of the file changes as well. 
As the code is, the content is perfectly as wanted, I just need more flexibility in the naming of the file and was wondering if there was an easy way to separate the two (i.e. select the content of the file as it is now, and then define then file name based on the first line of the file)


?????

 

As long as you do not include the fv variable in the PUT statement, changing it will have NO effect on the file content. None at all.

Tom
Super User Tom
Super User

@NathanB wrote:

Thanks for the reply. 

The issue is that if I modify the filename in that location, then the content of the file changes as well. 
As the code is, the content is perfectly as wanted, I just need more flexibility in the naming of the file and was wondering if there was an easy way to separate the two (i.e. select the content of the file as it is now, and then define then file name based on the first line of the file)



So you are writing the value of FV into the file? 

If so I would recommend making that variable first, so it is just one of the variables in the file.

And then use a different variable as the one you use with FILEVAR= option of the FILE statement.

 

For example here is a sketch based on the last method of my previous answer:

proc transpose data=mydataset(obs=0) out=names;
  var _all_; 
run;
data _null_;
  set MyDataSet;
  by Var2 ;
  fv=cats("MyDictory\",Var2,'.csv');
  if first.var2 then link header;
  put (_all_) (+0);
return;
header:
  otherfv= < some other name >;
  file csv filevar=otherfv dsd dlm=';' lrecl=32000 ;
  do _n_=1 to _nobs;
    set names point=_n_ nobs=_nobs;
    put _name_ @;
  end;
  put 'header for FV variable';
return;
run;
Tom
Super User Tom
Super User

So this is the line that is setting the filename.

fv=cats("MyDictory\",Var2,'.csv');

So just change that to generate the filenames you actually want to create.

 

Since you are processing the data in file order you can just use BY group processing to know when to write the header line.

If you know the variable names (and what header line you want for those variables) you can just list them instead of using the _ALL_ variable list.

data _null_;
  set MyDataSet;
  by Var2 ;
  fv=cats("MyDictory\",Var2,'.csv');
  file csv filevar=fv dsd dlm=';' lrecl=32000 ;
  if first.var2 then put 'var1;var2;var3;var4;var5';
  put var1 var2 var3 var4 var5 ;
run;

If you don't know the variable names then you will need to either first build the list of column headers into a macro variable.

proc sql noprint;
select nliteral(name),quote(trim(coalesce(label,name))) 
  into :names separated by ' '
     , :headers separated by ';'
  from dictionary.columns
  where libname='WORK'
    and memname='MYDATSET'
  order by varnum
;
quit;
data _null_;
  set MyDataSet;
  by Var2 ;
  fv=cats("MyDictory\",Var2,'.csv');
  file csv filevar=fv dsd dlm=';' lrecl=32000 ;
  if first.var2 then put %sysfunc(quote(%superq(headers),%str(%'))) ;
  put &names ;
run;

Or put the list of names into a dataset and use that.  You can use PROC CONTENTS or query of DICTIONARY.COLUMNS (like above) to get the list of names.  But you can also use PROC TRANSPOSE and the dataset option OBS=0 and VAR _ALL_ to make a dataset with names in order.  Then you can loop over that dataset to write the header row.

proc transpose data=mydataset(obs=0) out=names;
  var _all_; 
run;
data _null_;
  set MyDataSet;
  by Var2 ;
  fv=cats("MyDictory\",Var2,'.csv');
  file csv filevar=fv dsd dlm=';' lrecl=32000 ;
  if first.var2 then link header;
  put (_all_) (+0);
return;
header:
  do _n_=1 to _nobs;
    set names point=_n_ nobs=_nobs;
    put _name_ @;
  end;
  put;
return;
run;

 

 

 

NathanB
Calcite | Level 5

Thank you both. 

Unfortunately I can get the desired result. 

I'll try to explain my issue better !
Using the data set below: 

Var1Var2Var3Var4Var5
HEADERID_A2023050920231205correction
VOLLINESID_A2023050908MAY23:22:00:000.0000
VOLLINESID_A2023050908MAY23:22:15:000.0000
FIXLINEID_A202305092023120550
HEADERID_B2022013120221231correction
VOLLINESID_B2022013101JAN22:22:00:000.0000
VOLLINESID_B2022013101JAN22:22:15:000.0000
FIXLINEID_B2022013120221231150

 

If I run my initial code

 

data _null_;
set Mydataset ;
by column2 ;
fv=cats("MyDirectory",Var2,'.csv');
file csv filevar=fv dsd dlm=';' lrecl=32000 ;
put (_all_) (+0);
run;

I get two csv's:

Filename: ID_A

HEADERID_A2023050920231205correction
VOLLINESID_A2023050908MAY23:22:00:000.0000
VOLLINESID_A2023050908MAY23:22:15:000.0000
FIXLINEID_A202305092023120550

 

Filename: ID_B

HEADERID_B2022013120221231correction
VOLLINESID_B2022013101JAN22:22:00:000.0000
VOLLINESID_B2022013101JAN22:22:15:000.0000
FIXLINEID_B2022013120221231150

 

Content wise, this is exactly the output I want, but my issue is in the naming of the file: 

The first one should be named ID_A_Header (or whatever other concatenation of the first line of the csv eg: ID_A_20230509_20231205_correction)

Same goes for the second file. 

 

Any idea how I can achieve this result ? 
I tried changing the following line:
fv=cats("MyDirectory",Var2,'.csv');

==> fv=cats("MyDirectory",Var2,Var1,'.csv');

My csv the get split differently content wise. In this case I would have 6 files, one for each combination of Header, Vollines, Fixline and ID_A, ID_B. 

At this point I am a bit lost. 

Thank you again for your help, 
Nathan

Tom
Super User Tom
Super User

Not sure I am following, but it sounds like you want to include on the FIRST value of VAR1 in the name of the file?

 

If that is the case then make this change so that FV is updated only once per value of VAR2.

data _null_;
  set Mydataset ;
  by VAR2;
  if first.var2 then fv=cats("/MyDirectory/",catx('_',of Var1--Var5),'.csv');
  retain fv;
  file csv filevar=fv dsd dlm=';' lrecl=32000 ;
  put (_all_) (+0);
run;
NathanB
Calcite | Level 5

This is it !!
Thank you so much !

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 581 views
  • 2 likes
  • 3 in conversation