DATA Step, Macro, Functions and more

How do dynamically change the number of column in a dataset?

Reply
N/A
Posts: 0

How do dynamically change the number of column in a dataset?

Hi,

I have a question about dynamically generate dataset and also from dataset to generate table/report dynamically.

Says I have an ever-changing source file, maybe in an excel format as below:

| Date | Field 1 | Field 2 | Field 3 |
------------------------------------------------
| JUL01 | 100 | 101 | 88 |
| JUL02 | 331 | 72 | 111 |
| ... | ... | ... | ... |

But from time to time, the number of "Field X" might increase, like:


| Date | Field 1 | Field 2 | Field 3 | Field 4 |
------------------------------------------------------------
| AUG01 | 20 | 61 | 98 | 172 |
| AUG02 | 341 | 27 | 121 | 12 |
| ... | ... | ... | ... | .... |

(**Note: the total number of "Field N" could be either increased or decreased)

Is there a way I can get the information from the fields/columns dynamically without hardcoding the names? As if I use keep=field_1 field_2 ... field_n then everytime there is a change in the source I'd need to update the code as well, but I'd like to avoid this.

I heard from people saying proc tabulate might help, but I don't see how it could help, could someone kindly give me a direction to look into? Many thanks!
Super Contributor
Super Contributor
Posts: 3,174

Re: How do dynamically change the number of column in a dataset?

It depends on the SAS code you want to maintain and where it is maintained.

After a SAS member has been generated, you always have DICTIONARY.COLUMNS to reference with PROC SQL -- or SAS/View SASHELP.VCOLUMN. Either of these facilities can be used to keep track and can be used in SAS application code, where needed. Search the SAS.COM support website for references.

Or, SAS coding techniques can also be implemented depending on variable name structure, as in using a SAS variable prefix like FIELD_: in a KEEP= (dataset option) or KEEP statement.

For other, more specific reference / usage examples, you will want to share your SAS application code in a forum post reply.

Scott Barry
SBBWorks, Inc.
N/A
Posts: 0

Re: How do dynamically change the number of column in a dataset?

Hi Scott,

Thanks for your reply.

I have looked at dictionary and sashelp, but unless I am mistaken they seem to be retrieving the column information from dataset instead of source files?

As mentioned in my former post, the change would be in the source file, so I'd like to find a way to dynamically generate a dataset from it without updating the keep/drop within the code every time.

Sorry if I didn't make things clear, as I am pretty new to the SAS coding, thanks a lot!
Super Contributor
Super Contributor
Posts: 3,174

Re: How do dynamically change the number of column in a dataset?

Exactly my one option which is using the "colon modifier" as its called -- that being a variable prefix in your KEEP= or KEEP statement.

Or, depending on the layout/format of your "source file", you might look at using PROC IMPORT, depending on your OS platform.

Scott Barry
SBBWorks, Inc.

Google advanced search, this topic / post:

data step colon modifier site:sas.com
Valued Guide
Posts: 632

Re: How do dynamically change the number of column in a dataset?

Raymond,
As was mentioned IMPORT does not require preknowledge of the number of incoming fields. In a DATA step the following code places each of the fields on its own observation so that any number of fields are acceptable (the number can even vary within a incoming table).

[pre]
data a(keep=date num field);
infile datalines missover;
input date date9. @;
format date date9.;
num=1;
do while(num);
input field @;
if field = . then num=0;
else do;
output a;
num+1;
end;
end;
datalines;
23jul2010 1 2 3 4 5 6
24jul2010 11 22 33 44
run;
proc transpose data=a
out=b(keep=date fieldSmiley Happy
prefix=field;
by date;
id num;
run;
proc print data=b;
run;
[/pre]
SAS Super FREQ
Posts: 8,743

Re: How do dynamically change the number of column in a dataset?

As Scott says, PROC IMPORT would be most useful for reading in from Excel.

Not sure how using PROC TABULATE will help in anything but a report, however. Your question (How to dynamically change the number of columns in a dataset) seems to imply that one month, your dataset (let's call it PERM.WOMBAT) could have 3 FIELD: variables and the next month, when you read in your SOURCE or EXCEL file, your dataset (PERM.WOMBAT) could now have 5 FIELD: variables because the Excel file had a column added to it.

If you built programs based on the 3 FIELD: variable version of the dataset, and if your dataset PERM.WOMBAT suddenly turns out to have 5 FIELD: variables in a new month, your existing PROGRAMS would have no visibility of the 2 new FIELD: variables. Sometimes folks use the DICTIONARY files to see what variables are in a file before they overwrite that file with new data (and possible new columns). And then sometimes, people will write macro programs to allow for a bit more flexibility when the number of columns or the names of columns change.

At any rate, just some more to think about. I'm not sure whether you want a dataset or a report, though. I still don't understand how PROC TABULATE would help. Although, you CAN use a colon modifier in TABULATE syntax, in order to design your table, for more complicated tables, eventually, you may need to use the specific variable names. But in the meantime, the program below makes some fake data -- one file with 3 variables named FIELD1-FIELD3 and the other file with 5 variables FIELD1-FIELD5 and when you use the colon modifier in the VAR and TABLE statement, the correct number of FIELD variables appear in either table.

cynthia
[pre]
data class1(keep=name sex field1 field2 field3)
class2(keep=name sex field1-field5);
set sashelp.class;
field1 = age;
field2 = weight;
field3 = height;
field4 = age * weight;
field5 = age * height;
run;

ods html file='c:\temp\show_colon.html' style=sasweb;
proc tabulate data=class1;
title 'should see 3 columns under min and 3 columns under mean';
var field:;
class sex;
table sex,
(min mean)* field:;
run;

proc tabulate data=class2;
title 'should see 5 columns under min and 5 columns under mean';
var field:;
class sex;
table sex,
(min mean)* field:;
run;
ods html close;
[/pre]
Contributor
Posts: 48

Re: How do dynamically change the number of column in a dataset?

Like SBB and Cynthia say its proc import is most useful.
You can get variable names using directly opening table. In this example you can see that.

I make macroVar, that can be used in other datasets.

Here is example import code:

/* Make some excel data */
PROC EXPORT DATA= SASHELP.CLASS
    OUTFILE= "C:\temp\test2.xls"
    DBMS=EXCEL REPLACE;
    SHEET="class";
RUN;

/* Import this data to work.zzzz */

PROC IMPORT
   OUT= WORK.zzzz
   DATAFILE= "C:\temp\test2.xls"
   DBMS=EXCEL REPLACE;
   SHEET="class";
   GETNAMES=YES;
   MIXED=NO;
   SCANTEXT=YES;
   USEDATE=YES;
   SCANTIME=YES;
RUN;

/* Get variable names */
/* and create macroVar */
data columns_off_zzz;
   keep varname;
    format ALLVARS $1000.;
    ALLVARS = "";
    dsid = open("zzzz");
    nvars = attrn(dsid,"nvars");
    do i = 1 to nvars;
       varname = varname(dsid,i);
       ALLVARS = strip(ALLVARS)|| " " ||varname;
      output;
    end;
    call symput("ALLVARS",ALLVARS);
run;
%put &allVARS;

data test;
   keep &allVars;
    set zzzz;
run;
Valued Guide
Posts: 2,175

Re: How do dynamically change the number of column in a dataset?

I don't think the issue is so much about how to handle the dynamic data structure when the number of columns is unknown, as much as deciding what to do with these extra columns!
How can you decide that an extra column headed 'New_Balance" is the same information each time it appears - one day it might contain money, another time it might be a flag to indicate a new account.
Easily columns can be loaded as text. There are plenty of examples already in the Forum archives. What defeats my programming here is knowing what to do with "unknown" columns.

Raymond,
please can you define your data a little more: -
can you offer some more definition of what the additional columns might contain - numbers, messages, timestamps, boolean-flags, filenames
or perhaps something simpler ?
Please can you define what you might want to do with the extra columns?

hth
peterC
N/A
Posts: 0

Re: How do dynamically change the number of column in a dataset?

Thanks a lot to all the kind replies,

I have given a try on the proc import and indeed I successfully retrieved the columns I need from my dummy .txt file. Many thanks to SBB and Cythia for the suggestion.

As for questions about the extra columns, they would be containing just simple number values.

Besides, I think I was mis-understood my task, though knowing to get data from source files do help me somehow, I am also looking a way to generate reports on WRS from datasets dynamically.

Like the stories I have told, I will be dealing with datasets that varies in number of columns from time to time, and it seems (currently with my knowledge) when generating the WRS reports, when there is a change in the dataset structure, something has to be done manually to update the report.

And thats what I want to avoid.

I learnt that stored process seems might be helpful, while currently I am studying on how the whole thing works (from dataset to mapping then to report etc).

I just started learning SAS for a month, hope my questions won't cause too much troubles, and thank you very much for all the kind supports!
SAS Super FREQ
Posts: 8,743

Re: How do dynamically change the number of column in a dataset?

Hi:
When you are dealing with Web Report Studio (WRS), things are a bit different. The reports that you build in WRS can only use SAS Information Maps as the data source to build reports. Or, alternately, you can build a WRS report from a Stored Process. So, some of the techniques that we recommended (such as using PROC IMPORT and the colon modifier) will work inside a stored process, however, will not work with an Information Map. In some circumstances, it is possible to populate an Information Map from a Stored Process, but the number of columns in the Information Map really can't be dynamic and changeable.

At this point, I would recommend that you talk to your SAS Administrator or the person who has been building Information Maps for your company to discuss the best approach for your WRS report needs.

cynthia
Valued Guide
Posts: 2,175

Re: How do dynamically change the number of column in a dataset?

>
> As for questions about the extra columns, they would
> be containing just simple number values.

Since the information structure is so well defined, it seems the only variation would be the number of "extra", numeric columns.
I would recommend transposing this varying part into rows (each row also having the "common" part) and perhaps supplement these rows with a counter indicating the column position which that row of data represents.
Then the structure is fixed and the variation from input file to input file will only be the number of rows.
That rearrangement creates a structure that is far more "in tune" or "aligned" to the way that SAS works. I think (hope) it might be a standard transform in Information mapper.

Peter

>
> Besides, I think I was mis-understood my task
> .....
> somehow, I am also looking a way to generate reports
> on WRS from datasets dynamically.
>
Ask a Question
Discussion stats
  • 10 replies
  • 396 views
  • 0 likes
  • 6 in conversation