DATA Step, Macro, Functions and more

Keep field names dynamically

Reply
Super Contributor
Posts: 259

Keep field names dynamically

 The dataset below is an example of a dataset I'm using only the real dataset has 100s of fields and about 700 rows with different max/min values. 

 

How do i drop the field names using the max/min values i.e.

 

data help(keep = Price_&min--Price_max);

set want;

run; 

 

MinMaxPrice_1Price_2Price_3Price_4Price_5Price_6Price_7Price_8Price_9
291.011.021.8383992.6097432.9322463.156433.4217834.0838774.246248

 

My full code is below but it's not working. I think a solution to the simple example above would resolve the issue. 

 

Thanks

 

data _NULL_;
if 0 then set test1 nobs=n;
call symputx('nrows',n);
stop;
run;

proc sort data = test1 ; by descending diff; run;

%put &nrows;
%put &m_diff;

data test2;
input coin $50.;
datalines;
run;

%macro temp;

%do j = 1 %to &nrows;

data temp&j(keep = diff coin--maxdate_no Price_USD_&min--Price_USD_&max);
set test1;
if _n_ = &j;
call symputx('min',mindate_no);
call symputx('max',maxdate_no);
run;


%end;

%mend;

 

 

Super User
Posts: 19,861

Re: Keep field names dynamically

The timing in your logic is incorrect, you're trying to use macro variables before they're created. 

 

To to be clear, you're trying to keep between price_2 and price_9? The min/max are constants on each row or are they dynamic? You should expand your test case. 

Super Contributor
Posts: 259

Re: Keep field names dynamically

HI Reeza

 

Yes, in the example above I'm trying to keep price_2-price_9. The max and min will be different for each row I want to create a new dataset for each row, keep the rows between the price_min and price_max defined by the min and max field

Super User
Posts: 19,861

Re: Keep field names dynamically


brophymj wrote:

HI Reeza

 

Yes, in the example above I'm trying to keep price_2-price_9. The max and min will be different for each row I want to create a new dataset for each row, keep the rows between the price_min and price_max defined by the min and max field


Post a bigger example including sample output. It's inclear how the output will be formatted. You can't KEEP different variables on each row. . 

Super User
Super User
Posts: 7,988

Re: Keep field names dynamically

You misunderstand the concept of rows and columns.  Columns are structure - this is what allows you to program with them, they are fixed throughout the data, same type etc.  In the case you give, this may be possible as you only have one row, but if you have more than one, what happes if row 1 wants to drop column 1 and row 2 wants to keep it.  

What you should be doing is, find what value is the min and what is the max, put these in variables and then keep those variables and example:

data want (keep=keep_min keep_max);
  set have;
  array vals{*} price_:;
  keep_min=vals{min};
  keep_max=vals{max};
run;

In fact, you can drop the whole code you have doing there and do it far simpler:

data want (keep=min max);
  set have;
  min=min(of price_:);
  max=max(of price_:);
run;

Do try to avoid overcomplicating things by thinking like Excel.

Super Contributor
Posts: 259

Re: Keep field names dynamically

HI RW9

 

That won't work for my dataset. I want to keep all the fields between the min and max. The min and max will be diffeent for each row so i need to create a different dataset for each row and then deploy the code to keep the fields specified by the min and max fields. 

 

I tried this but it didn't work. If i define the min and max, then I have just two fields that are kept... I need all the fields between the max and min (and this will be different for each row in the original dataset)

 

%macro temp;

%do j = 1 %to &nrows;

data temp&j (keep=diff coin--maxdate_no vname(vals{mindate_no})--vname(vals{maxdate_no}));
set test1;
if _n_ = &j;
array vals{*} price_:;
run;


%end;

%mend;

%temp;

Super User
Super User
Posts: 7,988

Re: Keep field names dynamically

Then normalise your data (put the data going down rather than across) as you are looking for data rows which fall between min and max: from

Min Max Price_1 Price_2 Price_3 Price_4 Price_5 Price_6 Price_7 Price_8 Price_9
2 9 1.01 1.02 1.838399 2.609743 2.932246

 

To:

Price   Value

1        1.01

2        1.02

...

 

Then its simple select * from have where value between min(value) and max(value).  I would question the steps before which have left you with multiple datasets for the same data, and columns defining datarows.  

Super User
Posts: 5,516

Re: Keep field names dynamically

Your easiest method might be to use variable lists.  For example:

 

(keep = diff coin--maxdate_no Price_USD_: )

 

The colon means all variable names that begin with "Price_USD_" which probably gives you only the ones that you want.

Ask a Question
Discussion stats
  • 7 replies
  • 140 views
  • 0 likes
  • 4 in conversation