BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
JMagenta
Obsidian | Level 7

Hello,

I have a problem. I have 60 variables:

 

Apples1--Apples20 

Carrots1--Carrots20

Oranges1--Oranges20

 

What I want to do is group variables with the same suffix together in the SAS dataset:

 

Apples1 Carrots1 Oranges1 ... Apples20 Carrots20 Oranges20

 

How can I do this?

 

Next, depending on the value of Apples1--Apples20 ( using formula Min(Apples1--Apples20) to find the minimum value, I would like to create new variables with their values.

For instance:

Apples2 is less than all in group Apples1--Apples20, so :

 

Earlyfruit=Apples2;

Afruit=Carrot2;

Bfruit=Orange2;

 

Is this possible? Can I use "proc sql"  to accomplish this somehow?

 

Any advice would be helpful.

 

Thank you,

J Magenta

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
data have2;
retain &varlist;
set have;
Array Apples_ {*} apples1-apples20;
Array Carrot_{*} Carrots1-carrots20;
Array oranges_{*} oranges1-oranges20;

Earlyfruit=min(of apples_[*]);

index_earlyfruit = whichn(earlyfruit, of apples_(*));
afruit = carrot_(index_earlyfruit);
bfruit = oranges_(index_earlyfruit);
run;

Use WHICHN to find the index of the minimum value. Note that if you have ties, the first one is returned.

 


@JMagenta wrote:

Awesome!

Thank you so much!

But for the last part, where I create new variables, it gets tricky. 

Could this work?

 

 

data have2;
retain &varlist;
set have;
Array Apples_ {*} apples1-apples20;
Array Carrot_{*} Carrots1-carrots20;
Array oranges_{*} oranges1-oranges20;
do i=1 to dim(apples_);
Earlyfruit=min(of apples_[*]);
Afruit=Carrot_[min(of apples_[*])];   /* I want the carrot that is chosen to match the suffix of Apple without changing the value of Carrot*/
Bfruit=Oranges_[min(of apples_[*])]; /* same here with Oranges*/
end;
run;


 

View solution in original post

11 REPLIES 11
Patrick
Opal | Level 21

Here you go. 

I assume some of the SAS syntax used will be new to you. I suggest you first consult the SAS docu and then ask questions for the bits that remain unclear.

/***
I have 60 variables:
Apples1--Apples20 
Carrots1--Carrots20
Oranges1--Oranges20
**/
data work.have;
  array Apples  {20} 8;
  array Carrots {20} 8;
  array Oranges {20} 8;
  do i=1 to dim(apples);
    apples[i]=mod(i*10,dim(apples))+1;
    carrots[i]=apples[i]*5;
    oranges[i]=apples[i]*8;
  end;
  drop i;
run;

/* 1. What I want to do is group variables with the same suffix together in the SAS dataset */
/* Comment: The order of variables in a table is of no real relevance. Such ordering requests are normally reporting */ 
/*          requirements to be addressed as part of report generation                                                */

/* sort selected variables by digits in the variable name and populate a macro variable with the result */
proc sql noprint;
  select name into :varlist separated by ' '
  from dictionary.columns
  where 
    libname='WORK' 
    and memname='HAVE'
    and (
             upcase(name) like 'APPLES%'
          or upcase(name) like 'CARROTS%'
          or upcase(name) like 'ORANGES%'
        )
  order by compress(name,,'kd'), name
  ;
quit;

/* use macro variable to print data */
proc print data=work.have;
  var &varlist;
run;

/* 2. Next, depending on the value of Apples1--Apples20 ( using formula Min(Apples1--Apples20) */
/*    to find the minimum value, I would like to create new variables with their values.       */

/* below 3 coding options for doing this.                                                                                     */
/* - option one - of Apples: - will include all variables where the name starts with Apple - would also include AppleComputer */
data want;
  set have;
  array _oranges{*} Oranges1 - Oranges20;
  min_apple=min(of Apples:);
  min_carrots=min(of Carrots1 - Carrots20);
  min_oranges=min(of _oranges[*]);
run;

proc print data=want;
  var min_:;
run;

Patrick_0-1700451514914.png

 

Here the chatGPT explanation of above code which isn't too bad

Spoiler
  1. Creating the Initial Dataset (work.have😞

    sas
    data work.have; array Apples {20} 8; array Carrots {20} 8; array Oranges {20} 8; do i=1 to dim(apples); apples[i]=mod(i*10,dim(apples))+1; carrots[i]=apples[i]*5; oranges[i]=apples[i]*8; end; drop i; run;
    • This part of the code creates a dataset named work.have with 60 variables: Apples1 through Apples20, Carrots1 through Carrots20, and Oranges1 through Oranges20.
    • Three arrays (Apples, Carrots, and Oranges) of size 20 are defined.
    • A loop is used to populate these arrays with values based on a formula (apples[i]=mod(i*10,dim(apples))+1) and similar formulas for carrots and oranges.
    • The drop i; statement removes the loop index variable i.
  2. Sorting Variables and Creating a Macro Variable (varlist😞

    sas
    proc sql noprint; select name into :varlist separated by ' ' from dictionary.columns where libname='WORK' and memname='HAVE' and ( upcase(name) like 'APPLES%' or upcase(name) like 'CARROTS%' or upcase(name) like 'ORANGES%' ) order by compress(name,,'kd'), name ; quit;
    • This part of the code uses PROC SQL to query the dictionary.columns table to get the variable names in the work.have dataset that meet certain conditions.
    • The conditions are specified in the where clause, filtering for variables starting with 'APPLES', 'CARROTS', or 'ORANGES'.
    • The result is sorted by the numerical part of the variable names (order by compress(name,,'kd'), name).
    • The selected variable names are stored in a macro variable named varlist, separated by spaces.
  3. Printing Selected Variables Using Macro Variable:

    sas
    proc print data=work.have; var &varlist; run;
    • This part of the code prints the work.have dataset, selecting the variables specified in the macro variable varlist.
    • The &varlist is resolved at runtime, and it includes the list of variables retrieved and sorted in the previous step.
  4. Calculating Minimum Values and Creating a New Dataset (want😞

    sas
    data want; set have; array _oranges{*} Oranges1 - Oranges20; min_apple=min(of Apples:); min_carrots=min(of Carrots1 - Carrots20); min_oranges=min(of _oranges[*]); run;
    • This part of the code creates a new dataset named want by using the set statement to copy the observations from the have dataset.
    • Three new variables (min_apple, min_carrots, and min_oranges) are calculated using the min function and the of operator.
    • For min_apple, it calculates the minimum value of all variables starting with 'Apples'.
    • For min_carrots, it calculates the minimum value of variables from Carrots1 to Carrots20.
    • For min_oranges, it calculates the minimum value of the array _oranges that includes Oranges1 through Oranges20.
  5. Printing the New Dataset (want😞

    sas
    proc print data=want; var min_:; run;
    • This part of the code prints the want dataset, selecting the variables that start with 'min_'.

In summary, the code generates a dataset with 60 variables, sorts and selects specific variables based on their names, calculates the minimum values for selected variable groups, and prints the results.

 

JMagenta
Obsidian | Level 7

Awesome!

Thank you so much!

But for the last part, where I create new variables, it gets tricky. 

Could this work?

 

 

data have2;
retain &varlist;
set have;
Array Apples_ {*} apples1-apples20;
Array Carrot_{*} Carrots1-carrots20;
Array oranges_{*} oranges1-oranges20;
do i=1 to dim(apples_);
Earlyfruit=min(of apples_[*]);
Afruit=Carrot_[min(of apples_[*])];   /* I want the carrot that is chosen to match the suffix of Apple without changing the value of Carrot*/
Bfruit=Oranges_[min(of apples_[*])]; /* same here with Oranges*/
end;
run;

Reeza
Super User
data have2;
retain &varlist;
set have;
Array Apples_ {*} apples1-apples20;
Array Carrot_{*} Carrots1-carrots20;
Array oranges_{*} oranges1-oranges20;

Earlyfruit=min(of apples_[*]);

index_earlyfruit = whichn(earlyfruit, of apples_(*));
afruit = carrot_(index_earlyfruit);
bfruit = oranges_(index_earlyfruit);
run;

Use WHICHN to find the index of the minimum value. Note that if you have ties, the first one is returned.

 


@JMagenta wrote:

Awesome!

Thank you so much!

But for the last part, where I create new variables, it gets tricky. 

Could this work?

 

 

data have2;
retain &varlist;
set have;
Array Apples_ {*} apples1-apples20;
Array Carrot_{*} Carrots1-carrots20;
Array oranges_{*} oranges1-oranges20;
do i=1 to dim(apples_);
Earlyfruit=min(of apples_[*]);
Afruit=Carrot_[min(of apples_[*])];   /* I want the carrot that is chosen to match the suffix of Apple without changing the value of Carrot*/
Bfruit=Oranges_[min(of apples_[*])]; /* same here with Oranges*/
end;
run;


 

JMagenta
Obsidian | Level 7

To Everyone on this project!!

 

THANK YOU!!!

I am learning day by day, and you are each helping me do that!

 

JMagenta

LinusH
Tourmaline | Level 20

This table structure may serve a purpose for reporting or specific analysis use cases, but it's not very practical for data storage.

It would be better to store your data in a more flexible (normalised) manner, and then create the desired structure when needed.

Also, I'm missing some kind id variable in your example...

I'm suggesting something like

id date/period fruit fruit_no fruit_value

1 jan2023 orange 1 45

 

Then you can use formats or other means to group you values, and other kind of aggregation.  and then PROC TRANSPOSE to create your target layout.

Data never sleeps
Quentin
Super User

I agree with the suggestion to transpose this data into a more vertical / normalized format to make it easier to work with.

 

@JMagenta , could you post some example data?  Ideally, a DATA step that uses the CARDS statement with just a few rows and a few variables, maybe ID Apple1-Apple3 Carrot1-Carrot3 Orange1-Orange3?  Also please show the output data you would want for this example data.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
JMagenta
Obsidian | Level 7


data work.have;
  array Apples  {20} 8;
  array Carrots {20} 8;
  array Oranges {20} 8;
  do i=1 to dim(apples);
    apples[i]=i+1; /* I did "i+5" and so forth so that the numbers were unique values*/
    carrots[i]=i+5;
    oranges[i]=i+8;
  end;
  drop i;
run;


proc sql noprint;
  select name into :varlist separated by ' '
  from dictionary.columns
  where 
    libname='WORK' 
    and memname='HAVE'
    and (
             upcase(name) like 'APPLES%'
          or upcase(name) like 'CARROTS%'
          or upcase(name) like 'ORANGES%'
        )
  order by compress(name,,'kd'), name
  ;
quit;
data have2;
  retain &varlist;
	set have;
Array Apples_ {*} apples1-apples20;
Array Carrot_{*} Carrots1-carrots20;
Array oranges_{*} oranges1-oranges20;
 do i=1 to dim(apples_);
 	Earlyfruit=min(of apples_[*]);
	Afruit=Carrot_[min(of apples_[*])];
	Bfruit=Oranges_{min(of apples_[*])};

	end;

run;

The result value for Easyfruit is correct and equals Apples1 which is 2.

The result for Afruit should equal Carrots1 which is 6 instead it equals 7 (likely the value of Apples6).

The result for Bfruit should equal Oranges 1 which is 9 instead it equals 10 (likely the value of Apples9).

 

 

Kurt_Bremser
Super User

Whenever I see such structures, I see that there is data (fruit/vegetable types and sequence numbers) hidden in structure (variable names). Data belongs in variables, not in their names, so you should transpose your dataset and extract the relevant data from the _NAME_ variable you get from PROC TRANSPOSE.

Then you can do it in SQL, which has the concept of sets (of observations), but not of arrays.

JMagenta
Obsidian | Level 7

Hello,

I get it, but to be clear,

 

This was the only way I could find to group the data. 

Each variable group:

Apples1 Oranges 1 Carrots1 belongs to basket1. 

 

Apples2 Oranges2 Carrots2 belongs to basket2 and so on.

 

So if Apples2 is the variable with the minimum value for that line for Basket1;

Then basically what is Carrots2 and Oranges2:

FruitA=Oranges2;

FruitB=Carrots2;

 

This is what I want.

 

Sorry if it's not too clear.

Kurt_Bremser
Super User

Rid yourself of your Excel-thinking. In Excel, data is always stored in report-form, which is unsuited for computation (unless you have a spreadsheet).

In real computing, data is stored in relational, sequential tables, with one variable for one fact.

So let's transpose you dumb data to intelligent data:

data dumb;
input id $ apples1 apples2 apples3 carrots1 carrots2 carrots3 oranges1 oranges2 oranges3;
datalines;
1 219 285 185 463 192 316 219 352 196
;

proc transpose data=dumb out=intelligent1 (rename=(col1=value));
by id;
var apples: carrots: oranges:;
run;

data intelligent2;
set intelligent1;
i = anydigit(_name_);
fruit = substr(_name_,1,i-1);
seq = input(substr(_name_,i),best.);
drop i _name_;
run;

Then you can use SQL:

proc sql;
create table want as
select
  a.id,
  a.seq,
  a.fruit,
  a.value
from intelligent2 a
where a.seq in (
  select seq 
  from intelligent2 b 
  where b.id = a.id and b.fruit = "apples"
  group by b.id
  having b.value = max(b.value)
)
;
quit;

And create a wide report without hassle:

proc report data=want;
column id seq value,fruit;
define id / group;
define seq / group;
define value / "" analysis;
define fruit / "" across;
run;

Result:

id	seq	apples	carrots	oranges
1	2	285	192	352
Kurt_Bremser
Super User

PS the long structure also allows easy analysis over different groupings, as you can use the values in fruit, or the sequence, as the main index. With the wide structure, this is hard to impossible.

 

See Maxim 19. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1971 views
  • 4 likes
  • 6 in conversation