BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Maisha_Huq
Quartz | Level 8

Hi there,

 

I have an input data set, which is structured something like the sample data set below. Would someone have a suggestion for the best way to identify the food item with the least amount of each nutrient type at each school-meal-day combination If the data set is food-level, has  up to 400 food items at a school throughout the week, and almost 2000 schools ?

 

Some details:

 

food item - food-level file where each different food item anew for each school where it's offered and each meal within that school it's offfered. So if a food item, an apple, is offered at 100 schools, for both breakfast and lunch, it'll be listed 200 times in the fole, one row for each school and meal combo where it's offered.

 

meal- breakfast, lunch

day - weekday

nutrient- the amount of a nutrient in that food item

 

Fooditem    School       Meal      Day        Nutrient1   Nutrient2   

Apple           School1    B            Mon         5 cals        10 cals

Apple           School1    M            Mon         5 cals        10 cals

Apple           School1    B            Tues        5 cals        10 cals

Apple           School1    M            Tues       5 cals        10 cals

Apple           School1    B            Wed        5 cals        10 cals

Apple           School1    M            Wed        5 cals        10 cals

Apple           School1    B            Thurs      5 cals        10 cals

Apple           School1    M            Thurs      5 cals        10 cals

Apple           School2    B            Mon         5 cals        10 cals

Apple           School2    M            Mon         5 cals        10 cals

Apple           School2    B            Tues        5 cals        10 cals

Apple           School2    M            Tues       5 cals        10 cals

Apple           School2    B            Wed        5 cals        10 cals

Apple           School2    M            Wed        5 cals        10 cals

Apple           School2    B            Thurs      5 cals        10 cals

Apple           School2    M            Thurs      5 cals        10 cals

Milk           School1    B            Mon         5 cals        10 cals

Milk           School1    M            Mon         5 cals        10 cals

Milk           School1    B            Tues        5 cals        10 cals

Milk           School1    M            Tues       5 cals        10 cals

Milk           School1    B            Wed        5 cals        10 cals

Milk           School1    M            Wed        5 cals        10 cals

Milk           School1    B            Thurs      5 cals        10 cals

Milk           School1    M            Thurs      5 cals        10 cals

Milk           School2    B            Mon         5 cals        10 cals

Milk           School2    M            Mon         5 cals        10 cals

Milk           School2    B            Tues        5 cals        10 cals

Milk           School2    M            Tues       5 cals        10 cals

Milk           School2    B            Wed        5 cals        10 cals

Milk           School2    M            Wed        5 cals        10 cals

Milk           School2    B            Thurs      5 cals        10 cals

Milk           School2    M            Thurs      5 cals        10 cals

 

Woulld there be another way aside from

- assigning a macro and then callin a proc sort process for each school/day/meal/nutrient/food combo?

- and aside from sorting the dataset by a list of BY vars then creating a concatenated var for each school/day/meal/food nutrient combo then identifying each unique first combo as the one with the least of each nutrient level?

Thank you in advance

 

I use SAS 9.4

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Note, just seen your reply, still the below applies, just need to change the sum to min, then transpose back up.  I am leaving for the day now, so will see if I can get code later.

 

At a guess, it would help if you first normalised your data, i.e. have:

 

Fooditem    School       Meal      Day        Nutrient_Number   Cals  

Apple           School1    B            Mon       1                            5        

Apple           School1    B            Mon        2                          10

...

 

Then do you sums, example:

data have;
  length fooditem school meal day nutrient1 nutrient2 $50;
  input fooditem $ school $ meal $ day $ nutrient1 $ nutrient2 $;
datalines;
Apple           School1    B            Mon         5cals        10cals
Apple           School1    M            Mon         5cals        10cals
Apple           School1    B            Tues        5cals        10cals
Apple           School1    M            Tues       5cals        10cals
Apple           School1    B            Wed        5cals        10cals
;
run;

data inter (drop=nutrient1-nutrient2);
  set have;
  nutrient_number=1;
  nutrient=input(compress(nutrient1,,"kd"),best.);
  output;
  nutrient_number=2;
  nutrient=input(compress(nutrient2,,"kd"),best.);
  output;
run;

proc sql;
  create table STATS as
  select  FOODITEM,
          SCHOOL,
          MEAL,
          DAY,
          NUTRIENT_NUMBER,
          sum(NUTRIENT) as RES
  from    INTER
  group by FOODITEM,
            SCHOOL,
            MEAL,
            DAY,
            NUTRIENT_NUMBER;
quit;

 

View solution in original post

5 REPLIES 5
ballardw
Super User

Could you provide an example of the output you want?

 

Maisha_Huq
Quartz | Level 8

HI ballardw,

 

The output dataset would be school-level, one record per school ID and it would look something like this:

 

School_ID     Lunch_Day1_Lowestfat_fruit   Lunch_Day2_Lowestfat_fruit  Lunch_Day3_Lowestfat_fruit   ...

SCHID1          apple                                           pear                                              apple

SCHID2           apple                                           pear                                              pear

SCHID3           apple                                           apple                                             apple

 

 

 

where the other variables would be:

Lunch_Day1_Lowestfat_fruit

Lunch_Day2_Lowestfat_fruit

Lunch_Day3_Lowestfat_fruit  

Lunch_Day4_Lowestfat_fruit  

Lunch_Day5_Lowestfat_fruit  

 

Lunch_Day1_Lowestfat_grain

Lunch_Day2_Lowestfat_grain

Lunch_Day3_Lowestfat_grain

Lunch_Day4_Lowestfat_grain

Lunch_Day5_Lowestfat_grain

 

Lunch_Day1_Lowestfat_meat

Lunch_Day2_Lowestfat_meat

Lunch_Day3_Lowestfat_meat

Lunch_Day4_Lowestfat_meat

Lunch_Day5_Lowestfat_meat

 

 Let me know if I can clarify and thank you!

ballardw
Super User

To get column headings such as LowestFat_fruit the input data set will require something that identifies each food as to its category: fruite, vegetable, meat, dairy or what ever categories you may be using AND identification as to which nutrient is being considered. Since your example data mentioned nutrient1 and nutrient2 - which is Fat? or other category?

 

If there are ties, since ALL of your example data had the exact same values does it matter which gets reported as the lowest?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Note, just seen your reply, still the below applies, just need to change the sum to min, then transpose back up.  I am leaving for the day now, so will see if I can get code later.

 

At a guess, it would help if you first normalised your data, i.e. have:

 

Fooditem    School       Meal      Day        Nutrient_Number   Cals  

Apple           School1    B            Mon       1                            5        

Apple           School1    B            Mon        2                          10

...

 

Then do you sums, example:

data have;
  length fooditem school meal day nutrient1 nutrient2 $50;
  input fooditem $ school $ meal $ day $ nutrient1 $ nutrient2 $;
datalines;
Apple           School1    B            Mon         5cals        10cals
Apple           School1    M            Mon         5cals        10cals
Apple           School1    B            Tues        5cals        10cals
Apple           School1    M            Tues       5cals        10cals
Apple           School1    B            Wed        5cals        10cals
;
run;

data inter (drop=nutrient1-nutrient2);
  set have;
  nutrient_number=1;
  nutrient=input(compress(nutrient1,,"kd"),best.);
  output;
  nutrient_number=2;
  nutrient=input(compress(nutrient2,,"kd"),best.);
  output;
run;

proc sql;
  create table STATS as
  select  FOODITEM,
          SCHOOL,
          MEAL,
          DAY,
          NUTRIENT_NUMBER,
          sum(NUTRIENT) as RES
  from    INTER
  group by FOODITEM,
            SCHOOL,
            MEAL,
            DAY,
            NUTRIENT_NUMBER;
quit;

 

Ksharp
Super User

How you define " the least amount of each nutrient type" ?

I think you need transpose your data like this:

 

    School     Meal      Day        AppleNutrient1   AppleNutrient2   MilkNutrient1   MilkNutrient2   

    School1    B            Mon         5 cals        10 cals     5 cals        10 cals 

     School1    M            Mon         5 cals        10 cals     5 cals        10 cals 

 

 

and then GRAYCODE() may do that trick .

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1030 views
  • 0 likes
  • 4 in conversation