DATA Step, Macro, Functions and more

Reorganizing input data set

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 131
Accepted Solution

Reorganizing input data set

Hi there,

 

I have an input data set, which is structured something like the sample data set below. Would someone have a suggestion for the best way to identify the food item with the least amount of each nutrient type at each school-meal-day combination If the data set is food-level, has  up to 400 food items at a school throughout the week, and almost 2000 schools ?

 

Some details:

 

food item - food-level file where each different food item anew for each school where it's offered and each meal within that school it's offfered. So if a food item, an apple, is offered at 100 schools, for both breakfast and lunch, it'll be listed 200 times in the fole, one row for each school and meal combo where it's offered.

 

meal- breakfast, lunch

day - weekday

nutrient- the amount of a nutrient in that food item

 

Fooditem    School       Meal      Day        Nutrient1   Nutrient2   

Apple           School1    B            Mon         5 cals        10 cals

Apple           School1    M            Mon         5 cals        10 cals

Apple           School1    B            Tues        5 cals        10 cals

Apple           School1    M            Tues       5 cals        10 cals

Apple           School1    B            Wed        5 cals        10 cals

Apple           School1    M            Wed        5 cals        10 cals

Apple           School1    B            Thurs      5 cals        10 cals

Apple           School1    M            Thurs      5 cals        10 cals

Apple           School2    B            Mon         5 cals        10 cals

Apple           School2    M            Mon         5 cals        10 cals

Apple           School2    B            Tues        5 cals        10 cals

Apple           School2    M            Tues       5 cals        10 cals

Apple           School2    B            Wed        5 cals        10 cals

Apple           School2    M            Wed        5 cals        10 cals

Apple           School2    B            Thurs      5 cals        10 cals

Apple           School2    M            Thurs      5 cals        10 cals

Milk           School1    B            Mon         5 cals        10 cals

Milk           School1    M            Mon         5 cals        10 cals

Milk           School1    B            Tues        5 cals        10 cals

Milk           School1    M            Tues       5 cals        10 cals

Milk           School1    B            Wed        5 cals        10 cals

Milk           School1    M            Wed        5 cals        10 cals

Milk           School1    B            Thurs      5 cals        10 cals

Milk           School1    M            Thurs      5 cals        10 cals

Milk           School2    B            Mon         5 cals        10 cals

Milk           School2    M            Mon         5 cals        10 cals

Milk           School2    B            Tues        5 cals        10 cals

Milk           School2    M            Tues       5 cals        10 cals

Milk           School2    B            Wed        5 cals        10 cals

Milk           School2    M            Wed        5 cals        10 cals

Milk           School2    B            Thurs      5 cals        10 cals

Milk           School2    M            Thurs      5 cals        10 cals

 

Woulld there be another way aside from

- assigning a macro and then callin a proc sort process for each school/day/meal/nutrient/food combo?

- and aside from sorting the dataset by a list of BY vars then creating a concatenated var for each school/day/meal/food nutrient combo then identifying each unique first combo as the one with the least of each nutrient level?

Thank you in advance

 

I use SAS 9.4


Accepted Solutions
Solution
‎03-10-2016 08:13 AM
Super User
Super User
Posts: 7,988

Re: Reorganizing input data set

[ Edited ]
Posted in reply to Maisha_Huq

Note, just seen your reply, still the below applies, just need to change the sum to min, then transpose back up.  I am leaving for the day now, so will see if I can get code later.

 

At a guess, it would help if you first normalised your data, i.e. have:

 

Fooditem    School       Meal      Day        Nutrient_Number   Cals  

Apple           School1    B            Mon       1                            5        

Apple           School1    B            Mon        2                          10

...

 

Then do you sums, example:

data have;
  length fooditem school meal day nutrient1 nutrient2 $50;
  input fooditem $ school $ meal $ day $ nutrient1 $ nutrient2 $;
datalines;
Apple           School1    B            Mon         5cals        10cals
Apple           School1    M            Mon         5cals        10cals
Apple           School1    B            Tues        5cals        10cals
Apple           School1    M            Tues       5cals        10cals
Apple           School1    B            Wed        5cals        10cals
;
run;

data inter (drop=nutrient1-nutrient2);
  set have;
  nutrient_number=1;
  nutrient=input(compress(nutrient1,,"kd"),best.);
  output;
  nutrient_number=2;
  nutrient=input(compress(nutrient2,,"kd"),best.);
  output;
run;

proc sql;
  create table STATS as
  select  FOODITEM,
          SCHOOL,
          MEAL,
          DAY,
          NUTRIENT_NUMBER,
          sum(NUTRIENT) as RES
  from    INTER
  group by FOODITEM,
            SCHOOL,
            MEAL,
            DAY,
            NUTRIENT_NUMBER;
quit;

 

View solution in original post


All Replies
Super User
Posts: 11,343

Re: Reorganizing input data set

Posted in reply to Maisha_Huq

Could you provide an example of the output you want?

 

Frequent Contributor
Posts: 131

Re: Reorganizing input data set

[ Edited ]

HI ballardw,

 

The output dataset would be school-level, one record per school ID and it would look something like this:

 

School_ID     Lunch_Day1_Lowestfat_fruit   Lunch_Day2_Lowestfat_fruit  Lunch_Day3_Lowestfat_fruit   ...

SCHID1          apple                                           pear                                              apple

SCHID2           apple                                           pear                                              pear

SCHID3           apple                                           apple                                             apple

 

 

 

where the other variables would be:

Lunch_Day1_Lowestfat_fruit

Lunch_Day2_Lowestfat_fruit

Lunch_Day3_Lowestfat_fruit  

Lunch_Day4_Lowestfat_fruit  

Lunch_Day5_Lowestfat_fruit  

 

Lunch_Day1_Lowestfat_grain

Lunch_Day2_Lowestfat_grain

Lunch_Day3_Lowestfat_grain

Lunch_Day4_Lowestfat_grain

Lunch_Day5_Lowestfat_grain

 

Lunch_Day1_Lowestfat_meat

Lunch_Day2_Lowestfat_meat

Lunch_Day3_Lowestfat_meat

Lunch_Day4_Lowestfat_meat

Lunch_Day5_Lowestfat_meat

 

 Let me know if I can clarify and thank you!

Super User
Posts: 11,343

Re: Reorganizing input data set

Posted in reply to Maisha_Huq

To get column headings such as LowestFat_fruit the input data set will require something that identifies each food as to its category: fruite, vegetable, meat, dairy or what ever categories you may be using AND identification as to which nutrient is being considered. Since your example data mentioned nutrient1 and nutrient2 - which is Fat? or other category?

 

If there are ties, since ALL of your example data had the exact same values does it matter which gets reported as the lowest?

Solution
‎03-10-2016 08:13 AM
Super User
Super User
Posts: 7,988

Re: Reorganizing input data set

[ Edited ]
Posted in reply to Maisha_Huq

Note, just seen your reply, still the below applies, just need to change the sum to min, then transpose back up.  I am leaving for the day now, so will see if I can get code later.

 

At a guess, it would help if you first normalised your data, i.e. have:

 

Fooditem    School       Meal      Day        Nutrient_Number   Cals  

Apple           School1    B            Mon       1                            5        

Apple           School1    B            Mon        2                          10

...

 

Then do you sums, example:

data have;
  length fooditem school meal day nutrient1 nutrient2 $50;
  input fooditem $ school $ meal $ day $ nutrient1 $ nutrient2 $;
datalines;
Apple           School1    B            Mon         5cals        10cals
Apple           School1    M            Mon         5cals        10cals
Apple           School1    B            Tues        5cals        10cals
Apple           School1    M            Tues       5cals        10cals
Apple           School1    B            Wed        5cals        10cals
;
run;

data inter (drop=nutrient1-nutrient2);
  set have;
  nutrient_number=1;
  nutrient=input(compress(nutrient1,,"kd"),best.);
  output;
  nutrient_number=2;
  nutrient=input(compress(nutrient2,,"kd"),best.);
  output;
run;

proc sql;
  create table STATS as
  select  FOODITEM,
          SCHOOL,
          MEAL,
          DAY,
          NUTRIENT_NUMBER,
          sum(NUTRIENT) as RES
  from    INTER
  group by FOODITEM,
            SCHOOL,
            MEAL,
            DAY,
            NUTRIENT_NUMBER;
quit;

 

Super User
Posts: 10,044

Re: Reorganizing input data set

Posted in reply to Maisha_Huq

How you define " the least amount of each nutrient type" ?

I think you need transpose your data like this:

 

    School     Meal      Day        AppleNutrient1   AppleNutrient2   MilkNutrient1   MilkNutrient2   

    School1    B            Mon         5 cals        10 cals     5 cals        10 cals 

     School1    M            Mon         5 cals        10 cals     5 cals        10 cals 

 

 

and then GRAYCODE() may do that trick .

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 258 views
  • 0 likes
  • 4 in conversation