BookmarkSubscribeRSS Feed
Gladis6680
Obsidian | Level 7

Hi, I have a data with Produce as my BY Variable /unique variable

My goal is to give potential Amounts to Produce that does not have an amount.

 

I started off by creating another data set with Produce that have an amount and then later thought of merging with the main data, but my dilemma is I have 3 Chicken with 2 different Manufacturer and 2 brands giving 3 amounts, similarly for MILK.

 

Ideally I would like to show amounts of Chicken with the corresponding Manufacturer, Brand and Amount. 

 

Produce Manufacturer Brand Amount
Beans CLICxxx Clicxxxx $0
Kidneys Pantry Pantry Shelf $0
Brussel Sprouts Limson Limson $0
Cabbage Bear Bear Creek $0
Chicken Grannys  Grannys Poultry Cooperative $0
Milk Global Global Direct $0
Beans Ital Ital-Pasta $4
Kidneys Primo Primo $5
Brussel Sprouts Alasko Alasko $6
Cabbage Keybrand Laurie's $7
Chicken Galco Flamingo $2
Chicken Maple   $3
Chicken Maple Kitchen Essential $6
Milk Agropur Sealtest $5
Milk Saputo Neilson $6

 

7 REPLIES 7
Reeza
Super User
Imputation for missing values - you do need to specify the rules. Personally, If the brand or manufacture were the same I'd take one of those, likely Brand first, then manufacturer. And then if it was still missing I'd fill in with the Median value for that item instead.
Gladis6680
Obsidian | Level 7

Thank you for your response

Can you please explain what you mean with an example

 

Gladis6680
Obsidian | Level 7

Hi, I feel this subject may not be the appropriate title.sorry

I thought I would explain my concern once again. I have changed the data slightly. Produce is my unique variable/by variable

ProduceManufacturerBrandAmount
BeansCLICxxxClicxxxx$0
KidneysPantryPantry Shelf$0
Brussel SproutsLimsonLimson$0
CabbageBearBear Creek$0
ChickenGrannys Grannys Poultry Cooperative$0
MilkGlobalGlobal Direct$0
BeansItalItal-Pasta$4
KidneysPrimoPrimo$5
Brussel SproutsAlaskoAlasko$6
CabbageKeybrandLaurie's$7
ChickenGalcoFlamingo$2
ChickenMapleSofina$3
ChickenMapleKitchen Essential$6
MilkAgropurSealtest$5
MilkSaputoNeilson$6

What I would like is the table below:

 

ProduceManufacturerBrandAmountManufacturer_1New_Brand_1Potential_Amount_1Manufacturer_2New_Brand_2Potential_Amount_2Manufacturer_3New_Brand_3Potential_Amount_3
BeansCLICxxxClicxxxx$0ItalItal-Pasta$4      
KidneysPantryPantry Shelf$0PrimoPrimo$5      
Brussel SproutsLimsonLimson$0AlaskoAlasko$6      
CabbageBearBear Creek$0KeybrandLaurie's$7      
ChickenGrannys Grannys Poultry Cooperative$0GalcoFlamingo$2MapleSofina$3MapleKitchen Essential$6
MilkGlobalGlobal Direct$0AgropurSealtest$5SaputoNeilson$6   

Basically for the same Produce variable that does not have an Amount, I would like to give Potential amounts with the corresponding Manufacturer and Brand and if there are multiple Manufactuers, Brands and Amounts for the Produce variable...in this case Chicken and Milk, I would like to see those potential Manufacturers, Brands, Amounts.

Reeza
Super User
You'll have to merge in your data. I'd probably recommend doing all the merges at once and then capture the values into a row. Then using the logic to determine which values you want to use to fill it in.

If you're having trouble with this, rest assured it's not a beginner level problem and tbh the biggest difficulty is setting the logic, not necessarily the programming.
Gladis6680
Obsidian | Level 7

Thank you, is it possible for you to help me with a sample code. I think it will be easier for me to understand

Reeza
Super User

Untested but this should get you started. You'll end up with a data set, step 3 that you can then decide how to impute the missing data. 

Note that you can end up with duplicate matches so you'll need to decide how to filter/limit when you have multiple matches. You may want to add a key to your 'have' table to uniquely identify the rows. 

 

DATA HAVE;
INFILE CARDS DLM=',' TRUNCOVER;
LENGTH PRODUCE MANUFACTURER BRAND $20;
INPUT Produce $ Manufacturer  $  Brand $  Amount;
CARDS;
Beans,   CLICxxx, Clicxxxx,    0
Kidneys, Pantry,  Pantry Shelf,    0
Brussel Sprouts, Limson,  Limson,  0
Cabbage, Bear,    Bear Creek,  0
Chicken, Grannys,     Grannys Poultry Cooperative, 0
Milk,    Global,  Global Direct,   0
Beans,   Ital,    Ital-Pasta,  4
Kidneys, Primo,   Primo,   5
Brussel Sprouts, Alasko,  Alasko,  6
Cabbage, Keybrand,    Laurie's,    7
Chicken, Galco,   Flamingo,    2
Chicken, Maple,    ,   3
Chicken, Maple,   Kitchen Essential,   6
Milk,    Agropur, Sealtest,    5
Milk,    Saputo,  Neilson, 6
;;;;


*MERGE IN BY MANUFACTURER AND BRAND;
proc sql;
create table step1 as
select t1.*, t2.amount as amount1
from have as t1
left join have as t2
on t1.produce = t2.produce 
and t1.manufacturer=t2.manufacturer
and t1.brand=t2.brand;
quit;

*MERGE IN BY BRAND;
proc sql;
create table step2 as
select t1.*, t2.amount as amount2
from step1 as t1
left join have as t2
on t1.produce = t2.produce 
and t1.brand=t2.brand;
quit;

*MERGE IN BY Manufacturer;
proc sql;
create table step3 as
select t1.*, t2.amount as amount3
from step2 as t1
left join have as t2
on t1.produce = t2.produce 
and t1.manufacturer=t2.manufacturer;
quit;


/*insert data step to fill in logic here*/

data want;
set step3;

if amount = 0 the coalesce(amount1, amount2, amount3);

run;

@Gladis6680 wrote:

Thank you, is it possible for you to help me with a sample code. I think it will be easier for me to understand


 

Gladis6680
Obsidian | Level 7

Thank you very much, I forgot to mention, that ..yes i am still figuring out which logic to use first(I may use the min AMOUNT, or max AMOUNT) for duplicates (like for Chicken,which Manufacture and which Brand I should use) .....but for the time being I would like the output to have all the the Chicken - Manufacture,  Brand and Amount side by side like Manufacture_1 Brand _1 Manufacture_2 Brand_2.

Also I am so sorry is it also possible to give this to me in a DATA STEP as I not familiar with SQL.

Thank for your help

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 930 views
  • 0 likes
  • 2 in conversation