Help using Base SAS procedures

Creating Rows w/ null values based on multi-year relationships

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 11
Accepted Solution

Creating Rows w/ null values based on multi-year relationships

[ Edited ]

Hello,

 

I have a dataset of about 50,000 datapoints that can be summarized as 

 

Year         ID           ITEM         Value

2015        223          cars            5

2015        223          hats            10

2015        224          cars            12

2015        224          bikes            2

2014        223          cars             4

2014        223          bikes            1

2014        224          cars              5

2014        224         sneakers       3

 

 

I would like to do comparisons of 2015 values to 2014 values for each ID and Item. So for ID #223 compare the number of cars in 2015 and 2014, easy enough because they both exist. But to compare the number of hats in 2015 to hats in 2014 (when there were none) is giving me problems.

 

I need to create a 2014 row for ID#223 for hats and place a 0 value with it. Essentially I need a set of code that will go through all of the IDs for 2014 and 2015 and create rows with 0 values for all the items they do not share between years.

 

It is not feasible (in terms of time it would take) to create an entire dataset with null values + all the item names and then merge them without duplicates. The time it would take to create that null value dataset would pretty much be as long as it would take me to do this whole thing by hand one at a time.

 

Thank you for your help in advance.


Accepted Solutions
Solution
‎01-06-2017 01:13 PM
Respected Advisor
Posts: 3,777

Re: Creating Rows w/ null values based on multi-year relationships

[ Edited ]

This keeps the item expansion within each ID.

 

data have;
   input Year ID $ ITEM $ Value;
   cards;
2015    223     cars      5
2015    223     hats      10
2015    224     cars      12
2015    224     bikes      2
2014    223     cars       4
2014    223     bikes      1
2014    224     cars       5
2014    224     sneakers    3
;;;;
   run;
proc sort data=have;
   by id;
   run;
proc print;
   run;
proc summary data=have nway completetypes;
   by id;
   class year item;
   freq value;
   output out=expanded;
   run;
proc print;
   run;

Capture.PNG 

View solution in original post


All Replies
Solution
‎01-06-2017 01:13 PM
Respected Advisor
Posts: 3,777

Re: Creating Rows w/ null values based on multi-year relationships

[ Edited ]

This keeps the item expansion within each ID.

 

data have;
   input Year ID $ ITEM $ Value;
   cards;
2015    223     cars      5
2015    223     hats      10
2015    224     cars      12
2015    224     bikes      2
2014    223     cars       4
2014    223     bikes      1
2014    224     cars       5
2014    224     sneakers    3
;;;;
   run;
proc sort data=have;
   by id;
   run;
proc print;
   run;
proc summary data=have nway completetypes;
   by id;
   class year item;
   freq value;
   output out=expanded;
   run;
proc print;
   run;

Capture.PNG 

Occasional Contributor
Posts: 11

Re: Creating Rows w/ null values based on multi-year relationships

Thank you so much for your help.

 

After running into some problems with zero values and negatives in the original dataset (which i ended up removing), it has all come together perfectly!

 

Thanks again!

 

 

Super User
Posts: 5,081

Re: Creating Rows w/ null values based on multi-year relationships

50,000 observations is small enough that you can process it a couple of times.  For example:

 

proc sort data=have out=just_2_years;

   by id item year;

  where year in (2014, 2015);

run;

 

data want;

set have;

by id item;

output;

if first.item and year=2015 then do;

   value=0;

   year=2014;

   output;

end;

else if last.item and year=2014 then do;

   value=0

   year=2015;

   output;

end;

run;

 

The observations might need to be sorted again ... it depends on how you will conduct the analysis from that point.

Super User
Posts: 17,801

Re: Creating Rows w/ null values based on multi-year relationships

If you only have two years why bother. 

 

You don't show how you want the output.

data have;
input Year         ID    $       ITEM   $       Value;
cards;
2015        223          cars            5
2015        223          hats            10
2015        224          cars            12
2015        224          bikes            2
2014        223          cars             4
2014        223          bikes            1
2014        224          cars              5
2014        224         sneakers       3
;
run;

proc sort data=have; by id item year; run;

data want;
set have;
by id item;
diff=dif(value);
if first.item or (first.item and last.item) then diff=.;
run;

Or flip the data to a wide format, usually not recommended, but in this case it makes sense to me.

 

proc transpose data=have out=flipped prefix=Y;
by id item;
var value;
id year;
run;

data want2;
set flipped;
diff= Y2015-Y2014;
run;

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 228 views
  • 1 like
  • 4 in conversation