BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JNWong
Calcite | Level 5

Hello,

    

 I have a panel data set with nearly 30 variables in which code indicating the stock code and year indicating the time period from 2009 to 2015. I wanna make some adjustment about it. 28 of them are dummy variable which only take the value of 0 and 1. Now i have to change the definition for the dummy variables and make new 28 variables. The new rule  is that  for one stock code, if the variable take the value of 1 in any of year before 2012 then it would be 1 for all years (from  2009 to 2015).

   

the data set like 

 

data q1;
input a b c d ;
datalines;
20002 2009 0 1
20002 2010 1 0
20002 2013 0 1
20002 2014 0 0
30002 2009 0 1
30002 2010 0 0
30002 2013 1 1
30002 2014 1 0
;
run; 

 

and I want to get a data set like that 

Snipaste_2018-11-08_19-56-42.png

 

I have tried the code below 

data c4;
set c3;
if c = 1 and year < 2012;
run;
proc sort data = c4 nodupkey; by a ; run;
data c4; set c4; rename c = c_n; run;
proc sql; create table c5 as select c3.*,c4.c_n from c3 left join c4 on c3.a = c4.a; quit;
data c3; set c3; if c_n = . then c_n = 0; else c_n = c_n; run;

It works, but I have to deal with 28 variables using the same code and I am wondering  if I can get the data set without wirting those duplicate codes.

 

Thanks,

Jane

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

UNTESTED CODE

proc summary data=have(where=(year<2012)) nway;
    class stock_code;
    var var-var28;
    output out=_max_ max=max1-max28;
run;

data want;
    merge _max_ Have;
    by stock_code;
    array max max1-max28;
    array var var1-var28;
    do i=1 to 28;
        if max(i)=1 and year>=2009 then var(i)=1;
    end;
    drop max1-max28;
run;
--
Paige Miller

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

So, first of all, we would all benefit from meaningful variable names in you data set. So, instead of variable A, you could call it STOCK_CODE and instead of variable B you could call it YEAR. That would help greatly.

 

The new rule  is that  for one stock code, if the variable take the value of 1 in any of year before 2012 then it would be 1 for all years (from  2009 to 2015).

 

This doesn't appear to be what I see in the desired output data set. When A=20002, and B=2009, I see the value of 1 in the input data for D. Then, according to your rule, you want to change D to be equal to 1 for all years from 2009 to 2015, but that is not what i see in the output data set, so there would need to be additional explanation on this matter.

 

Eventually, you are probably going to need to use PROC SUMMARY to determine if there is a 1 in any year before 2012 for each of your 28 variables, and then a data step using arrays (not macros) to change the values. To do this, if you name your variables as CODE1 through CODE28 (or something like that), then your coding becomes much much simpler.

--
Paige Miller
JNWong
Calcite | Level 5

Thanks and I am sorry for the strange variable names and some typo mistakes.

I would change the example data like this;

data p1;
input stock_code year var1 var2 var3;
datalines;
20002 2009 1 0 1
20002 2010 0 1 0
20002 2011 0 0 0
20002 2013 0 0 0
20002 2014 0 0 0
20003 2009 0 0 0
20003 2010 0 0 0
20003 2013 1 1 1
20003 2014 1 1 1
;
run; 

 I have tried the code using array like this

data want;
set have;
array b(28) var1-var28;
array a(28) n_var1-n_var28;
do i = 1 to 28;
if b(i) = 1 and year < 2012 then a(i) = 1;
else a(i) = 0;
end;
run;

seems I have made some mistake with the if conditions. 

 

and I  want to get the data set like that:

Snipaste_2018-11-08_22-18-10.png

PaigeMiller
Diamond | Level 26

UNTESTED CODE

proc summary data=have(where=(year<2012)) nway;
    class stock_code;
    var var-var28;
    output out=_max_ max=max1-max28;
run;

data want;
    merge _max_ Have;
    by stock_code;
    array max max1-max28;
    array var var1-var28;
    do i=1 to 28;
        if max(i)=1 and year>=2009 then var(i)=1;
    end;
    drop max1-max28;
run;
--
Paige Miller
Astounding
PROC Star

Let's try something similar to what you already programmed:

 

data want;
array b {28} var1-var28;
array a {28} n_var1-n_var28;
do k=1 to 28;
   a{k}=0;
end;
do until (last.stock_code);
   set have;
   by stock_code;
   if year < 2012 then do k=1 to 28;
      if b{k}=1 then a{k}=1;
   end;
end;
do until (last.stock_code);
   set have;
   by stock_code;
   output;
end;
run;

The top loop sets the new variables.  Then the bottom loop reads all observations for the stock codes and outputs them (along with the new variables).

JNWong
Calcite | Level 5

Thanks for your code and explanation, I will try it latter : )

Cynthia_sas
SAS Super FREQ

Hi:

  Also confusing is why you are creating table c5 in the PROC SQL step, but then are using C3 in the final DATA step.

Cynthia

JNWong
Calcite | Level 5
Thanks and I should modify the data set name to C5 in the final DATA step.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 922 views
  • 0 likes
  • 4 in conversation