BookmarkSubscribeRSS Feed
znhnm
Quartz | Level 8

I have a SAS datasets. I have 18 month columns:

    Jul21  Aug21 Sep21 Oct21 Nov21 Dec21 Jan22 Feb22 Mar22 May22 Apr22 Jun22 Jul22 Aug22 Sep22 Oct22 Nov22

    Dec22

Each of them has values between 1-100 for each record.

I also have a reference_month column which has month values in numbers between 1 to 12.

I want to create 7 new columns score0 to score6. For example score0, score1, etc... They will get their values between the column's values starting from Jan22 to Dec22, depending on the reference_month.

For example, 
If reference_month is 1, score0 column will take the value of Jan22 column.
If reference_month is 2, score0 column will take the value of Feb22 column.
If reference_month is 3, score0 column will take the value of Mar22 column.
...
If reference_month is 12, score0 column will take the value of Dec22 column.

Then the other 6 score columns (score1, score2, ..., score6) will take he value of the previous month's.

For example,
If reference_month is 1,
score1 column will take the value of Dec21 column
score2 column will take the value of Nov21 column
score3 column will take the value of Oct21 column
score4 column will take the value of Sep21 column
score5 column will take the value of Aug21 column
score6 column will take the value of Jul21 column

How can I do this? Thank you!

8 REPLIES 8
PaigeMiller
Diamond | Level 26

Please show us a portion of this data. Please provide the data as working SAS data step code, which you can type in yourself, or follow these instructions. Screen captures and file attachments are not acceptable.

--
Paige Miller
Astounding
PROC Star

To write any kind of program, we need to know the names of the variables.  When you say column 01-22, what is the variable name for that column?

znhnm
Quartz | Level 8
Hi, they are the actual column names. For example, column "01-22" referring to the first month of 2022...
PaigeMiller
Diamond | Level 26

As I requested, SHOW US THE DATA.

 

Please follow the instructions I gave earlier.

 

Hi, they are the actual column names. For example, column "01-22" referring to the first month of 2022...

 

Are those the column names in Excel?

--
Paige Miller
znhnm
Quartz | Level 8

Hi, sure!

This should create a similar structure data I have. I edited my question so that the column names in the question and in the code are consistent.


data dataset;
do i = 1 to 10;
reference_month = int(12*ranuni(123));
Jul21 = int(100*ranuni(123));
Aug21 = int(100*ranuni(123));
Sep21 = int(100*ranuni(123));
Oct21 = int(100*ranuni(123));
Nov21 = int(100*ranuni(123));
Dec21 = int(100*ranuni(123));
Jan22 = int(100*ranuni(123));
Feb22 = int(100*ranuni(123));
Mar22 = int(100*ranuni(123));
May22 = int(100*ranuni(123));
Apr22 = int(100*ranuni(123));
Jun22 = int(100*ranuni(123));
Jul22 = int(100*ranuni(123));
Aug22 = int(100*ranuni(123));
Sep22 = int(100*ranuni(123));
Oct22 = int(100*ranuni(123));
Nov22 = int(100*ranuni(123));
Dec22 = int(100*ranuni(123));
output;
end;
run;
ballardw
Super User

If you want to really reference things by ORDER , i.e. "Previous" column, then save yourself a lot of headaches and name the variables in order. Col1, Col2, Col3. Or Month1, Month2, Month3.

 

Does this mean that your data will continue to add columns in the future? If so, it is time to quit thinking like a spreadsheet user as that is real inefficient use of SAS.

To "reference" a variable of a group in order than an ARRAY is typically the solution. ARRAY statement defines a list of variables in an order that you can then address which ever variable (not column, variable) that you want.

data dummy;
  set dataset;
  array M (*) Tem21 Aug21; /*continue listing the variables in order
     I'm too lazy to use bad names*/
  want = m[reference_month];
run; 

An array is structure that allows using an index value(the proper name of the value you are calling "reference_month" as that is a specific use a range of values). The name M is arbitrary, it could be any valid name that does not repeat that of any actual variable. All of the values in an array must be of the same type. The value inside the parentheses or brackets after the name must be an integer to be useful and correspond the the number of variables in the array. There are some details involved. Look up the documentation as this a common programming tool and has a lots of options that get involved that I am not going to repeat here.

 

Actually the array above should be able to use a list of the form with -- , yes two dashes, for adjacent variables in a data set.

But you have not actually provided data in a usable form so I am not actually sure it will work with your data.

Array m(*) Tem21 -- Dec22;

 

 

PaigeMiller
Diamond | Level 26

I personally refuse to work with data in such a horrible (spreadsheet) format. I recommend you do that too. Apparently, @ballardw also recommends you do that.

 

I would transpose the data so that calendar information is now contained in a variable.

 

data a;
do i = 1 to 10;
reference_month = int(12*ranuni(123));
Jul21 = int(100*ranuni(123));
Aug21 = int(100*ranuni(123));
Sep21 = int(100*ranuni(123));
Oct21 = int(100*ranuni(123));
Nov21 = int(100*ranuni(123));
Dec21 = int(100*ranuni(123));
Jan22 = int(100*ranuni(123));
Feb22 = int(100*ranuni(123));
Mar22 = int(100*ranuni(123));
May22 = int(100*ranuni(123));
Apr22 = int(100*ranuni(123));
Jun22 = int(100*ranuni(123));
Jul22 = int(100*ranuni(123));
Aug22 = int(100*ranuni(123));
Sep22 = int(100*ranuni(123));
Oct22 = int(100*ranuni(123));
Nov22 = int(100*ranuni(123));
Dec22 = int(100*ranuni(123));
output;
end;
run;

proc transpose data=a out=b prefix=value;
    by i reference_month;
    var jul21--dec22;
run;
data c;
    set b;
    which_month=input(_name_,monyy.);
    prev_col1=lag(value1);
    prev_col2=lag2(value1);
    prev_col3=lag3(value1);
    prev_col4=lag4(value1);
    prev_col5=lag5(value1);
    prev_col6=lag6(value1);
    if month(which_month)=reference_month then do;
    	score0=value1;
    	score1=prev_col1;
    	score2=prev_col2;
    	score3=prev_col3;
    	score4=prev_col4;
    	score5=prev_col5;
    	score6=prev_col6;
	end;
	if n(of score:)>0 then output;
    format which_month monyy5.;
    drop jul21--dec22 _name_ prev_:;
run;

 

Since your original data has Apr22 and May 22 in the wrong order, I haven't tried to fix that at all, but then of course the results have wrong ordering. I leave it as an assignment for you to fix.

 

Also, @znhnm , please help us (which helps you get faster and better answers) by always providing example data, as working SAS data step code. Don't wait for us to ask, just do it. Thank you.

--
Paige Miller
Tom
Super User Tom
Super User

To reference variables by position use ARRAYs.  That is what they are for.  You can reference a variable by its position in the ARRAY.

data want ;
  set have;
  array values Jul21 Aug21 Sep21 Oct21 Nov21 Dec21 Jan22 Feb22 Mar22 
               May22 Apr22 Jun22 Jul22 Aug22 Sep22 Oct22 Nov22 Dec22 ;
  array score[0:6] score0-score6 ;
  do offset = 0 to 6 while (reference_month-offset>0);
    score[offset]=value[reference_month-offset];
  end;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 741 views
  • 3 likes
  • 5 in conversation