BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Aman4SAS
Obsidian | Level 7

i have a dataset

yearmonth areacode barcode tot_qty tot_value

where year months are like : 201401

                                              201402

                                              201403

i need a output like

areacode barcode _201401_tot_qty _201401_tot_value and so on for all months

unique is [yearmonth+ areacode +barcode]

is there any possibilities to do it with proc transpose..

thanks in advance

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, yes you could do that by using id and idlabel:

proc transpose data =xyz;

     by abc;

     id year;

     idlabel year;

run;

I wouldn't recommend that approach however.  Its something I have seen a lot here on the forums, putting data in the variable name.  Its not good practice for several reasons:

- You lose the direct arraying functionality, ee.g col1 col2 col3 etc. can be referenced as array col{*}. 

- Also not having the number at the end means no list approaches, e.g. col1-col10.

- Your code will become big and unweildy very quickly as you try to remember all the variables, and your table will stretch out to unreadable lengths.

I would advise you re-assess what you are trying to do, if its for a report, then will anyone really look at a table with more than xx columns?  I personally wouldn't be scrolling across to review in Excel for instance.  If it is for further coding then there are better ways of doing it, i.e. keeping the normalised structure and using by group processing.  If for something else, consider re-doing the output or structure to better fit the purpose, maybe a yearly based report so you can fix 12 months.  Etc.

View solution in original post

15 REPLIES 15
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, yes you could do that by using id and idlabel:

proc transpose data =xyz;

     by abc;

     id year;

     idlabel year;

run;

I wouldn't recommend that approach however.  Its something I have seen a lot here on the forums, putting data in the variable name.  Its not good practice for several reasons:

- You lose the direct arraying functionality, ee.g col1 col2 col3 etc. can be referenced as array col{*}. 

- Also not having the number at the end means no list approaches, e.g. col1-col10.

- Your code will become big and unweildy very quickly as you try to remember all the variables, and your table will stretch out to unreadable lengths.

I would advise you re-assess what you are trying to do, if its for a report, then will anyone really look at a table with more than xx columns?  I personally wouldn't be scrolling across to review in Excel for instance.  If it is for further coding then there are better ways of doing it, i.e. keeping the normalised structure and using by group processing.  If for something else, consider re-doing the output or structure to better fit the purpose, maybe a yearly based report so you can fix 12 months.  Etc.

Aman4SAS
Obsidian | Level 7

Its not working as my date are like:

yearmonth areacode barcode tot_qty tot_value

201402 AR01 1101 67 123

201402 AR01 1201 67 123

201402 AR01 1301 67 123

201402 AR01 1401 67 123

201402 AR01 1501 67 123

201402 AR01 1601 67 123

201402 AR01 1701 67 123

201403 AR01 1101 67 123

201404 AR01 1201 67 123

201405 AR01 1301 67 123

201406 AR01 1401 67 123

201407 AR01 1501 67 123

201408 AR01 1601 67 123

201409 AR01 1701 67 123

my Desired output like :

areacode barcode _201402_tot_qty _201402_tot_value ...  so on for year

I have done it by array,,

is there any possiblities in proc transpose to do the same???

Tom
Super User Tom
Super User

I don't understand your desired structure because the way you describe it cannot work since the first 8 records all have the same values for YEARMONTH variable.

Aman4SAS
Obsidian | Level 7

Original data:(input data)

yearmonth areacode barcode tot_qty tot_value

201402 AR01 1101 67 123

201402 AR01 1201 67 123

201402 AR01 1301 67 123

201402 AR01 1401 67 123

201402 AR01 1501 67 123

201402 AR01 1601 67 123

201402 AR01 1701 67 123

201403 AR01 1101 67 123

201403 AR01 1201 67 123

201403 AR01 1301 67 123

201403 AR01 1401 67 123

201403 AR01 1501 67 123

201403 AR01 1601 67 123

201403 AR01 1701 67 123

....

till 201501

Desired ouput:

areacode barcode _201402_tot_qty _201402_tot_value _201403_tot_qty _201403_tot_value and so on .. till  _201501_tot_qty _201501_tot_

AR01 1101                67                      123                          67                             123

AR01 1201                67                     123                          67                              123

AR01 1301                67                      123                         67                              123

AR01 1401                67                     123                          67                              123

AR01 1501                67                     123                          67                              123

AR01 1601                67                     123                          67                              123

AR01 1701                67                     123                          67                              123

... many area and many barcode..

* number are not fixed.

art297
Opal | Level 21

I think you are only asking how to do something like:

proc sort data=have out=need;

  by areacode barcode yearmonth;

run;

proc transpose data=need out=need2 (drop=_:) prefix=tot_qty_;

  by areacode barcode;

  var tot_qty;

  id yearmonth;

run;

proc transpose data=need out=need3 (drop=_:) prefix=tot_value_;

  by areacode barcode;

  var tot_value;

  id yearmonth;

run;

data want;

  merge need2 need3;

  by areacode barcode;

run;

Aman4SAS
Obsidian | Level 7

This is what i have done, but its showing error that:  The ID value "_tot_qty_201402" occurs twice in the same by group

                                                                                    Too many bad by groups

Reeza
Super User

Then you have duplicates in your BY groups.

What do you want to happen when you have multiple records for an areacode/barcode combination?

Data duplicates;

set have;

by areacod barcode;

if not (first.barcode and last.barcode);

run;

Aman4SAS
Obsidian | Level 7

plz have a look on data:

Original data:(input data)

yearmonth areacode barcode tot_qty tot_value

201402 AR01 1101 67 123

201402 AR01 1201 67 123

201402 AR01 1301 67 123

201402 AR01 1401 67 123

201402 AR01 1501 67 123

201402 AR01 1601 67 123

201402 AR01 1701 67 123

201403 AR01 1101 67 123

201403 AR01 1201 67 123

201403 AR01 1301 67 123

201403 AR01 1401 67 123

201403 AR01 1501 67 123

201403 AR01 1601 67 123

201403 AR01 1701 67 123

....

till 201501

Desired ouput:

areacode barcode _201402_tot_qty _201402_tot_value _201403_tot_qty _201403_tot_value and so on .. till  _201501_tot_qty _201501_tot_

AR01 1101                67                      123                          67                             123

AR01 1201                67                     123                          67                              123

AR01 1301                67                      123                         67                              123

AR01 1401                67                     123                          67                              123

AR01 1501                67                     123                          67                              123

AR01 1601                67                     123                          67                              123

AR01 1701                67                     123                          67                              123

... many area and many barcode..

* number are not fixed.

Tom
Super User Tom
Super User

Try this.  But if you have duplicates you might need to make adjustments.

data have ;

  input yearmonth areacode $ barcode $ tot_qty tot_value ;

cards;

201402 AR01 1101 67 123

201402 AR01 1201 67 123

201402 AR01 1301 67 123

201402 AR01 1401 67 123

201402 AR01 1501 67 123

201402 AR01 1601 67 123

201402 AR01 1701 67 123

201403 AR01 1101 67 123

201403 AR01 1201 67 123

201403 AR01 1301 67 123

201403 AR01 1401 67 123

201403 AR01 1501 67 123

201403 AR01 1601 67 123

201403 AR01 1701 67 123

;;;;

proc transpose data=have out=middle ;

  by areacode barcode yearmonth notsorted;

  var tot_qty tot_value ;

run;

proc sort ;

  by areacode barcode yearmonth _name_ ;

run;

proc transpose data=middle out=want(drop=_name_) delim=_;

  by areacode barcode ;

  id yearmonth _name_;

  var col1 ;

run;

Aman4SAS
Obsidian | Level 7

here will miss col2 values (in case).

code still not working with my data.

Reeza
Super User

Because your sample data does not match your actual data.

Please run the code I posted earlier. Post how many records are in that table.

Aman4SAS
Obsidian | Level 7

near about 20m. , its just a sample of actual data.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Sorry, missed most of this conversation.  You are going to tranpose a table of 20million records, using year month as an identifier * 2 per year month, you do realise even if only one year is done that is 24 variables, have you considered how you are going to work with this further down the line?

Reeza
Super User

This is what your data actually looks like according to the errors you're mentioning.

How do you want to deal with the records in RED that have duplicate areacode/barcode combinations?

I agree with @RW9 though, this is a very bad way to store data.

See the Tidy Data paper from Hadley Wickam, though it references R, the concepts are the same.

http://vita.had.co.nz/papers/tidy-data.pdf


data have ;

  input yearmonth areacode $ barcode $ tot_qty tot_value ;

cards;

201402 AR01 1101 67 123

201402 AR01 1201 67 123

201402 AR01 1301 67 123

201402 AR01 1401 67 123

201402 AR01 1501 67 123

201402 AR01 1501 67 123

201402 AR01 1601 67 123

201402 AR01 1701 67 123

201403 AR01 1101 67 123

201403 AR01 1201 67 123

201403 AR01 1301 67 123

201403 AR01 1401 67 123

201403 AR01 1501 67 123

201403 AR01 1601 67 123

201403 AR01 1701 67 123

201403 AR01 1701 67 123

;;;;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 2322 views
  • 0 likes
  • 6 in conversation