BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
LL5
Pyrite | Level 9 LL5
Pyrite | Level 9

Hi. I have a dataset step1 shown as below. There are seven different time periods for unit1, unit2 and unit3, respectively.

I used Proc Transpose to show dates in columns and units in row. However, in the final table (step2), the dates were not arranged in order - 3/31/16 was showing in the last column. I personally suspect it might have something to do with the indicator.

 

Does anyone know why and how to fix it?

Thanks for any advices.

 

DATA STEP1;

INFILE CARDS MISSOVER;

INPUT DATE$ UNIT$ AMT IND$;

CARDS;

3/31/15 UNIT1 150 Y

3/31/16 UNIT1 200

4/30/16 UNIT1 300 Y

9/30/16 UNIT1 350 Y

12/31/16 UNIT1 400 Y

3/31/17 UNIT1 450 Y

4/30/17 UNIT1 500 Y

3/31/15 UNIT2 250

3/31/16 UNIT2 200 Y

4/30/16 UNIT2 400 Y

9/30/16 UNIT2 150 Y

12/31/16 UNIT2 200

3/31/17 UNIT2 450 Y

4/30/17 UNIT2 500 Y

3/31/15 UNIT3 550

3/31/16 UNIT3 200

4/30/16 UNIT3 300

9/30/16 UNIT3 350 Y

12/31/16 UNIT3 400 Y

3/31/17 UNIT3 450 Y

4/30/17 UNIT3 500 Y

;

RUN;

 

PROC SORT DATA=STEP1;

BY UNIT;

RUN;

 

PROC TRANSPOSE DATA=STEP1 OUT=STEP2;

WHERE IND = 'Y';

BY UNIT;

ID DATE;

VAR AMT;

RUN;

PROC PRINT DATA=STEP2;

RUN;

 

ObsUNIT_NAME_3/31/20154/30/20169/30/201612/31/20163/31/20174/30/20173/31/2016
1UNIT1AMT150300350400450500.
2UNIT2AMT.400150.450500200
3UNIT3AMT..350400450500.
1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

Create an ID with observations with all possible dates, for example.

 

DATA STEP1;
INFILE CARDS MISSOVER;
INPUT DATE:mmddyy. UNIT$ AMT IND$;
format date mmddyy10.;
CARDS;
3/31/15 UNIT1 150 Y
3/31/16 UNIT1 200
4/30/16 UNIT1 300 Y
9/30/16 UNIT1 350 Y
12/31/16 UNIT1 400 Y
3/31/17 UNIT1 450 Y
4/30/17 UNIT1 500 Y
3/31/15 UNIT2 250
3/31/16 UNIT2 200 Y
4/30/16 UNIT2 400 Y
9/30/16 UNIT2 150 Y
12/31/16 UNIT2 200
3/31/17 UNIT2 450 Y
4/30/17 UNIT2 500 Y
3/31/15 UNIT3 550
3/31/16 UNIT3 200
4/30/16 UNIT3 300
9/30/16 UNIT3 350 Y
12/31/16 UNIT3 400 Y
3/31/17 UNIT3 450 Y
4/30/17 UNIT3 500 Y
;
RUN;
proc sort data=step1;
   by unit;
   run;
proc print;
   run;
proc sort data=step1(keep=date) out=dates nodupkey;
   by date;
   run;
data step2;
   set dates(in=in1) step1;
   if in1 then ind = 'Y';
   run;
proc print;
   run;
proc transpose data=step2 out=step3(where=(not missing(unit))) prefix=D;
   by unit;
   where ind eq 'Y';
   var amt;
   id date;
   run;
proc print;
   run;

2017-06-01_16-30-55.png

 

View solution in original post

15 REPLIES 15
art297
Opal | Level 21

I would think that you're easiest solution, at this point, would be to add a data step like;

data step2 (drop=_name_);
  retain unit _3_31_15 _3_31_16 _4_30_16 _9_30_16 _12_31_16
                 _3_31_17 _4_30_17;
  set step2;
run;

Art, CEO, AnalystFinder.com

ballardw
Super User

DATES as CHARACTER values will only sort correctly when you use yyyymmdd values.

Consider:

DATA STEP1;
INFILE CARDS MISSOVER;
informat date mmddyy8.;
format date mmddyy8.;
INPUT DATE  UNIT$ AMT IND$;
CARDS;
3/31/15 UNIT1 150 Y
3/31/16 UNIT1 200
4/30/16 UNIT1 300 Y
9/30/16 UNIT1 350 Y
12/31/16 UNIT1 400 Y
3/31/17 UNIT1 450 Y
4/30/17 UNIT1 500 Y
3/31/15 UNIT2 250
3/31/16 UNIT2 200 Y
4/30/16 UNIT2 400 Y
9/30/16 UNIT2 150 Y
12/31/16 UNIT2 200
3/31/17 UNIT2 450 Y
4/30/17 UNIT2 500 Y
3/31/15 UNIT3 550
3/31/16 UNIT3 200
4/30/16 UNIT3 300
9/30/16 UNIT3 350 Y
12/31/16 UNIT3 400 Y
3/31/17 UNIT3 450 Y
4/30/17 UNIT3 500 Y
;
RUN;

proc tabulate data=step1;
  where ind='Y';
  class unit date;
  format date mmddyy8.;
  var amt;
  tables unit='',
         date=''*amt=''*max=''*f=best6.
         /misstext= ' ';
run;
data_null__
Jade | Level 19

Create an ID with observations with all possible dates, for example.

 

DATA STEP1;
INFILE CARDS MISSOVER;
INPUT DATE:mmddyy. UNIT$ AMT IND$;
format date mmddyy10.;
CARDS;
3/31/15 UNIT1 150 Y
3/31/16 UNIT1 200
4/30/16 UNIT1 300 Y
9/30/16 UNIT1 350 Y
12/31/16 UNIT1 400 Y
3/31/17 UNIT1 450 Y
4/30/17 UNIT1 500 Y
3/31/15 UNIT2 250
3/31/16 UNIT2 200 Y
4/30/16 UNIT2 400 Y
9/30/16 UNIT2 150 Y
12/31/16 UNIT2 200
3/31/17 UNIT2 450 Y
4/30/17 UNIT2 500 Y
3/31/15 UNIT3 550
3/31/16 UNIT3 200
4/30/16 UNIT3 300
9/30/16 UNIT3 350 Y
12/31/16 UNIT3 400 Y
3/31/17 UNIT3 450 Y
4/30/17 UNIT3 500 Y
;
RUN;
proc sort data=step1;
   by unit;
   run;
proc print;
   run;
proc sort data=step1(keep=date) out=dates nodupkey;
   by date;
   run;
data step2;
   set dates(in=in1) step1;
   if in1 then ind = 'Y';
   run;
proc print;
   run;
proc transpose data=step2 out=step3(where=(not missing(unit))) prefix=D;
   by unit;
   where ind eq 'Y';
   var amt;
   id date;
   run;
proc print;
   run;

2017-06-01_16-30-55.png

 

LL5
Pyrite | Level 9 LL5
Pyrite | Level 9
Thanks art297, should I run this data step at last? Seems I still have the same result
art297
Opal | Level 21

Yes! If you run it after ALL of your other steps, and they run without error, by definition it will re-order your variables. If it doesn't, you would have to post your log in order for any of us to identify why it didn't work.

 

Art, CEO, AnalystFinder.com

 

LL5
Pyrite | Level 9 LL5
Pyrite | Level 9

Hi Art297. Thanks for your method. I run the data step after all the procedures, but seems I got warning in the lob. see below - it says the date variables are uninitialized. Perhaps I should have added a prefix in the Proc Transpose?

 

 

1 ;*';*";*/;quit;run;

2 OPTIONS PAGENO=MIN;

3 %LET _CLIENTTASKLABEL='Program';

4 %LET _CLIENTPROJECTPATH='';

5 %LET _CLIENTPROJECTNAME='';

6 %LET _SASPROGRAMFILE=;

7

8 ODS _ALL_ CLOSE;

9 OPTIONS DEV=ACTIVEX;

10 GOPTIONS XPIXELS=0 YPIXELS=0;

11 FILENAME EGSR TEMP;

12 ODS tagsets.sasreport13(ID=EGSR) FILE=EGSR

13 STYLE=HtmlBlue

14 STYLESHEET=(URL="file:///C:/Program%20Files%20(x86)/SAS94/x86/SASEnterpriseGuide/7.1/Styles/HtmlBlue.css")

15 NOGTITLE

16 NOGFOOTNOTE

17 GPATH=&sasworklocation

18 ENCODING=UTF8

19 options(rolap="on")

20 ;

NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR

21

22 GOPTIONS ACCESSIBLE;

23

24 data step2 (drop=_:);

25 retain unit _3_31_15 _3_31_16 _4_30_16 _9_30_16 _12_31_16

26 _3_31_17 _4_30_17;

27 set STEP2;

28 run;

NOTE: Variable _3_31_15 is uninitialized.

NOTE: Variable _3_31_16 is uninitialized.

NOTE: Variable _4_30_16 is uninitialized.

NOTE: Variable _9_30_16 is uninitialized.

NOTE: Variable _12_31_16 is uninitialized.

NOTE: Variable _3_31_17 is uninitialized.

NOTE: Variable _4_30_17 is uninitialized.

WARNING: The variable '_:'n in the DROP, KEEP, or RENAME list has never been referenced.

NOTE: There were 3 observations read from the data set WORK.STEP2.

NOTE: The data set WORK.STEP2 has 3 observations and 8 variables.

NOTE: Compressing data set WORK.STEP2 increased size by 100.00 percent.

Compressed is 2 pages; un-compressed would require 1 pages.

NOTE: DATA statement used (Total process time):

real time 0.01 seconds

user cpu time 0.01 seconds

system cpu time 0.00 seconds

memory 934.37k

OS Memory 20640.00k

Timestamp 06/02/2017 02:42:03 PM

Step Count 9 Switch Count 60

Page Faults 0

Page Reclaims 164

Page Swaps 0

Voluntary Context Switches 188

Involuntary Context Switches 0

Block Input Operations 0

Block Output Operations 272

 

2 The SAS System 14:38 Friday, June 2, 2017

 

29

30

31 GOPTIONS NOACCESSIBLE;

32 %LET _CLIENTTASKLABEL=;

33 %LET _CLIENTPROJECTPATH=;

34 %LET _CLIENTPROJECTNAME=;

35 %LET _SASPROGRAMFILE=;

36

37 ;*';*";*/;quit;run;

38 ODS _ALL_ CLOSE;

39

40

41 QUIT; RUN;

42

 

art297
Opal | Level 21

There was an error in the code I posted that I've since corrected. Try the following:

DATA STEP1;
  INFILE CARDS MISSOVER;
  INPUT DATE$ UNIT$ AMT IND$;
  CARDS;
3/31/15 UNIT1 150 Y
3/31/16 UNIT1 200
4/30/16 UNIT1 300 Y
9/30/16 UNIT1 350 Y
12/31/16 UNIT1 400 Y
3/31/17 UNIT1 450 Y
4/30/17 UNIT1 500 Y
3/31/15 UNIT2 250
3/31/16 UNIT2 200 Y
4/30/16 UNIT2 400 Y
9/30/16 UNIT2 150 Y
12/31/16 UNIT2 200
3/31/17 UNIT2 450 Y
4/30/17 UNIT2 500 Y
3/31/15 UNIT3 550
3/31/16 UNIT3 200
4/30/16 UNIT3 300
9/30/16 UNIT3 350 Y
12/31/16 UNIT3 400 Y
3/31/17 UNIT3 450 Y
4/30/17 UNIT3 500 Y
;
RUN;

PROC SORT DATA=STEP1;
  BY UNIT;
RUN;

PROC TRANSPOSE DATA=STEP1 OUT=STEP2;
  WHERE IND = 'Y';
  BY UNIT;
  ID DATE;
  VAR AMT;
RUN;

data step2 (drop=_name_);
  retain unit _3_31_15 _3_31_16 _4_30_16 _9_30_16 _12_31_16
                 _3_31_17 _4_30_17;
  set step2;
run;

Art, CEO, AnalystFinder.com

LL5
Pyrite | Level 9 LL5
Pyrite | Level 9
Thanks data_null_ for this method. I'm trying to understand why Proc transpose has this weird result. Do you have any advices?
data_null__
Jade | Level 19

@LL5 wrote:
Thanks data_null_ for this method. I'm trying to understand why Proc transpose has this weird result. Do you have any advices?

 

What weird result are you referring to?  PROC TRANSPOSE creates new variables from ID values in the order they are encountered.  What else did you expect?  My advice is create a dummy ID to establish the order you seek.  That is what I posted above.

LL5
Pyrite | Level 9 LL5
Pyrite | Level 9
I like your advices and it workd perfectly. but I didn't understand why the my original ID value (dates)was not arranged in order.
LL5
Pyrite | Level 9 LL5
Pyrite | Level 9
Thanks ballarwd. Is the date not arranged/sorted in order due to format? Or because of the missing indicator? Really appreciated for your advices.
ballardw
Super User

When your variable is character then the character sort order is used. Which compares the first character in one value to the first in the second. So 12/16/2017 comes before 3/10/2015 because 1 comes before 3. If your data only represents one year then if you force character date-appearing values to have leading zeroes then 03/ compared to 12/ will sort correctly but with multiple years then you have all the january values together (which may be desireable).

 

Generally if a value is supposed to be a date, a time or datetime it is worth creating the correct SAS date/ time/ datetime value. Not only do the dates then sort correctly but then they are more useable in graphs (character values don't work for axis very well such as scatter or series) and you can do summaries by time intervals just by changing the format. Consider:

 

DATA STEP1;
INFILE CARDS MISSOVER;
informat date mmddyy8.;
format date mmddyy8.;
INPUT DATE  UNIT$ AMT IND$;
CARDS;
3/31/15 UNIT1 150 Y
3/31/16 UNIT1 200
4/30/16 UNIT1 300 Y
9/30/16 UNIT1 350 Y
12/31/16 UNIT1 400 Y
3/31/17 UNIT1 450 Y
4/30/17 UNIT1 500 Y
3/31/15 UNIT2 250
3/31/16 UNIT2 200 Y
4/30/16 UNIT2 400 Y
9/30/16 UNIT2 150 Y
12/31/16 UNIT2 200
3/31/17 UNIT2 450 Y
4/30/17 UNIT2 500 Y
3/31/15 UNIT3 550
3/31/16 UNIT3 200
4/30/16 UNIT3 300
9/30/16 UNIT3 350 Y
12/31/16 UNIT3 400 Y
3/31/17 UNIT3 450 Y
4/30/17 UNIT3 500 Y
;
RUN;

proc tabulate data=step1;
  title 'Annual Sum';
  class date unit;
  format date year4.;
  var amt;
  table unit, date*amt*sum;
run;title;

proc tabulate data=step1;
  title 'Monthly Sum';
  class date unit;
  format date yymon7.;
  var amt;
  table unit, date*amt*sum;
run;title;

proc tabulate data=step1;
  title 'Cross year Monthly Sum';
  class date unit;
  format date monname10.;
  var amt;
  table unit, date*amt*sum;
run;title;

With character values you would have to go through a lot of extra work to pull the character apart and create the correct period of interest and add variables to do it.

 

Coupled with the ability to create custom date formats with Proc Format and the functions to determine intervals between dates or to increment dates (or times or datetimes) the advantages really go to the proper values.

LL5
Pyrite | Level 9 LL5
Pyrite | Level 9

Ballardw,

 

Thanks a lot for your explanations and showing me the examples of using date format in Proc Tabulate to get different result! This apporach is eye opening and these techiques are extremely helpful to me!

 

And sorry for my stubbornness as I need Proc Transpose instead of Proc Tabulate to achieve my end result, so I applied the date format in my original data step (step 1) and run the Proc Sort and Proc Transpose again, I am still seeing the same result – 03/31/2016 came at last. I wonder why…

 

Anyway I could use the Proc Tabulate result or move around the columns from the Proc Transpose result in an excel spreadsheet, but If you can give me additional advices, that’d be great.

 

DATA STEP1;

INFILE CARDS MISSOVER;

INFORMAT DATE mmddyy10.;

FORMAT DATE mmddyy10.;

INPUT DATE UNIT$ AMT IND$;

CARDS;

3/31/15 UNIT1 150 Y

3/31/16 UNIT1 200

4/30/16 UNIT1 300 Y

9/30/16 UNIT1 350 Y

12/31/16 UNIT1 400 Y

3/31/17 UNIT1 450 Y

4/30/17 UNIT1 500 Y

3/31/15 UNIT2 250

3/31/16 UNIT2 200 Y

4/30/16 UNIT2 400 Y

9/30/16 UNIT2 150 Y

12/31/16 UNIT2 200

3/31/17 UNIT2 450 Y

4/30/17 UNIT2 500 Y

3/31/15 UNIT3 550

3/31/16 UNIT3 200

4/30/16 UNIT3 300

9/30/16 UNIT3 350 Y

12/31/16 UNIT3 400 Y

3/31/17 UNIT3 450 Y

4/30/17 UNIT3 500 Y

;

RUN;

 

PROC SORT DATA=STEP1;

BY UNIT DATE;

RUN;

 

PROC TRANSPOSE DATA=STEP1 OUT=STEP2;

WHERE IND = 'Y';

BY UNIT;

ID DATE;

VAR AMT;

RUN;

 

PROC PRINT DATA=STEP2;

RUN;

Obs

UNIT

_NAME_

03/31/2015

04/30/2016

09/30/2016

12/31/2016

03/31/2017

04/30/2017

03/31/2016

1

UNIT1

AMT

150

300

350

400

450

500

.

2

UNIT2

AMT

.

400

150

.

450

500

200

3

UNIT3

AMT

.

.

350

400

450

500

.

 

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 7871 views
  • 6 likes
  • 5 in conversation