G'day!
I hope you can help!
I have the following sample data in this structure and format (but lots more spanning more years):
DATA WORK.TEST_0000;
LENGTH
LocID 8
Number 8
Pct_HpA 8
Pct_Bh 8
Pct_Menin 8
Pct_Rot 8
Pct_Ap 8
yr 8 ;
FORMAT
LocID BEST12.
Number BEST12.
Pct_HpA BEST12.
Pct_Bh BEST12.
Pct_Menin BEST12.
Pct_Rot BEST12.
Pct_Ap BEST12.
yr BEST12. ;
INFORMAT
LocID BEST12.
Number BEST12.
Pct_HpA BEST12.
Pct_Bh BEST12.
Pct_Menin BEST12.
Pct_Rot BEST12.
Pct_Ap BEST12.
yr BEST12. ;
INFILE DATALINES4
DLM='7F'x
MISSOVER
DSD ;
INPUT
LocID : BEST32.
Number : BEST32.
Pct_HpA : BEST32.
Pct_Bh : BEST32.
Pct_Menin : BEST32.
Pct_Rot : BEST32.
Pct_Ap : BEST32.
yr : BEST32. ;
DATALINES4;
12187689.494.794.396.698.220082009
2213594.810099.299.310020082009
3244990.596.395.598.198.820082009
4475489.194.693.997.498.220082009
5252981.787.289.395.495.820082009
12155687.192.193.394.195.720092010
2214289.594.793.99494.720092010
3240893.498.697.499.310020092010
447869195.894.797.298.420092010
5252785.691.690.897.398.320092010
12170888.394.393.595.397.220102011
2213586.591.790.99191.720102011
3241988.794.391.495.998.320102011
4482590.495.894.297.298.220102011
5257584.691.688.194.997.620102011
;;;;
And I need it to look like this with:
1. the variables starting with 'Pct' in a new column variable called 'Names'
2. the associated values of the Pct_HpA, Pct_Bh, etc in a new column called 'Pcts'
3. and their associated values per year, LocID, and Number listed per line as below:
LocID |
Number |
Names |
Pcts |
Yr |
1 |
21876 |
Pct_HpA |
89.4 |
20082009 |
1 |
21876 |
Pct_Bh |
94.7 |
20082009 |
1 |
21876 |
Pct_Menin |
94.3 |
20082009 |
1 |
21876 |
Pct_Rot |
96.6 |
20082009 |
1 |
21876 |
Pct_Ap |
98.2 |
20082009 |
2 |
2135 |
Pct_HpA |
94.8 |
20082009 |
2 |
2135 |
Pct_Bh |
100 |
20082009 |
2 |
2135 |
Pct_Menin |
99.2 |
20082009 |
2 |
2135 |
Pct_Rot |
99.3 |
20082009 |
2 |
2135 |
Pct_Ap |
100 |
20082009 |
3 |
2449 |
Pct_HpA |
89.1 |
20082009 |
3 |
2449 |
Pct_Bh |
94.6 |
20082009 |
3 |
2449 |
Pct_Menin |
93.9 |
20082009 |
3 |
2449 |
Pct_Rot |
97.4 |
20082009 |
3 |
2449 |
Pct_Ap |
98.2 |
20082009 |
4 |
4754 |
Pct_HpA |
89.1 |
20082009 |
4 |
4754 |
Pct_Bh |
94.6 |
20082009 |
4 |
4754 |
Pct_Menin |
93.9 |
20082009 |
4 |
4754 |
Pct_Rot |
97.4 |
20082009 |
4 |
4754 |
Pct_Ap |
98.2 |
20082009 |
5 |
2529 |
Pct_HpA |
81.7 |
20082009 |
5 |
2529 |
Pct_Bh |
87.2 |
20082009 |
5 |
2529 |
Pct_Menin |
89.3 |
20082009 |
5 |
2529 |
Pct_Rot |
95.4 |
20082009 |
5 |
2529 |
Pct_Ap |
95.8 |
20082009 |
1 |
21556 |
Pct_HpA |
87.1 |
20092010 |
1 |
21556 |
Pct_Bh |
92.1 |
20092010 |
1 |
21556 |
Pct_Menin |
93.3 |
20092010 |
1 |
21556 |
Pct_Rot |
94.1 |
20092010 |
1 |
21556 |
Pct_Ap |
95.7 |
20092010 |
2 |
2142 |
Pct_HpA |
89.5 |
20092010 |
2 |
2142 |
Pct_Bh |
94.7 |
20092010 |
2 |
2142 |
Pct_Menin |
93.9 |
20092010 |
2 |
2142 |
Pct_Rot |
94 |
20092010 |
2 |
2142 |
Pct_Ap |
94.7 |
20092010 |
3 |
2408 |
Pct_HpA |
93.4 |
20092010 |
3 |
2408 |
Pct_Bh |
98.6 |
20092010 |
3 |
2408 |
Pct_Menin |
97.4 |
20092010 |
3 |
2408 |
Pct_Rot |
99.3 |
20092010 |
3 |
2408 |
Pct_Ap |
100 |
20092010 |
4 |
4786 |
Pct_HpA |
91 |
20092010 |
4 |
4786 |
Pct_Bh |
95.8 |
20092010 |
4 |
4786 |
Pct_Menin |
94.7 |
20092010 |
4 |
4786 |
Pct_Rot |
97.2 |
20092010 |
4 |
4786 |
Pct_Ap |
98.4 |
20092010 |
5 |
2527 |
Pct_HpA |
85.6 |
20092010 |
5 |
2527 |
Pct_Bh |
91.6 |
20092010 |
5 |
2527 |
Pct_Menin |
90.8 |
20092010 |
5 |
2527 |
Pct_Rot |
97.3 |
20092010 |
5 |
2527 |
Pct_Ap |
98.3 |
20092010 |
1 |
21708 |
Pct_HpA |
88.3 |
20102011 |
1 |
21708 |
Pct_Bh |
94.3 |
20102011 |
1 |
21708 |
Pct_Menin |
93.5 |
20102011 |
1 |
21708 |
Pct_Rot |
95.3 |
20102011 |
1 |
21708 |
Pct_Ap |
97.2 |
20102011 |
2 |
2135 |
Pct_HpA |
86.5 |
20102011 |
2 |
2135 |
Pct_Bh |
91.7 |
20102011 |
2 |
2135 |
Pct_Menin |
90.9 |
20102011 |
2 |
2135 |
Pct_Rot |
91 |
20102011 |
2 |
2135 |
Pct_Ap |
91.7 |
20102011 |
3 |
2419 |
Pct_HpA |
88.7 |
20102011 |
3 |
2419 |
Pct_Bh |
94.3 |
20102011 |
3 |
2419 |
Pct_Menin |
91.4 |
20102011 |
3 |
2419 |
Pct_Rot |
95.9 |
20102011 |
3 |
2419 |
Pct_Ap |
98.3 |
20102011 |
4 |
4825 |
Pct_HpA |
90.4 |
20102011 |
4 |
4825 |
Pct_Bh |
95.8 |
20102011 |
4 |
4825 |
Pct_Menin |
94.2 |
20102011 |
4 |
4825 |
Pct_Rot |
97.2 |
20102011 |
4 |
4825 |
Pct_Ap |
98.2 |
20102011 |
5 |
2575 |
Pct_HpA |
84.6 |
20102011 |
5 |
2575 |
Pct_Bh |
91.6 |
20102011 |
5 |
2575 |
Pct_Menin |
88.1 |
20102011 |
5 |
2575 |
Pct_Rot |
94.9 |
20102011 |
5 |
2575 |
Pct_Ap |
97.6 |
20102011 |
I'm not getting the results I'm needing using: https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/
Any help you can give will be so very much appreciated! Thank you!
Post the code you tried, proc transpose is the correct approach here.
@jcis7 wrote:
G'day!
I hope you can help!
I have the following sample data in this structure and format (but lots more spanning more years):
DATA WORK.TEST_0000; LENGTH LocID 8 Number 8 Pct_HpA 8 Pct_Bh 8 Pct_Menin 8 Pct_Rot 8 Pct_Ap 8 yr 8 ; FORMAT LocID BEST12. Number BEST12. Pct_HpA BEST12. Pct_Bh BEST12. Pct_Menin BEST12. Pct_Rot BEST12. Pct_Ap BEST12. yr BEST12. ; INFORMAT LocID BEST12. Number BEST12. Pct_HpA BEST12. Pct_Bh BEST12. Pct_Menin BEST12. Pct_Rot BEST12. Pct_Ap BEST12. yr BEST12. ; INFILE DATALINES4 DLM='7F'x MISSOVER DSD ; INPUT LocID : BEST32. Number : BEST32. Pct_HpA : BEST32. Pct_Bh : BEST32. Pct_Menin : BEST32. Pct_Rot : BEST32. Pct_Ap : BEST32. yr : BEST32. ; DATALINES4; 12187689.494.794.396.698.220082009 2213594.810099.299.310020082009 3244990.596.395.598.198.820082009 4475489.194.693.997.498.220082009 5252981.787.289.395.495.820082009 12155687.192.193.394.195.720092010 2214289.594.793.99494.720092010 3240893.498.697.499.310020092010 447869195.894.797.298.420092010 5252785.691.690.897.398.320092010 12170888.394.393.595.397.220102011 2213586.591.790.99191.720102011 3241988.794.391.495.998.320102011 4482590.495.894.297.298.220102011 5257584.691.688.194.997.620102011 ;;;;
And I need it to look like this with:
1. the variables starting with 'Pct' in a new column variable called 'Names'
2. the associated values of the Pct_HpA, Pct_Bh, etc in a new column called 'Pcts'
3. and their associated values per year, LocID, and Number listed per line as below:
LocID
Number
Names
Pcts
Yr
1
21876
Pct_HpA
89.4
20082009
1
21876
Pct_Bh
94.7
20082009
1
21876
Pct_Menin
94.3
20082009
1
21876
Pct_Rot
96.6
20082009
1
21876
Pct_Ap
98.2
20082009
2
2135
Pct_HpA
94.8
20082009
2
2135
Pct_Bh
100
20082009
2
2135
Pct_Menin
99.2
20082009
2
2135
Pct_Rot
99.3
20082009
2
2135
Pct_Ap
100
20082009
3
2449
Pct_HpA
89.1
20082009
3
2449
Pct_Bh
94.6
20082009
3
2449
Pct_Menin
93.9
20082009
3
2449
Pct_Rot
97.4
20082009
3
2449
Pct_Ap
98.2
20082009
4
4754
Pct_HpA
89.1
20082009
4
4754
Pct_Bh
94.6
20082009
4
4754
Pct_Menin
93.9
20082009
4
4754
Pct_Rot
97.4
20082009
4
4754
Pct_Ap
98.2
20082009
5
2529
Pct_HpA
81.7
20082009
5
2529
Pct_Bh
87.2
20082009
5
2529
Pct_Menin
89.3
20082009
5
2529
Pct_Rot
95.4
20082009
5
2529
Pct_Ap
95.8
20082009
1
21556
Pct_HpA
87.1
20092010
1
21556
Pct_Bh
92.1
20092010
1
21556
Pct_Menin
93.3
20092010
1
21556
Pct_Rot
94.1
20092010
1
21556
Pct_Ap
95.7
20092010
2
2142
Pct_HpA
89.5
20092010
2
2142
Pct_Bh
94.7
20092010
2
2142
Pct_Menin
93.9
20092010
2
2142
Pct_Rot
94
20092010
2
2142
Pct_Ap
94.7
20092010
3
2408
Pct_HpA
93.4
20092010
3
2408
Pct_Bh
98.6
20092010
3
2408
Pct_Menin
97.4
20092010
3
2408
Pct_Rot
99.3
20092010
3
2408
Pct_Ap
100
20092010
4
4786
Pct_HpA
91
20092010
4
4786
Pct_Bh
95.8
20092010
4
4786
Pct_Menin
94.7
20092010
4
4786
Pct_Rot
97.2
20092010
4
4786
Pct_Ap
98.4
20092010
5
2527
Pct_HpA
85.6
20092010
5
2527
Pct_Bh
91.6
20092010
5
2527
Pct_Menin
90.8
20092010
5
2527
Pct_Rot
97.3
20092010
5
2527
Pct_Ap
98.3
20092010
1
21708
Pct_HpA
88.3
20102011
1
21708
Pct_Bh
94.3
20102011
1
21708
Pct_Menin
93.5
20102011
1
21708
Pct_Rot
95.3
20102011
1
21708
Pct_Ap
97.2
20102011
2
2135
Pct_HpA
86.5
20102011
2
2135
Pct_Bh
91.7
20102011
2
2135
Pct_Menin
90.9
20102011
2
2135
Pct_Rot
91
20102011
2
2135
Pct_Ap
91.7
20102011
3
2419
Pct_HpA
88.7
20102011
3
2419
Pct_Bh
94.3
20102011
3
2419
Pct_Menin
91.4
20102011
3
2419
Pct_Rot
95.9
20102011
3
2419
Pct_Ap
98.3
20102011
4
4825
Pct_HpA
90.4
20102011
4
4825
Pct_Bh
95.8
20102011
4
4825
Pct_Menin
94.2
20102011
4
4825
Pct_Rot
97.2
20102011
4
4825
Pct_Ap
98.2
20102011
5
2575
Pct_HpA
84.6
20102011
5
2575
Pct_Bh
91.6
20102011
5
2575
Pct_Menin
88.1
20102011
5
2575
Pct_Rot
94.9
20102011
5
2575
Pct_Ap
97.6
20102011
I'm not getting the results I'm needing using: https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/
Any help you can give will be so very much appreciated! Thank you!
I am going to guess that your order for specifying the variables in your by statement had issues.
Since you show starting data that is not "sorted" by LOCID, then you cannot start a "by locid" in proc transpose.
So consider
by notsorted locid number yr ;
to get all the values you need on all of the output records.
The Name= option on proc transpose will set the name of the variable that has the original variables that transposed and Prefix sets the start of the name for the transposed values (hint: will get at least a 1 suffix)
and you can RENAME variables on any output data set where there is something like Out=mydatset (rename=(thisvar = thatvar))
Got It : - ) !!! Do you have a good link to spell out the notsorted option?
proc transpose data=TEST_0000 out=long3a;
by notsorted locid number yr;
run;
data long4;
set long3a (rename=(COL1=Pcts _NAME_=Names));
label Names='Names';
run;
Thank you so very much : - )! You all are so awesome and helpful!!!
specifies that observations with the same BY value are grouped together but are not necessarily sorted in alphabetical or numeric order.
Restriction | You cannot use the NOTSORTED option with the MERGE and UPDATE statements. |
Tips | The NOTSORTED option can appear anywhere in the BY statement. |
Using the NOTSORTED option is useful if you have data that falls into other logical groupings such as chronological order or categories. | |
Example | BY-Group Processing with Nonsorted Data |
I think a little extra code will get you even closer to what you want. i.e.,
proc transpose data=TEST_0000 out=long3a(rename=(COL1=Pcts)) name=Names;
by notsorted locid number yr;
run;
data long4;
set long3a;
label Names='Names';
Names=substr(Names,5);
run;
However, you will get an undesirable result if any adjacent records share the same values in their by values. e.g., the following input dataset won't result (i don't think) what you expect to achieve:
DATA WORK.TEST_0000;
LENGTH
LocID 8
Number 8
Pct_HpA 8
Pct_Bh 8
Pct_Menin 8
Pct_Rot 8
Pct_Ap 8
yr 8 ;
FORMAT
LocID BEST12.
Number BEST12.
Pct_HpA BEST12.
Pct_Bh BEST12.
Pct_Menin BEST12.
Pct_Rot BEST12.
Pct_Ap BEST12.
yr BEST12. ;
INFORMAT
LocID BEST12.
Number BEST12.
Pct_HpA BEST12.
Pct_Bh BEST12.
Pct_Menin BEST12.
Pct_Rot BEST12.
Pct_Ap BEST12.
yr BEST12. ;
INFILE DATALINES4
DLM='7F'x
MISSOVER
DSD ;
INPUT
LocID : BEST32.
Number : BEST32.
Pct_HpA : BEST32.
Pct_Bh : BEST32.
Pct_Menin : BEST32.
Pct_Rot : BEST32.
Pct_Ap : BEST32.
yr : BEST32. ;
DATALINES4;
12187689.494.794.396.698.220082009
121876289.4294.7294.3296.6298.220082009
2213594.810099.299.310020082009
3244990.596.395.598.198.820082009
4475489.194.693.997.498.220082009
5252981.787.289.395.495.820082009
12155687.192.193.394.195.720092010
2214289.594.793.99494.720092010
3240893.498.697.499.310020092010
447869195.894.797.298.420092010
5252785.691.690.897.398.320092010
12170888.394.393.595.397.220102011
2213586.591.790.99191.720102011
3241988.794.391.495.998.320102011
4482590.495.894.297.298.220102011
5257584.691.688.194.997.620102011
;;;;
run;
However, the %untranspose macro will (I think) give you the desired result in both cases) and not need an extra data step. e.g.:
filename ut url 'http://tiny.cc/untranspose_macro';
%include ut ;
%untranspose(data=TEST_0000,
out=want(rename=(_name_=names _value_=pct)),
by=locid number yr,
prefix=Pct_,
var=HpA Bh Menin Rot Ap)
Art, CEO, AnalystFinder.com
You are transposing BY row where the source data may NOT be ordered according BY variables that are surrogates for uniquely representing a row. If the BY values are disordered, or repeated a simple BY statement will not work.
For the case of row-by-row transposing of uncertain data, I would recommend creating a VIEW that has an additional computed column ROWNUM that will ensure a distinct identification of a row when prefaced to your original BY group. The VIEW can be deleted after the TRANSPOSE.
Example:
data have; INPUT LocID Number Pct_HpA Pct_Bh Pct_Menin Pct_Rot Pct_Ap yr; DATALINES; 1 21876 89.4 94.7 94.3 96.6 98.2 20082009 2 2135 94.8 100 99.2 99.3 100 20082009 3 2449 90.5 96.3 95.5 98.1 98.8 20082009 4 4754 89.1 94.6 93.9 97.4 98.2 20082009 5 2529 81.7 87.2 89.3 95.4 95.8 20082009 1 21556 87.1 92.1 93.3 94.1 95.7 20092010 2 2142 89.5 94.7 93.9 94 94.7 20092010 3 2408 93.4 98.6 97.4 99.3 100 20092010 4 4786 91 95.8 94.7 97.2 98.4 20092010 5 2527 85.6 91.6 90.8 97.3 98.3 20092010 ; data have_v / view=have_v; set have; rownum + 1; run; proc transpose data=have_v out=want ( rename=(_name_=Names col1=Pcts) drop=rownum ) ; by rownum locid number yr; var Pct:; label Names=' '; run; proc delete data=have_v(mt=view); run;
Result
Can you help me understand the use of the colon in the 3rd to last line in the full code below (var Pct:;)?
Thank you!
data have_v / view=have_v;
set have;
rownum + 1;
run;
proc transpose
data=have_v
out=want ( rename=(_name_=Names col1=Pcts)
drop=rownum
)
;
by rownum locid number yr;
var Pct:;
label Names=' ';
run;
VAR PCT: ;
is a statement that is using a SAS Variable List. In this case PCT: means all variables whose name starts with PCT.
Learn more about Variable Lists in the documentation
A SAS variable is an abbreviated method of referring to a list of variable names. SAS enables you to use the following variable lists:
- numbered range lists
- name range lists
- name prefix lists
- special SAS name lists
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.