Hi,
Above is the link of the file that I need to switch from short to long format using data steps.
I attached the project questions, I am very new to programming and have no idea on how to continue.
I am hoping that someone here could please help me out.
First,
I imported the text file to SAS which I did successfully
Now I have to change the format from short to long
I don’t know how to approach this because SAS says the file has only one variable while there are about 6 or more columns.
SET is used to read data from a table, INPUT is used to read data from an external file (or inline via DATALINES).
The @ used at the end of an INPUT statement keeps the input pointer on the same line. This allows you to read the data and directly output it into a long format.
data epilepsy;
infile datalines truncover;
input ID $ Treatment Age ESC @;
do week=1 to 4;
input Seizurecount @;
output;
end;
datalines;
1 0 31 11 5 3 3 3
2 0 30 11 3 5 3 3
3 0 25 6 2 4 0 5
4 0 36 8 4 4 1 4
5 0 22 66 7 18 9 21
6 0 29 27 5 2 8 7
7 0 31 12 6 4 0 2
8 0 36 52 40 20 23 12
9 0 37 23 5 6 6 5
10 0 28 10 14 13 6 0
11 0 36 52 26 12 6 22
12 0 24 33 12 6 8 5
13 0 28 18 4 4 6 2
14 0 36 42 7 9 12 14
15 0 26 87 16 24 10 9
16 0 26 50 11 0 0 5
17 0 28 18 0 0 3 3
18 0 31 111 37 29 28 29
19 0 32 18 3 5 2 5
20 0 21 20 3 0 6 7
21 0 29 12 3 4 3 4
22 0 21 9 3 4 3 4
23 0 32 17 2 3 3 5
24 0 25 28 8 12 2 8
25 0 30 55 18 24 76 25
26 0 40 9 2 1 2 1
27 0 19 10 3 1 4 2
28 0 22 47 13 15 13 12
29 1 18 76 11 14 9 8
30 1 32 38 8 7 9 4
31 1 20 19 0 4 3 0
32 1 20 10 3 6 1 3
33 1 18 19 2 6 7 4
34 1 24 24 4 3 1 3
35 1 30 31 22 17 19 16
36 1 35 14 5 4 7 4
37 1 57 11 2 4 0 4
38 1 20 67 3 7 7 7
39 1 22 41 4 18 2 5
40 1 28 7 2 1 1 0
41 1 23 22 0 2 4 0
42 1 40 13 5 4 0 3
43 1 43 46 11 14 25 15
44 1 21 36 10 5 3 8
45 1 35 38 19 7 6 7
46 1 25 7 1 1 2 4
47 1 26 36 6 10 8 8
48 1 25 11 2 1 0 0
49 1 22 151 102 65 72 63
50 1 32 22 4 3 2 4
51 1 25 42 8 6 5 7
52 1 35 32 1 3 1 5
53 1 21 56 18 11 28 13
54 1 41 24 6 3 4 0
55 1 32 16 3 5 4 3
56 1 26 22 1 23 19 8
57 1 21 25 2 3 0 1
58 1 36 13 0 0 0 0
59 1 37 12 1 4 3 2
;
run;
If your actual data is stored in an external file then eventually post this file (or some of the first lines of the file) here as attachment so that we can show how to read such a file (uses of a FILENAME statement).
That means you did not import the data correctly. Please post the code and log from the import step. It will say the number of variables and observations which is one indication that the file was read correctly.
@Stacy wrote:
Hi,
Above is the link of the file that I need to switch from short to long format using data steps.
I attached the project questions, I am very new to programming and have no idea on how to continue.
I am hoping that someone here could please help me out.
First,
I imported the text file to SAS which I did successfully
Now I have to change the format from short to long
I don’t know how to approach this because SAS says the file has only one variable while there are about 6 or more columns.
Libname sasuser 'C:\Users\Ifeoma\Downloads';
Data sasuser.epilepsy;
SET epilepsy;
input ID $ Treatment Age ESC TSC1 TSC2 TSC3 TSC4 $;
datalines;
1 0 31 11 5 3 3 3
2 0 30 11 3 5 3 3
3 0 25 6 2 4 0 5
4 0 36 8 4 4 1 4
5 0 22 66 7 18 9 21
6 0 29 27 5 2 8 7
7 0 31 12 6 4 0 2
8 0 36 52 40 20 23 12
9 0 37 23 5 6 6 5
10 0 28 10 14 13 6 0
11 0 36 52 26 12 6 22
12 0 24 33 12 6 8 5
13 0 28 18 4 4 6 2
14 0 36 42 7 9 12 14
15 0 26 87 16 24 10 9
16 0 26 50 11 0 0 5
17 0 28 18 0 0 3 3
18 0 31 111 37 29 28 29
19 0 32 18 3 5 2 5
20 0 21 20 3 0 6 7
21 0 29 12 3 4 3 4
22 0 21 9 3 4 3 4
23 0 32 17 2 3 3 5
24 0 25 28 8 12 2 8
25 0 30 55 18 24 76 25
26 0 40 9 2 1 2 1
27 0 19 10 3 1 4 2
28 0 22 47 13 15 13 12
29 1 18 76 11 14 9 8
30 1 32 38 8 7 9 4
31 1 20 19 0 4 3 0
32 1 20 10 3 6 1 3
33 1 18 19 2 6 7 4
34 1 24 24 4 3 1 3
35 1 30 31 22 17 19 16
36 1 35 14 5 4 7 4
37 1 57 11 2 4 0 4
38 1 20 67 3 7 7 7
39 1 22 41 4 18 2 5
40 1 28 7 2 1 1 0
41 1 23 22 0 2 4 0
42 1 40 13 5 4 0 3
43 1 43 46 11 14 25 15
44 1 21 36 10 5 3 8
45 1 35 38 19 7 6 7
46 1 25 7 1 1 2 4
47 1 26 36 6 10 8 8
48 1 25 11 2 1 0 0
49 1 22 151 102 65 72 63
50 1 32 22 4 3 2 4
51 1 25 42 8 6 5 7
52 1 35 32 1 3 1 5
53 1 21 56 18 11 28 13
54 1 41 24 6 3 4 0
55 1 32 16 3 5 4 3
56 1 26 22 1 23 19 8
57 1 21 25 2 3 0 1
58 1 36 13 0 0 0 0
59 1 37 12 1 4 3 2
;
run;
Data epilepsy1;
Set epilepsy;
Week = 1;
Seizurecount = TSC1;
drop TSC1 TSC2 TSC3 TSC4;
run;
Data epilepsy2;
Set epilepsy;
Week = 2;
Seizurecount = TSC2;
drop TSC1 TSC2 TSC3 TSC4;
run;
Data epilepsy3;
Set epilepsy;
Week = 3;
Seizurecount = TSC3;
drop TSC1 TSC2 TSC3 TSC4;
run;
Data epilepsy4;
Set epilepsy;
Week = 4;
Seizurecount = TSC4;
drop TSC1 TSC2 TSC3 TSC4;
run;
Data sasuser.epilepsylong;
Set epilepsy1 epilepsy2 epilepsy3 epilepsy4;
run;
After running the code, I got the following errors;
NOTE: Variable TSC1 is uninitialized.
WARNING: The variable TSC2 in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable TSC3 in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable TSC4 in the DROP, KEEP, or RENAME list has never been referenced.
NOTE: There were 94 observations read from the data set WORK.EPILEPSY.
NOTE: The data set WORK.EPILEPSY1 has 94 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
2243 Data epilepsy2;
2244 Set epilepsy;
2245 Week = 2;
2246 Seizurecount = TSC2;
2247 drop TSC1 TSC2 TSC3 TSC4;
2248 run;
NOTE: Variable TSC2 is uninitialized.
WARNING: The variable TSC1 in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable TSC3 in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable TSC4 in the DROP, KEEP, or RENAME list has never been referenced.
NOTE: There were 94 observations read from the data set WORK.EPILEPSY.
NOTE: The data set WORK.EPILEPSY2 has 94 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
2249 Data epilepsy3;
2250 Set epilepsy;
2251 Week = 3;
2252 Seizurecount = TSC3;
2253 drop TSC1 TSC2 TSC3 TSC4;
2254 run;
NOTE: Variable TSC3 is uninitialized.
WARNING: The variable TSC1 in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable TSC2 in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable TSC4 in the DROP, KEEP, or RENAME list has never been referenced.
NOTE: There were 94 observations read from the data set WORK.EPILEPSY.
NOTE: The data set WORK.EPILEPSY3 has 94 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
2255 Data epilepsy4;
2256 Set epilepsy;
2257 Week = 4;
2258 Seizurecount = TSC4;
2259 drop TSC1 TSC2 TSC3 TSC4;
2260 run;
NOTE: Variable TSC4 is uninitialized.
WARNING: The variable TSC1 in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable TSC2 in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable TSC3 in the DROP, KEEP, or RENAME list has never been referenced.
NOTE: There were 94 observations read from the data set WORK.EPILEPSY.
NOTE: The data set WORK.EPILEPSY4 has 94 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
2261 Data sasuser.epilepsylong;
2262 Set epilepsy1 epilepsy2 epilepsy3 epilepsy4;
2263 run;
NOTE: There were 94 observations read from the data set WORK.EPILEPSY1.
NOTE: There were 94 observations read from the data set WORK.EPILEPSY2.
NOTE: There were 94 observations read from the data set WORK.EPILEPSY3.
NOTE: There were 94 observations read from the data set WORK.EPILEPSY4.
NOTE: The data set SASUSER.EPILEPSYLONG has 376 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
I dont see what youre talking about. The data sets I used are the same.
@Stacy wrote:
I dont see what youre talking about. The data sets I used are the same.
No they aren't.
Tables are referenced using <libref>.<table name>. If you only use the table name like in "set epilepsy" then SAS defaults to WORK as your libref.
Now look at your code:
Libname sasuser 'C:\Users\Ifeoma\Downloads';
Data sasuser.epilepsy;
SET epilepsy;
input ID $ Treatment Age ESC TSC1 TSC2 TSC3 TSC4 $;
datalines;
You are creating a table under libref SASUSER but then later in your code you try to use the table under WORK (as you're only using a one level name).
Also: What's this combination of SET and INPUT statement about? That's most likely not going to work as you'd expect.
You also might want to remove the $ at the end of your INPUT line. Why would you want to only read TSC4 as character.
And last but not least: Not sure why you need to replicate your data just with a different number for Week - but if really necessary then code as below would be more efficient
data epilepsylong(drop=TSC:);
set epilepsy;
array TSC {*} TSC1-TSC4;
do Week=1 to 4;
Seizurecount=TSC[Week];
output;
end;
run;
Your first step doesn't make any sense.
Data sasuser.epilepsy;
SET epilepsy;
input ID $ Treatment Age ESC TSC1 TSC2 TSC3 TSC4 $;
datalines;
Why do you have both a SET statement and an INPUT statement.
That would mean that you are assuming that the existing dataset EPILEPSY has the same number of observations as the data that will be read from the datalines AND that the observations are in the exact same order so that it makes sense to add new variables (or overwrite the existing variables) with the data from the datalines.
I used both a set and input statement because in the file theres no column specofying the variable.
;Likethis;
Patient ID, Treatment (0=Placebo, 1=Progabide), Age,
Baseline 8 week seizure count, First 2 week seizure count,
Second 2 week seizure count, Third 2 week seizure count,
Fourth 2 week seizure count.
1 0 31 11 5 3 3 3
2 0 30 11 3 5 3 3
3 0 25 6 2 4 0 5
4 0 36 8 4 4 1 4
5 0 22 66 7 18 9 21
6 0 29 27 5 2 8 7
7 0 31 12 6 4 0 2
8 0 36 52 40 20 23 12
9 0 37 23 5 6 6 5
10 0 28 10 14 13 6 0
11 0 36 52 26 12 6 22
12 0 24 33 12 6 8 5
13 0 28 18 4 4 6 2
14 0 36 42 7 9 12 14
15 0 26 87 16 24 10 9
16 0 26 50 11 0 0 5
17 0 28 18 0 0 3 3
18 0 31 111 37 29 28 29
19 0 32 18 3 5 2 5
20 0 21 20 3 0 6 7
21 0 29 12 3 4 3 4
22 0 21 9 3 4 3 4
The numbers dont have a heading, so I thought by using input I could give the columns a heading before converting from short format to long format.
Also, the set recorded as 94 observations with one variable, so I used the input statement to create variables for each column.
SET is used to read data from a table, INPUT is used to read data from an external file (or inline via DATALINES).
The @ used at the end of an INPUT statement keeps the input pointer on the same line. This allows you to read the data and directly output it into a long format.
data epilepsy;
infile datalines truncover;
input ID $ Treatment Age ESC @;
do week=1 to 4;
input Seizurecount @;
output;
end;
datalines;
1 0 31 11 5 3 3 3
2 0 30 11 3 5 3 3
3 0 25 6 2 4 0 5
4 0 36 8 4 4 1 4
5 0 22 66 7 18 9 21
6 0 29 27 5 2 8 7
7 0 31 12 6 4 0 2
8 0 36 52 40 20 23 12
9 0 37 23 5 6 6 5
10 0 28 10 14 13 6 0
11 0 36 52 26 12 6 22
12 0 24 33 12 6 8 5
13 0 28 18 4 4 6 2
14 0 36 42 7 9 12 14
15 0 26 87 16 24 10 9
16 0 26 50 11 0 0 5
17 0 28 18 0 0 3 3
18 0 31 111 37 29 28 29
19 0 32 18 3 5 2 5
20 0 21 20 3 0 6 7
21 0 29 12 3 4 3 4
22 0 21 9 3 4 3 4
23 0 32 17 2 3 3 5
24 0 25 28 8 12 2 8
25 0 30 55 18 24 76 25
26 0 40 9 2 1 2 1
27 0 19 10 3 1 4 2
28 0 22 47 13 15 13 12
29 1 18 76 11 14 9 8
30 1 32 38 8 7 9 4
31 1 20 19 0 4 3 0
32 1 20 10 3 6 1 3
33 1 18 19 2 6 7 4
34 1 24 24 4 3 1 3
35 1 30 31 22 17 19 16
36 1 35 14 5 4 7 4
37 1 57 11 2 4 0 4
38 1 20 67 3 7 7 7
39 1 22 41 4 18 2 5
40 1 28 7 2 1 1 0
41 1 23 22 0 2 4 0
42 1 40 13 5 4 0 3
43 1 43 46 11 14 25 15
44 1 21 36 10 5 3 8
45 1 35 38 19 7 6 7
46 1 25 7 1 1 2 4
47 1 26 36 6 10 8 8
48 1 25 11 2 1 0 0
49 1 22 151 102 65 72 63
50 1 32 22 4 3 2 4
51 1 25 42 8 6 5 7
52 1 35 32 1 3 1 5
53 1 21 56 18 11 28 13
54 1 41 24 6 3 4 0
55 1 32 16 3 5 4 3
56 1 26 22 1 23 19 8
57 1 21 25 2 3 0 1
58 1 36 13 0 0 0 0
59 1 37 12 1 4 3 2
;
run;
If your actual data is stored in an external file then eventually post this file (or some of the first lines of the file) here as attachment so that we can show how to read such a file (uses of a FILENAME statement).
Hi,
Thanks for the reply. I imported the file to SAS. I attached the file.
How do I convert it to long format without using datalines
Hi Patrick,
The code worked thanks.
Thank you
Good to hear and thanks for posting your external file.
Given that it's not a lot of data I believe easiest for you is to go with the inline data (datalines) approach which works already for you.
Yes that is true and thanks.
Please, how can I go in converting the ESC column and Seizure count column into weekly rates
@Stacy wrote:
Yes that is true and thanks.
Please, how can I go in converting the ESC column and Seizure count column into weekly rates
I have no idea what these "weekly rates" are so you would have to describe the logic in detail and provide examples based on the sample data posted.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.