Dear All,
Greetings for the Day..!!
Well I have trade level data for 50 days. For each day, there are 5 columns (namely Day, Flag, Price, Qty, Type) . Since my data is for 50 days, which means there are total 250 columns. (There are 60,000 plus rows for each day.)
Data:
8000 Call | 8100 Call | 8200 Call | ||||||||||||
Day | Flag | Price | Qty | Type | Day | Flag | Price | Qty | Type | Day | Flag | Price | Qty | Type |
May 4, 2015 | TRADE | 320 | 4 | May 5, 2015 | TRADE | 368.05 | 1 | May 6, 2015 | TRADE | 355 | 5 | |||
May 4, 2015 | TRADE | 320.1 | 2 | May 5, 2015 | TRADE | 368 | 3 | May 6, 2015 | TRADE | 354.25 | 2 | |||
May 4, 2015 | TRADE | 316.85 | 1 | May 6, 2015 | TRADE | 368 | 4 | May 6, 2015 | TRADE | 354.3 | 2 | |||
May 4, 2015 | TRADE | 317.4 | 1 | May 6, 2015 | TRADE | 368 | 2 | May 6, 2015 | TRADE | 354.2 | 1 | |||
May 4, 2015 | TRADE | 311.05 | 1 | May 7, 2015 | TRADE | 361.65 | 2 | May 6, 2015 | TRADE | 354.4 | 1 | |||
May 4, 2015 | TRADE | 320.3 | 18 | May 7, 2015 | TRADE | 362.8 | 7 | May 6, 2015 | TRADE | 350.05 | 1 | |||
May 4, 2015 | TRADE | 318 | 18 | May 8, 2015 | TRADE | 361.85 | 3 | May 6, 2015 | TRADE | 347.05 | 2 | |||
May 4, 2015 | TRADE | 318 | 1 | May 8, 2015 | TRADE | 362.8 | 1 | May 6, 2015 | TRADE | 345.85 | 7 | |||
May 4, 2015 | TRADE | 318 | 2 | May 9, 2015 | TRADE | 362.15 | 8 | May 6, 2015 | TRADE | 342 | 3 | |||
May 4, 2015 | TRADE | 318 | 2 | May 9, 2015 | TRADE | 366.05 | 1 | May 6, 2015 | TRADE | 341 | 131 | |||
May 4, 2015 | TRADE | 318.05 | 2 | May 10, 2015 | TRADE | 368 | 1 | May 6, 2015 | TRADE | 343 | 12 | |||
May 4, 2015 | TRADE | 318 | 7 | May 10, 2015 | TRADE | 368 | 3 | May 6, 2015 | TRADE | 343 | 10 | |||
May 4, 2015 | TRADE | 320 | 40 | May 11, 2015 | TRADE | 367 | 8 | May 6, 2015 | TRADE | 345.1 | 1 | |||
Desired Output:
Day | Flag | Price | Qty | Type |
May 4, 2015 | 8000 | 320 | 4 | Call |
May 4, 2015 | 8000 | 320.1 | 2 | Call |
May 4, 2015 | 8000 | 316.85 | 1 | Call |
May 4, 2015 | 8000 | 317.4 | 1 | Call |
May 4, 2015 | 8000 | 311.05 | 1 | Call |
May 4, 2015 | 8000 | 320.3 | 18 | Call |
May 4, 2015 | 8000 | 318 | 18 | Call |
May 4, 2015 | 8000 | 318 | 1 | Call |
May 4, 2015 | 8000 | 318 | 2 | Call |
May 4, 2015 | 8000 | 318 | 2 | Call |
May 4, 2015 | 8000 | 318.05 | 2 | Call |
May 4, 2015 | 8000 | 318 | 7 | Call |
May 4, 2015 | 8000 | 320 | 40 | Call |
May 5, 2015 | 8100 | 368.05 | 1 | Call |
May 5, 2015 | 8100 | 368 | 3 | Call |
May 5, 2015 | 8100 | 368 | 4 | Call |
May 5, 2015 | 8100 | 368 | 2 | Call |
May 5, 2015 | 8100 | 361.65 | 2 | Call |
May 5, 2015 | 8100 | 362.8 | 7 | Call |
May 5, 2015 | 8100 | 361.85 | 3 | Call |
May 5, 2015 | 8100 | 362.8 | 1 | Call |
May 5, 2015 | 8100 | 362.15 | 8 | Call |
May 5, 2015 | 8100 | 366.05 | 1 | Call |
May 5, 2015 | 8100 | 368 | 1 | Call |
May 5, 2015 | 8100 | 368 | 3 | Call |
May 5, 2015 | 8100 | 367 | 8 | Call |
May 6, 2015 | 8200 | 355 | 5 | Call |
May 6, 2015 | 8200 | 354.25 | 2 | Call |
May 6, 2015 | 8200 | 354.3 | 2 | Call |
May 6, 2015 | 8200 | 354.2 | 1 | Call |
May 6, 2015 | 8200 | 354.4 | 1 | Call |
May 6, 2015 | 8200 | 350.05 | 1 | Call |
May 6, 2015 | 8200 | 347.05 | 2 | Call |
May 6, 2015 | 8200 | 345.85 | 7 | Call |
May 6, 2015 | 8200 | 342 | 3 | Call |
May 6, 2015 | 8200 | 341 | 131 | Call |
May 6, 2015 | 8200 | 343 | 12 | Call |
May 6, 2015 | 8200 | 343 | 10 | Call |
May 6, 2015 | 8200 | 345.1 | 1 | Call |
Stpes required to get desired output:
1. Column Flag should consider first 4 letters of Header present in first row (e.g.8000 Call Header will show flag as 8000)
2. Column Type should consider 6-9 letters of Header (e.g. 8000 Call will show Type as Call)
3. Now,Append each days data one under other.
Request to kindly help me to get me desired output.
Thanks and Regards,
Vishal Kapasi
I would just read it in and dump it back out into a text file so that I could read it in properly.
For some reason when I read the XLSX file I made from the first 4 records in your sample data the dates are coming in as the excel number of days format instead of the displayed format date format. If you get actual 'May 4, 2015' in your intermediate text file then used an INFORMAT like ANYDTDTE. to convert it to a date instead of the formal I used to convert from Excel number of days to SAS number of days.
ods rtf file='c:\downloads\manycol.rtf' style=minimal;
proc import datafile='c:\downloads\manycol.xlsx' dbms=xlsx out=wide replace ;
getnames=No ;
run;
proc print data=wide ; run;
filename tall temp ;
data _null_;
do col =1 by 5 ;
_p=1 ;
set wide nobs=nobs point=_p ;
array vars _character_ ;
coltype=vars(col);
do _p2=3 to nobs ;
set wide point=_p2 ;
file tall dsd ;
put coltype @ ;
do _j=0 to 4 ;
put vars(col+_j) @;
end;
put ;
end;
if col+5 > dim(vars) then stop;
end;
stop;
run;
data want ;
infile tall dsd truncover ;
length coltype $20 Day 8 Flag $20 Price 8 Qty 8 Type $20 ;
format day yymmdd10.;
input coltype -- type ;
day = day - ('01JAN1960'd - '01JAN1900'D + 2) ;
run;
proc print data=want ;
run;
ods rtf close;
Obs | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
1 | 8000 Call | 8100 Call | 8200 Call | ||||||||||||
2 | Day | Flag | Price | Qty | Type | Day | Flag | Price | Qty | Type | Day | Flag | Price | Qty | Type |
3 | 42128 | TRADE | 320 | 4 | 42129 | TRADE | 368.05 | 1 | 42130 | TRADE | 355 | 5 | |||
4 | 42128 | TRADE | 320.1 | 2 | 42129 | TRADE | 368 | 3 | 42130 | TRADE | 354.25 | 2 | |||
5 | 42128 | TRADE | 316.85 | 1 | 42130 | TRADE | 368 | 4 | 42130 | TRADE | 354.3 | 2 | |||
6 | 42128 | TRADE | 317.4 | 1 | 42130 | TRADE | 368 | 2 | 42130 | TRADE | 354.2 | 1 |
Obs | coltype | Day | Flag | Price | Qty | Type |
1 | 8000 Call | 5/4/2015 | TRADE | 320 | 4 | |
2 | 8000 Call | 5/4/2015 | TRADE | 320.1 | 2 | |
3 | 8000 Call | 5/4/2015 | TRADE | 316.85 | 1 | |
4 | 8000 Call | 5/4/2015 | TRADE | 317.4 | 1 | |
5 | 8100 Call | 5/5/2015 | TRADE | 368.05 | 1 | |
6 | 8100 Call | 5/5/2015 | TRADE | 368 | 3 | |
7 | 8100 Call | 5/6/2015 | TRADE | 368 | 4 | |
8 | 8100 Call | 5/6/2015 | TRADE | 368 | 2 | |
9 | 8200 Call | 5/6/2015 | TRADE | 355 | 5 | |
10 | 8200 Call | 5/6/2015 | TRADE | 354.25 | 2 | |
11 | 8200 Call | 5/6/2015 | TRADE | 354.3 | 2 | |
12 | 8200 Call | 5/6/2015 | TRADE | 354.2 | 1 |
What does your variable name look like ? They are not able to be the same name for each day.
This doesn't sound so much of a question as more of a "please write my code for me". Try posting some simple test data in the form of a datastep and required output. I would think something along the lines of (also I would question why you have that kind of data in the first place):
data want;
set have;
array items{3,5} day_8000 flag_8000 price_8000 ... /* put variable names here */;
do i=1 to 3;
flag=8000 + ((i-1)*100);
day=items{i,1};
price=items{i,2};
...
output;
end;
run;
Key word to search for here is TRANSPOSE.
Method 1 - run several proc transposes and merge data.
Method 2 - Use a data step with multiple arrays to flip the data.
Method 3 - Use the macro in the link below to transpose the data
Here are some resources to help you get started:
See example 3 here:
http://www.ats.ucla.edu/stat/sas/modules/widetolong_data.htm
A better way to flip data
http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset
I have a suspicion that the first thing is reading the data. Do you have the starting data as a SAS data set or not?
If not, is the source date file a text file? Is the data fixed column, csv, tab delimited or something else? Or is this an Excel or other spreadsheet data source?
Hi,
Well my dataset is in excel. Since excel could not function properly on large data sets. So if it could work on SAS, by apply ing code, it will be great.
The first row of data let's call it Header for each day need to be divided in 2 columns. E.g. 8000 call will be split in 2 columns, where flag will consider 8000, n type will consider call.
Once these 2 columns modified, for each day., we will have 5 columns for each day.. Then need to append each days trade one under other, so that 250 columns (50 days), will convert into 5 required columns.
Thanks
Vishal Kapasi
I would contact whoever is providing that data. It may well be that someone in the past requested such output for a different purpose and could provide the data in the format you need without any obnoxious code needed.
One thing to consider is what happens if the number of columns or the order changes? I am not saying this can't be done but I find that the earlier in a process that things are addressed the better in the long run.
Tom has provided an excellent answer, however I would also just vote in favor of ballardw. Your problem is that you are using Excel which is not a tool to be dealing with data. Where is the documentation, the data transfer specification and agreement, if you don't have these then your throwing darts blindfold, the sheet could completely change next time, or have different cell formats or any numerous problems associated with Excel.
I would just read it in and dump it back out into a text file so that I could read it in properly.
For some reason when I read the XLSX file I made from the first 4 records in your sample data the dates are coming in as the excel number of days format instead of the displayed format date format. If you get actual 'May 4, 2015' in your intermediate text file then used an INFORMAT like ANYDTDTE. to convert it to a date instead of the formal I used to convert from Excel number of days to SAS number of days.
ods rtf file='c:\downloads\manycol.rtf' style=minimal;
proc import datafile='c:\downloads\manycol.xlsx' dbms=xlsx out=wide replace ;
getnames=No ;
run;
proc print data=wide ; run;
filename tall temp ;
data _null_;
do col =1 by 5 ;
_p=1 ;
set wide nobs=nobs point=_p ;
array vars _character_ ;
coltype=vars(col);
do _p2=3 to nobs ;
set wide point=_p2 ;
file tall dsd ;
put coltype @ ;
do _j=0 to 4 ;
put vars(col+_j) @;
end;
put ;
end;
if col+5 > dim(vars) then stop;
end;
stop;
run;
data want ;
infile tall dsd truncover ;
length coltype $20 Day 8 Flag $20 Price 8 Qty 8 Type $20 ;
format day yymmdd10.;
input coltype -- type ;
day = day - ('01JAN1960'd - '01JAN1900'D + 2) ;
run;
proc print data=want ;
run;
ods rtf close;
Obs | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
1 | 8000 Call | 8100 Call | 8200 Call | ||||||||||||
2 | Day | Flag | Price | Qty | Type | Day | Flag | Price | Qty | Type | Day | Flag | Price | Qty | Type |
3 | 42128 | TRADE | 320 | 4 | 42129 | TRADE | 368.05 | 1 | 42130 | TRADE | 355 | 5 | |||
4 | 42128 | TRADE | 320.1 | 2 | 42129 | TRADE | 368 | 3 | 42130 | TRADE | 354.25 | 2 | |||
5 | 42128 | TRADE | 316.85 | 1 | 42130 | TRADE | 368 | 4 | 42130 | TRADE | 354.3 | 2 | |||
6 | 42128 | TRADE | 317.4 | 1 | 42130 | TRADE | 368 | 2 | 42130 | TRADE | 354.2 | 1 |
Obs | coltype | Day | Flag | Price | Qty | Type |
1 | 8000 Call | 5/4/2015 | TRADE | 320 | 4 | |
2 | 8000 Call | 5/4/2015 | TRADE | 320.1 | 2 | |
3 | 8000 Call | 5/4/2015 | TRADE | 316.85 | 1 | |
4 | 8000 Call | 5/4/2015 | TRADE | 317.4 | 1 | |
5 | 8100 Call | 5/5/2015 | TRADE | 368.05 | 1 | |
6 | 8100 Call | 5/5/2015 | TRADE | 368 | 3 | |
7 | 8100 Call | 5/6/2015 | TRADE | 368 | 4 | |
8 | 8100 Call | 5/6/2015 | TRADE | 368 | 2 | |
9 | 8200 Call | 5/6/2015 | TRADE | 355 | 5 | |
10 | 8200 Call | 5/6/2015 | TRADE | 354.25 | 2 | |
11 | 8200 Call | 5/6/2015 | TRADE | 354.3 | 2 | |
12 | 8200 Call | 5/6/2015 | TRADE | 354.2 | 1 |
Dear Mr. Tom,
Big Thanks as you have made me get desired output.
Brilliant Effort.!! Thanks a ton..
Thanks and Regards,
Vishal Kapasi
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.