Hello everyone,
I am working on cleaning up a dataset that seems to be very complicated and I have been unable to convert it into a way that I want it to be. Please see the attached file. Each company (referred to as Firm A, B, C, ... ) has a few variables (Date, Var1, Var2, and Var 3) and each variable has up to 2,000 observations. There is a column of 'blank' between the Var3 column of Firm_i and the date variable column of Firm_(i+1). In a spreadsheet, there are about 800 Firms. That is, each firm has 4 variables and one blank column so there are approximately 4,000 columns in the spreadsheet. I would like to convert this into a dataset that is easy for me to analyze on SAS. (Please see the 'after' example.)
Is there any way I can use SAS to manipulate this kind of dataset? My first thoughts were to use Proc Transpose but it did not work the way I wanted it to be. If anyone could share any ideas regarding the solution to this problem, I would be very grateful.
Thank you very much for your time.
Do you have to read the spreadsheet? Can you save the data to a CSV file instead?
If so then something like this should work, assuming that the structure is as predictable as you showed.
data firms (keep=firmno firm)
values(keep=firmno date var1-var3 )
;
infile 'myfile.csv' dsd truncover lrecl=1000000 ;
length firmno 8 firm $20 date var1-var3 8 dummy $1 ;
informat date mmddyy. ;
format date yymmdd10. ;
if _n_=1 then do;
do firmno=1 by 1 until (firm=' ');
input firm 4*dummy @;
if firm ne ' ' then output firms ;
end;
input / ;
nfirms=firmno;
retain nfirms;
end;
do firmno = 1 to nfirms ;
input date var-var3 dummy @;
output values;
end;
input;
run;
proc sql ;
create table want as
select * from firms a, values b
where a.firmno = b.firmno
order by b.firmno,b.date
;
quit;
Do you have to read the spreadsheet? Can you save the data to a CSV file instead?
If so then something like this should work, assuming that the structure is as predictable as you showed.
data firms (keep=firmno firm)
values(keep=firmno date var1-var3 )
;
infile 'myfile.csv' dsd truncover lrecl=1000000 ;
length firmno 8 firm $20 date var1-var3 8 dummy $1 ;
informat date mmddyy. ;
format date yymmdd10. ;
if _n_=1 then do;
do firmno=1 by 1 until (firm=' ');
input firm 4*dummy @;
if firm ne ' ' then output firms ;
end;
input / ;
nfirms=firmno;
retain nfirms;
end;
do firmno = 1 to nfirms ;
input date var-var3 dummy @;
output values;
end;
input;
run;
proc sql ;
create table want as
select * from firms a, values b
where a.firmno = b.firmno
order by b.firmno,b.date
;
quit;
Dear Tom,
It worked perfectly and now the dataset looks exactly what I wanted.
Thank you very much for your help!
Kishiyo.
I have data like this
data ds;
x=123456789;
run;
Then I want output like this
output;
x
1
2
3
4
5
6
7
8
9
how to do.....????
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.