BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kishiyo
Calcite | Level 5

Hello everyone, 

 

I am working on cleaning up a dataset that seems to be very complicated and I have been unable to convert it into a way that I want it to be. Please see the attached file. Each company (referred to as Firm A, B, C, ... ) has a few variables (Date, Var1, Var2, and Var 3) and each variable has up to 2,000 observations. There is a column of 'blank' between the Var3 column of Firm_i and the date variable column of Firm_(i+1). In a spreadsheet, there are about 800 Firms. That is, each firm has 4 variables and one blank column so there are approximately 4,000 columns in the spreadsheet. I would like to convert this into a dataset that is easy for me to analyze on SAS. (Please see the 'after' example.) 

 

Is there any way I can use SAS to manipulate this kind of dataset? My first thoughts were to use Proc Transpose but it did not work the way I wanted it to be. If anyone could share any ideas regarding the solution to this problem, I would be very grateful. 

 

Thank you very much for your time. 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Do you have to read the spreadsheet?  Can you save the data to a CSV file instead?

If so then something like this should work, assuming that the structure is as predictable as you showed.

data firms (keep=firmno firm)
     values(keep=firmno date var1-var3 )
;
  infile 'myfile.csv' dsd truncover lrecl=1000000 ;
  length firmno 8 firm $20 date var1-var3 8 dummy $1 ;
  informat date mmddyy. ;
  format date yymmdd10. ;
  if _n_=1 then do;
    do firmno=1 by 1 until (firm=' ');
      input firm 4*dummy @;
      if firm ne ' ' then output firms ;
    end;
   input / ;
   nfirms=firmno;
   retain nfirms;
  end;
  do firmno = 1 to nfirms ;
    input date var-var3 dummy @;
    output values;
  end;
  input;
run;

proc sql ;
  create table want as 
    select * from firms a, values b
    where a.firmno = b.firmno
   order by b.firmno,b.date
  ;
quit;
  

View solution in original post

3 REPLIES 3
Tom
Super User Tom
Super User

Do you have to read the spreadsheet?  Can you save the data to a CSV file instead?

If so then something like this should work, assuming that the structure is as predictable as you showed.

data firms (keep=firmno firm)
     values(keep=firmno date var1-var3 )
;
  infile 'myfile.csv' dsd truncover lrecl=1000000 ;
  length firmno 8 firm $20 date var1-var3 8 dummy $1 ;
  informat date mmddyy. ;
  format date yymmdd10. ;
  if _n_=1 then do;
    do firmno=1 by 1 until (firm=' ');
      input firm 4*dummy @;
      if firm ne ' ' then output firms ;
    end;
   input / ;
   nfirms=firmno;
   retain nfirms;
  end;
  do firmno = 1 to nfirms ;
    input date var-var3 dummy @;
    output values;
  end;
  input;
run;

proc sql ;
  create table want as 
    select * from firms a, values b
    where a.firmno = b.firmno
   order by b.firmno,b.date
  ;
quit;
  
kishiyo
Calcite | Level 5

Dear Tom, 

 

It worked perfectly and now the dataset looks exactly what I wanted. 

Thank you very much for your help! 

 

Kishiyo. 

Ayaan
Calcite | Level 5

I have data like this 

data ds;

x=123456789;

run;

 

Then I want output like this

 output;

   x

   1

   2 

   3 

   4

   5

   6

   7

   8

   9

 

how to do.....????

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 3436 views
  • 1 like
  • 3 in conversation