BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
lioradam
Obsidian | Level 7

Hi all,

I have the following data set:


/* --------------------------------------------------------------------
   Code generated by a SAS task
   
   Generated on Saturday, December 3, 2022 at 10:48:36 PM
   By task:     Import Data Wizard
   
   Source file: C:\Users\Lior\Downloads\example.xlsx
   Server:      Local File System
   
   Output data: WORK.EXAMPLE_0000
   Server:      Local
   -------------------------------------------------------------------- */

/* --------------------------------------------------------------------
   This DATA step reads the data values from DATALINES within the SAS
   code. The values within the DATALINES were extracted from the Excel
   source file by the Import Data wizard.
   -------------------------------------------------------------------- */

DATA WORK.EXAMPLE_0000;
    LENGTH
        PRD_YR             8
        GVKEY            $ 6 ;
    FORMAT
        PRD_YR           BEST12.
        GVKEY            $CHAR6. ;
    INFORMAT
        PRD_YR           BEST12.
        GVKEY            $CHAR6. ;
    INFILE DATALINES4
        DLM='7F'x
        MISSOVER
        DSD ;
    INPUT
        PRD_YR           : BEST32.
        GVKEY            : $CHAR6. ;
DATALINES4;
1999001034
2001001034
2002001034
2003001034
2004001034
2006001034
2007001034
2018001045
2019001045
2020001045
2014001050
2020001076
2002001078
2003001078
2004001078
2005001078
2006001078
2007001078
2008001078
2009001078
2010001078
2011001078
2012001078
2013001078
2014001078
2015001078
2016001078
2017001078
2018001078
2019001078
2021001078
;;;;



 

and I wrote the following code in the order number of each row belonging to a certain firm. meaning when the firm switches to a new firm, the numbering should start again.

 

 if first.GVKEY then year_of_series=0;
 year_of_series+1;

But when I run this code I just receive a file that numbers all the rows in the file as one series.

 

Does anyone know why the code is not working?

 

Thanks in advance,

Lior

1 ACCEPTED SOLUTION

Accepted Solutions
Sajid01
Meteorite | Level 14

It works

Please see the code below

DATA WORK.EXAMPLE_0000;
    LENGTH
        PRD_YR             8
        GVKEY            $ 6 ;
    FORMAT
        PRD_YR           BEST12.
        GVKEY            $CHAR6. ;
    INFORMAT
        PRD_YR           BEST12.
        GVKEY            $CHAR6. ;
    INFILE DATALINES4
        DLM='7F'x
        MISSOVER
        DSD ;
    INPUT
        PRD_YR           : BEST32.
        GVKEY            : $CHAR6. ;
DATALINES4;
1999001034
2001001034
2002001034
2003001034
2004001034
2006001034
2007001034
2018001045
2019001045
2020001045
2014001050
2020001076
2002001078
2003001078
2004001078
2005001078
2006001078
2007001078
2008001078
2009001078
2010001078
2011001078
2012001078
2013001078
2014001078
2015001078
2016001078
2017001078
2018001078
2019001078
2021001078
;;;;
proc sort data=EXAMPLE_0000;
by GVKEY;
run;
data EXAMPLE_0001;
Retain year_of_series GVKEY PRD_YR;
set EXAMPLE_0000;
by GVKEY;
if first.GVKEY then  year_of_series=0;
year_of_series+1;
run;

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

@lioradam wrote:

 

... and I wrote the following code in the order number of each row belonging to a certain firm. meaning when the firm switches to a new firm, the numbering should start again.

 

 if first.GVKEY then year_of_series=0;
 year_of_series+1;

 


Please don't ever show us a partial data step. This code can't be evaluated unless you show us the ENTIRE data step where it is used. I have a guess why you are getting the wrong results, but I shouldn't have to guess.

--
Paige Miller
Astounding
PROC Star
Your clue would appear in the log. (You must read the log every time!) There should be a message telling you that first.GVKEY is unitialized. That means it was never assigned a value, which means your DATA step is missing a BY statement.

After the SET statement add:

by gvkey;

That's what creates first.gvkey and last.gvkey.
Sajid01
Meteorite | Level 14

It works

Please see the code below

DATA WORK.EXAMPLE_0000;
    LENGTH
        PRD_YR             8
        GVKEY            $ 6 ;
    FORMAT
        PRD_YR           BEST12.
        GVKEY            $CHAR6. ;
    INFORMAT
        PRD_YR           BEST12.
        GVKEY            $CHAR6. ;
    INFILE DATALINES4
        DLM='7F'x
        MISSOVER
        DSD ;
    INPUT
        PRD_YR           : BEST32.
        GVKEY            : $CHAR6. ;
DATALINES4;
1999001034
2001001034
2002001034
2003001034
2004001034
2006001034
2007001034
2018001045
2019001045
2020001045
2014001050
2020001076
2002001078
2003001078
2004001078
2005001078
2006001078
2007001078
2008001078
2009001078
2010001078
2011001078
2012001078
2013001078
2014001078
2015001078
2016001078
2017001078
2018001078
2019001078
2021001078
;;;;
proc sort data=EXAMPLE_0000;
by GVKEY;
run;
data EXAMPLE_0001;
Retain year_of_series GVKEY PRD_YR;
set EXAMPLE_0000;
by GVKEY;
if first.GVKEY then  year_of_series=0;
year_of_series+1;
run;
lioradam
Obsidian | Level 7

Thank you!

lioradam
Obsidian | Level 7

Hi,

The "BY statement" indeed was written incorrectly.

Thanks a lot!

 

Lior

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 688 views
  • 3 likes
  • 4 in conversation