Help using Base SAS procedures

Number of months in which the company remain in same zip code area for a given year

Accepted Solution Solved
Reply
Regular Contributor
Posts: 162
Accepted Solution

Number of months in which the company remain in same zip code area for a given year

Good days to all,

I attached a file containing the following variables:

crsp_fundn = Identification code for a company

chgdt = Beginning date for contact information

chgenddt = Ending date for contact information

City =City for contact information

State = State for contact information

zip = Zipcodes for contact information.

My main task now is to calculate 'number of months in which the company keep an unique zipcode for a given year' starting for year 2003 - 2010.

For example, company with crsp_fundn=105, keep the same zipcode for 12 mths for each year 2003 - 2010.

My expected output:

All my input:

crsp_fundn = Identification code for a company

chgdt = Beginning date for contact information

chgenddt = Ending date for contact information

City =City for contact information

State = State for contact information

zip = Zipcodes for contact information.

Plus:

number of months which company keep a given zipcode for a specific year (for years 2003 - 2010; one value ie. months for each year)

Thank you.

mspak

Attachment

Accepted Solutions
Solution
‎03-07-2012 10:14 AM
PROC Star
Posts: 7,356

Re: Number of months in which the company remain in same zip code area for a given year

Yes they can provide the same results if applied correctly.  However, in your case, I think a simple datastep would be both easier and quicker.  e.g.:

libname art "c:\art";

data want (drop=inSmiley Happy;

  set art.fmutualzip;

  array years in:;

  do over years;

    if years ge 180 then do;

      year=input(substr(vname(years),3),4.);

      output;

    end;

  end;

run;

View solution in original post


All Replies
PROC Star
Posts: 7,356

Number of months in which the company remain in same zip code area for a given year

While SAS has functions to count the number of month boundaries crossed between two dates, I think that would end up giving you rather biased results.  I think you could get more accurate results by obtaining the number of days at particular zip codes.

The following is only offered as a possible method.  I did not check to ensure that it is doing everything correctly .. you would have to do that on your own:

libname thedata "c:\";

data temp (rename=

    (years1=in2003

    years2=in2004

    years3=in2005

    years4=in2006

    years5=in2007

    years6=in2008

    years7=in2009

    years8=in2010))

;

  set thedata.mutualcom (

    where=(year(chgdt) le 2010 and

      year(chgdt) gt 2002)

    drop=city state);

  array years(2003:2010);

  if year(chgenddt) ge 2011 then

    chgenddt='31DEC2010'd;

  if year(chgdt) lt 2003 then

    chgdt='01JAN2003'd;

  do year=year(chgdt) to year(chgenddt);

    if year ne year(chgenddt) then

      end=mdy(12,31,year);

    else end=chgenddt;

    if year gt year(chgdt) then

      start=mdy(1,1,year);

    else start=chgdt;

    years(year)=end-start+1;

  end;

run;

proc summary data=temp nway;

  var in:;

  class crsp_fundn zip;

  output out=want (drop=_Smiley Happy

             sum=;

run;

Regular Contributor
Posts: 162

Number of months in which the company remain in same zip code area for a given year

Thank you very much Smiley Happy

Super User
Posts: 9,662

Number of months in which the company remain in same zip code area for a given year

You'd better post some output to clarify what you want.

Ksharp

Regular Contributor
Posts: 162

Re: Number of months in which the company remain in same zip code area for a given year

Hi all,

I ran the program suggested  and yield the output with file name "fmutualzip". I get the number of days in which the mutual fund companies available for a given year. I would like to set a criteria that if a particular mutual fund available for 180 or more days, then the firms is considered as "available for the year".

My purpose is to calculate the distance between the US firms (as attached file "comp") and mutual fund companies available for years 2003 to 2010.  How should I restructure the data as per in "fmutualzip" into a format that enables distance calculation?

Thank you for any help.

Regards,

MEI SEN

Attachment
Attachment
Super User
Posts: 9,662

Re: Number of months in which the company remain in same zip code area for a given year

Sorry. I don't understand what you mean.

What is Distance you refered to ?

Ksharp

Regular Contributor
Posts: 162

Re: Number of months in which the company remain in same zip code area for a given year

geographic distance..I have zipcodes for both mutual fund companies and US listed companies in files. I used zipcodes to calculate distance between two locations.

As all firms in US, the zipcitydistance function is suitable for distance calculation. But i don't know how to restructure the data in "fmutualzip"  into the following columns:

crsp_fundn (mutual fund identification code)   zip  year

I wish only have one "year" column. The mutual fund which available for 180 days or more considered as "mutual companies "available" for a given year.

Thanks.

mspak

PROC Star
Posts: 7,356

Re: Number of months in which the company remain in same zip code area for a given year

Do the zipcode for mutual funds change over time?  If not, I would just get rid of all of the duplicate records in file comp and eliminate the year variable from that file.  If they do change over time, I think you will have to restructure one of your files so that they only have one year variable each, or both have wide records that include fields for multiple years.

Regular Contributor
Posts: 162

Re: Number of months in which the company remain in same zip code area for a given year

Hi art297,

the zipcodes for mutual funds might change as they might move their headquarter to another area. The original files that I posted have all starting date and ending date for the same zipcode for a given mutual funds. I couldn't include the mutual funds that have been in a given area for a very short period of time (perhaps only 1 day). Therefore, I set a criteria that if the mutual fund is available for at least 180 days for a given year, then I should calculate the distance between the mutual fund companies and the US listed firms.

At the final stage, I have to calculate how many mutual funds companies are located within 60 miles radius from the US firms' headquarter.

Thank you.

Regards,

mspak

PROC Star
Posts: 7,356

Re: Number of months in which the company remain in same zip code area for a given year

Then, if it were me, I would transpose the fmutual file so that it (1) each year was on a separate record (with the same variable name as in your comp file) and (2) only keep those records that meet your criteria.

Then, the problem becomes a simple one-to-one merge that can either be done in a datastep or with proc sql.

Regular Contributor
Posts: 162

Re: Number of months in which the company remain in same zip code area for a given year

Thanks,

I am new user of SAS and still exploring countless of the SAS functions. As per information that I read, I can use  either proc transpose or proc summary using IDGROUP option or data step to transpose data. Let me to explore more. Will these tranposition methods give rise to the same result?

Smiley Happy

Regards,

mspak

Solution
‎03-07-2012 10:14 AM
PROC Star
Posts: 7,356

Re: Number of months in which the company remain in same zip code area for a given year

Yes they can provide the same results if applied correctly.  However, in your case, I think a simple datastep would be both easier and quicker.  e.g.:

libname art "c:\art";

data want (drop=inSmiley Happy;

  set art.fmutualzip;

  array years in:;

  do over years;

    if years ge 180 then do;

      year=input(substr(vname(years),3),4.);

      output;

    end;

  end;

run;

Regular Contributor
Posts: 162

Number of months in which the company remain in same zip code area for a given year

Hi art297,

Thank you very much. I will also explore more for SAS transposition methods for my future use.

Regards,

mspak

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 184 views
  • 3 likes
  • 3 in conversation