BookmarkSubscribeRSS Feed
silasskovsbo
Calcite | Level 5

Dear all.

I work with panel data (data sets with 5 mio. observations pr year and hundreds of variables) at University of Aarhus, department of Economics, and am looking for a program which can do the following:

For some variables are the all observations missing for some years.It would be of great use to have a program which could make a table with one observation for each year telling for all the variables whether this variable is fully missing for the specific year. Missing is defined as SAS standards for both numeric and characters.

Example of data

Header 1Header 2Header 3Header 4
PersonidentifierYearMaleIncome
1199911000
2199901500
120001.
220000.
1200111400
220010.

And wanted table:

Header 1Header 2Header 3
YearMale_missIncome_miss
199900
200001
200100

Hope it makes sense and that somebody it able to help.

Thanks a lot!

Best // Silas

2 REPLIES 2
Ksharp
Super User

OK.

data have;
input Personidentifier     Year     Male     Income ;
cards;
1     1999     1     1000
2     1999     0     1500
1     2000     1     .
2     2000     0     .
1     2001     1     1400
2     2001     0     .
;
run;
proc sql;
create table want as
 select  Year,nmiss(Male)=count(*) as male_miss,nmiss(Income)=count(*) as income_miss
  from have
   group by Year ;
quit;

Ksharp

Haikuo
Onyx | Level 15

In case performance is one of your concerns (It seems to me you have a big table), comparing the following data step approach with Ksharp's Proc SQL, choose the one that runs faster:

Also note, the following solution can also ease your hard coding to some extend.

data have;

  input Personidentifier Year Male Income;

  cards;

1 1999 1 1000

2 1999 0 1500

1 2000 1 .

2 2000 0 .

1 2001 1 1400

2 2001 0 .

;

proc sql NOPRINT;

  select name into :old separated by ' ' from dictionary.columns

  where upcase(name) not in ('PERSONIDENTIFIER', 'YEAR')

  AND LIBNAME='WORK' AND MEMNAME='HAVE';

  SELECT cats(name,'_','miss') INTO :NEW separated by ' ' from dictionary.columns

  where upcase(name) not in ('PERSONIDENTIFIER', 'YEAR')

  AND LIBNAME='WORK' AND MEMNAME='HAVE';

QUIT;

data want;

retain year &new;

array new &new ;

do i=1 to dim(new);

  new(i)=1;

end;

  DO UNTIL (LAST.YEAR);

  SET HAVE ;

BY YEAR NOTSORTED; /* IN CASE 'YEAR' IS ONLY CLUSTERED, NOT SORTED*/

array old &old.;

do i=1 to dim(old);

new(i)=ifn(not missing(old(i)),0,new(i));

end;

  end;

  keep year &new;

  run;

Haikuo

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 842 views
  • 1 like
  • 3 in conversation