Hi, I am newbie to SAS and i need help desperately.
getting straight to the point, I am trying to process the sas table into another table as described below:
Given:
data work;
input ID JOB_CD YEAR;
cards;
1 4 2006
1 4 2007
1 3 2008
1 5 2009
1 4 2010
2 5 2005
2 5 2006
7 1 2007
;
run;
WANT:
ID JOB_CD YEAR FIRST_YEAR LAST_YEAR
1 4 2006 2006 2007
1 4 2007 2006 2007
1 3 2008 2008 2008
1 5 2009 2009 2009
1 4 2010 2010 2010
2 5 2005 2005 2006
2 5 2006 2005 2006
7 1 2007 2007 2007
the actual data is much larger but,
basically I am trying to group the data by ID and JOB_CD, and get first_year and last_year for each job within each ID.
It seemed to work with using "BY ID JOB_CD", FIRST.ID, LAST.ID, RETAIN statements but the problem is, given the case above,
ID JOB_CD YEAR FIRST_YEAR LAST_YEAR
1 4 2006 2006 2010
but as you can see this is not what i want. This guy with ID 1 had had a job 4 then changed a job for a few years before changing back to the original job.
It does not matter if I use SQL for it.
Simpler the better, but I really want this to work.
Thanks for reading.
Be cautious. Just grouping by id and job_cd will not solve the fact that id 1 has two separate periods with job_cd = 4.
I propose some trickery:
data
  int1 (keep=id job_cd period year)
  int2 (keep=id period first_year last_year)
;
set work;
by id job_cd notsorted;
retain period_count first_year;
if first.id then period = 0;
if first.job_cd
then do;
  period + 1;
  first_year = year;
end;
if last.job_cd
then do;
  last_year = year;
  output int2;
end;
output int1;
run;
data want;
merge
  int1
  int2
;
by id period;
drop period;
run;While there are a couple of ways to do this, here are some tools that I expect would be "next in line" for you to learn. You can skip the PROC SORT if the data set is already in the proper order.
proc sort data=have;
by id job_cd;
run;
proc summary data=have;
var year;
by id job_cd;
output out=years (drop=_type_ _freq_) min(year) = first_year max(year) = last_year;
run;
data want;
merge have years;
by id job_cd;
run;
Be cautious. Just grouping by id and job_cd will not solve the fact that id 1 has two separate periods with job_cd = 4.
I propose some trickery:
data
  int1 (keep=id job_cd period year)
  int2 (keep=id period first_year last_year)
;
set work;
by id job_cd notsorted;
retain period_count first_year;
if first.id then period = 0;
if first.job_cd
then do;
  period + 1;
  first_year = year;
end;
if last.job_cd
then do;
  last_year = year;
  output int2;
end;
output int1;
run;
data want;
merge
  int1
  int2
;
by id period;
drop period;
run;You can use @Astounding's method by first creating a period variable like I did and then use that in the proc summary and the merge instead of job_cd.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
