BookmarkSubscribeRSS Feed
J_CKY
Obsidian | Level 7

Hi,

I am new to SAS, especially for do loop etc.

I have a dataset named "table" looks like below:

ID        Data      Months

21123     data1     201203

24578     data2     201203

12345     data3     201205

12346     data4     201204

12678     data5     201301

...

..

.

etc.


I currently need to run some sql join as following:


proc sql;

  create table _201204 as

  select A.*, B.score

  from table as A

  inner join

  scoredata as B

  on a.id=b.id

WHERE months=(201203) OR months=(201204)

  ;

quit;

proc sql;

  create table _201205 as

  select A.*, B.score

  from table as A

  inner join

  scoredata as B

  on a.id=b.id

WHERE months=(201203) OR months=(201204) OR months=(201205)

  ;

quit;


proc sql;

  create table _201206 as

  select A.*, B.score

  from table as A

  inner join

  scoredata as B

  on a.id=b.id

WHERE months=(201203) OR months=(201204) OR months=(201205) OR months=(201206)

  ;

quit;


So as time goes, the code keep growing longer and longer.

I hope to improve the code using a macro + hash table and do loops.

I have constructed a hash table using the data from "table"

I referred the following global forum document, it help in someway, but I couldn't get my head round of getting a full solution.

http://support.sas.com/resources/papers/proceedings12/069-2012.pdf

Any help from the community is much appreciated.

Thanks,

Jack

4 REPLIES 4
SUN59338
Obsidian | Level 7

it's depends on your datasets 'table' and 'scoredata'. if both datasets are not changed any more, you should be able to know how many unique values for variable months. Assume the varialbe ID in your scoredata has unique value for each record.

you can try something like this

data _201203  _201204 _201205 _201206  /*<>, you may also use macro here to list all*/;

length score 8.;

if _n_ eq 1 then do;

dcl hash sc(dataset:"Scoredata");

sc.definekey('ID");

sc.definedata("Score");

sc.definedone();

call missing(score);

end;

set table;

if sc.find() ne  0 then put "Warning: ID not found from ScoreData "  ID;

else do;

     select(months);

          when('201203") output _201203 _201204 _201205 _201206;

          when("201204") output _201204 _201205 _201206;

          when("201205") output _201205 201206 ;

          when("201206") output _201206;

          otherwise put "Warning: Other months value -" months;

     end;

end;

run;

if your dataset table keep update, I would like to suggest use Proc Append to add new to proper dataset _2012<>, rather than add to 'table' then split all the data again.

if there are duplicated values for ID variable in scoredata, the code above need some modifcation.

Ron_MacroMaven
Lapis Lazuli | Level 10

based on the presence of dates in your repetition

I recommend

http://www.sascommunity.org/wiki/Macro_Loops_with_Dates

http://support.sas.com/resources/papers/proceedings13/343-2013.pdf

note: you can reduce your where clause

from:

months=(201203) OR months=(201204) OR months=(201205) OR months=(201206)


to:

where Months between 201203 and 201206;

Ron Fehd  current.or.past dates maven

J_CKY
Obsidian | Level 7

forgot to mention that the "scoredata" is also monthly data. So there are loads of different copy of "scoredataMONYYYY" e.g. scoredateMar2012

I have now streamline the code using a macro as following:

%macro mc1(mth);


data _null_;

call symput('mth2', put(input(put(&mth,z6.),yymmn6.),monyy7.));

run;


proc sql;

  create table _&mth as

  select A.*, B.score

  from table as A

  inner join

  scoredata&mth2 as B

  on a.id=b.id

WHERE 201203 months &mth

  ;

quit;

%mend;


mc1(201203);

mc1(201204);

mc1(201205);

...

..

.

mc1(201303);

Now my question become how can I use a do loop to repeat the running of macro.

I know one way is to create a hash table from the dataset "table"

data _null_;

length months best12.;

if _n_ = 1 do;

declare hash m (dataset:"table", ordered: "a");

rc = definekey(months);

m.definedone();

end;

rc=m.output (dataset: "tab_mths");

run;

Ron_MacroMaven
Lapis Lazuli | Level 10

Your macro parameter consists of two parts: ccyy and mm

Macro DateText can extract ccyy and mm from an integer of a date:

http://www.sascommunity.org/wiki/Macro_Loops_with_Dates#Macro_DateText

%Macro Do_This

        (start=%sysfunc(mdy(3,1,2012))

        ,stop=%sysfunc(today())

        ,interval=month

);

%do ThisDate = &Start %to &Stop;

    %let D_Begin = %sysfunc(intnx(&Interval,&start,0,begin));

    %let D_End = %sysfunc(intnx(&Interval,&start,0,end));

    %let ccyymm = %datetext(date=&ThisDate

    ,text = %nrstr(work.&year-&mm

    );

    %put note: ccYYmm = &ccYYmm.;

    *...;

    %let &ThisDate = %eval(&D_End+1);

    %end;

%mend;

%do_This()

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1958 views
  • 3 likes
  • 3 in conversation