BookmarkSubscribeRSS Feed
jonam
Calcite | Level 5

Hi siva,

Based on your requirement i wrote the following code; I've assumed you've one (group, newstore) as key , another (group, modelid).  Is there any chance that you'll have same model id  in different groups? For the current scenario i kept the array limit as 3. but you can change that based on your need.

data g_s_m;

input Group          Newstore          Modelid;

datalines;

   10549          6453               5582          

   10549          6453               5217          

   10549          6453               2871 

   10549          2001               5582          

   10549          2001               5217          

   10549          2001               2871

   10548          9648               6453          

   10548          9648               3251          

   10548          9648               2205          

   10548          9453               6453          

   10548          9453               3251          

   10548          9453               2205    

   ;

/* source 1 */

Proc sql;

create table temp as

select

g_s.group, g_s.newstore,g_m.modelid

from

          (Select distinct

                group

                    , newstore

          from g_s_m

          ) g_s,

          (select distinct

                    group

                    , modelid

          from g_s_m) as g_m

where g_s.group=g_m.group

order by group, newstore, modelid;

quit;

proc print;run;

/*I realised,  For the subset of data provided creation of source1 produced exactly the same result as g_s_m */ . So instead of that directly we can use the g_s_m table */

/* using source1 to create source2 with unique modelid at group level */

proc sql;

create table group_modelid as

select distinct group,  modelid from

temp

order by group, modelid desc

;

quit;

/* transpose modelid at group */

proc transpose data=group_modelid out =grp_col_mod  prefix=model_;

by group;

var modelid;

run;

proc sql;

create table input1

as

select distinct group, newstore

from temp;

quit;

data final;

merge input1 (in=a) grp_col_mod (in=b);

by group;

if a and b;

run;

data final1(keep=group newstore final_mod);

set final;

by group;

array model_sel model_1-model_3;

if first.group then index=1;

else index+1;

final_mod=model_sel{index};

run;

proc print;

run;

Message was edited by: Jeyvinth Manoj Rayan

Sivanandam
Calcite | Level 5

Hi,

i ran the sample code which you gave it was as i expected ..

Could you please explain the below sample code

data final1(keep=group newstore final_mod);

set final;

by group;

array model_sel model_1-model_3;

if first.group then index=1;

else index+1;

final_mod=model_sel{index};//What this line will do

run;

proc print;

run;

art297
Opal | Level 21

Sivanandam,

I didn't write the code, but can definitely explain it.

The file, final, apparently has your values in multiple records in the order that you want them, in variables model_1 to model_3, with the desired value for the first record (in each group) in model_1, the second record in Model_2, etc.

The code keeps track of which record it is looking at by using

if first.group then index=1;

else index+1;

Then, it selects final_mod by picking the first, second or third value, respectively:

final_mod=model_sel{index}; /*<-What this line will do*/

jonam
Calcite | Level 5

Thanks for jumping in to explain Art.  appreciated!. i was away for sometime.

Basically  what i'm doing here is

1. My input1 is ordered by what you want at group, newstore.

2. I transposed the modelid by group level to have unique model id.

3. so once i merge in the above step (Input1, grp_col_mod). I'm actually appending the unique/distict model id at group level with

my "input1".

if first.group then index=1;

else index+1;

final_mod=model_sel{index};//What this line will do

4.  Now my first two columns are ready,(i mean group, newstore).  So every group. i'm assiging the modelid staring from 1 to 3(in this case).

if a group has multiple records i'm incrementing my index to fetch index+1 record.

Hope this clarifes.  welcome your queries if you still not clear about it.  Let us know about the other scenarios if it's not covered.

NN
Quartz | Level 8 NN
Quartz | Level 8

Hi,

Not sure but will this work.

data a;

infile datalines delimiter='|';

input a : $10.  b : $10.  c : $10.;

datalines;

10549|6453|5582

10549|6453|5217

10549|6453|2871

10549|2001|5582

10549|2001|5217

10549|2001|2871

10548|9648|6453

10548|9648|3251

10548|9648|2205

10548|9453|6453

10548|9453|3251

10548|9453|2205

;

run;

proc sort data=a;

by descending a descending b descending c;

run;

data b;

set a;

by descending a descending b descending c;

retain _b _c;

format _b _c $100.;

if first.a then do;

_b=b;

_c=c;

flag=1;

end;

else do;

    if findw(_b,b,',','T')=0 then do;

         if findw(_c,c,',','T') = 0 then do;

        flag=1;

        _c=cats(_c,',', c);

        _b=cats(_b,',', b);

        end;

    end;   

end;

if flag=1 then output;

drop _b _c;

run;

Tom
Super User Tom
Super User

Your original data seems to have all new stores matched to all model ids.

If this is true then you can use a data step many-to-many merge on GROUP to assign stores to models.

data have ;

  input Group Newstore Modelid ;

cards;

10549 6453 5582 

10549 6453 5217 

10549 6453 2871 

10549 2001 5582 

10549 2001 5217 

10549 2001 2871

10548 9648 6453 

10548 9648 3251 

10548 9648 2205 

10548 9453 6453 

10548 9453 3251 

10548 9453 2205 

run;

proc sort data=have(keep=group modelid) out=modelids nodupkey;

  by group modelid;

run;

proc sort data=have(keep=group newstore) out=stores nodupkey;

  by group newstore;

run;

 

* Need to reset the IN flag to keep it from matching the last store to all of the remaining models ;

data want ;

  in1=0;

  merge stores(in=in1) modelids;

  by group ;

  if in1;

  put group newstore modelid ;

run;

If you want to introduce some randomness into the issue then sort the stores and models randomly within the groups before the merge.

jonam
Calcite | Level 5

Nice!. reseting the in flag is a new learning for me Smiley Happy

Sivanandam
Calcite | Level 5

Hi Tom,

Could  you please epalin about IN flag in detail so that we can learn about that.

Thanks,

Siva

Sivanandam
Calcite | Level 5

Thanks Art.

Tom
Super User Tom
Super User

The flags created by the IN= dataset option will be boolean flags indicating whether the dataset contributed to the current observation.

For 1-to-1 and 1-to-N matches the meaning is clear.

But when you have N-to-M matches the logic is a little more complex.  SAS will basically match the records 1 to 1 until it runs out of records for that BY group from one of the sources.  In that case the result is that SAS will carry forward the variables for the last observation from that dataset for the rest of the BY group.  This includes the IN= flag.  Resetting the flag (either at the end of the step or before the MERGE statement at the beginning of the step) will cause it to not be set until SAS reads another observation from that dataset.  You might want to also add a CALL MISSING statement to force the variables to missing as they will also retain their values. You do not need that in this program for the variables coming from the STORES dataset, but you might need it for the variables coming from the MODELIDS dataset if it is possbile to have more newstores than modelids for a group.

art297
Opal | Level 21

I noticed that the code you said worked didn't actually provide what you originally said was the desired result, thus I presume that it doesn't make any difference which modelid is selected second, as long as the newstore and modelid aren't the same as the first one.

If that is correct, couldn't you just use some simple code like:

data g_s_m;

  input Group  Newstore  Modelid;

  datalines;

10549  6453  5582         

10549  6453  5217         

10549  6453  2871

10549  2001  5582         

10549  2001  5217         

10549  2001  2871

10548  9648  6453         

10548  9648  3251         

10548  9648  2205         

10548  9453  6453         

10548  9453  3251         

10548  9453  2205   

;

data want (keep=group newstore modelid);

  set g_s_m;

  by group newstore notsorted;

  retain first_newstore first_modelid;

  if first.group and first.newstore then do;

    n=1;

    first_newstore=newstore;

    first_modelid=modelid;

    output;

  end;

  else if newstore ne first_newstore and n eq 1 then do;

    if modelid ne first_modelid then do;

      n+1;

      output;

    end;

  end;

run;

Of course, if you data aren't already grouped by group and, within each group grouped by newstore, you would first have to sort the data.  E.g., using something like:

proc sort data=g_s_m;

  by group newstore;

run;

Sivanandam
Calcite | Level 5

HI All,

Thanks for all your response.It's very usefull.

am currently working with various input scenerios.Once it complete let you know.

Thanks,

Siva.

Ksharp
Super User

Sorry ,so late to reponse you.

Tom has already gave you an answer, My code is another way if you would like.

"it varies based up on group number."

So I assume that modelid is different from each group.

data have ;
  input Group Newstore Modelid ;
cards;
10549 6453 5582 
10549 6453 5217 
10549 6453 2871 
10549 2001 5582 
10549 2001 5217 
10549 2001 2871
10548 9648 6453 
10548 9648 3251 
10548 9648 2205 
10548 9453 6453 
10548 9453 3251 
10548 9453 2205 
run;
proc sort data=have(keep=group modelid) out=modelids nodupkey;
  by group modelid;
run;
proc sort data=have(keep=group newstore) out=newstore nodupkey;
 by group newstore;
run;
data result(where=(newstore is not missing));
 merge newstore modelids;
 by group; output;
 call missing(newstore);
run;

Ksharp

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 28 replies
  • 1746 views
  • 6 likes
  • 9 in conversation