how to create multiple sub-databases from a time series database?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

how to create multiple sub-databases from a time series database?

I have a dataset like below:

StudyIDSereisHbCO2
S00112149
S00122251
S00132353
S00142455
S00152557
S00162659
S00172761
S00182863
S00192965
S001103067
S00211845
S00222047
S00232249
S00242451
S00252653
S002628

55

 

I'd like to create a dataset like follow. cutting data from series 1- 4 as 1st sub-dataset, then cutting from series from 3-6 as 2nd sub-dataset, and so on. would you please help?

StudyIDSereisHbCO2Subset
S001121491
S001222511
S001323531
S001424551
S001323532
S001424552
S001525572
S001626592
S001527613
S001628633
S001729653
S001830673
S001731694
S001832714
S001933734
S0011034754
S002118451
S002220471
S002322491
S002424511
S002326532
S002428552
S002530572
S002632592

  

 


Accepted Solutions
Solution
‎06-14-2016 10:23 AM
Super User
Posts: 9,878

Re: how to create multiple sub-databases from a time series database?

Oh. I just realize the raw data is different with output . Try this one :



data have;
infile cards expandtabs truncover;
input StudyID $	Sereis	Hb	CO2;
cards;
S001	1	21	49
S001	2	22	51
S001	3	23	53
S001	4	24	55
S001	5	25	57
S001	6	26	59
S001	7	27	61
S001	8	28	63
S001	9	29	65
S001	10	30	67
S002	1	18	45
S002	2	20	47
S002	3	22	49
S002	4	24	51
S002	5	26	53
S002	6	28	55
;
run;
data temp;
 merge have have(keep=StudyID rename=(StudyID=_StudyID) firstobs=3);
 if mod(_n_,2)=1 or StudyID ne lag(StudyID) then group+1;
 idx=1;output;
 if StudyID=lag2(StudyID) and StudyID=_StudyID then do;idx=2;output;end;
 drop _StudyID;
run;
proc sort data=temp;
 by StudyID group idx Sereis;
run;
data temp;
 set temp;
 by StudyID;
 if first.StudyID then do;k=0;g=0;end;
 k+1;
 if mod(k,4)=1 then g+1;
 drop k group idx;
run;
proc sql noprint;
 select distinct cats('want',g) into : data separated by ' '
  from temp;
 select distinct cats('when(',g,') output want',g,';') into : select separated by ' '
  from temp;
run;
data &data ;
 set temp;
 select(g);
 &select
 otherwise;
 end;
 drop g;
run;
 


View solution in original post


All Replies
Super User
Posts: 11,139

Re: how to create multiple sub-databases from a time series database?

You will have to explain all of the rules involved in going from 10 records within an id and selectively dupicating some of them such as why were series 3 through 8 duplicated but 1, 2, 9 and 10 not?

Are there always exactly 10 initial values of series for an Id? If fewer or more what would the duplication pattern be?

Is the exact output order critical?

 

Occasional Contributor
Posts: 5

Re: how to create multiple sub-databases from a time series database?

sure. this is a sample data from our study. we measured each patient's Hb and CO2 over time. like patient-1 (studyID=S001) had 10 time point (Sereis 1-10) measurements, but patient-2 (S002) only had 6 time point (Sereis 1-6) measurements. we want to cut the data for a speciafic time windows (here every 4 time points), but the next cut window should be re-set from half of previous window (i.e. 1st sub-set: from sereis 1 to 4; 2nd sub-set: from sereis 3 to 6 and so on). so there is no dulplicated at both begining and ending time point when we move the window forward.

 

thanks for help! 

 

Super User
Posts: 19,188

Re: how to create multiple sub-databases from a time series database?

In general you can do this as follows, this creates two datasets, one for females and one for males.

If you have many categories, or need some sort of dynamic set up it's more complex and you'll need to provide details for a better sample. This should help you get started.

 

data female male;
set sashelp.class;
if sex='F' then output female;
else if sex='M' then output male;
run;
Occasional Contributor
Posts: 5

Re: how to create multiple sub-databases from a time series database?

thanks. good to know.

Super User
Posts: 9,878

Re: how to create multiple sub-databases from a time series database?

Does every StudyID group have the number which is multiply of 4 -- n*4 ?
If that was , that would be easy.

data temp;
 set have;
 by StudyID;
 if first.StudyID then group=0;
 if mod(_n_,4)=1 then group+1;
run;

after that use HashTable or SELECT to split the table.


Occasional Contributor
Posts: 5

Re: how to create multiple sub-databases from a time series database?

Thanks Xia. Yes. I'd like to cut data by using same time window (here is 4 time points) for each studyID. but I need to re-set next window start point from half of the previous window. i.e. 1st group: from sereis 1-4; 2nd group: from sereis 3-6 (not from sereis 5-8).

Solution
‎06-14-2016 10:23 AM
Super User
Posts: 9,878

Re: how to create multiple sub-databases from a time series database?

Oh. I just realize the raw data is different with output . Try this one :



data have;
infile cards expandtabs truncover;
input StudyID $	Sereis	Hb	CO2;
cards;
S001	1	21	49
S001	2	22	51
S001	3	23	53
S001	4	24	55
S001	5	25	57
S001	6	26	59
S001	7	27	61
S001	8	28	63
S001	9	29	65
S001	10	30	67
S002	1	18	45
S002	2	20	47
S002	3	22	49
S002	4	24	51
S002	5	26	53
S002	6	28	55
;
run;
data temp;
 merge have have(keep=StudyID rename=(StudyID=_StudyID) firstobs=3);
 if mod(_n_,2)=1 or StudyID ne lag(StudyID) then group+1;
 idx=1;output;
 if StudyID=lag2(StudyID) and StudyID=_StudyID then do;idx=2;output;end;
 drop _StudyID;
run;
proc sort data=temp;
 by StudyID group idx Sereis;
run;
data temp;
 set temp;
 by StudyID;
 if first.StudyID then do;k=0;g=0;end;
 k+1;
 if mod(k,4)=1 then g+1;
 drop k group idx;
run;
proc sql noprint;
 select distinct cats('want',g) into : data separated by ' '
  from temp;
 select distinct cats('when(',g,') output want',g,';') into : select separated by ' '
  from temp;
run;
data &data ;
 set temp;
 select(g);
 &select
 otherwise;
 end;
 drop g;
run;
 


Occasional Contributor
Posts: 5

Re: how to create multiple sub-databases from a time series database?

Thank you so much! this works very well in our project.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 496 views
  • 1 like
  • 4 in conversation