We use standard 5fold crossvalidation to evaluate our BiC(RoBiC, ...) system: we first divided the data into 5 balanced folds FF = {F_{1}, …, F_{5}} then use the information in FF  { F_{i}} to find labels for each instance r_{i,j} ∈ F_{i}. Now recall, however, that BiC(RoBiC, ...)'s first step involves finding the biclusters based on both (the nonlabel part of) FF  { F_{i}} and F_{i}  ie on all of FF. This means the biclusters (and hence the classification) for r_{5,1} depends on r_{5,2}, r_{5,3}, ... r_{5,3}, as well as F_{1}, …, F_{4}.
Does this make a difference? In particular, how does the scenario compare with the more "standard" version, where the label for r_{5,1} depends only on itself and F_{1}, …, F_{4}, but NOT r_{5,2}, r_{5,3}, ... r_{5,15}.
To find out, we took 4/5 of the BreastCancer data as the training set D, which here has 61 instances. We then considered each of the remaining 15 elements R = { r_{1}, r_{2}, ... r_{15}} one by one. Here, we used the set of instances D_{i} = D ∪ { r_{i} } to produce the set of k=30 biclusters, B_{i} = { B_{i,1}, B_{i,2}, ..., B_{i,k}} = RoBiC( D_{i}, k).
Of course, each of these B_{i} bicluster sets can be very different from one another. We can allay some of our worries if we find that these 15 different bicluster sets are similar to one another, and also to the biclusters obtained using the full FF set of instances, B^{*} = RoBiC( FF, k). Below we present two ways to measure these similarities, focussing on just the first three biclusters for each set  ie, comparing the members of {B_{i,1}}_{i} = {B_{1,1}, B_{2,1}, ..., B_{15,1}} with one another and with B^{*}_{1}; then comparing {B_{i,2}}_{i} with each other and with B^{*}_{2}; and finally dealing with {B_{i,3}}_{i} and B^{*}_{3}. For notation: each bicluster B_{i,j} involves a particular set of genes G_{i,j}.
(See also UseOnlyTraining for another way to use only the training data.)
F(A, B) = Fmeasure(A, B) = 

We therefore computed the 15 values F( G_{i,1}, G_{1}^{*}), associated with the first bicluster of each biclusterset. This is graphed in the far left region in left plot in Figure 1 below, as a boxandwhisker plot (produced with Matlab's BOXPLOT). (This plot in corresponds to the 15 values of F( G_{i,1}, G_{1}^{*}) over the 15 single patient additions.) We see that the mean is around 0.85, and one standard deviation is only a few percent. The middle region in this graph corresponds to the second biclusters { F( G_{i,2}, G_{2}^{*}) }; and the far right to the third biclusters { F( G_{i,3}, G_{3}^{*}) }.
We also compared all (^{15}_{ 2}) pairs F( G_{i,1}, G_{j,1} ) pairs, for (i ≠ j). The left graph of the Figure shows those values, for the first, second and third biclusters. Notice the average Fscore here is around 0.95 for both the first and 2nd biclusters.
Figure 3 deals with genes. We note that almost 300 genes (of around 400) appear in all 15 bicluster#1's, and around 800 (of 1000) genes in all 15 bicluster#2's.