Data format

  First row - learning parameters

    errorgoal - stop learning if error goal is reached.
    numberofiterations - stop criterian if maximum number of iterations is reached.

  Second row - variable names

  The remaining rows - data values for training

     , - the separator
    ? - indicates the missing value
    : - indicates the queried value
    @ - indicates the label of the query
    & - indicates the frequency of the query being asked (assumed uniform distribution in our experiments)
    everything else - indicates the actual value

  The following is an example for the simple A->X->C network

  errorgoal,.000000001,numberofiterations,50,maxnumberoflines,2
  A,X,C
  0,?,:0@1.0&1
  1,?,:1@1.0&1
  This data indicates that P(C=0|A=0)=1, and P(C=1|A=1)=1.

Data generation

Input

  Number of queries to generate
  Evidence variables
  Query variables (may overlap with evidence variables)
  Amount of missing evidence

Procedure

  For each training query to be generated
        For each evidence variable
              Select it as an observed evidence with the probability specified
        Assign the values to the selected evidence variables using their prior probability distribution
        Randomly select a query variable
        Set a value for the query variable using its infered probability distribution