Merged prob.h, factorgraph.h, factograph.cpp from SVN head (broken!)
[libdai.git] / TODO
1 IMPORTANT
2
3 - Find out which methods need connected factor graphs (at least TreeEP and JunctionTree).
4
5 - Forwardport the Bart patch
6
7 - Get rid of multiple inheritance
8
9 - Introduce a namespace libDAI
10
11 - Investigate the performance penalty from the use of Properties map
12
13 - Another design question that needs to be answered: should a BP object own a
14 FactorGraph, or only store a reference to a FactorGraph (which can optimize
15 memory), or should it be a FactorGraph with additional variables and functions?
16 Probably, the first or second option would be the best. Since FactorGraphs are
17 assumed to be rather static, it would make sense to store *references* to
18 FactorGraphs inside AIAlgs. Or maybe some smart_ptr? It would make sense to
19 prevent unnecessary copying of FactorGraphs. Also, general marginalization
20 functions now copy a complete object. Instead, it would make more sense that
21 they construct a new object without copying the FactorGraph. Or they can be made
22 simply methods of the general InfAlg class.
23
24
25 DIFFICULT
26
27 - Bug: TreeEP sometimes returns NANs.
28
29 - Bug: TreeEP::logZ() seems to be wrong (?).
30
31 - Bug: strange things happen for at least JTree and TreeEP if the factor graph is not connected.
32 Should be fixed by finding out which methods demand connected factor graphs and these
33 methods should check whether the supplied factor graph is connected.
34
35 - Kees' trick for preventing NANs in GBP updates: if quantity smaller than epsilon, replace by epsilon.
36
37 - Bug: MF_SEQRANDOM does not work on alarm2.fg (normalization error: Z = 0)
38
39
40 OPTIONAL
41
42 - Restore brute force EXACT method.
43
44 - Alternative implementation of undo factor changes: the only things that have to be
45 undone are setting a factor to all ones and setting a factor to a Kronecker delta. This
46 could also be implemented in the factor itself, which could maintain its state
47 (one/delta/full) and act accordingly.
48
49 - Regression test could be enhanced by
50 * reporting number of iterations
51 * reporting all single node marginal errors
52
53 - Think about the _fac2ind variable: a better implementation of this feature is
54 probably possible.
55
56 - Changed() methods?
57
58 - It might be an idea to forbid modification of the structure of a factor graph
59 (except in constructors). Then we could do away with Regenerate and similar methods.
60
61 - Think about whether using the L-1 norm as a distance measure between new and
62 old messages/beliefs would be better, since || P(x_A, x_B) - Q(x_A, xB) ||_1 < epsilon
63 implies that || P(x_A) - Q(x_A) ||_1 < epsilon. However, for a product of messages
64 we have the weaker || P1(x_A) P2(x_A) - Q1(x_A) Q2(x_A) ||_1 < 2 epsilon if
65 || P1(x_A) - Q1(x_A) ||_1 < epsilon AND || P2(x_A) - Q1(x_A) ||_1 < epsilon...
66
67 - Implement damping.
68
69 - Optimize GBP with precalculated indices.
70
71 - Variable elimination should be implemented generically using a function
72 object that tells you which variable to delete.
73
74 - Improve general support for graphs and trees.
75
76 - What to do in case of divide-by-zero? For HUGIN, the correct way to handle
77 this is to define x/0 = 0. For other methods, this may not be the case...
78 Maybe a good solution would be to make different flavours of Prob which differ
79 in the way that they handle zeroes (two issues here: normalization of an all-zero
80 factor, which appears to be a sign that something is wrong, and division-by-zero
81 in operator/, operator/= and inverse(), which is probably harmless and can be fixed
82 by defining x/0 = 0. I think it is harmless because usually it occurs by calculating
83 some marginal _in the absence of some factor_ as an intermediate step. Afterwards,
84 this factor is always multiplied in again, which sets any value on these positions
85 to zero. Another difference would be how to deal with underflows: GBP often has
86 numbers that converge to zero, but instead this gives an underflow. A solution is
87 to set everything smaller than some epsilon to zero. Finally, one could decide to
88 add support for sparse potentials. Oh, and whether it can handle negative entries.
89
90 - Optimize BP_SEQMAX (it should use a better data structure than a vector for the residuals).
91 Also it does strange things in case of negative potentials.
92
93 - Investigate the possibility to calculate logZ *during* the inference algorithm
94 by keeping track of all normalizations.
95
96 - Write documentation.