Merged SVN head ...
[libdai.git] / TODO
1 IMPORTANT
2
3 - Idea: a FactorGraph and a RegionGraph are often equipped with
4 extra properties for nodes and edges. The code to initialize those
5 is often quite similar; maybe this can be abstracted to classes
6 like ExtFactorGraph and ExtRegionGraph (Ext stands for Extended), e.g.
7 template <typename NodeProperties, typename EdgeProperties>
8 class ExtFactorGraph : public FactorGraph {
9 public:
10 std::vector<NodeProperties> nodeProps;
11 std::vector<std::vector<EdgeProperties> > edgeProps;
12 // blabla
13 }
14 A disadvantage of this approach may be worse cachability.
15 A problem is if there are nog properties for nodes (type 1), nodes (type 2)
16 or edges. Maybe this can be solved using specializations, or using variadac
17 template arguments or something similar?
18 Idea: you could define a "class Empty {}", and add some code that checks for
19 the typeid, comparing it with Empty, and doing something special in that case
20 (e.g., not allocating memory).
21 The main disadvantage of this approach seems to be that it leads to even more
22 entanglement.
23
24 - THIS SMELLS LIKE A GOOD IDEA: instead of polymorphism by inheritance,
25 use polymorphism by template parameterization. For example, the real reason
26 you introduced the complicated inheritance scheme was for functions like
27 Factor calcMarginal( const InferenceAlgorithm & obj, const VarSet & ns, bool reInit );
28 Now instead, you could use a template function:
29 template<typename InferenceAlgorithm>
30 Factor calcMarginal( const InferenceAlgorithm &obj, const VarSet &ns, bool reInit );
31 This would assume that InferenceAlgorithm supports certain methods, like
32 clone(), grm(), ......
33 Ideally, you would use concepts to define different classes of inference
34 algorithms with different capabilities, for example the ability to calculate logZ,
35 the ability to calculate marginals, the ability to calculate bounds, the ability
36 to calculate MAP states, etcetera. Then, use traits classes in order to be able to
37 query the capabilities of the model. For example, you would be able to query whether
38 the inference algorithm supports calculation of logZ.
39
40 - If you ever do a rewrite, make sure that the graphical properties are
41 not entangled with the probabilistic properties. E.g., a factor graph
42 really should be represented as a bipartite graph, with a separate array
43 of variable labels and dimensions, and a seperate array of (pointers to)
44 factor tables. In this way, each factor could be implemented differently,
45 e.g., we could have some sparse factors, some noisy-OR factors, some dense
46 factors, some arbitrary precision factors, etc. Or we could make more use
47 of templates to have a more generic factor graph. Maybe in the end,
48 HasA relations are better than IsA relations...
49 Also, the current setup is stupid: I wrote a new function that works
50 on FactorGraphs, and I had to write boiler plate code for it in graphicalmodel.h
51 and in regiongraph.h (which is stupid).
52
53 - Clean up
54
55 - Use Boost::uBLAS framework to deal with matrices, especially, with
56 2D sparse matrices. See http://www.boost.org/libs/numeric/ublas/doc/matrix_sparse.htm
57 and tests/errorbounds/errorbounds3
58
59 - Introduce naming scheme:
60 all Vars and VarSets should be named v_..., e.g. v_i instead of i
61 all Factors should be named f_..., e.g. f_I instead of I
62 all indices should be named _..., e.g. _k instead of k
63 all iterators should be named i_, e.g. i_i is an iterator to i
64 all const_iterators should be named ci_, e.g. ci_i is an iterator to i
65
66 - Improve weightedgraph or use Boost::Graph
67
68 - Iterations and maxDiff are only interesting for iterative inference
69 algorithms. Yet, tests/test wants to know these values in a generic
70 way. Maybe we have to think of some way (e.g. using a Properties object)
71 to return these values from run(). Then we can simply look whether a
72 InferenceAlgorithm supports these fields. What other results could
73 we return? Time. MaxDiff during initialization for LC methods.
74 Or maybe we could use some traits mechanism which we can ask whether the
75 object has _iterations and _maxdiff variables.
76
77 - Think about whether the cavity initialization belongs to init() or to run().
78
79 - Fix LCLin.
80
81 - setFactor and setFactors should only change Probs, not Factors.
82
83 - Simplify Properties framework: each Property should be a std::string.
84 Each DAIAlg should have its own _properties struct and handle conversion.
85
86 - Forwardport the Bart patch
87
88 - Another design question that needs to be answered: should a BP object own a
89 FactorGraph, or only store a reference to a FactorGraph (which can optimize
90 memory), or should it be a FactorGraph with additional variables and functions?
91 Probably, the first or second option would be the best. Since FactorGraphs are
92 assumed to be rather static, it would make sense to store *references* to
93 FactorGraphs inside AIAlgs. Or maybe some smart_ptr? It would make sense to
94 prevent unnecessary copying of FactorGraphs. Also, general marginalization
95 functions now copy a complete object. Instead, it would make more sense that
96 they construct a new object without copying the FactorGraph. Or they can be made
97 simply methods of the general InfAlg class.
98
99 - Add comments in example.cpp, add documentation.
100
101 - Write documentation.
102
103 - Improve error handling.
104
105
106 DIFFICULT
107
108 - Bug: TreeEP sometimes returns NANs.
109
110 - Bug: TreeEP::logZ() seems to be wrong (?).
111
112 - Kees' trick for preventing NANs in GBP updates: if quantity smaller than epsilon, replace by epsilon.
113
114
115 OPTIONAL
116
117 - Use Expat XML parser for IO and define a XML based fileformat for .fg files?
118
119 - Port to GNU autotool chain?
120
121 - Another design question that needs to be answered: should a BP object own a
122 FactorGraph, or only store a reference to a FactorGraph (which can optimize
123 memory), or should it be a FactorGraph with additional variables and functions?
124 Probably, the first or second option would be the best. Since FactorGraphs are
125 assumed to be rather static, it would make sense to store *references* to
126 FactorGraphs inside AIAlgs. Or maybe some smart_ptr? It would make sense to
127 prevent unnecessary copying of FactorGraphs. Also, general marginalization
128 functions now copy a complete object. Instead, it would make more sense that
129 they construct a new object without copying the FactorGraph. Or they can be made
130 simply methods of the general InfAlg class.
131
132 - Forwardport the Bart/Max patch.
133
134 - Alternative implementation of undo factor changes: the only things that have to be
135 undone are setting a factor to all ones and setting a factor to a Kronecker delta. This
136 could also be implemented in the factor itself, which could maintain its state
137 (one/delta/full) and act accordingly.
138
139 - Think about the _fac2ind variable: a better implementation of this feature is
140 probably possible.
141
142 - Changed() methods?
143
144 - Optimize GBP with precalculated indices.
145
146 - Optimize all indices as follows: keep a cache of all indices that have been
147 computed (use a hash). Then, instead of computing on the fly, use the precomputed
148 ones.
149
150 - Variable elimination should be implemented generically using a function
151 object that tells you which variable to delete.
152
153 - Improve general support for graphs and trees.
154
155 - Add support for sparse potentials.
156
157 - Optimize BP_SEQMAX (it should use a better data structure than a vector for the residuals).