Improved documentation of factor.h, ...
[libdai.git] / TODO
1 TODO:
2
3 - Improve documentation.
4 * COPYING
5 * FILEFORMAT
6 * Merge README with mainpage, removing duplicate text
7 * Document examples, tests and utils
8
9 - Adapt http://www.boost.org/development/requirements.html#Design_and_Programming
10
11 - Use "gcc -MM" to generate dependencies for targets.
12 Maybe even better: switch to cmake (see http://lwn.net/Articles/188693/)
13 cross-platform build system.
14
15
16 OPTIMIZATION:
17
18 - BipartiteGraph::isConnected should be optimized using boost::graph
19
20 - Replace VarSets by smallSet<size_t> where appropriate, in order to minimize the use of findVar().
21
22 - A DAIAlg<T> should not inherit from a FactorGraph/RegionGraph, but should
23 store a reference to it. This prevents needless copying of (possibly large)
24 data structures. Disadvantage: the caller must not change the data structure
25 between calls. (Maybe a smart_ptr would help here?)
26 Also, general marginalization functions now copy a complete object. Instead, it
27 would make more sense that they construct a new object without copying the
28 FactorGraph. Or they can simply be made methods of the general InfAlg class.
29
30
31 IDEAS WORTH EXPLORING:
32
33 - Use a PropertySet as output of a DAIAlg, instead of functions like maxDiff and Iterations().
34
35 - Cache second-order neighborhoods (delta's) in BipGraph?
36
37 - If you ever do a rewrite, make sure that the graphical properties are not
38 entangled with the probabilistic properties. E.g., a factor graph really
39 should be represented as a bipartite graph, with a separate array of variable
40 labels and dimensions, and a seperate array of (pointers to) factor tables. In
41 this way, each factor could be implemented differently, e.g., we could have
42 some sparse factors, some noisy-OR factors, some dense factors, some arbitrary
43 precision factors, etc. Or we could make more use of templates to have a more
44 generic factor graph.
45
46 - Use Boost::uBLAS framework to deal with matrices, especially, with
47 2D sparse matrices.
48 See http://www.boost.org/libs/numeric/ublas/doc/matrix_sparse.htm
49 and tests/errorbounds/errorbounds3
50
51 - Introduce naming scheme:
52 all Vars and VarSets should be named v_..., e.g. v_i instead of i
53 all Factors should be named f_..., e.g. f_I instead of I
54 all indices should be named _..., e.g. _k instead of k
55 all iterators should be named i_, e.g. i_i is an iterator to i
56 all const_iterators should be named ci_, e.g. ci_i is an iterator to i
57
58 - Define a better fileformat for .fg files (maybe using XML)?
59
60 - Alternative implementation of undo factor changes: the only things that have to be
61 undone are setting a factor to all ones and setting a factor to a Kronecker delta. This
62 could also be implemented in the factor itself, which could maintain its state
63 (one/delta/full) and act accordingly.
64
65 - Think about the RegionGraph::fac2OR variable: a better implementation of this feature is
66 probably possible.
67
68 - Optimize GBP with precalculated indices.
69
70 - Optimize all indices as follows: keep a cache of all indices that have been
71 computed (use a hash). Then, instead of computing on the fly, use the precomputed ones.
72
73 - Variable elimination should be implemented generically using a function
74 object that tells you which variable to delete.
75
76 - Improve general support for graphs and trees.
77
78 - Add support for sparse potentials.
79
80 - Optimize BP_SEQMAX (it should use a better data structure than a vector for the residuals).
81
82
83 BAD IDEAS:
84
85 - A FactorGraph and a RegionGraph are often equipped with
86 additional properties for nodes and edges. The code to initialize those
87 is often quite similar; maybe this can be abstracted to classes
88 like ExtFactorGraph and ExtRegionGraph (Ext stands for Extended), e.g.
89 template <typename NodeProperties, typename EdgeProperties>
90 class ExtFactorGraph : public FactorGraph {
91 public:
92 std::vector<NodeProperties> nodeProps;
93 std::vector<std::vector<EdgeProperties> > edgeProps;
94 // blabla
95 }
96 A disadvantage of this approach may be worse cachability.
97 A problem is if there are nog properties for nodes (type 1), nodes (type 2)
98 or edges. Maybe this can be solved using specializations, or using variadac
99 template arguments or something similar?
100 Idea: you could define a "class Empty {}", and add some code that checks for
101 the typeid, comparing it with Empty, and doing something special in that case
102 (e.g., not allocating memory).
103 The main disadvantage of this approach seems to be that it leads to even more
104 entanglement.
105
106 - Instead of polymorphism by inheritance, use polymorphism by template
107 parameterization. For example, the real reason you introduced the complicated
108 inheritance scheme was for functions like
109 Factor calcMarginal( const InferenceAlgorithm & obj, const VarSet & ns, bool reInit );
110 Now instead, you could use a template function:
111 template<typename InferenceAlgorithm>
112 Factor calcMarginal( const InferenceAlgorithm &obj, const VarSet &ns, bool reInit );
113 This would assume that InferenceAlgorithm supports certain methods, like
114 clone(), grm(), ...... Ideally, you would use concepts to define different
115 classes of inference algorithms with different capabilities, for example the
116 ability to calculate logZ, the ability to calculate marginals, the ability to
117 calculate bounds, the ability to calculate MAP states, etcetera. Then, use
118 traits classes in order to be able to query the capabilities of the model. For
119 example, you would be able to query whether the inference algorithm supports
120 calculation of logZ. Unfortunately, this is compile-time polymorphism, whereas
121 tests/test needs runtime polymorphism.