Hierarchical Log-linear models are also quite similar to the approach of reconstructability analysis (also to mask analysis [50]). They investigate which combined effects of variables are required for a good approximation of the overall relation, or in short what structure among the variables is necessary to describe the data sufficiently.
Log-linear models in general start out with count tables (contingency tables). As in RA, the data is then projected to the hypothesized subrelations, still represented by counts compared to some other measure (mostly probability) in RA. Then the overall relation is rebuilt by maximum likelihood estimates (MLE) via the iterative proportional fitting algorithm (Deming-Stephan algorithm), rather than the unbiased join procedure in RA. Also some other reconstruction algorithms are known, e.g. the Newton-Raphson algorithm [30, pg. 22].
While RA describes its models just by subsets of variables, this method puts
the main emphasis on describing the result as a ``log-linear model''.
The cell-counts Fij of two variables 1,2 are expressed in the following way:
(23) |
(24) |
These models are called ``saturated'' as they represent the whole relationship between two variables. The and the 's can be calculated according to the counts. For details on this see [30]. A ``simplified'' model is obtained by ignoring some of the interaction terms and assuming, e.g. . This ``unsaturated'' model is used for representing the projected and reconstructed data. Note that hierarchical models require that models containing high order tau's (e.g. ) also contain all its lower order tau's (e.g. and ). High order in this context reflects the number of variables interacting.
Though RA and hierarchical log-linear models are quite similar in what they actually do, their main differences are in their approaches. RA emphasis on all the possible models and the search through model space for a huge number of variables. Log-linear models concentrate more on statistical aspects and the interactions between a small number of variables. Because of the similarities it is at least interesting to follow the development and history of both methods.