Graphing a transition matrix, part 2

Regarding the request for a “good graphical way of showing changes in the distribution of a population among quantile categories,” Antony Unwin sends in this:

MultBarsFatherSonIncome.png

He writes:

Twenty-five lines of different angles and lengths and widths proportional to probabilities does not sound easy to interpret. Personally, I would go for a mosaicplot in multiple barchart form. The attached plot was produced using iplots with this code in R for some English income data (1767 cases in all) for quintiles of fathers (rows) and sons (columns). As the quintile variables were coded 1, ..5, they had to be redefined as factors. (Using Mondrian would have been easier.)

attach(sic58)

fqn58<-as.factor(fqn58) sqn58<-as.factor(sqn58) imosaic(fqn58,sqn58,type="mul")

I prefer my own suggestion (see link above) but maybe this is just a matter of taste.

4 thoughts on “Graphing a transition matrix, part 2

  1. This is okay, but I think it badly needs a y-axis scale. Maybe pale gray horizontal lines? And it would be more readable with a bit more empty space among the rectangles.

    Also, what if you want to compare the rectangles vertically as well as horizontally? This shows clearly if you start at location x, how likely is it you will end up at location y? But what if you want to also know the probability that y came from x? Maybe circles would work better than rectangles for that.

  2. The multiple barchart approach is okay, but not really that good for transition matrices if they are strongly structured (e.g. p_ii >> p_ij, i
    e j). Some bars risk being reduced to a single pixel-width, to accommodate the larger values.

    Several years ago Tak Wing Chan and I used area to represent the quantity instead, in square tables relating husbands' and wives' educational level (which are structurally similar to transition tables).

    You can see an example at http://teaching.sociology.ul.ie/bhalpin/transitio

  3. @yolio Scales could be nice, but would likely be intrusive. Interactive querying is a helpful solution to make detailed information available. Graphics are more for representing qualitative structure than for providing quantitative precision. If the plot were to be printed I would include scale information in the caption.

    Making all possible comparisons with one graphic is ambitious. Circles are tricky to compare and my group prefers the squares suggested by BrendanH, which are available in iplots using imosaic(vars, type="fluct"). It's a good display, particularly when the rows and columns are to be treated equally. If you want to compare the current column values, it would be better to draw a plot for that purpose with the variables switched: imosaic(sqn58, fqn58, type="mul"). This has the disadvantage that the columns are now in the rows, but the much greater advantage that the bar heights can be directly compared. Just be careful of the interpretation: is the probability that y came from x actually what is shown?

    @BrendanH The point about strongly structured matrices is a good one, but it is then the heights which become small, the widths remain the same. There are examples of such matrices in Chapter 11 of the book "Graphics of Large Datasets", where ceiling-censored zooming is used to get round the problem. This grows the smaller cells at the cost of limiting the size of the biggest ones. Needless to say, this is most effective when used interactively.

  4. Have a look at page 19 of
    http://www.iariw.org/papers/2008/jantti.pdf

    These authors are comparing income/wealth matrices like these across several countries. The labelling in their figure is a bit too cluttered, but the use of dots gets the message across.

    I presume in the above diagram (like the one by Jantti et al) the bars add up to 20% across and down. It might be worth having a total row and column to make this clear. (There is an error in the Jantti et al figure for Germany where they don't add up).

Comments are closed.