4-Set Venn Diagrams

Classic Venn diagrams use overlapping circles to depict sets and their intersections. These work very well for data that divides into 2 or 3 sets. We have had a few requests over the past couple of years to extend the Venn Diagram component in Pipeline Pilot to handle higher numbers of sets.

There are a number of possible ways of representing 4 sets diagramatically, and one or two possibilities for higher numbers. The wikipedia article http://en.wikipedia.org/wiki/Venn_diagram shows some examples of these.

The most common way of representing 4 sets appears to be using overlapping ellipses eg

ellipses.png

In this case there is no relationship between the size of the ellipses and their overlaps and the sizes of the sets and their intersections.

The Canvas components in Pipeline Pilot allow you to draw ellipses and it is relatively simple to work out the numerical overlaps between the sets.  The attached "4 Set quasi-Venn Diagram" component will draw a diagram such as the one above and optionally label the various regions with either a description or a count of their contents. In addition it produces a table of the various sets and set intersections with a count of the contents. A final option allows you to turn off the labels of the intersections to produce a less cluttered chart.

To produce the plot, the component takes input records with an array property "Set" containing an array of the sets of which the record is a member eg set1;set4.

The attached protocol, "4 Compound Sets Venn", illustrates the use of the component using 4 random subsets from the Asinex data.

I would welcome any comments or suggestions for improvements.

Malcolm