Anisotropic measurements of the Baryon Acoustic Oscillation (BAO) feature within a galaxy survey enable joint inference about the Hubble parameter H(z) and angular diameter distance DA(z). These measurements are typically obtained from moments of the measured 2-point clustering statistics, with respect to the cosine of the angle to the line of sight mu. The position of the BAO features in each moment depends on a combination of DA(z) and H(z), and measuring the positions in two or more moments breaks this parameter degeneracy. We derive analytic formulae for the parameter combinations measured from moments given by Legendre polynomials, power laws and top-hat Wedges in μ, showing explicitly what is being measured by each in real-space for both the correlation function and power spectrum, and in redshift-space for the power spectrum. The large volume covered by modern galaxy samples means that the correlation function can be well approximated as having no correlations at different mu on the BAO scale, and that the errors on this scale are approximately independent of mu. Using these approximations, we derive the information content of various moments. We show that measurements made using either the monopole and quadrupole, or the monopole and μ2 power-law moment, are optimal for anisotropic BAO measurements, in that they contain all of the available information using two moments, the minimal number required to measure both H(z) and DA(z). We test our predictions using 600 mock galaxy samples, matched to the SDSS-III Baryon Oscillation Spectroscopic Survey CMASS sample, finding a good match to our analytic predictions. Our results should enable the optimal extraction of information from future galaxy surveys such as eBOSS, DESI and Euclid.