<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>bit-player</title>
	<atom:link href="http://www.bit-player.org/feed" rel="self" type="application/rss+xml" />
	<link>http://bit-player.org</link>
	<description>An amateur's outlook on computation and mathematics.</description>
	<pubDate>Thu, 02 Feb 2012 19:38:54 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.3</generator>
	<language>en</language>
			<item>
		<title>The Right Click</title>
		<link>http://bit-player.org/2012/the-right-click</link>
		<comments>http://bit-player.org/2012/the-right-click#comments</comments>
		<pubDate>Sat, 21 Jan 2012 22:39:41 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<category><![CDATA[modern life]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=1072</guid>
		<description><![CDATA[For a few hours yesterday the front page of the New York Times was stealing  right clicks. If I right-clicked on a hyperlinked headline (or option-clicked, or made a two-fingered tap on the trackpad), I did not get the usual context menu; instead, I was taken directly to the target of the link. This [...]]]></description>
			<content:encoded><![CDATA[<p>For a few hours yesterday the front page of the <em>New York Times</em> was stealing  right clicks. If I right-clicked on a hyperlinked headline (or option-clicked, or made a two-fingered tap on the trackpad), I did not get the usual context menu; instead, I was taken directly to the target of the link. This is the proper behavior for an ordinary mouse click&#8212;or a left click with a two-button mouse&#8212;but not for a right click. </p>
<p>The first time this happened, I thought it was just a slip-of-the-finger, but the error was consistently repeatable across two different machines and three different browsers (Firefox, Chrome, Safari). Furthermore, it affected only the <em>New York Times</em>. Indeed, it was only the front page of the <em>Times</em> that was misbehaving; right clicks elsewhere in the paper worked normally.</p>
<p>The cause of this problem may have been an innocent goof, but I&#8217;m skeptical. When the <em>Times</em> first put up a paywall, not quite a year ago, readers quickly found <a href="http://www.practicalhacks.com/2011/05/31/how-to-hack-the-new-york-times-paywall/">holes</a> in it. One of those holes involves right-clicking a link to get a copy of the URL, pasting it in the browser address bar, and removing the referrer cruft following the question mark. My guess is that someone at the <em>Times</em> decided it was time to close the hole.</p>
<p>I hasten to add that freeloading is not <em>my</em> reason for right-clicking on <em>Times</em> headlines. I pay my $15 per doublefortnight. But my newsreading habit is to peruse the entire front page, opening each article that interests me in a separate tab. The &#8220;open in new tab&#8221; command lives in the right-click contextual menu.</p>
<p>Regardless of <em>why</em> the Times was interfering with my Second Amendment right to bear mouse buttons, I was curious about <em>how</em> they were doing it. They weren&#8217;t just disabling the contextual menu entirely. (You can read a scornful account of <em>that</em> nefarious practice at <a href="http://javascript.about.com/library/blright.htm">About.com</a>, which identifies itself as &#8220;A part of The New York Times Company.&#8221; (Not, in my view, the best part.)) On the NYT front page, right clicks worked as usual in ordinary text; they were only hijacking right clicks on links.</p>
<p>Regrettably, I&#8217;m not going to be able to answer the how&#8217;d-they-do-it question. Before I could find the offending code, some grownup at the <em>Times</em> called off the whole crazy experiment, and normal right-clickery was restored.</p>
<p>Although I couldn&#8217;t find the click-stealer, I found plenty else. The <em>Times</em>, it seems, prints all the JavaScript that fits. Some of it is unsurprising. jQuery is loaded. There are scripts to run slide shows and videos, to manage cookies, to serve ads, to provide menus and other navigation aids. But there&#8217;s lots more:</p>
<ul>
<li><code>beacon.js  </code> This may have something to do with all those little files named 1px.gif floating around like packing peanuts.</li>
<li><code>revenuescience.js  </code> Apparently a product of an outfit called <a href="http://www.audiencescience.com/">Audience Science</a>. &#8220;AudienceScience is processing trillions of behaviors per day and over 270 billion attributes at any given moment.&#8221; You don&#8217;t say.</li>
<li><code>krux-4.7.2.js   </code> The web site of <a href="http://www.kruxdigital.com/">Krux</a> (which I had never heard of before) says: &#8220;Krux helps large and small websites control, energize, and responsibly monetize consumer data across screens and sources.&#8221; Reading further, I get the impression they are in the business of preventing snoopers from snooping on the snoopers who snoop on us. I&#8217;m certainly not having much luck snooping on <em>their</em> code. It looks like this:
<p><code>function(a){e(a)||A(b,c(a))}),h(b,c(a[1]),e(f)?f:function({o.js.apply(null,j)})):h(b,c(a[1]));</code></li>
<li><code>gw.js   </code>Even deeper obfuscation. I believe this is a JavaScript program whose function is to write another JavaScript program into the page header. It seems to be one of the tools that Audience Science uses to process those trillions of &#8220;behaviors&#8221; per day.</li>
</ul>
<p>Phooey on them, I say.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2012/the-right-click/feed</wfw:commentRss>
		</item>
		<item>
		<title>Sugarpixels</title>
		<link>http://bit-player.org/2012/sugarpixels</link>
		<comments>http://bit-player.org/2012/sugarpixels#comments</comments>
		<pubDate>Sun, 01 Jan 2012 19:42:29 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[off-topic]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=1070</guid>
		<description><![CDATA[ Yes, that&#8217;s one of those annoying QR codes that seem to be turning up all over the place lately. The jumble of red and blue pixels appears on the front of the holiday greeting cards that my wife Rosalind and I sent to a few friends and family-members last week. Since the card has [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://bit-player.org/sugarpixels/"><img class="alignright" src="http://bit-player.org/wp-content/uploads/2012/01/qr-red-blue-200pts.jpg" alt="QR code in red and blue pixels" border="0" width="200" height="200" /></a> Yes, that&#8217;s one of those annoying QR codes that seem to be turning up all over the place lately. The jumble of red and blue pixels appears on the front of the holiday greeting cards that my wife Rosalind and I sent to a few friends and family-members last week. Since the card has an online component, I thought I might share it here with a wider circle of acquaintances. So let me take this opportunity to wish everyone a happy 2012&#8212;all 366 days of it.</p>
<p>If you have a QR reader on your cell phone, you should be able to scan the code directly from the screen. You&#8217;ll be taken to a web page where the same pattern appears, and tapping the red targets should eventually get you to a greeting-of-the-season. Unfortunately, the Javascript driving this transformation doesn&#8217;t seem to run very well on some phones. (I&#8217;ve tested it on iPhones, iPads and a Palm Pre, where it is balky but functional; I&#8217;ve had at a report that it fails entirely on at least one Android phone.) If anyone can offer hints or clues about what I might be doing wrong, I&#8217;ll be grateful.</p>
<p>In any case, the small-screen version of the program isn&#8217;t nearly as nice as the one for grown-up computers, which you can reach by <a href="http://bit-player.org/sugarpixels/">clicking here</a> or on the image above. And there&#8217;s more about my travails with cell-phone Javascript (and SVG) in the <a href="http://bit-player.org/sugarpixels/nerdnotes.html">nerd notes</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2012/sugarpixels/feed</wfw:commentRss>
		</item>
		<item>
		<title>The acceleration of history</title>
		<link>http://bit-player.org/2011/the-acceleration-of-history</link>
		<comments>http://bit-player.org/2011/the-acceleration-of-history#comments</comments>
		<pubDate>Tue, 20 Dec 2011 22:27:43 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=1067</guid>
		<description><![CDATA[
Four hundred years ago, the idea that the Earth goes around the Sun rather than vice versa was not just a scientific breakthrough but also a cultural bombshell. People were asked to reimagine the world they were living in. Not everyone welcomed the opportunity. Books were burned. In the case of Giordano Bruno, an author [...]]]></description>
			<content:encoded><![CDATA[</p>
<blockquote><p>Four hundred years ago, the idea that the Earth goes around the Sun rather than vice versa was not just a scientific breakthrough but also a cultural bombshell. People were asked to reimagine the world they were living in. Not everyone welcomed the opportunity. Books were burned. In the case of Giordano Bruno, an author was burned.</p>
<p>In the modern world, cosmological revolutions seem to cause hardly a ripple in public consciousness. Inflation, dark matter, dark energy&#8212;these ideas also call for a reimagining of the world we live in, but they have provoked very little fuss outside the community of science. It&#8217;s certainly a relief that no one will be burned at the stake over matters of cosmological doctrine. But are we really more liberal and open-minded, or just not paying attention?</p>
</blockquote>
<p>Those are the final paragraphs of <a href="http://www.americanscientist.org/issues/pub/a-box-of-universe">my new column</a> in <em>American Scientist</em>. Here I want to say a few words more about the reception of these new ideas in cosmology, but first I should explain that the column is really about something else, namely the <a href="http://hipacc.ucsc.edu/Bolshoi/">Bolshoi computer simulation</a> of the large-scale structure of the universe, led by Joel Primack of UC Santa Cruz and Anatoly Klypin of New Mexico State University.</p>
<p>While preparing to write the column, I picked up Marcia Bartusiak&#8217;s recent book <em><a href="http://www.marciabartusiak.com/books.html">The Day We Found the Universe</a></em>, which tells the story of the discovery that the &#8220;nebulae&#8221; we see in the sky are actually distant galaxies much like our own&#8212;what Kant called &#8220;island universes.&#8221; It&#8217;s a grand story, and Bartusiak gives a splendid account of it, with engaging portraits of the dozen or so principal players. Highly recommended. </p>
<p>I&#8217;m not going to retell the whole story here, but I want to point out that it took 175 years for the idea of island universes to be accepted by astronomers. The earliest known proposal was by Thomas Wright in 1750; Bartusiak&#8217;s story culminates on January 1, 1925, when Edwin Hubble&#8217;s paper &#8220;Cepheids in Spiral Nubulae&#8221; was read to a joint session of the American Astronomical Society and the American Association for the Advancement of Science. In between, there was a great deal of backing and forthing. For example, William Herschel, the preeminent observational astronomer of the 18th century, initially supported the island-universe theory, but later he changed his mind. As late as 1900 many astronomers  believed the nebulae were relatively small, nearby objects&#8212;perhaps protostars about to condense. It took new instruments and a barrelfull of observational evidence to overturn this view. (Specifically: telescopes that could resolve individual stars in distant galaxies, better spectroscopes, better photographic film, the understanding of redshifts, the discovery of a relation between period and luminosity in the stars called Cepheid variables.)</p>
<p>I find it wholly unsurprising that people might need a century or two to digest such a major shift in how we view the universe around us. What&#8217;s remarkable is that lately the pace of change has accelerated, and nobody seems to be having much trouble keeping up.</p>
<p>Consider what&#8217;s happened in cosmology in the 80-some years since Hubble&#8217;s revelation. There was the battle between the steady-state and the big-bang models, which can be traced back to the 1920s and 30s and that was finally resolved in the 1960s with the discovery of the cosmic background radiation. Then there&#8217;s &#8220;dark matter.&#8221; Fritz Zwicky pointed out in the 1930s that the dynamics of galaxies imply there&#8217;s a lot more mass out there than we&#8217;re seeing, and this discrepancy became more troubling with later observations. By the 1980s or 90s most astronomers had accepted the remarkable conclusion that we don&#8217;t know what the universe is made of; all of the familiar &#8220;baryonic&#8221; matter of stars and planets is a minority constituent; the bulk of the mass is some unidentified stuff that Primack dubbed cold dark matter.</p>
<p>Even weirder (if that&#8217;s possible) is the notion of cosmic inflation: In a period of 10<sup>&ndash;36</sup> second, the universe expanded by a factor of 10<sup>78</sup>. The inflationary hypothesis was first put forward in 1980, was tweaked a bit later in that decade, and was soon swallowed whole by the cosmological community (with the exception of a very few skeptics). </p>
<p>Finally comes &#8220;dark energy,&#8221; the force that&#8217;s causing the cosmic expansion to accelerate. It&#8217;s well known that this concept goes back to the early years of general relativity, with Einstein&#8217;s cosmological constant &Lambda;. But Einstein soon disavowed the idea, and it remained moribund until about 15 years ago, when two groups of astronomers found direct observational evidence that the expansion <em>is</em> indeed accelerating. The resurrection of &Lambda; was so quick and total that this year&#8217;s Nobel prize in physics was awarded for this work.</p>
<p>I find it astonishing and disquieting to live in a universe that&#8217;s so very different from the one I was born into. We already had external galaxies in my childhood, and Fred Hoyle and George Gamow were sparring over the big-bang/steady-state issue. But I grew up with no inkling of dark matter, dark energy or cosmic inflation. Now it turns out that most of the universe disappeared over the event horizon in the inflationary era, a fraction of a second after it all began, and long before any of us had a chance to see what we were missing. Of what&#8217;s left, less than 1 percent is the kind of matter we know and love&#8212;and nobody has a very good idea what the rest of all that stuff might be.</p>
<p>Given the contentious history of earlier innovations in cosmology&#8212;starting, of course, with the post-Copernican civil war&#8212;I would have expected more controversy over these ideas. But the whole rapid-fire series of head-spinning revolutions seems to have been accepted rather placidly, both within astronomy and by the wider scientific community. Why so little resistance? Is the evidence so compelling as to overwhelm all opposition? Or, on the contrary, have we become so complacently accepting of what experts tell us to believe that we&#8217;ve lost all independent judgment.</p>
<p>In a telephone conversation I asked Primack how he would explain the lack of controversy. He broadened the scope of the question, pointing out that when you consider the public at large, rather than the scientific community, the issue is not uncritical acceptance but rather ignorance and indifference. A population that doubts Darwinian evolution and anthropogenic climate change is not too easily convinced by evidence or cowed by authority. If no one has risen up to denounce the teaching of dark matter and dark energy in the public schools, it&#8217;s simply because they are unaware of those ideas. I think Primack is right about this, but I don&#8217;t understand why questions about the basic nature of the universe&#8212;which once excited such passion&#8212;could now lie beneath the notice even of the most benighted citizens.</p>
<p>(By the way, the headline on this post is borrowed from my former boss, Gerard Piel, who published a book under that title. Now that Gerry is gone, I can confess that I never read the book, but I always liked the title.)</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/the-acceleration-of-history/feed</wfw:commentRss>
		</item>
		<item>
		<title>Chebfun</title>
		<link>http://bit-player.org/2011/chebfun</link>
		<comments>http://bit-player.org/2011/chebfun#comments</comments>
		<pubDate>Tue, 13 Dec 2011 23:34:29 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<category><![CDATA[mathematics]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=1060</guid>
		<description><![CDATA[I went to a magic show the other day. Nick Trefethen was giving a demo of Chebfun, a Matlab extension package he is building in collaboration with his Oxford students and colleagues. In the course of the talk, several mathematical rabbits were pulled out of numerical hats.
The key idea in Chebfun is to represent any [...]]]></description>
			<content:encoded><![CDATA[<p>I went to a magic show the other day. <a href="http://people.maths.ox.ac.uk/trefethen/">Nick Trefethen</a> was giving a demo of <a href="http://www2.maths.ox.ac.uk/chebfun/">Chebfun</a>, a Matlab extension package he is building in collaboration with his Oxford students and colleagues. In the course of the talk, several mathematical rabbits were pulled out of numerical hats.</p>
<p>The key idea in Chebfun is to represent any function of a real variable by a polynomial approximation.</p>
<pre>
  >> f = chebfun('sin(x) + sin(x.^2)', [0 10]);
  >> plot(f)
</pre>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/12/chebfun-f.png" alt="graph of chebfun f" border="0" width="450" height="298" /></p>
<p>That wiggly line <em>looks</em> like a graph of <em>y</em> = sin(<em>x</em>) + sin(<em>x</em><sup>2</sup>), but that&#8217;s an illusion. What is being plotted here is a certain polynomial of degree 118 that happens to approximate sin(<em>x</em>) + sin(<em>x</em><sup>2</sup>) with high precision.</p>
<p>As I understand it, the chebfun construction algorithm works something like this. First you select <em>N+1</em> points in the interval where the function is defined, and construct the unique polynomial of degree <em>N</em> that passes through all the points. If the error of this approximation is below a threshold, you&#8217;ve found your chebfun. Otherwise, choose a larger sample of points and try again.</p>
<p>The sample points are not evenly spaced across the interval. They are Chebyshev points, whose distribution varies as a cosine function, denser at the extremes and sparser in the middle. In this case, the process converged with 119 Chebyshev points:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/12/f-with-sample-points.png" alt="the function f along with the 120 sample points that determine the polynomial" border="0" width="450" height="316" /></p>
<p>In one respect the example above is an easy one: The function is quite smooth. Here&#8217;s something more challenging:</p>
<pre>
  >> hat = 1-abs(x-5)/5;
  >> h = max(f, hat);
  >> plot(h)
</pre>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/12/rabbit-ears.png" alt="the rabbit-in-the-hat function" border="0" width="450" height="296" /></p>
<p>This is where we pull the rabbit out of the hat&#8212;or at least several pairs of rabbit ears. To deal with the <del datetime="2011-12-20T14:32:57+00:00">discontinuities</del> <ins datetime="2011-12-20T15:13:20+00:00">sharp corners</ins> in this curve, the Chebfun system assembles 25 polynomial segments, each defined on a different interval. Some are linear, some of higher degree. But the entire structure is still treated as a single function, which can be operated on by other functions. For example, <code>sum(h)</code> calculates the integral over [0, 10], returning the result 8.598303617326401. And here&#8217;s the square root of those rabbit ears:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/12/square-root-of-rabbit-ears.png" alt="Square root of rabbit ears" border="0" width="450" height="315" /></p>
<p>These are neat tricks, but why would one <em>want</em> to work with polynomial approximations to a function, rather than with the function itself? I&#8217;m too new to all this to answer that question with confidence, so I&#8217;ll quote the <a href="http://www2.maths.ox.ac.uk/chebfun/guide/html/guide1.shtml">Chebfun Guide</a>:</p>
<blockquote><p>The aim of Chebfun is to &#8220;feel symbolic but run at the speed of numerics&#8221;. More precisely our vision is to achieve for functions what floating-point arithmetic achieves for numbers: rapid computation in which each successive operation is carried out exactly apart from a rounding error that is very small in relative terms.</p>
</blockquote>
<p>For those who want to know more, I offer a few pointers:</p>
<p>The first published paper on Chebfun:</p>
<p class="biblio">Battles, Zachary, and Lloyd N. Trefethen. 2004. An extension of MATLAB to continuous functions and operators. <em>SIAM Journal on Scientific Computing</em> 25:1743–1770. (<a href="http://www2.maths.ox.ac.uk/chebfun/publications/chebfun_paper.pdf">PDF</a>)</p>
<p>Trefethen&#8217;s argument favoring floating-point arithmetic over symbolic computation or exact rational arithmetic:</p>
<p class="biblio">Trefethen, Lloyd N. 2007. Computing numerically with functions instead of numbers. Mathematics in Computer Science 1:9&#8211;19. (<a href="http://people.maths.ox.ac.uk/trefethen/trefethen_functions.pdf">PDF</a>)</p>
<p>A provocative account of why polynomial approximation is not as wonky as you may think:</p>
<p class="biblio">Trefethen, Lloyd N. 2011. Six myths of polynomial interpolation and quadrature. <em>Mathematics Today</em>. (<a href="http://people.maths.ox.ac.uk/trefethen/mythspaper.pdf">PDF</a>)</p>
<p>Finally, Trefethen has a forthcoming book on Chebfun and related matters (which I have only just begun to read):</p>
<p class="biblio">Trefethen, Lloyd N. To appear. <em>Approximation Theory and Approximation Practice</em>. (<a href="http://www2.maths.ox.ac.uk/chebfun/ATAP/ATAPJune11.pdf">PDF</a>)</p>
<p>Chebfun runs inside <a href="http://www.mathworks.com/products/matlab/">Matlab</a>, the numerical computing environment from Mathworks. Chebfun itself has recently become open-source software (under a BSD license), but Matlab is proprietary. As far as I can tell, Chebfun does not not (yet?) run under Octave, the open-source alternative to Matlab.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/chebfun/feed</wfw:commentRss>
		</item>
		<item>
		<title>How Did the Stars Get Their Points?</title>
		<link>http://bit-player.org/2011/how-did-the-stars-get-their-points</link>
		<comments>http://bit-player.org/2011/how-did-the-stars-get-their-points#comments</comments>
		<pubDate>Thu, 08 Dec 2011 16:19:38 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[linguistics]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=1054</guid>
		<description><![CDATA[
Those are hot young stars in the Large Magellanic Cloud&#8212;one of the puppy-dog galaxies that follow the Milky Way around&#8212;photographed by the Hubble Space Telescope. (Detail cropped from a Wikipedia image.) Note that four rays seem to emanate from each of the brightest stars. The rays are not, of course, true beams of light radiating [...]]]></description>
			<content:encoded><![CDATA[<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/12/hst-bright-stars-in-lmc-450.jpg" border="0" alt="a field of bright stars and dust clouds in the Large Magellanic Cloud, photographed by the Hubble Space Telescope, courtesy Wikipedia" width="450" height="450" /></p>
<p>Those are hot young stars in the Large Magellanic Cloud&#8212;one of the puppy-dog galaxies that follow the Milky Way around&#8212;photographed by the Hubble Space Telescope. (Detail cropped from <a href="http://en.wikipedia.org/wiki/File:Starsinthesky.jpg">a Wikipedia image</a>.) Note that four rays seem to emanate from each of the brightest stars. The rays are not, of course, true beams of light radiating in the four cardinal directions. They are <a href="http://www.telescope-optics.net/spider.htm">an artifact of the telescope&#8217;s structure</a>: a diffraction pattern created by the four vanes of the &#8220;spider&#8221; that supports the secondary mirror within the barrel of the telescope. Many other telescopes have three-vane spiders that yield a six-pointed diffraction pattern.</p>
<p><img class="alignright" src="http://bit-player.org/wp-content/uploads/2011/12/escher-stars.png" border="0" alt="Stars, engraving by M. C. Escher, from Wikipedia" width="220" height="271" />Recently, in my lovable know-it-all manner, I was holding forth on the idea that this diffraction effect&#8212;a mere accident of instrumental design&#8212;might actually be the source of the familiar iconographic star, with its five or six angular points. In other words, we think of a star as something spiky, poking out in various directions, because we&#8217;re used to seeing telescopic images with this diffractive defect. At right is <a href="http://en.wikipedia.org/wiki/Stars_%28M._C._Escher%29">M. C. Escher&#8217;s interpretation</a> of what <em>stellar</em> means. For other examples see the Hollywood Walk of Fame or the flags of the U.S. and the E.U. and those of <a href="http://en.wikipedia.org/wiki/Gallery_of_sovereign-state_flags">more than 50 other countries</a>, not to mention Texas.</p>
<p>Well, it turns out my cute idea about the cultural influence of telescopic photos is utterly bogus. If you need any evidence, the engraving reproduced below should suffice. It shows the muse Astronomia (a.k.a. Urania) pointing out the moon and stars to Ptolemy. The stars are five- or six-pointed scribbles that beg to be called <em>asterisks</em>. The engraving appears in the <a href="http://www.er.uqam.ca/nobel/r14310/Ptolemy/Reisch.html"><em>Margarita Philosophica</em> of Gregor Reisch</a>, published in 1504, which is a full century before Galileo turned his telescope to the heavens. Whatever those engraved stars are, they are not artifacts of telescope spider vanes.</p>
<p><img class="alignright" src="http://bit-player.org/wp-content/uploads/2011/12/ptolemy-and-astronomia-450.png" border="0" alt="Ptolemy and Astronomia with stars and moon from Margarita Philosophica 1504" width="450" height="572" /></p>
<p>The dictionary offers further evidence. For example, the <em>starfish</em> (genus <em>Asterias</em>, class <em>Asteroidea</em>) has had that name at least since 1538. And the <em>asterisk</em>&#8212;the typographical mark&#8212;has a citation in the OED going all the way back to 1382. These terms make sense only if the concept of a star was already associated in most people&#8217;s minds with a spiky polygon, rather than a dimensionless point of light in the night sky.</p>
<p>And that&#8217;s what puzzles me, because the stars really do appear to be dimensionless points of light. When I stare at the sky, I see some twinkling going on, but nowhere do I see pentagrams and hexagrams pinned to black velvet, or even the slightest hint of angularity. So where did this tradition get started? Did the Greek word ?????? already convey a sense of symmetrical spikiness, so that ancient Athenians would have understood why we call certain flowers <em>asters</em>? Is the same iconography prevalent in other cultures, say in China? Those 50+ star-studded flags (including China&#8217;s) suggest that the conventional stellar icon is at least recognized globally, but they don&#8217;t tell us where and when it all began. After my telescopic theory fell apart, I had a second hypothesis, namely that the star icon might come from the symbol-happy world of astrology, but I&#8217;ve found no support for this idea either. So I throw the question out to the starry void: How did the star get its points?</p>
<p><strong>Addendum 2011-12-16</strong>: The illuminating comments below on ancient Egyptian paintings of stars would appear to settle part of my question: Well over 2,000 years ago, at least some people were already drawing stars in much the same way a modern kindergartner does. What I&#8217;d still like to know is <em>why</em>. Yes, there are many plausible just-so stories, but you&#8217;d think that someone at the time might have offered a word of explanation.</p>
<p>The other day I spent a pleasant afternoon leafing through <em>The History and Practice of Ancient Astronomy</em>, by James Evans (New York: Oxford University Press, 1998). It&#8217;s quite a thorough introduction to Greek and Egyptian ideas about the sky, but I did not find an answer to my question about the points of stars. The astronomers of that period were engrossed in charting the positions and motions of the stars, but one gets the impression they had no interest whatever in the nature of those bright objects&#8212;what they look like up close, what they&#8217;re made of, why they shine. Of course I don&#8217;t really believe the ancients were so lacking curiosity. Surely Aristotle holds forth somewhere on the substance of the stars? But I haven&#8217;t found it yet.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/how-did-the-stars-get-their-points/feed</wfw:commentRss>
		</item>
		<item>
		<title>TNT Is Not TeX</title>
		<link>http://bit-player.org/2011/tnt-is-not-tex</link>
		<comments>http://bit-player.org/2011/tnt-is-not-tex#comments</comments>
		<pubDate>Tue, 06 Dec 2011 04:09:21 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<category><![CDATA[mathematics]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=1049</guid>
		<description><![CDATA[
The curious document above was produced sometime in the spring of 1980 by Don Knuth to show off the typographical prowess of his new programs, TeX and Metafont. The software was then being introduced to the mathematical community through the publication of TeX and Metafont: New Directions in Typesetting, and I was writing a news [...]]]></description>
			<content:encoded><![CDATA[<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/12/knuth-tex-specimen-1980-450px.png" alt="Knuth TeX specimen 1980 450px" border="1" width="450" height="178" /></p>
<p>The curious document above was produced sometime in the spring of 1980 by Don Knuth to show off the typographical prowess of his new programs, TeX and Metafont. The software was then being introduced to the mathematical community through the publication of <em><a href="http://www.worldcat.org/title/tex-and-metafont-new-directions-in-typesetting/oclc/005751341">TeX and Metafont: New Directions in Typesetting</a></em>, and I was writing a news item about it for <em>Scientific American</em>. At the time it seemed like a quaint, quirky and quixotic project, worth a column of type in the magazine even if nothing came of it in the long run. I would not have guessed that 30 years later TeX would be the foundation of a huge software superstructure&#8212;and would still be a part of my own professional life.</p>
<p>TeX is not the oldest software still in widespread use, but it may be the most stable. In the core of the system&#8212;the typesetting engine&#8212;very little has changed since 1990. And there will be even fewer revisions going forward. The current version of TeX is 3.1415926. Knuth <a href="http://www.ntg.nl/maps/05/34.pdf">has decreed</a> that on his death the version number should be set equal to &pi; and no further changes should ever be made. &#8220;From that moment on, all &#8216;bugs&#8217; will be permanent &#8216;features.&#8217;&#8221;</p>
<p>I think&#8212;though this is subject to interpretation&#8212;that what Knuth wants to protect from all future meddling is not <a href="ftp://tug.ctan.org/pub/tex-archive/systems/knuth/dist/tex/tex.web">the text of the program itself</a>, or even the underlying algorithms and data structures, but rather its operational specification. His intent in freezing TeX is to ensure that the same input should always yield the same output. Specifically, any software that calls itself TeX is supposed to pass his TRIP test suite.</p>
<p>I am of two minds about this policy. Mind One agrees with Knuth&#8217;s declaration: &#8220;Let us regard these systems as fixed points, which should give the same results 100 years from now that they produce today.&#8221; It&#8217;s comforting to think that all the TeX documents I&#8217;ve written over the years will still be readable a century hence. But Mind Two reminds me that in practice I have trouble maintaining TeX documents even for a few months, much less decades or centuries. What about those presentations done with the <em>foils</em> class that stopped working after an upgrade and that I&#8217;ve never bothered to fix? Or the articles using  the <em>pstricks</em> package that won&#8217;t compile under <em>pdflatex</em>? TeX itself may be a fixed point in the software universe, but everything else spins dizzily around it.</p>
<p>The skeptical Mind Two has another argument as well: Under Knuth&#8217;s edict it&#8217;s not just the TeX markup language that can&#8217;t change; it&#8217;s also the architecture of the system. Knuth created his flawless souffl&eacute;s and d&aelig;mon diarrh&oelig;a at <a href="http://www.tug.org/TUGboat/tb26-1/beebe.pdf">an ASCII terminal wired to a PDP-10</a>, and the only way he could see the product of his labors was to walk down the hall and retrieve hard copy from the AlphaType machine. We are no longer accustomed to such barbarities. TeX has been hauled halfway into the world of modern computing. Front-end software such as TeXShop provides a pleasanter interface. But the core programs still run in batch mode, as they did in the Dark Ages. To make even the smallest change in a document, you still need to throw away all the existing output and run a whole file (or set of files) through the compiler tool chain. Sometimes you have to do it twice. Or <a href="http://amath.colorado.edu/documentation/LaTeX/reference/faq/bibstyles.html#commands">four times</a>. Isn&#8217;t this ridiculous in a world of event-driven, interactive, multithreaded software? Will we still have to press the <em>Typeset</em> button in 2111?</p>
<p>Mind One replies: Of course not. By then we&#8217;ll just throw Moore&#8217;s Law at it: Automatically rerun TeX <em>n</em> times for every keystroke in the editor.</p>
<p>At this point Mind Three pipes up. (Did I mention that I&#8217;m of three minds?) The problem here, she says, is not that we can&#8217;t or shouldn&#8217;t alter TeX. It&#8217;s the utterly depressing notion that we&#8217;re incapable of building anything better, and that TeX will still be the typesetter to beat after another century. Surely, if we just stand tippytoe on the shoulders of Don Knuth, we can see a little farther. Who was the architect who said that every great building should have a bomb in the basement, set to blow itself up after 50 years and thereby clear the land for something greater still? Let&#8217;s make a new improved TeX. We&#8217;ll call it TNT.</p>
<p>Minds One and Two pounce in unison: You think we haven&#8217;t thought of that? What about <a href="http://www.tug.org/TUGboat/tb14-3/tb40taylor.pdf">&epsilon;-TeX</a>? <a href="http://en.wikipedia.org/wiki/New_Typesetting_System">NTS</a>? <a href="http://www.extex.org/">ExTeX</a>? What about <a href="http://www.luatex.org/index.html">LuaTeX</a>&#8230;?</p>
<p class="centered">&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;</p>
<p>This trinitarian meditation was inspired by a <a href="http://vallettaventures.com/post/13124883568/the-price-of-a-messy-codebase-no-latex-for-the-ipad">blog post I stumbled upon last week</a>, in which an entity named Valletta Ventures, publisher of TeXPad for the Macintosh, attempts to port TeX (and also LaTeX) to the iPad; in this venture, Valletta Ventures eventually concedes defeat. The failure could be blamed on the scrutineers at the Apple App Store, who insist that every iPad program must be bundled up in a single executable. (My current TeX /bin directory has 342 entries.) But even if we were to let Apple off the hook here, the project still seems truly quaint, quirky and quixotic. Mind One says you shouldn&#8217;t expect to run a system as large and complex as TeX on a puffed-up cellphone. But Mind Two says: Why not? The iPad probably has more computational oomph than Knuth&#8217;s 1980 PDP-10.</p>
<p>In the end my sympathies lie with Mind Three, who sees the barrier to putting TeX on the iPad not as a lost opportunity but as a thin, bright glimmer of hope on the horizon. Maybe this protected market&#8212;the walled garden of Cupertino&#8212;will induce some young genius to create the next great mathematical writing system, an iPad app so good it will induce envy in all of us poor TeX users.</p>
<p>Looking at the issue more broadly, I think we often value stability and reliability a little too highly, and innovation too lowly. The world of computer science is overpopulated by walking fossils&#8212;not just TeX but also Unix, the Intel 86 architecture, TCP/IP. <a href="http://www.americanscientist.org/issues/pub/qwerks-of-history">Quoting myself</a>:</p>
<blockquote><p>What has everybody been doing for the past 35 years? Can it be true that technologies conceived in the era of time-sharing, teletypes and nine-track tape are the very best that computer science has to offer in the 21st century?</p>
</blockquote>
<p>As a remedy for this situation, the bomb in the basement may be a bit extreme. But I wonder if we shouldn&#8217;t try something like a reverse patent, where the whole world gets free use of an invention for the first 17 years, but then there&#8217;s an escalating schedule of royalties or taxes for those who fail to come up with a brighter idea.</p>
<p class="centered">&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;</p>
<p>One final question. When Knuth counts LAZY FOXES in his typographic specimen, where does he get the peculiar number 854.9176302? I would have thought 85491.76320.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/tnt-is-not-tex/feed</wfw:commentRss>
		</item>
		<item>
		<title>Pretirement</title>
		<link>http://bit-player.org/2011/pretirement</link>
		<comments>http://bit-player.org/2011/pretirement#comments</comments>
		<pubDate>Wed, 23 Nov 2011 20:23:48 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[modern life]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=1046</guid>
		<description><![CDATA[As a high school kid in the 1960s, I wrote a snarky term paper arguing that retirement is wasted on old people. By the time you get your promised years of leisure, you&#8217;re too worn out to enjoy them. So I proposed a new order of working life: Everybody gets five or ten years off [...]]]></description>
			<content:encoded><![CDATA[<p>As a high school kid in the 1960s, I wrote a snarky term paper arguing that retirement is wasted on old people. By the time you get your promised years of leisure, you&#8217;re too worn out to enjoy them. So I proposed a new order of working life: Everybody gets five or ten years off at the start, when they&#8217;re still full of spunk, in exchange for a promise to keep trudging away on the treadmill right up to the end.</p>
<p>I wasn&#8217;t able to arrange such a pretirement for myself, but the world now seems to be coming around to my way of thinking. Here&#8217;s some evidence, with data courtesy of the <a href="http://data.bls.gov/pdq/querytool.jsp?survey=ln">Bureau of Labor Statistics</a>:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/11/pretirement-graph.png" alt="employment-to-population ratio for age groups 16-24 and 65+" border="0" width="444" height="481" /></p>
<p>The proportion of Americans who stay on the job after age 65 was falling steadily for many years and got down to about 10 percent in the 1980s; but it has been rising since then, and the rate of increase accelerated after 2000. Today almost 17 percent of the 65+ cohort are still working. Meanwhile, the analogous curve for youths aged 16 to 24 is pretty much a mirror image. The employment rate peaked in the 1980s and has been declining since then. In the years between 2000 and 2010 it fell from just under 60 percent to 45 percent.</p>
<p>My pretirement hypothesis&#8212;the notion that we&#8217;re giving people an opportunity to waste their youth on the golf course rather than their old age&#8212;is just about the most benevolent interpretation one could possibly put on these trends. A less-rosy reading of the same data puts the blame on old geezers like me who just won&#8217;t get out of the way and give the youngsters their turn. For some reason, this view seems to be prevalent among recent grads who expected a job offer at the end of their studies but instead got only a bill from Sallie Mae.</p>
<p>An <a href="http://www.nytimes.com/2011/11/20/opinion/sunday/retirement-goodbye-golden-years.html?">op-ed piece</a> in the Sunday <em>New York Times</em> takes issue with this sour diagnosis. Edward L. Glaeser, an economist at Harvard, argues that what motivates the elders who linger in the work force is not greed, selfishness or indifference to their children&#8217;s aspirations; it&#8217;s economic necessity. Their houses are underwater; their 401k&#8217;s have swooned; they can&#8217;t afford to retire. Furthermore, the kids should be grateful that grandmom and grandpop have hung on to the family business:</p>
<blockquote><p>It&#8217;s counterintuitive, but the forever work life of older Americans may turn out to be a good thing for young workers&#8230;. Recent studies in Britain and Germany find a positive correlation between labor-force participation among the elderly and youth employment. It&rsquo;s not that older workers never crowd out younger workers, but there are myriad ways in which older workers also increase employment among the young. As older workers earn more, they can afford to buy more products produced by the young. Older workers may be entrepreneurs who employ younger workers, and they may pass along valuable skills to the young.</p>
<p>America has a terrible youth unemployment problem&#8230;. We have reason to worry that the current economic slowdown will create a lost generation of Americans who are now in their 20s. But it&#8217;s a mistake to imagine we can fix the problem of youth unemployment by encouraging older workers to retire.
</p></blockquote>
<p><a href="http://en.wikipedia.org/wiki/Edward_Glaeser">According to Wikipedia</a>, Glaeser, is 44 years old&#8212;right in the middle between the involuntary pretirees and the never-gonna-retirees.</p>
<p>Glaeser doesn&#8217;t discuss the demographic context of these changes, and neither did I in my high school term paper. Looking back on it now, I see a serious flaw in my proposal. Retirement plans, such as the Social Security system, work best with a pointy population pyramid, so that a wide base of young earners supports a smaller number of pensioners. My plan called for reversing the flow of resources, which would not have worked out well given the age structure of the U.S. population in the 1960s, with my own generation of Baby Boomers fattening the base of the pyramid. But the situation is different now; the <a href="http://populationpyramid.net/?country=United_States_of_America&#038;year=2020">pyramid is slimming down</a>, and citizens in their 60s may soon outnumber those in their 20s. Maybe pretirement is worth a second look. </p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/pretirement/feed</wfw:commentRss>
		</item>
		<item>
		<title>(McCarthyism)</title>
		<link>http://bit-player.org/2011/mccarthyism</link>
		<comments>http://bit-player.org/2011/mccarthyism#comments</comments>
		<pubDate>Wed, 26 Oct 2011 02:38:29 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=1042</guid>
		<description><![CDATA[John McCarthy, the mind behind Lisp, died yesterday at age 84. The photographs below are from one of his web sites at Stanford. Many of his papers are available through a different web page. (I don&#8217;t know how long either of these sites will remain on the air.)

The Kiplingesque just-so story of how Lisp got [...]]]></description>
			<content:encoded><![CDATA[<p>John McCarthy, the mind behind Lisp, died yesterday at age 84. The photographs below are from <a href="http://www-formal.stanford.edu/jmc/personal.html">one of his web sites</a> at Stanford. Many of his papers are available through <a href="http://www-formal.stanford.edu/jmc/">a different web page</a>. (I don&#8217;t know how long either of these sites will remain on the air.)</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/10/john-mccarthy.jpg" alt="John McCarthy, younger and older" border="0" width="450" height="297" /></p>
<p>The Kiplingesque just-so story of how Lisp got its parentheses has been told several times, by McCarthy and others, but today seems an appropriate occasion to trot it out again. In the winter of 1958&ndash;59 McCarthy was at work on a paper that would eventually be published in <em>Communications of the ACM</em> as &#8220;<a href="http://www-formal.stanford.edu/jmc/recursive.html">Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I.</a>&#8221; (There was never a Part II.) He was looking for good example programs to show that the new language was &#8220;neater than Turing machines&#8221; as a formalism for describing computable functions. In the end he came up with a demo much better than &#8220;Hello world!&#8221;</p>
<p>The following paragraphs are from McCarthy&#8217;s presentation to the first History of Programming Languages meeting in 1977 (<em>ACM SIGPLAN Notices</em>, Vol.13, No.8, August 1978; there&#8217;s <a href="http://www-formal.stanford.edu/jmc/history/lisp/node3.html">an HTML-ized version</a>).</p>
<blockquote><p>Another way to show that LISP was neater than Turing machines was to write a universal LISP function and show that it is briefer and more comprehensible than the description of a universal Turing machine. This was the LISP function <em>eval[e, a]</em>, which computes the value of a LISP expression <em>e</em>&#8212;the second argument <em>a</em> being a list of assignments of values to variables (<em>a</em> is needed to make the recursion work). Writing <em>eval</em> required inventing a notation representing LISP functions as LISP data, and such a notation was devised for the purposes of the paper with no thought that it would be used to express LISP programs in practice.</p>
<p>S. R. Russell noticed that <em>eval</em> could serve as an interpreter for LISP, promptly hand coded it, and we now had a programming language with an interpreter.</p>
</blockquote>
<p>The &#8220;notation representing LISP functions as LISP data,&#8221; was the parenthesized prefix syntax now beloved by Lispers and reviled or ridiculed by just about everybody else. McCarthy himself was not a big fan of this notation. He believed&#8212;to the end of his life, as far as I know&#8212;that the language deserved a more conventional &#8220;algebraic&#8221; syntax, perhaps something in the Algol tradition. The parenthesized lists were called S-expressions; the elements of the fancier notation were to be called M-expressions. In the 1977 talk McCarthy continued:</p>
<blockquote><p>The unexpected appearance of an interpreter tended to freeze the form of the language, and some of the decisions made rather lightheartedly for the &#8220;Recursive functions&#8230;&#8221; paper later proved unfortunate&#8230;. The project of defining M-expressions precisely and compiling them or at least translating them into S-expressions was neither finalized nor explicitly abandoned. It just receded into the indefinite future, and a new generation of programmers appeared who preferred internal notation to any FORTRAN-like or ALGOL-like notation that could be devised.</p>
</blockquote>
<p>I guess I&#8217;m part of that new, unregenerate, generation, who have never been weaned away from their (((()))).</p>
<p>In 2005 I attended an International Lisp Conference at Stanford. McCarthy was present throughout the proceedings but kept a low profile until the final discussion session, when he rose from his seat to make this pronouncement:</p>
<blockquote><p>If someone set off a bomb in this room, it would wipe out half of the worldwide Lisp community. That might not be a bad thing for Lisp, because it would have to be reinvented.</p>
</blockquote>
<p>There were two provocative aspects to this statement. First, the lecture hall where we were gathered held no more than a couple of hundred people, so if we represented half of the of worldwide Lisp community, the whole outfit must be pretty small potatoes. Second, if obliterating half the Lisp community would be good for the language, then we must have gone badly astray somwhere in the past 50+ years.</p>
<p>My own view (for what it&#8217;s worth) is quite different. I&#8217;m simply amazed that so many fundamentally sound ideas were formulated so clearly so early in the history of computation. If McCarthy could get so much right in 1958, what the hell have we been doing since then?</p>
<p class="centered">&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;</p>
<p>One more reminiscence of John McCarthy. He was a great prognosticator and futurist. In the introduction to a <em>Scientific American</em> special issue on information (September 1966), he wrote presciently about a the prospect of putting a computer in every home and about what we now call cloud computing. (He filled in more details on these themes in <a href="http://www-formal.stanford.edu/jmc/hoter.html">a 1970 paper</a>.)</p>
<blockquote><p>No stretching of the demonstrated technology is required to envision computer consoles installed in every home and connected to public-utility computers through the telephone system. The console might consist of a typewriter keyboard and a television screen that can display text and pictures. Each subscriber will have his own private file space in the computer that he can consult and alter at any time.</p>
</blockquote>
<p>But not all of McCarthy&#8217;s predictions were precisely on target. In 1989 he wrote about the threat of the fax machine:</p>
<blockquote><p>Unless e-mail is freed from dependence on the networks, I predict it will be supplanted by the telefax for most uses in spite of the fact it is more advantageous&#8230;. Unless e-mail is separated from special networks, telefaxing will prevail because it works by using the existing telephone network directly.</p>
</blockquote>
<p>I say we should let him slide on that one. Overall, the world is a richer place for his contributions to it.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/mccarthyism/feed</wfw:commentRss>
		</item>
		<item>
		<title>The n-ball game</title>
		<link>http://bit-player.org/2011/the-n-ball-game</link>
		<comments>http://bit-player.org/2011/the-n-ball-game#comments</comments>
		<pubDate>Sat, 22 Oct 2011 17:12:00 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[mathematics]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=1037</guid>
		<description><![CDATA[
The area enclosed by a circle is &#960;r2. The volume inside a sphere is 4&#8725;3&#960;r3. These are formulas I learned too early in life. Having committed them to memory as a schoolboy, I ceased to ask questions about their origin or meaning. In particular, it never occurred to me to wonder how the two formulas [...]]]></description>
			<content:encoded><![CDATA[<p>
<blockquote>The area enclosed by a circle is &#960;<em>r</em><sup>2</sup>. The volume inside a sphere is <sup>4</sup>&#8725;<sub>3</sub>&#960;<em>r</em><sup>3</sup>. These are formulas I learned too early in life. Having committed them to memory as a schoolboy, I ceased to ask questions about their origin or meaning. In particular, it never occurred to me to wonder how the two formulas are related, or whether they could be extended beyond the familiar world of two- and three-dimensional objects to the geometry of higher-dimensional spaces. What&rsquo;s the volume bounded by a four-dimensional sphere? Is there some master formula that gives the measure of a round object in <em>n</em> dimensions?</p></blockquote>
<p>The text above is the opening paragraph of my new column in <em>American Scientist</em>. If you&#8217;d like to know how the story comes out, by all means go read the full column in the format of your choice: <a href="http://dx.doi.org/10.1511/2011.93.442">HTML</a>, <a href="http://www.americanscientist.org/libraries/documents/201110101628308738-2011-11CompSciHayes.pdf">PDF</a>, or ink-on-paper at better newsstands everywhere. For those in a hurry, here&#8217;s the gist in one equation, one code snippet and one graph:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/10/n-ball-volume-formula.png" alt="V(n,r)=\frac{\pi^\frac{n}{2} r^n}{\Gamma(\frac{n}{2}+1)}" border="0" width="134" height="42" /></p>
<p>The equation is the &#8220;master formula&#8221; mentioned above: Plug in the radius <em>r</em> and the number of spatial dimensions <em>n</em>, and you&#8217;ll get back the volume of the corresponding ball. (If the gamma function in the denominator is unfamiliar, think of it as a factorial that makes sense even when the argument is not an integer.)</p>
<pre>
        v[0, r_] := 1
        v[1, r_] := 2r
        v[n_,r_] := (2&#960;r<sup>2</sup>/n) * v[n&#8211;2, r]
</pre>
<p>This version of the formula, given here in Mathematica notation, works only for integer <em>n</em>. It defines the volume of a 0-ball as 1 and the volume of a 1-ball as 2<em>r</em>. For larger <em>n</em>, the volume is calculated recursively: It&#8217;s 2&#960;<em>r</em><sup>2</sup>/<em>n</em> times the volume of a ball with the same radius in <em>n</em>&ndash;2 dimensions.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/10/n-ball-volume-graph.png" alt="graph of the volume of a unit ball in n dimensions as a function of n from 0 to 20" border="0" width="450" height="232" /></p>
<p> Finally, the graph shows the volume of a unit <em>n</em>-ball (<em>i.e.</em>, <em>r</em> = 1) for values of <em>n</em> from 0 through 20. When I first saw these data, two things took me by complete surprise. First, I was perplexed to learn that the volume of an <em>n</em>-ball dwindles away to nothing as <em>n</em> gets large. Second, I was even more surprised that the relation is not monotonic but has a peak at finite <em>n</em>. If we consider only integer <em>n</em>, the unit 5-ball has the largest volume. If we allow the spatial dimension to become a continuous variable, the maximum is at approximately <em>n</em> = 5.26. </p>
<p>These facts still seem pretty weird to me. And, although they are hardly new discoveries&#8212;the formula cited above goes back to the middle of the 19th century&#8212;they seem not to be widely known. How come nobody ever told me about this stuff?</p>
<p>Well, it turns out somebody <em>did</em> tell me, long ago. A few days after the current issue of <em>American Scientist</em> was sent to the press, I came upon an old Martin Gardner column with the title &#8220;Circles and spheres, and how they kiss and pack.&#8221; Writing about the <em>n</em>-sphere (by which he means the same object I&#8217;m calling the <em>n</em>-ball), Martin remarks:</p>
<blockquote><p>And something very queer happens to its <em>n</em>-volume as <em>n</em> increases. The area of the unit circle is, of course, &pi;. The volume of the unit sphere is 4.1+. The unit 4-sphere&#8217;s hypervolume is 4.9+. In 5-space the volume is still larger, 5.2+, then in 6-space it decreases to 5.1+ and thereafter steadily declines. Indeed, as <em>n</em> approaches infinity the hypervolume of a unit <em>n</em>-sphere approaches zero!</p></blockquote>
<p>These words were published in the May, 1968, issue of <em>Scientific American</em>. I was a faithful reader in those days and surely saw the column, but I retain no shred of memory.</p>
<p>In the same column Martin discusses another mind-boggler that I also mention. Consider this configuration of disks in a square:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/10/disks-in-a-box.png" alt="Disks in a box" border="0" width="300" height="335" /></p>
<p>The maroon disk in the middle, tangent to the four blue disks, has a radius of &radic;2 &ndash; 1. In the analogous three-dimensional arrangement, with eight blue balls, the radius of the central maroon ball is &radic;3 &ndash; 1; in <em>n</em> dimensions it is &radic;<em>n</em> &ndash; 1. Now look at what happens when <em>n</em> = 9: The maroon ball, though still surrounded on all sides by blue balls, has expanded to a radius of 2 and thus reaches the edge of the enclosing cube. I learned of this conundrum from Barry Cipra, who gives an account of it in <em>WHIMS I</em>, the first volume in the AMS series <em>What&#8217;s Happening in the Mathematical Sciences (1991)</em>. Barry wasn&#8217;t able to tell me anything about the provenance of the problem. Martin describes it as &#8220;an unpublished paradox discovered by Leo Moser,&#8221; a Canadian mathematician who died just a few years later. As far as I can tell, Moser never did publish anything about the problem. If anyone knows more about its origin, I&#8217;d be eager to hear about it.</p>
<p>A further note: The new <em>American Scientist</em> had not been out more than a day or two before I began getting letters arguing that the whole tale I&#8217;m telling is nonsensical because balls (or other shapes) that differ in dimension have volumes measured in different units. A unit 3-ball is neither larger nor smaller than a unit 2-ball, because one kind of volume is measured in cubic units and the other in square units. This issue is addressed in my column, though evidently not to the satisfaction of all readers. In retrospect, I think my discussion would have been clearer (and less controversial) if I had stated everything in terms of volume ratios rather than volumes. In other words, the numbers labeling the ordinate of the graph above should be understood as measuring the ratio of the volume of a unit <em>n</em>-ball to the volume of a unit <em>n</em>-cube. All measurements involve some such comparison, but in this case it&#8217;s really helpful to make it explicit.</p>
<p>I do believe that the curve in that graph is trying to tell us something important about geometry in higher-dimensional spaces. I have no clear idea what. Wisdom and insight are always welcome in the comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/the-n-ball-game/feed</wfw:commentRss>
		</item>
		<item>
		<title>Divisive diversions</title>
		<link>http://bit-player.org/2011/divisive-diversions</link>
		<comments>http://bit-player.org/2011/divisive-diversions#comments</comments>
		<pubDate>Sun, 04 Sep 2011 14:18:06 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<category><![CDATA[mathematics]]></category>

		<category><![CDATA[problems and puzzles]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=1022</guid>
		<description><![CDATA[The ever-puzzling Peter Winkler offered three problems in the August Communications of the ACM:


Does every positive integer divide some number of the form 1{0,1}*&#8212;that is, a positive integer whose decimal representation includes no digits other than 0 and 1?


Does every positive integer divide a Fibonacci number?


Is there an odd perfect number (an odd integer equal [...]]]></description>
			<content:encoded><![CDATA[<p>The ever-puzzling Peter Winkler offered <a href="http://mags.acm.org/communications/201108?pg=122#pg122">three problems</a> in the August <em>Communications of the ACM</em>:</p>
<blockquote><ol>
<li>
<p>Does every positive integer divide some number of the form 1{0,1}*&#8212;that is, a positive integer whose decimal representation includes no digits other than 0 and 1?</p>
</li>
<li>
<p>Does every positive integer divide a Fibonacci number?</p>
</li>
<li>
<p>Is there an odd perfect number (an odd integer equal to the sum of its proper divisors)?</p>
</li>
</ol>
</blockquote>
<p><a href="http://mags.acm.org/communications/201109?#pg112">Answers to Problems 1 and 2</a> have now been published in the September <em>CACM</em>, so I won&#8217;t worry too much about spoiling anyone&#8217;s fun with the discussion below. Still, if you&#8217;d like to take a crack at the first two problems, do so before you read on. (As for Problem 3, you needn&#8217;t worry about my giving away the secret.)</p>
<p>Winkler&#8217;s solutions are based on the pigeonhole principle. If you divide successive 1{01}* numbers or successive Fibonacci numbers by any fixed integer <em>n</em>, the remainders necessarily lie between 0 and <em>n</em>&#8211;1. Furthermore, the sequence of remainders repeats cyclically. For example, the Fibonacci numbers modulo 3 are:</p>
<p align=center>0 1 1 2 0 2 2 1 0 1 1 2 0 2 2 1 0 . . .</p>
<p>The trick is to show that the cyclic sequence of remainders always includes zero. For details see Winkler&#8217;s solution page mentioned above. (<em>CACM</em> is behind a paywall, but I <em>think</em> these links will work for nonsubscribers.) A 2007 <a href="http://lanl.arxiv.org/abs/0712.3509v1">paper by Tanya Khovanova</a> also explains what&#8217;s going on, and gives a couple of further enticing problems.</p>
<p>The pigeonhole argument answers the questions as posed, but it tells us very little about the structure of the solutions. <em>Which</em> 1{01}* numbers and <em>which</em> Fibonacci numbers are divisible by various integers <em>n</em>? Are there any interesting patterns in the results? I was curious, so I started computing.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/09/01-stars-1-30.png" alt="least 1{01}* numbers divisible by n from 1 to 30" border="0" width="444" height="293" /></p>
<p>The graph above shows the smallest 1{01}* numbers divisible by each <em>n</em> from 1 through 30. The standout pattern is the series of tall flagpoles for <em>n</em> a multiple of 9. It appears that 1{01}* numbers that include 9 among their divisors are rarities. I didn&#8217;t foresee this pattern, although I should have. It&#8217;s connected with the long-forgotten ritual of &#8220;casting out nines,&#8221; which in turn is based on the following fact: A decimal number is divisible by 9 if and only if the sum of its digits is divisible by 9. (Question: What are the smallest 1{01}* numbers divisible by 99 and by 999? Answers at the end of this article.)</p>
<p>Below is the analogous graph for Fibonacci numbers divisible by values of <em>n</em> between 1 and 30. </p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/09/fibodivs-1-to-30.png" alt="graph of least m such that n divides F(M) for n from 1 to 30" border="0" width="435" height="280" /></p>
<p>There are some interesting patterns here, too, but we need a bigger sample to see them clearly.</p>
<p>By the way, it&#8217;s important to notice that the ordinate axis of the Fibonacci graph gives the <em>index</em> of each Fibonacci number, not the Fibonacci number itself. I adopt this indexing convention : </p>
<table rules="rows" align="center">
<tr align="right">
<td width="20"><em>m</em></td>
<td width="20">0</td>
<td width="20">1</td>
<td width="20">2</td>
<td width="20">3</td>
<td width="20">4</td>
<td width="20">5</td>
<td width="20">6</td>
<td width="20">7</td>
<td width="20">8</td>
<td width="20">9</td>
<td width="20">&#8230;</td>
</tr>
<tr align="right">
<td width="20"><em>F(m)</em></td>
<td width="20">0</td>
<td width="20">1</td>
<td width="20">1</td>
<td width="20">2</td>
<td width="20">3</td>
<td width="20">5</td>
<td width="20">8</td>
<td width="20">13</td>
<td width="20">21</td>
<td width="20">34</td>
<td width="20">&#8230;</td>
</tr>
</table>
<p>The diagram below offers another way of looking at the mapping from integers <em>n</em> (on the left) to the smallest Fibonacci number <em>F(m)</em> divisible by <em>n</em> (on the right):</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/09/bipartite-n-100.png" alt="bipartite graph of the mapping from n to n | F(m)" border="0" width="449" height="801" /></p>
<p>A few Fibonacci numbers are highly popular&#8212;notably <em>F</em>(12), <em>F</em>(24), <em>F</em>(30), <em>F</em>(60). Indeed, <em>F</em>(24) is the destination of 12 of the first 100 values of <em>n</em>. The reason for this clustering is not a deep mystery; Fibonacci numbers of the form <em>F</em>(6<em>k</em>) tend to be very &#8220;smooth&#8221; numbers, with an abundance of small factors. <em>F</em>(60) is equal to 1,548,008,755,920, a number that has 960 divisors. <em>F</em>(240) has more than 1.3 million divisors. These smooth numbers simply have more chances to be the smallest <em>F(m)</em> divisible by some <em>n</em>.</p>
<p>But the clustering also highlights an asymmetry. The solution of the Winkler problem says that we can draw a line from every <em>n</em> on the left to a specific <em>F(m)</em> on the right, the least Fibonacci number divisible by that <em>n</em>. What about the converse? Is every <em>F(m)</em> the least Fibonacci number divisible by some <em>n</em>? Can we draw a line from every <em>F(m)</em> on the right to some <em>n</em> on the left? The diagram as shown, which covers all <em>n</em> up to 100, has many gaps in the set of nodes on the right; for example, none of the numbers between <em>F</em>(61) and <em>F</em>(67) are among the smallest Fibonacci numbers divisible by an <em>n</em> &le; 100. If we were to extend the computation to larger values of <em>n</em>, would all the gaps in the righthand column eventually be filled in? Let me state that question a little more formally: For every <em>m</em>, is there an <em>n</em> that divides <em>F(m)</em> but does not divide <em>F(k)</em> for any <em>k</em> &lt; <em>m</em>? </p>
<p>This is a trick question. The answer is No because of <em>F</em>(2), which is &#8220;shadowed&#8221; by <em>F</em>(1), since <em>F</em>(1) = <em>F</em>(2) = 1. But <em>F</em>(2) is unique in this respect. If we set aside this one exception, is the statement true for all other <em>F(m)</em>? Now the answer is Yes, but trivially so. Every <em>F(m)</em> is divisible by <em>F(m)</em> itself, which cannot possibly divide any <em>F(k)</em> with <em>k</em> &lt; <em>m</em>. (In the case of Fibonacci numbers that are prime, <em>F(m)</em> and 1 are obviously the <em>only</em> divisors.)</p>
<p>To avoid the trivial solution, we need to ask a more tightly constrained question, which I&#8217;ll phrase as a conjecture:</p>
<blockquote><p>If <em>F(m)</em> has any divisors <em>n</em> with 1 &lt; <em>n</em> &lt; <em>F(m)</em>, then at least one of those <em>n</em> does not divide any <em>F(k)</em> for <em>k</em> &lt; <em>m</em>.</p>
</blockquote>
<p>Is the conjecture true? If so, where do all those new divisors come from? What mechanism guarantees that every nonprime <em>F(m)</em> will introduce at least one divisor never seen before in the sequence of Fibonacci numbers? If the conjecture is false, then there are &#8220;unselfish&#8221; Fibonacci numbers that share all their proper divisors with their smaller siblings, keeping none for their own exclusive use. What is the smallest such unselfish <em>F(m)</em>? (For what it&#8217;s worth, a computational search shows that any counterexample to the conjecture must lie beyond <em>F</em>(382).)</p>
<p>I&#8217;m going to leave this question as a challenge. I&#8217;ll give an answer in an update&#8212;in the unlikely event that no one posts a complete solution in the comments section in the next 10 minutes. One further note: Unselfish numbers <em>do</em> exist among the 1{01}* numbers; the smallest example is 1111, whose only proper divisors are 11 and 101, both of which obviously divide lesser 1{01}* numbers. Thus if you want to assert that Fibonacci numbers behave differently in this respect, you might want to think about what distinguishes the two sequences.</p>
<p>Finally, more about patterns of divisibility in Fibonacci numbers. The dots in the figure below show the least <em>m</em> such that <em>n</em> divides <em>F(m)</em> for all <em>n</em> up to 1,000:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/09/fibodivs1000-dots.png" alt="patterns of Fibonacci divisibility for n up to 1,000" border="0" width="426" height="600" /></p>
<p>It&#8217;s interesting that the dots tend to line up along certain rays, namely those whose slope is a ratio of small integers. The slopes <em>m/n</em> = 1 and 1/2 are the most clearly delineated, but there are also aggregations detectable by eye at <em>m/n</em> = 1/4, 1/3, 2/3, 3/4 and 3/2. The ray at <em>m/n</em> = 2 has only four data points on it in the range up to <em>n</em> = 1,000, but it is significant for another reason: It marks the absolute boundary of the <em>m/n</em> ratio. In other words, not only is it true that every <em>n</em> divides some <em>F(m)</em>, but furthermore the <em>m</em> in question is never greater than 2<em>n</em>.</p>
<p>There is no sign of such radial streaks or other distinctive patterns in the equivalent graph for the 1{01}* numbers:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/09/binstars-1000-dots.png" alt="patterns of 1{01}* divisibility for n up to 1000" border="0" width="448" height="626" /></p>
<p>As with the Fibonacci graphs, the vertical axis here represents not the numerical magnitude of an 1{01}* number but its index within the sequence. For example, the smallest 1{01}* number divisible by 18 is 1,111,111,110, which is the 1,022nd number in the 1{01}* sequence. (It&#8217;s more than coincidence that 1111111110<sub>2</sub> = 1022<sub>10</sub>.) Hence there&#8217;s a dot at <em>n</em> = 18 and vertical coordinate 1,022. I have colored orange all the dots associated with <em>n</em> that are multiples of 9. The dots along the upper margin of the graph are off-scale and actually belong at much higher elevations. For example, the dot for <em>n</em> = 99 should be at height 262,143 (the index of 111,111,111,111,111,111). The dot at <em>n</em> = 999 belongs at height 134,217,727 (the index of 111,111,111,111,111,111,111,111,111).</p>
<p><strong>Update 2011-10-09</strong>: More than a month ago I promised an answer to the question posed above: Is there a composite Fibonacci number for which all the proper divisors are also divisors of smaller Fibonacci numbers? When I asked the question, I had an answer for it. But a week later when I sat down to write up the proof, it fell apart like wet tissue. Since then the problem has been constantly with me. I&#8217;ve been waking up with it in the morning; I go to bed with it at night; it comes back to visit at idle moments during the day. Several times I&#8217;ve thought I had found a solution, but then the argument fell to pieces again. If some kindly reader had posted a full proof in the comments, I could have responded, &#8220;Yes, yes, exactly so. That&#8217;s just what I had in mind.&#8221; But only one reader came to my rescue (thanks, unekdoud!); although that suggestion seemed to be heading in the right direction, I wasn&#8217;t able to fill in all the details.</p>
<p>Now I have yet another proof. It&#8217;s the middle of the night as I write this, but I&#8217;m going to stay up and get this posted before the idea disintegrates again.</p>
<p>Here, copied from above, is a more precise statement of the problem, phrased as a conjecture:</p>
<blockquote><p>If <em>F(m)</em> has any divisors <em>n</em> with 1 &lt; <em>n</em> &lt; <em>F(m)</em>, then at least one of those <em>n</em> does not divide any <em>F(k)</em> for <em>k</em> &lt; <em>m</em>.</p>
</blockquote>
<p>I claim the conjecture is true, based on this assertion: If every divisor of <em>F(m)</em> also divides some smaller <em>F(k)</em>, then the <em>F(k)</em> in question cannot be greater than <em>F(m/</em>2<em>)</em>, while at least one of the divisors must be no smaller than &radic;<em>F(m)</em>. But this is impossible, because &radic;<em>F(m)</em> &gt; <em>F(m/</em>2<em>)</em> for all <em>m</em> &gt; 2. (For odd <em>m</em>, take the ceiling of <em>m</em>/2.)</p>
<p>The key to the proof is again the cyclic pattern of remainders observed when the members of the Fibonacci sequence are taken modulo an integer. In particular, if an integer <em>a</em> divides <em>F(m)</em>, then <em>a</em> also divides <em>F(</em>2<em>m)</em>, <em>F(</em>3<em>m)</em>, <em>F(</em>4<em>m)</em>, . . .   &nbsp;&nbsp;For example, <em>F</em>(8) = 21 is divisible by 3 and 7, and these numbers are also divisors of <em>F</em>(16) = 987 and <em>F</em>(24) = 46,368. </p>
<p>An immediate consequence of this cyclic structure is the remarkable fact that <em>F(k)</em> divides <em>F(m)</em> if and only if <em>k</em> divides <em>m</em>. Again let me cite an example: <em>F</em>(15) = 610 is divisible by <em>F</em>(3) = 2 and by <em>F</em>(5) = 5 and by no other Fibonacci numbers. (The prime factors of 610 are 2, 5 and 61. Note that the factor 61 divides no smaller Fibonacci number, and so <em>F</em>(15) is a confirming instance of the conjecture.)</p>
<p>A further observation is that if <em>m</em> is prime, then <em>F(m)</em> cannot be divisible by any Fibonacci number. </p>
<p>Let&#8217;s look at the divisors of <em>F(m)</em> for composite values of <em>m</em>. We know that such divisors exist. They include every <em>F(k)</em> for which <em>k</em> divides <em>m</em>, as well as all the proper divisors of each such <em>F(k)</em>. Suppose, contrary to the conjecture, that these known divisors comprise <em>all</em> the divisors of <em>F(m)</em>. Then for each integer <em>a</em> that divides <em>F(m)</em> we can ask which <em>F(k)</em> it also divides. Actually, a given <em>a</em> might divide <em>many</em> <em>F(k)</em>, but consider just the smallest member of this set. If <em>a</em> divides this minimal <em>F(k)</em>, then it also divides <em>F(</em>2<em>k)</em>, <em>F(</em>3<em>k)</em>, and so on, but it divides no other Fibonacci numbers. Thus <em>F(m)</em> must be a member of this series, or in other words <em>m</em> must be a multiple of <em>k</em>. The smallest such multiple is <em>m</em> = 2<em>k</em>. This gives us half of the proof: If a divisor of <em>F(m)</em> also divides <em>F(k)</em>, <em>k</em> can be no larger than <em>m</em>/2.</p>
<p>The second half is easier. Divisors come in pairs; they are integers <em>a</em> and <em>b</em> such that <em>ab</em> = <em>F(m)</em>. Furthermore, if <em>a</em> &le; &radic;<em>F(m)</em>, then <em>b</em> &ge; &radic;<em>F(m)</em>. Thus we conclude that <em>b</em> cannot be less than the square root of <em>F(m)</em> or more than <em>F(m/</em>2<em>)</em>&#8212;a contradiction.</p>
<p>The same reasoning applies with even greater force in the case of prime <em>m</em>. If we imagine that a divisor of <em>F(m)</em> is also a divisor of some smaller <em>F(k)</em>, we are driven to the conclusion that <em>k</em> divides <em>m</em>, which can&#8217;t be so for prime <em>m</em>.</p>
<p>In the course of working all this out in my bumbling-stumbling way, while making lots of lists of Fibonacci numbers and their divisors, the patterns I was seeing suggested a slightly stronger conjecture: </p>
<blockquote><p>Every Fibonacci number that is not a perfect power has at least one prime factor that appears in no smaller Fibonacci number.</p>
</blockquote>
<p>The perfect-power exception excludes exactly five Fibonacci numbers: <em>F</em>(0) = 0, <em>F</em>(1) = 1, <em>F</em>(2) = 1, <em>F</em>(6) = 8 = 2<sup>3</sup> and <em>F</em>(12) = 144 = 12<sup>2</sup>. No other Fibonacci numbers are perfect powers; I find it interesting that <a href="http://www-irma.u-strasbg.fr/~bugeaud/travaux/fibo.pdf">this fact was proved</a> only in the past few years, and only with the use of industrial-grade mathematical machinery. On the other hand, &#8220;my&#8221; conjecture <a href="http://www.jstor.org/stable/1967797">was proved</a> almost a century ago by the American mathematician R. D. Carmichael. If I had known that fact a few weeks ago, I would have slept better. But maybe learned less.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/divisive-diversions/feed</wfw:commentRss>
		</item>
		<item>
		<title>Driving the dreamboat</title>
		<link>http://bit-player.org/2011/driving-the-dreamboat</link>
		<comments>http://bit-player.org/2011/driving-the-dreamboat#comments</comments>
		<pubDate>Wed, 17 Aug 2011 23:57:22 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<category><![CDATA[modern life]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=1013</guid>
		<description><![CDATA[
Slide behind the wheel of this dreamboat. Push the electronic control button. Then sit back and let transistors take over.
There&#8217;s something curiously tentative about this vision of the future of motoring, as seen from 1964. You&#8217;re invited to push the button and let the transistors take over. But you&#8217;ve still got your hands on the [...]]]></description>
			<content:encoded><![CDATA[<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/08/rca-electronic-car-of-tomorrow-ad.jpg" alt="RCA electronic car of tomorrow ad" border="0" width="450" height="395" /></p>
<blockquote><p>Slide behind the wheel of this dreamboat. Push the electronic control button. Then sit back and let transistors take over.</p></blockquote>
<p>There&#8217;s something curiously tentative about this vision of the future of motoring, as seen from 1964. You&#8217;re invited to push the button and let the transistors take over. But you&#8217;ve still got your hands on the wheel; apparently you&#8217;re still responsible for driving the dreamboat.</p>
<p>Other early discussions of automatic automobiles are also fuzzy about exactly who or what is in charge. A notable example is the General Motors <a href="http://www.archive.org/details/ToNewHor1940">Futurama exhibit</a> at the 1939 Worlds Fair in New York. &#8220;Safe distance between cars is maintained by automatic radio control,&#8221; intones the narrator, above creepy organ music. This certainly suggests something other than seat-of-the-pants driving. But the next sentence narrows the scope of that automatic control: &#8220;Curved sides assist the driver in keeping his car within the proper lane under all circumstances.&#8221; Thus the technology is merely assistive, not autonomous. And what&#8217;s that about &#8220;curved sides&#8221;? Norman Bel Geddes, the designer of the exhibit, explains all in <em><a href="http://openlibrary.org/books/OL7205622M/Magic_motorways.">Magic Motorways</a></em>, published in 1940. It&#8217;s very low-tech. Freeway lanes are to be separated by high curbs of concave cross-section, which deflect a straying car back into its lane. (Later in the book Bel Geddes also discusses more elaborate guidance systems, involving buried conductors.)</p>
<p>The reprise of Futurama at the 1964 World&#8217;s Fair&#8212;an exhibit that I attended, along with 29 million other people&#8212;was even vaguer about the question of autonomous vehicles. We saw lots of miniature automobiles moving in close order along gleaming freeways, and personally I came away with the impression that all those vehicles were under computer control. But the <a href="http://www.phrenicea.com/futurama_chip.htm">transcript</a> of the narration includes only a single sentence on the topic, and it&#8217;s open to almost any interpretation: &#8220;Vehicles electronically paced, travel routes remarkably safe, swift and efficient.&#8221;</p>
<p>Why so coy about the prospect of cars that would drive themselves without human intervention? Maybe the concept was just too outlandish for credibility, particularly in 1939. Or maybe GM recognized that their natural audience is made up of car enthusiasts, who want to <em>drive</em> their dreamboats, not just be carried along as electronically paced, radio-controlled passengers.</p>
<p>In any case, the coyness has now evaporated, and these days everybody is talking about truly autonomous vehicles. DARPA runs contests for them; an Italian group has driven them across Europe and Asia; Google has a &#8220;<a href="http://techcrunch.com/2010/10/09/google-automated-cars/">secret</a>&#8221; fleet of them. And I too am talking about autonomous vehicles: &#8220;<a href="http://www.americanscientist.org/issues/pub/2011/6/leave-the-driving-to-it">Leave the Driving to It</a>&#8221; is my latest <em>American Scientist</em> column.</p>
<p>Note: The artwork above is from an RCA advertisement in the September 1964 issue of <em>Scientific American</em>. Stylistically, the painting owes something to the Futurama exhibits, but I&#8217;d like to make a wild guess that the (uncredited) artist who created this rendering lived in Minneapolis. That brightly lighted, colonnaded building to the right of center looks to me very much like a building at Hennepin and Washington (now owned by ING) that was completed in 1964, just as this ad appeared. The architect was Minoru Yamasaki, the designer of the World Trade Center.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/driving-the-dreamboat/feed</wfw:commentRss>
		</item>
		<item>
		<title>Probabilities of probabilities</title>
		<link>http://bit-player.org/2011/probabilities-of-probabilities</link>
		<comments>http://bit-player.org/2011/probabilities-of-probabilities#comments</comments>
		<pubDate>Tue, 16 Aug 2011 19:42:05 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[mathematics]]></category>

		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=1006</guid>
		<description><![CDATA[
In simple games, one can calculate the exact probability of every outcome, and so the expected winnings can also be determined exactly&#8230;.
When I wrote the words above in a recent American Scientist column, I didn&#8217;t think I was saying anything controversial. So I was taken by surprise when a reader objected to the very notion [...]]]></description>
			<content:encoded><![CDATA[<p>
<blockquote>In simple games, one can calculate the exact probability of every outcome, and so the expected winnings can also be determined exactly&#8230;.</p></blockquote>
<p>When I wrote the words above in a <a href="http://www.americanscientist.org/issues/pub/2011/4/quasirandom-ramblings">recent <em>American Scientist</em> column</a>, I didn&#8217;t think I was saying anything controversial. So I was taken by surprise when a reader objected to the very notion of <em>exact probability</em>. My correspondent argued that probabilities can never be determined exactly, only measured to within some error bound. He also ruled out the use of limits in defining probabilities, allowing only finite processes. After a brief correspondence with my critic, I set the matter aside; but it keeps coming back to haunt me in idle moments, and I think I should try to clarify my thoughts.</p>
<p>The &#8220;simple games&#8221; I had in mind were those based on coin-flipping or dice-rolling or card-dealing. For example, I would say that the exact probability of getting heads when you flip a fair coin is 1/2. (Indeed, that&#8217;s how I would <em>define</em> a fair coin.) Likewise the exact probability of rolling a 12 with a pair of unbiased dice is 1/36, and the exact probability of dealing four aces from a properly shuffled deck is 1/270,725. The arithmetic behind these numbers is straightforward. You&#8217;ll find a multitude of similar examples in the exercises of any introductory textbook on probability. The trouble is, if you ask me to show you a fair coin or an unbiased die or a properly shuffled deck, I can&#8217;t do it. I certainly can&#8217;t prove that any specific coin or die or deck has the properties claimed. So my critic has a point: Those <em>exact probabilities</em> I was talking about come from a suppositional world of ideal randomizing devices that don&#8217;t exist&#8212;or can&#8217;t be shown to exist&#8212;in the physical universe.</p>
<p>Of course mathematics is full of objects that we can&#8217;t construct out of Lego bricks and duct tape&#8212;dimensionless points, the real number line, Hilbert space. Just as I can&#8217;t exhibit a fair coin, I can&#8217;t show you an equilateral triangle&#8212;or rather I can&#8217;t draw you a triangle and then prove that the three sides of that particular triangle are equal. The existence of perfect circles and right angles and other such Platonic apparatus is something we just have to accept if we want to do a certain kind of mathematics. I&#8217;m happy to view the fair coin in this light&#8212;as a hypothetical device that comes out of the same cabinet where I keep my Turing machines and my Cantor sets. The question is whether we can (or should) take the more radical step of banishing such idealized paraphernalia altogether.</p>
<p>Suppose we insist that the only way of determining a probability is to measure it by experiment. Take a coin out of your pocket and flip it 100 times; if it comes up heads 55 times, then <em>p(H)</em> = 0.55. But when you repeat the experiment, you might get 51 heads, suggesting <em>p(H)</em> = 0.51. Or maybe you would now conclude that <em>p(H)</em> = 0.53. Or you might adopt some other method of inferring a probability from the experimental evidence, something more sophisticated than just taking the mean of a sample. There&#8217;s an elaborate and highly developed technology for just this purpose. It&#8217;s called statistics, and it works wonders. Still, to the extent that the assigned probability depends on the outcome of a finite number of trials (and remember: taking limits is out of bounds), we&#8217;re never going to settle on a single, definite and immutable value for the probability.</p>
<p>Alan H&aacute;jek, in an <a href="http://plato.stanford.edu/entries/probability-interpret/">article</a> in the <em>Stanford Encyclopedia of Philosophy</em>, calls this approach to probability &#8220;finite frequentism.&#8221; </p>
<blockquote><p>Where the classical interpretation [i.e., the probability theory of Laplace, Pascal, <em>et al</em>.] counted all the possible outcomes of a given experiment, finite frequentism counts actual outcomes. It is thus congenial to those with empiricist scruples. It was developed by Venn (1876), who in his discussion of the proportion of births of males and females, concludes: &#8220;probability <em>is</em> nothing but that proportion.&#8221;</p></blockquote>
<p>H&aacute;jek calls attention to several worrisome aspects of this doctrine. Most of the problems take us right back to where this discussion began, namely to the fact that we can never learn the exact probability of anything. And this ignorance leads to awkward consequences. In the standard calculus of probabilities, we know that if an event occurs with probability <em>p</em>, then two independent occurrences of the same event have probability <em>p</em><sup>2</sup>. How do we apply this rule in an environment where <em>p</em> changes every time we measure it? I suppose a stalwart empiricist might reply that if you want to know <em>p(HH)</em>, then that&#8217;s what you should be measuring. The empiricist might also point out that the loss of the calculus is not a defect of the theory; it&#8217;s just the human condition. We really are ignorant of exact probabilities, and we&#8217;ll only get into trouble if we pretend otherwise.</p>
<p>We could adopt a new calculus based on probabilities of probabilities, in which <em>p(H)</em> is not a number but a distribution&#8212;maybe a normal distribution determined by the mean and variance of the experimental data. Then <em>p</em><sup>2</sup> becomes the product of two such distributions. But beware: The shape of that normal curve we&#8217;ve just smuggled into our reasoning is defined by a process that involves the moral equivalent of flipping a fair coin, not to mention taking limits. Maybe we should  instead use a discrete, experimental approximation to the normal distribution, created with a physical device such as a <a href="http://en.wikipedia.org/wiki/Galton_board">Galton board</a>.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/08/hexstat-galton-board-sciam-1964.jpg" alt="the Hexstat in action" border="0" width="450" height="523" /></p>
<p class="centered">&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&#8226;</p>
<p>Reading on in H&aacute;jek&#8217;s long encyclopedia article, I find that the finite frequentists are not the only school of thought subject to withering criticism; H&aacute;jek finds deep flaws in <em>every</em> interpretation of probability theory, without exception. (That&#8217;s a philosopher&#8217;s job, I guess.)</p>
<p>Joseph Doob, <a href="http://www.jstor.org/stable/2974673">writing</a> 15 years ago in the <em>American Mathematical Monthly</em>, took the position that we have a perfectly sound <em>mathematical</em> theory of probability (formulated mainly by Kolmogorov in the 1930s, and founded on measure theory); the only problem is that it doesn&#8217;t connect very well with the world of everyday experience. I would like to quote at length what Doob has to say about the law of large numbers:</p>
<blockquote><p>In a repetitive scheme of independent trials, such as coin tossing, what strikes one at once is what has been christened the <em>law of large numbers</em>. In the simple context of coin tossing it states that in some sense the number of heads in <em>n</em> tosses divided by <em>n</em> has limit 1/2 as the number of tosses increases. The key words here are <em>in some sense</em>. If the law of large numbers is a mathematical theorem, that is, if there is a mathematical model for coin tossing, in which the law of large numbers is formulated as a mathematical theorem, either the theorem is true in one of the various mathematical limit concepts or it is not. On the other hand, if the law of large numbers is to be stated in a real world nonmathematical context, it is not at all clear that the limit concept can be formulated in a reasonable way. The most obvious difficulty is that in the real world only finitely many experiments can be performed in finite time. Anyone who tries to explain to students what happens when a coin is tossed mumbles words like <em>in the long run</em>, <em>tends</em>, <em>seems to cluster near</em>, and so on, in a desperate attempt to give form to a cloudy concept. Yet the fact is that anyone tossing a coin observes that for a modest number of coin tosses the number of heads in <em>n</em> tosses divided by <em>n</em> seems to be getting closer to 1/2 as <em>n</em> increases. The simplest solution, adopted by a prominent Bayesian statistician, is the vacuous one: never discuss what happens when a coin is tossed. A more common equally satisfactory solution is to leave fuzzy the question of whether the context under discussion is or is not mathematics. Perhaps the fact that the assertion is called a law is an example of this fuzziness.</p></blockquote>
<p>Note that the coins Doob is tossing seem to be drawn from the Platonic closet of ideal hardware.</p>
<p>None of my reading and pondering has made me a convert to finite frequentism, but at the same time I am grateful for this challenge to my easygoing confidence that I know what probabilities are and how to calculate with them. I surely know nothing of the sort. And on balance I think it might be a good idea if introductory accounts of probability theory put a little more emphasis on calculating from real-world data, with less reliance on fair coins and unbiased dice. In this connection I can recommend the probability section of Joseph Mazur&#8217;s diverting book <em><a href="http://members.authorsguild.net/mazur/work3.htm">Euclid in the Rainforest</a></em>.</p>
<p>Four miscellaneous related notes, in lieu of a conclusion:</p>
<p>(1) One might guess that any discrepancies between real coins and ideal ones would amount to only minor biases (unless you&#8217;re <a href="http://comptop.stanford.edu/u/preprints/heads.pdf">wagering with Persi Diaconis</a>). Perhaps so, but consider what happened a century ago when several eminent statisticians tried <a href="http://bit-player.org/bph-publications/AmSci-2001-07-Hayes-randomness.pdf">large-scale experiments</a> in generating random numbers with dice, playing cards and numbered slips of paper drawn from a bowl or a bag. Not one of those efforts produced results that passed statistical tests of randomness (including the predictions of the law of large numbers). As late as 1955, even the big Rand Corporation table of a million random digits (generated by a custom-made electronic device) had to be fudged a little after the fact. </p>
<p>(2) H&aacute;jek&#8217;s indictment of finite frequentism includes this charge: The scheme &#8220;rules out irrational probabilities; yet our best physical theories say otherwise.&#8221; Elsewhere in the essay he elaborates on this point, mentioning that quantum mechanics posits &#8220;irrational probabilities such as 1/&radic;2.&#8221; At first I thought this a very acute criticism, but now I&#8217;m not so sure. In the quantum contexts most familiar to me, 1/&radic;2 is a commonly encountered <em>amplitude</em>; the corresponding <em>probability</em> is |1/&radic;2|<sup>2</sup> = 1/2. Is it true that quantum mechanics necessarily requires irrational probabilities?</p>
<p>(3) Mark Kac, writing on probability in <em>Scientific American</em> (September 1964, p. 96) asks why probability theory was such a late-blooming flower among the branches of mathematics. It was neglected through most of the 18th and 19th centuries. Kac offers this explanation:</p>
<blockquote><p>Why this apathy toward the subject among professional mathematicians? There were various reasons. The main one was the feeling that the entire theory seemed to be built on loose and nonrigorous foundations. Laplace&#8217;s definition of probability, for instance, is based on the assumption that all the possible outcomes in question are equally likely; since this notion itself is a statement of probability, the definition appears to be a circular one.</p></blockquote>
<p>I&#8217;m skeptical of this hypothesis. It&#8217;s surely true that probability had shaky foundations, but so did other areas of mathematics. In particular, analysis was a ramshackle mess from the time of Newton and Leibniz until 1951, when <a href="http://www.haverford.edu/physics/songs/lehrer/delta.htm">Tom Lehrer</a> finally gave the world the epsilon-delta notation. Yet analysis was the height of fashion all through that period.</p>
<p>(4) If you adhere <em>strictly</em> to the finite-frequentist doctrine that a probability does not exist until you measure it, can you ever be surprised by anything?</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/probabilities-of-probabilities/feed</wfw:commentRss>
		</item>
		<item>
		<title>A slight discrepancy</title>
		<link>http://bit-player.org/2011/a-slight-discrepancy</link>
		<comments>http://bit-player.org/2011/a-slight-discrepancy#comments</comments>
		<pubDate>Thu, 23 Jun 2011 13:24:54 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<category><![CDATA[mathematics]]></category>

		<category><![CDATA[physics]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=999</guid>
		<description><![CDATA[
The image above shows the mesh top of a patio table, photographed after a soaking rain. Some of the openings in the mesh retain drops of water. What can we say about the distribution of those drops? Are they sprinkled randomly over the surface? The rainfall process that deposited them certainly seems random enough, but [...]]]></description>
			<content:encoded><![CDATA[<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/06/table-with-raindrops-0761-px450.jpg" alt="patio table with a pattern of embedded raindrops" border="0" width="450" height="367" /></p>
<p>The image above shows the mesh top of a patio table, photographed after a soaking rain. Some of the openings in the mesh retain drops of water. What can we say about the distribution of those drops? Are they sprinkled randomly over the surface? The rainfall process that deposited them certainly seems random enough, but to my eye the pattern of occupied sites in the mesh looks suspiciously even and uniform.</p>
<p>For ease of analysis I have isolated a square piece of the tabletop image (avoiding the central umbrella hole), and extracted the coordinates of all the drops within the square. There are 394 drops, which I plot below as blue dots:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/06/raindrop-394-dots.png" alt="positions of 394 raindrops on a tabletop" border="0" width="450" height="449" /></p>
<p>Again: Does this pattern look like the outcome of a random process? </p>
<p>I come to this question in the aftermath of writing an <a href="http://www.americanscientist.org/issues/pub/2011/4/quasirandom-ramblings"><em>American Scientist</em> column</a> that explores two varieties of simulated randomness: <em>pseudo</em> and <em>quasi</em>. Pseudorandomness needs no introduction here. A pseudorandom algorithm for selecting points in a square tries to ensure that all points have the same probability of being chosen and that all the choices are independent of one another. Here&#8217;s an array of 394 pseudorandom dots constrained to lie on a skewed and rotated lattice somewhat like that of the mesh tabletop:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/06/random-60grid-skew-dots-xy1.png" alt="394 pseudorandom dots on a skewed 60-by-60 grid" border="0" width="450" height="450" /></p>
<p>Quasirandomness is a less familiar concept. In selecting quasirandom points the aim is not equiprobability or independence but rather equidistribution: spraying the points as uniformly as possible across the area of the square. Just how to achieve this aim is not obvious. For the 394 quasirandom dots shown below, the <em>x</em> coordinates form a simple arithmetic progression, but the <em>y</em> coordinates are permuted by a digit-twiddling process. (The underlying algorithm was invented in the 1930s by the Dutch mathematician J. G. van der Corput, who worked with one-dimensional sequences, and extended to two dimensions in the 1950s by K. F. Roth. For more details, see my <a href="http://www.americanscientist.org/issues/pub/2011/4/quasirandom-ramblings"><em>American Scientist</em> column</a>, page 286, or the splendid <a href="http://bookshelf.theopensourcelibrary.org/2010_CharlesUniversity_GeometricDiscrepancy.pdf">book by Jiri Matousek</a>.)</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/06/vandercorput-394-dots.png" alt="394 dots scattered over a square by the Vandercorput algorithm" border="0" width="450" height="450" /></p>
<p>Which of these point sets, the <em>pseudo</em> or the <em>quasi</em>, is a better match for the raindrops? Here are the three patterns in miniature, placed side-by-side as an aid to comparison:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/06/pseudo-quasi-raindrop-trio.png" alt="pseudorandom, quasirandom and raindrop patterns" border="0" width="450" height="165" /></p>
<p>Each of the panels has a distinctive texture. The pseudorandom pattern has both tight clusters and large voids. The quasirandom dots are more evenly spaced (though there are several close pairs of points), but they also form distinctive, small-scale repetitive motifs, most notably a hexagonal structure that repeats with not-quite-crystalline regularity. As for the raindrops, they appear to be spread over the area at almost uniform density (in this respect resembling the quasirandom pattern), but the texture shows hints of swirly rather than latticelike structures (more like the pseudorandom example).</p>
<p>Rather than just eyeballing the patterns, we could try a quantitative approach to describing or classifying them. There are lots of tools for this purpose&#8212;radial distribution functions, nearest-neighbor statistics, Fourier methods&#8212;but my main motive for bringing up this question in the first place is to play with a new toy that I learned about in the course of reading up on quasirandomness. It is a quantity called <em>discrepancy</em>, which attempts to measure how much a point set departs from a uniform spatial distribution.</p>
<p>There are lots of variations on the concept of discrepancy, but I&#8217;m going to discuss just one measurement scheme. The idea is to superimpose rectangles of various shapes and sizes on the pattern, allowing only rectangles with sides parallel to the <em>x</em> and <em>y</em> axes. Here are three example rectangles drawn on the raindrop pattern:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/06/raindrop-394-dots-with-rects.png" alt="raindrop pattern with three axis-parallel rectangles" border="0" width="450" height="449" /></p>
<p>Now count the number of dots enclosed by each rectangle, and compare it with the number of dots that <em>would</em> be enclosed&#8212;based on the rectangle&#8217;s area&#8212;if the distribution of dots were perfectly uniform throughout the square. The absolute value of the difference is the discrepancy <em>D(R)</em> associated with rectangle <em>R</em>:</p>
<p class="centered"><em>D</em>(<em>R</em>) = | <em>N</em> &middot; area(<em>R</em>) &ndash; #(<em>P</em> &cap; <em>R</em>) |,</p>
<p>where <em>N</em> is the total number of dots and #(<em>P</em> &cap; <em>R</em>) denotes the number of dots <em>P</em> in rectangle <em>R</em>. For example, the rectangle at the upper left covers 10 percent of the area of the square, so its fair share of  dots would be 0.1 &times; 394 = 39.4 dots. The rectangle actually encloses only 37 dots, and so the discrepancy associated with the rectangle is |39.4 &#8212; 37| = 2.4. Note that the density of dots in the rectangle could be either higher or lower than the overall average; in either case the absolute-value operation would give a positive discrepancy.</p>
<p>For the pattern as a whole, we can define the global discrepancy <em>D</em> as the worst-case value of this measure, or in other words the maximum discrepancy taken over all possible rectangles drawn in the square. Van der Corput asked whether point sets could be constructed with arbitrarily low discrepancy, so that <em>D</em> would always remain below some fixed bound as <em>N</em> goes to infinity. The answer is No, at least in one and two dimensions; the minimal growth rate is <em>O</em>(log <em>N</em>). Pseudorandom patterns generally have even higher discrepancy, <em>O</em>(&radic;<em>N</em>).</p>
<p>How can you find the rectangle that has maximum discrepancy for a given point set? When I first read the definition of discrepancy, I thought it would be impossible to compute exactly, because there are infinitely many rectangles to be considered. But after thinking about it a while, I realized there are only finitely many candidate rectangles that might possibly maximize the discrepancy. They are the rectangles in which each of the four sides passes through at least one dot. (Exception: Candidate rectangles can also have sides lying on the boundary of the enclosing square.) </p>
<p>Suppose we encounter the following rectangle:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/06/rectangle-bounds-1.png" alt="reactangle with three sides anchored by points" border="0" width="332" height="139" /></p>
<p>The left, top and bottom sides each pass through a dot, but the right side lies in &#8220;empty space.&#8221; This configuration cannot be the rectangle of maximum discrepancy. We could push the right edge leftward until it just intersects a dot:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/06/rectangle-bounds-2.png" alt="ractangle reshaped to maximize density of points" border="0" width="332" height="139" /></p>
<p>This change in shape reduces the area of the rectangle without altering the number of dots enclosed, and thus increases the density of dots. Alternatively, we could push the edge the other way:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/06/rectangle-bounds-3.png" alt="rectangle stretched to maximize area" border="0" width="332" height="139" /></p>
<p>Now we have <em>increased</em> the area, again without changing the count of enclosed dots, and thereby lowered the density. At least one of these actions must increase <em>D</em>(<em>R</em>), compared with the initial configuration. Thus every rectangle with all four sides touching dots or the edges of the square is a local maximum of the discrepancy function; by enumerating all rectangles in this finite set, we can find the global maximum.</p>
<p>There is also the irksome question of whether the rectangle is to be considered &#8220;closed&#8221; (meaning that points on the boundary are included in the area) or &#8220;open&#8221; (boundary points are excluded). I&#8217;ve sidestepped that problem by tabulating results for both open and closed boundaries. The closed form gives the highest discrepancy for densely populated rectangles, and the open form maximizes discrepancy for sparsely populated rectangles.</p>
<p>By considering only extrema of the discrepancy function, we make the counting problem finite&#8212;but not easy! In a square with <em>N</em> dots (all with distinct <em>x</em> and <em>y</em> coordinates), how many rectangles have to be considered? This turns out to be a really sweet little problem, with a simple combinatorial solution. I&#8217;m not going to reveal the answer here, but if you don&#8217;t feel like working it out for yourself, you could look it up in <a href="http://oeis.org/A000537">the OEIS</a> or see a short paper by <a href="http://www.math.hmc.edu/~benjamin/papers/rectangles.pdf">Benjamin, Quinn and Wurtz</a>. What I <em>will</em> say is that for <em>N</em> = 394, the number of rectangles is 6,055,174,225&#8212;or double that if you count open and closed rectangles separately. For each rectangle, it&#8217;s necessary to figure out how many of the 394 points are inside and how many are outside. Pretty big job.</p>
<p>One way to reduce the computational burden is to retreat to a simpler measure of discrepancy. Much of the literature on quasirandom patterns in two or more dimensions uses a quantity called star discrepancy, or <em>D*</em>. The idea is to consider only rectangles &#8220;anchored&#8221; at the lower left corner of the square (which we can conveniently identify with the origin of the <em>xy</em> plane). In this case the number of rectangles is just <em>N</em><sup>2</sup>, or about 150,000 for <em>N</em> = 394.</p>
<p>Here is the rectangle that defines the global star discrepancy of the raindrop pattern:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/06/star-discrepancy-raindrops-1.png" alt="maximal star-discrepancy rectangle for the raindrop pattern" border="0" width="450" height="449" /></p>
<p>The dark green rectangle at the bottom covers about 55 percent of the square and ought to encompass 216.9 dots, if the distribution were truly uniform. The actual number of dots included (assuming a &#8220;closed&#8221; rectangle) is 238, for a discrepancy of 21.1. No other rectangle anchored to the corner (0,0) has a greater discrepancy. (Note: Because of limited graphic resolution, the <em>D*</em> rectangle appears to extend horizontally all the way across the unit square; in fact the right edge lies at <em>x</em> = 0.999395.)</p>
<p>What does this result tell us about the nature of the raindrop pattern? Well, for starters, the discrepancy is fairly close to &radic;<em>N</em> (which is 19.8 for N = 394) and not particularly close to log <em>N</em> (which is 6.0 for natural logs). Thus we get no support for the idea that the raindrop pattern might be more quasirandom than pseudorandom. The <em>D*</em> values for the other patterns shown above are in the ranges expected: 25.9 for the pseudorandom and 4.4 for the quasirandom. Contrary to the visual impression, the raindrop distribution seems to have more in common with a pseudorandom point set than a quasirandom one&#8212;at least by the <em>D*</em> criterion.</p>
<p>What about calculating the unrestricted discrepancy <em>D</em>&#8212;that is, looking at <em>all</em> rectangles rather than just the anchored rectangles of <em>D*</em>? A moment&#8217;s thought shows that this exhaustive enumeration of rectangles can&#8217;t change the basic conclusion in this case; <em>D</em> can never be less than <em>D*</em>, and so we can&#8217;t hope to move from &radic;<em>N</em> toward log <em>N</em>. But I was curious about the computation anyway. Among those six billion rectangles, which one has the greatest discrepancy? Is it possible to answer this question without heroic efforts?</p>
<p>The obvious, straightforward algorithm for <em>D</em> generates all candidate rectangles in turn, measures their area, counts the dots inside, and keeps track of the extreme discrepancies seen along the way. I found that a program implementing this algorithm could determine the exact discrepancy of 100 pseudorandom or quasirandom dots in a few minutes. This outcome might seem to offer some encouragement for pushing on to higher <em>N</em>; however, the running time almost doubles every time <em>N</em> increases by 10, which suggests the computation would take a couple of centuries at <em>N</em> = 394.</p>
<p>I&#8217;ve invested a few days&#8217; work in efforts to speed up this calculation. Most of the running time is spent in the routine that counts the dots in each rectangle. Deciding whether a given dot is inside, outside or on the boundary takes eight arithmetic comparisons; thus, at <em>N</em> = 394, there are more than 3,000 comparisons for each of the six billion rectangles. The most effective time-saving device I&#8217;ve discovered is to precompute all the comparisons. For each point that can become the lower left corner of a rectangle, I precompute a list of all the pattern dots above and to the right. For each potential upper right corner of a rectangle, I compile a similar list of dots below and to the left. These lists are stored in a hash table indexed by the corner coordinates. Given a rectangle, I can then count the number of interior dots just by taking the set intersection of the two lists.</p>
<p>With this trick, the estimated running time for <em>N</em> = 394 came down from two centuries to about two weeks. A big improvement&#8212;just enough encouragement to induce me to spend yet another day on further refinements. Replacing the hash table with an <em>N</em> &times; <em>N</em> array helped a little. And then I figured out a way to ignore all the smallest rectangles, those that cannot possibly be the max-<em>D</em> rectangle because they either contain too few dots or have too small an area. This improvement finally brought the running time down to the overnight range. The illustration below, which shows the rectangle of maximum discrepancy <em>D</em> for the raindrop pattern, took six hours to compute.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/06/full-discrepancy-raindrops.png" alt="rectangle yielding the largest exact discrepancy for the raindrop pattern" border="0" width="450" height="449" /></p>
<p>The max-<em>D</em> rectangle is clearly a slight refinement of the <em>D*</em> rectangle for the same point set. The <em>D</em> rectangle &#8220;should&#8221; contain 204.3 dots but actually has 229, for a discrepancy of <em>D</em> = 24.7.</p>
<p>Of course knowing that the exact discrepancy is 24.7 rather than 21.1 tells us nothing more about the nature of the raindrop pattern. As a matter of fact, I feel I know less and less about it as I compute more and more.</p>
<p>When I started this project, I had a pet theory about what might be happening in the tabletop to smooth out the distribution of drops and thereby make the pattern look more quasi- than pseudorandom. To begin with, think of droplets lying on a smooth pane of glass rather than a metal mesh. If two small droplets come close enough to touch, they merge into one larger drop, because that configuration has lower energy associated with surface tension. Perhaps something similar could happen in the mesh. If two adjacent cells of the mesh are both filled with water, and if the metal channel between them is wet, then water could flow freely from one drop to the other. The larger drop will almost surely grow at the expense of the smaller one, and the latter will eventually be annihilated. Thus we would expect a deficit of drops in adjacent cells, compared with a purely random distribution.</p>
<p>This idea still sounds pretty good to me. The only trouble is: It explains a phenomenon that may not exist. I don&#8217;t know whether my discrepancy measurements actually reveal anything important about the three patterns, but at the very least the measurements fail to provide evidence that the raindrop distribution is different from the pseudorandom distribution. True, the patterns <em>look</em> different, but how much should we trust our perceptual apparatus in a case like this? If you ask people to draw dots at random, most do a bad job of it, typically making the distribution too smooth and even. Maybe the brain is equally challenged when trying to <em>recognize</em> randomness.</p>
<p>Nevertheless, I suspect there <em>is</em> some mathematical property that will more effectively distinguish between these patterns. If anyone else would like to sink some time into the search, the <em>xy</em> coordinates for the three point sets are in a <a href="http://bit-player.org/wp-content/uploads/2011/06/discrepancy-coordinates.txt" title="discrepancy-coordinates.txt" alt="discrepancy-coordinates.txt">text file here</a>.</p>
<p><strong>Update 2011-06-24</strong>: This is just a brief note to suggest that if you&#8217;ve read this far, please go on and read the comments, too. There&#8217;s much of value there. I want to thank all those who took the trouble to propose alternative explanations or algorithms, and to point out weaknesses in my analysis. Special thanks to themathsgeek, who within 40 minutes after I first posted the item had come up with a far superior program for computing discrepancies. Also Iain, who pursued my offhand remarks about the perception of random patterns with actual experiments.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/a-slight-discrepancy/feed</wfw:commentRss>
		</item>
		<item>
		<title>Don&#8217;t try to read this proof!</title>
		<link>http://bit-player.org/2011/dont-try-to-read-this-proof</link>
		<comments>http://bit-player.org/2011/dont-try-to-read-this-proof#comments</comments>
		<pubDate>Tue, 07 Jun 2011 17:51:22 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[mathematics]]></category>

		<category><![CDATA[problems and puzzles]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=980</guid>
		<description><![CDATA[On the subject of the Collatz conjecture (also known as the 3x+1 problem), Paul Erdos remarked: &#8220;Mathematics is not yet ready for such problems.&#8221; Shizuo Kakutani joked that the problem was a Cold War invention of the Russians meant to slow the progress of mathematics in the West. Richard Guy listed it in an article [...]]]></description>
			<content:encoded><![CDATA[<p>On the subject of the Collatz conjecture (also known as the 3<em>x</em>+1 problem), Paul Erdos remarked: &#8220;Mathematics is not yet ready for such problems.&#8221; Shizuo Kakutani joked that the problem was a Cold War invention of the Russians meant to slow the progress of mathematics in the West. Richard Guy listed it in an article titled &#8220;Don&#8217;t try to solve these problems!&#8221; All of these warnings to the unwary have had the expected effect: A <a href="http://arxiv.org/abs/math/0608208">bibliography compiled by Jeffrey Lagarias</a> cites 200 works on the Collatz conjecture, and there are hundreds of other papers that didn&#8217;t make the list. This past week brought <a href="http://preprint.math.uni-hamburg.de/public/papers/hbam/hbam2011-09.pdf">one more preprint</a>: a claimed proof of the conjecture by <a href="http://www.math.uni-hamburg.de/home/opfer/index.html">Gerhard Opfer</a> of the University of Hamburg.</p>
<p>The Collatz conjecture can be stated in terms of this little recursive program:</p>
<pre>    procedure collatz(x)
        if x=1 then halt
        elseif even(x) then collatz(x/2)
        elseif odd(x) then collatz(3*x+1);
</pre>
<p>The conjecture makes the following claim: If <em>x</em> is any positive integer, then the program eventually halts. For example, starting with <em>x</em>=3, the successive values of <em>x</em> are 3, 10, 5, 16, 8, 4, 2, 1&#8212;and on reaching <em>x</em>=1 the program halts. You could try a few other starting values for yourself; <em>x</em>=27 is a popular choice; <em>x</em>=319,804,831 is even better. But please remember Guy&#8217;s advice. Also note that the Collatz conjecture has been <a href="http://www.numbertheory.org/php/collatz.html">verified numerically</a> by Tom&aacute;s Oliveira e Silva for all <em>x</em> up to 20 &times; 2<sup>58</sup> . Thus if you&#8217;re searching for a counterexample, you may as well start somewhere north of 5,764,607,523,034,234,880.</p>
<p>The conjecture is named for Lothar Collatz (1910&ndash;1990), who investigated the curious properties of the 3<em>x</em>+1 iteration when he was a young student, circa 1930. Opfer, the author of the new proposed proof, was a Ph.D. student of Collatz in the 1960s. This conjunction suggests a novelistic storyline: A mathematician struggles all his life with an intractable problem, then hands it on to his student, who dedicates his own career to the task, finally achieving triumphant success some 80 years after the story began. But apparently that&#8217;s not how it happened. Collatz never returned to the 3<em>x</em>+1 problem after the 1930s; almost all of his mature work was in numerical analysis. Opfer is also a numerical analyst and only recently turned to the 3<em>x</em>+1 problem. There&#8217;s no grand saga of a multigenerational obsession here.</p>
<p>Much of Opfer&#8217;s paper is beyond my understanding, but I can piece together a crude guide to a few of the basic ideas. The proof begins with a mathematical change of venue. A problem originally posed in terms of the simplest kind of arithmetic on the natural numbers is transported to the realm of <a href="http://en.wikipedia.org/wiki/Holomorphic_function">holomorphic functions on the complex plane</a>. It&#8217;s like being swept up from a farm in Kansas and set down in the Land of Oz. The idea for this transformation comes from the <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.4482">work of Lothar Berg and G&uuml;nter Meinardus</a> in the 1990s, who established a direct connection between these two realms. They set up a certain system of equations for functions of a complex variable <em>z</em>, then showed that the Collatz conjecture is true if and only if the solutions of the equations all lie in a certain region. In the simplest case, the region is the open unit disc&#8212;the disc centered at the origin with |<em>z</em>| &lt; 1. Who&#8217;d've thunk it? What we do in Oz has consequences back in Kansas!</p>
<p>How can those two very different problems be yoked together? As I (tenuously) understand it, the complex functions of Berg and Meinardus can be represented by formal power series whose integer coefficients encode information about the sequence of numbers generated in the Collatz iteration. The rest of the Opfer proof is all about those coefficients. Thus we click our heels and return from the Emerald City to the land of number theory.</p>
<p>At this point Opfer begins reasoning in terms of algorithms that generate sets of coefficients. I&#8217;m on much friendlier terms with algorithms than I am with holomorphic functions, but strangely enough this is where I begin to lose my footing as I try to follow Opfer&#8217;s steps. His aim is to show that all coefficients vanish except for those whose indices lie in a certain congruence class. (Specifically, he is seeking to prove that all coefficients &eta;<sub><em>j</em></sub> = 0 except those for which <em>j</em> = 3<em>k</em>&ndash;1, <em>k</em> &isin; 1, 2, 3, &#8230; .) He argues that his algorithms for generating the coefficients yield exactly this property. But this is where I get lost.</p>
<p>Opfer is not the first to announce a proof of the Collatz conjecture. The earlier attempts did not stand up to scrutiny, and in <a href="http://math.stackexchange.com/questions/43051/collatz-finally-solved/43082#43082">various</a> <a href="http://www.reddit.com/r/math/comments/hpn3g/learned_gentlemen_of_rmath_im_counting_on_you/c1xo3qq">corners</a> <a href="http://mathlesstraveled.com/2011/06/04/the-collatz-conjecture-is-safe-for-now/">of</a> <a href="http://www.reddit.com/r/math/comments/hqqph/collatz_3n_1_conjecture_solved/c1xp6iu">nerdom</a> there&#8217;s already skepticism about this try, too. As for me, I can&#8217;t say whether there are gaps in Opfer&#8217;s proof because there are such wide gaps in my understanding of it. The paper has been submitted to <em>Mathematics of Computation</em>, which is certainly an appropriate journal for this work, and I&#8217;m content to wait and see what comes of the referreeing process.</p>
<p>Like many others, I first learned of the 3<em>x</em>+1 problem from Martin Gardner&#8217;s <em>Scientific American</em> column in the early 1970s. A decade later, when I briefly had a chance to fill Martin&#8217;s space in the magazine after his retirement, 3<em>x</em>+1 was one of the first subjects I wrote about (<a href="http://bit-player.org/bph-publications/SciAm-1984-01-Hayes-hailstone.pdf">nearly illegible PDF</a>&#8212;sorry; it&#8217;s my only copy). I&#8217;m delighted to have this opportunity to poke at the problem again. Even if the Opfer proof comes to nothing, it has given me an incentive to read some of the recent literature, including that illuminating trip to Oz courtesy of Berg and Meinardus. I would also like to recommend a book by G&uuml;nther J. Wirsching, <em>The Dynamical System Generated by the 3</em><em>n</em> + 1 Function (Springer, 1998). A version of at least one chapter is <a href="http://www.math.uni-bielefeld.de/baake/algdyn/posden.pdf">available online</a>.</p>
<p>Jeffrey Lagarias also has a recent book: <em><a href="http://books.google.com/books?id=hekJ7JDMEVkC&#038;lpg=PP1&#038;pg=PP1#v=onepage&#038;q&#038;f=false">The Ultimate Challenge: The 3<em>x</em>+1 Problem</a></em> (AMS, 2010), but I haven&#8217;t seen it yet. The volume reprints a number of survey articles and historical documents, including a 1985 <em>American Mathematical Monthly</em> <a href="http://www.cecm.sfu.ca/organics/papers/lagarias/index.html">article</a> by Lagarias himself that is still the best starting point for those who want to ignore all good advice and dive into the problem. I am particularly eager to see another chapter: an English translation of the only paper in which Collatz discusses his work on 3<em>x</em>+1; it is an account in Chinese of a talk by Collatz at Qufu Normal University in 1986.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/dont-try-to-read-this-proof/feed</wfw:commentRss>
		</item>
		<item>
		<title>Only correlate!</title>
		<link>http://bit-player.org/2011/only-correlate</link>
		<comments>http://bit-player.org/2011/only-correlate#comments</comments>
		<pubDate>Sun, 29 May 2011 00:16:09 +0000</pubDate>
		<dc:creator>brian</dc:creator>
		
		<category><![CDATA[computing]]></category>

		<category><![CDATA[modern life]]></category>

		<guid isPermaLink="false">http://bit-player.org/?p=976</guid>
		<description><![CDATA[I&#8217;m not actually a shill for Google Labs, although it may seem that way from all my recent (and ongoing) attention to the Google Ngram Viewer: four posts (1, 2, 3, 4) and an American Scientist column, so far. What I particularly like about Google Labs is that they share their toys. They create Big [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m not actually a shill for Google Labs, although it may seem that way from all my recent (and ongoing) attention to the Google Ngram Viewer: four posts (<a href="http://bit-player.org/2010/googling-the-lexicon">1</a>, <a href="http://bit-player.org/2011/314">2</a>, <a href="http://bit-player.org/2011/the-library-of-babble">3</a>, <a href="http://bit-player.org/2011/zipfy-n-grams">4</a>) and an <em>American Scientist</em> <a href="http://www.americanscientist.org/issues/pub/2011/3/bit-lit">column</a>, so far. What I particularly like about Google Labs is that they share their toys. They create Big Data projects that everybody can play with. For those of us without a server farm on the back 40, that&#8217;s a rare opportunity.</p>
<p>The latest Labs release is <a href="http://correlate.googlelabs.com/">Google Correlate</a>. If you have a time series&#8212;data expressed as a function of date, for any subinterval of the period since 2003&#8212;Correlate will try to identify Google search queries that exhibit a similar temporal pattern of activity. All this is easier to understand with an example. For a  specimen time series, consider the interest-rate index known as the 1-year CMT, which is published every week. I scraped seven years of CMT data from <a href="http://mortgage-x.com/general/indexes/default.asp">this web site</a>, and uploaded the file to Correlate. I got back a list of 100 phrases whose popularity as Google search terms has followed a trajectory more or less similar to that of the interest rates. As it happens, none of those highly correlated terms has an obvious connection to financial affairs. Roughly half of them are related to cell phones (&#8221;cingular&#8221; and &#8220;treo&#8221; turn up over and over). But the term with the strongest correlation (<em>r</em>=0.9751) is the phrase &#8220;pill identification&#8221;:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/05/interest-pills-450.png" alt="graph of time-series correlation between 1-year CMT interest rate data and Google searches for 'pill identitification'" border="0" width="450" height="278" /></p>
<p>In other words, the gradual rise in interest rates during the early 2000s was paralleled by a steady growth in the number of people seeking help in identifying the contents of mysterious unlabeled vials in the medicine cabinet. Then, sometime in 2007, both trends reversed direction. Why should these particular variables be so closely correlated? If there is a reason, I have no idea what it is. And I must immediately insert the obligatory disclaimer: Correlation is not causation. Emphatically so in this case. If you are trying to predict the future course of interest rates, I do not recommend tracking popular interest in pill identification. Or vice versa.</p>
<p>At a more personal level, there&#8217;s a time series I have been <a href="http://bit-player.org/2010/the-state-of-the-spamosphere">tracking</a> since 2007: the volume of spam arriving in my email inbox. My records are monthly, whereas Google Correlate wants weekly data, so I did some resampling and smoothing, and came up with this:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/05/spam-ashford-450.png" alt="graph of correlation between Brian Hayes's spam receipts and the Google query 'ashford blackboard login'" border="0" width="450" height="278" /></p>
<p>The best match, shown in the graph, is the mildly enigmatic query &#8220;ashford blackboard login.&#8221; Many of the other correlated series suggest a seasonal theme that I can understand in retrospect but that I did not see coming before looking at the results: &#8220;honda accord 2009,&#8221; &#8220;celica 2009,&#8221; &#8220;rav4 2009,&#8221; &#8220;2009 altima coupe,&#8221; &#8220;new cars 2009,&#8221; &#8220;2009 ranger,&#8221; etc. The most distinctive features of the spam curve are a peak in the fall of 2008, a deep dip the following winter, and an even stronger surge in the summer of 2009. Evidently shoppers for cars in the 2009 model year followed a similar trend line. (But again I would caution that spam volume is unlikely to be a good predictor of automobile sales.)</p>
<p>These results might be taken to suggest that every conceivable time series must be correlated with <em>some</em> set of Google queries, however farfetched the association. I tried submitting a few random walks, covering the same time span as the spam series, and they too fetched up matching queries from the Google database:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/05/random-atttilt-450.png" alt="correlation graph for a random walk and the query 'att tilt software'" border="0" width="450" height="278" /></p>
<p>At the opposite end of the spectrum from a random walk, I tried some rigidly artificial probes, such as a series with nonzero entries only in the month of May. Sure enough, there are search-engine queries that follow the same recurrent annual pattern:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/05/mays-jlabs-450.png" alt="correlation of a time series with nonzero entries on in the month of May and the Google query 'j labs'" border="0" width="450" height="278" /></p>
<p> A time series that has all of its energy concentrated in a single pulse elicits from the database a variety of flash-in-the-pan topics&#8212;queries that came and went and were never heard of again.</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/05/sept05-wolframtones-450.png" alt="correlation of a time series with a pulse in September 2005 and the query 'wolframtones'" border="0" width="450" height="278" /></p>
<p>Without too much work we could enumerate all such one-month wonders.</p>
<p>It is not the case, however, that <em>every</em> possible time series has a close correlate somewhere in the Google collection. Here is an example of a series for which Correlate finds no query that matches closely enough to bother reporting:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/05/mini-miles.jpg" alt="weekly driving mileage, late 2003 to late 2010" border="0" width="450" height="364" /></p>
<p>This is a weekly record of miles driven in the family car. Should we be surprised that not a single series among the tens of millions of queries in the Google database comes close to matching this pattern? One approach to this question is to ask just how many series of this kind might exist. The mileage record covers 364 weeks. As a lower bound, suppose the mileage associated with each week could have just two possible values: either we drove the car or we didn&#8217;t, so the mileage is either zero or greater than zero. Then there are 2<sup>364</sup> (or about 10<sup>110</sup>) possible time series&#8212;many orders of magnitude greater than the total number of Google searches since the company was founded. Thus the set of queries in the Google archive must be an extremely sparse subset of all possible time series. Most of the series we could construct would necessarily come up empty. (I note in passing that there&#8217;s interesting structure in that mileage log of mine, which I never knew about until I graphed it&#8212;but that&#8217;s a story for another day.)</p>
<p>A really interesting question is how Google Correlate does it. Even with &#8220;only&#8221; tens of millions of queries in the database, comparing a submitted series with all the candidates would be impossibly expensive. A <a href="http://correlate.googlelabs.com/whitepaper.pdf">white paper</a> explains: </p>
<blockquote><p>In our Approximate Nearest Neighbor (ANN) system, we achieve a good balance of precision and speed by using a two-pass hash-based system. In the first pass, we compute an approximate distance from the target series to a hash of each series in our database. In the second pass, we compute the exact distance function on the top results returned from the first pass.</p></blockquote>
<p>Thus the basic strategy is precomputation: Spend a lot of time in advance computing a succinct signature or hash associated with each time series in the database; then quickly compare hash values when looking to match a submitted time series.</p>
<p>A few further miscellaneous notes:</p>
<p>Google Correlate evolved from earlier work on tracking influenza outbreaks by monitoring search-engine queries. Initially this required a batch computation lasting hours, even when run on hundreds of computers. The new hash-based search takes less than a second. (Algorithms and data structures still count for more than hardware.)</p>
<p>Google Correlate includes a geographic component alongside the temporal database. If you have data distributed over the 50 U.S. states, you can retrieve Google queries that exhibit a similar spatial pattern. (I have not experimented with this system.)</p>
<p>Even if you don&#8217;t have a time series or a geographic data set of your own, you can play with the new service by cross-correlating one search query against others. For example, enter the term &#8220;solstice&#8221; in the search box, and you&#8217;ll see a graph with exactly the pattern of twice-a-year spikes that you might expect. You also get a list of other search terms whose temporal pattern has similar features. One of those correlated terms is &#8220;italian seafood salad.&#8221; A glance at the corresponding graph suggests there&#8217;s only half a correlation in this case:</p>
<p><img class="centered" src="http://bit-player.org/wp-content/uploads/2011/05/solstice-salad-450.png" alt="correlation of 'solstice' and 'italian seafood salad'" border="0" width="450" height="278" /></p>
<p>I didn&#8217;t know until just a few minutes ago that <em>frutti di mare</em> was a dish to be eaten at the winter solstice.</p>
]]></content:encoded>
			<wfw:commentRss>http://bit-player.org/2011/only-correlate/feed</wfw:commentRss>
		</item>
	</channel>
</rss>

