en Judgment of Princeton

The Judgment of Princeton was a wine tasting (or blind tasting) event held on 8 June 2012 during a conference of the American Association of Wine Economists held at Princeton University in Princeton, New Jersey. The purpose of this event was to compare, by a blind tasting, of several French wines against wines produced in New Jersey in order to gauge the quality and development of the New Jersey wine industry. Because New Jersey's wine industry is relatively young and small, it has received little attention in the world wine market. The state's wine production has experienced growth in recent years largely as a result of state legislators offering new opportunities for winery licensing and repealing Prohibition-era laws that have constrained the industry's development in past years. This event was modeled after a 1976 blind tasting event dubbed the "Judgment of Paris" in which French wines were compared to several wines produced in California when that state's wine industry was similarly young and developing. The New Jersey wine industry heralded the results and asserted that the rating of New Jersey wines by the blind tasting's judges was a victory for the state's wine industry.^[1]

Details

The Judgment of Princeton, held at Princeton University on Friday, June 8, 2012, was a structured blind tasting of top New Jersey wines against top French wines from Bordeaux and Burgundy.^[2]^[3]^[4]^[5]^[6]^[7] The event was based on the famous 1976 Judgment of Paris (wine), in which California wines famously beat French wines in a blind tasting. The Judgment of Princeton was spearheaded by George M. Taber, who had been in Paris for the original Judgment of Paris and later written a book on the subject.^[8] Along with Taber, the tasting was organized and carried out by economists Orley Ashenfelter, Richard E. Quandt, Karl Storchmann, and Mark Censits, owner of CoolVines, a local wine and spirits shop, who acted in the role of merchant Steven Spurrier, gathering the competition wines from the NJ winemakers and selecting and sourcing the French wines against which they were to be pitted. The French wines were sourced from the same estates as the original wines of the Paris tasting. The event also included other members of the American Association of Wine Economists, who then posted the data set from the tastings online as an open invitation to further analysis.^[9]^[10]

The judges

Of the nine judges in Princeton, six were American, two French, and one Belgian. They are listed here in alphabetical order.

Name	Affiliation	Nationality
Jean-Marie Cardebat	Université de Bordeaux	France
Tyler Colman	DrVino.com	USA
John Foy	The Star-Ledger, thewineodyssey.com	USA
Olivier Gergaud	BEM Management School	France
Robert Hodgson	Fieldbrook Winery	USA
Danièle Meulders	Université Libre de Bruxelles	Belgium
Linda Murphy	Decanter (magazine), American Wine	USA
Jamal Rayyis	Gilbert & Gaillard Wine Magazine	USA
Francis Schott	Stage Left Restaurant, RestaurantGuysRadio.com	USA

Controversy

The judges were told, in advance, similar to the set up in the Judgment of Paris, that six wines in each flight of ten were from New Jersey. Subsequently, several of the judges complained about the revelation of their judgments, as also occurred in the Judgment of Paris.

Interpretation of results

In 1999, Quandt and Ashenfelter published a paper in the journal "Chance" that questioned the statistical interpretation of the results of the 1976 Judgment of Paris. The authors noted that a "side-by-side chart of best-to-worst rankings of 18 wines by a roster of experienced tasters showed about as much consistency as a table of random numbers," and reinterpreted the data, altering the results slightly, using a formula that they argued was more statistically valid (and less conclusive).^[11] Quandt’s later paper "On Wine Bullshit" poked fun at the seemingly random strings of adjectives that often accompanied experts' published wine ratings.^[12] More recent work by Robin Goldstein, Hilke Plassmann, Robert Hodgson, and other economists and behavioral scientists has shown high variability and inconsistency both within and between blind tasters; and little correlation has been found between price and preference, even among wine experts, in tasting settings in which labels and prices have been concealed.^[13]^[14]

Methodology

The blind tasting panel was made up of nine expert judges, with each wine graded out of 20 points. The tasting was performed behind closed doors at Princeton University, and results were kept secret from the judges until they were analyzed by Quandt and announced later that day. According to an algorithm devised by Quandt, each judge's set of ratings was converted to a set of personal rankings, which were in turn tabulated cumulatively by “votes against," with a lower score better (representing higher cumulative rankings) and a higher score worse (representing lower cumulative rankings). The data were then tested by Quandt for statistically significant differences between tasters and wines using the same software he had previously employed to re-analyze the Judgment of Paris results.^[15]

The reveal

Shortly after the tasting was completed and the results tabulated, Taber, Quandt, and Ashenfelter announced the results to an audience of media, New Jersey winemakers, wine economists, and the judges themselves. The event took place in an auditorium at Princeton’s Woodrow Wilson School of Public and International Affairs as part of the American Association of Wine Economists’ annual meeting. Due to the technical limitations of Quandt's custom-built, floppy-disk-powered FORTRAN system, it was necessary for Goldstein to scrawl the results onto a giant chalkboard, eliciting murmurs of disapproval from the audience over his poor handwriting.^[16]

Results

White wines

“Votes against” in the Ashenfelter-Quandt methodology are indicated here. (The maximum possible score in this tasting would have been 9, and the minimum 90.) Only one wine was significantly better, statistically, than the other wines: the Beaune 1er Cru Clos de Mouches 2010, the cheapest of the four white Burgundies in the lot. The rest of the wines were statistically indistinguishable from each other based on the data, meaning that no conclusions can be drawn from the rankings of wines #2 to #10.^[16]

Significantly better than the other wines:

Rank	Votes Against	Winery	Wine	Vintage	Origin
1.	33.5	Joseph Drouhin	Beaune Clos des Mouches	2009	France

Not statistically distinguishable from each other:

Rank	Votes Against	Winery	Wine	Vintage	Origin
2.	38	Unionville Vineyards	Pheasant Hill Chardonnay	2010	New Jersey
3.	45.5	Heritage Vineyards	Chardonnay	2010	New Jersey
4.	47.5	Silver Decoy Winery	Black Feather Chardonnay	2010	New Jersey
5.	52	Domaine Leflaive	Puligny-Montrachet	2009	France
6. (tie)	53	Bellview Winery	Chardonnay	2010	New Jersey
6. (tie)	53	Marc-Antonin Blain	Bâtard-Montrachet Grand Cru	2009	France
8.	54.5	Amalthea Cellars	Chardonnay	2008	New Jersey
9.	57.5	Ventimiglia Vineyard	Chardonnay	2010	New Jersey
10.	60.5	Jean Latour-Labille	Meursault-Charmes Premier Cru	2008	France

Red wines

“Votes against” in the Ashenfelter-Quandt methodology are indicated. (The maximum possible score in this tasting would have been 9, and the minimum 90.) The only wine that was significantly worse, statistically, than the other wines was #10, the Four JG’s Cabernet Franc 2008, from New Jersey. The rest of the wines were statistically indistinguishable from each other based on the data, meaning that no conclusions can be drawn from the rankings of wines #1 to #9.^[16]