Share to: share facebook share twitter share wa share telegram print page

 

Reference class problem

In statistics, the reference class problem is the problem of deciding what class to use when calculating the probability applicable to a particular case.

For example, to estimate the probability of an aircraft crashing, we could refer to the frequency of crashes among various different sets of aircraft: all aircraft, this make of aircraft, aircraft flown by this company in the last ten years, etc. In this example, the aircraft for which we wish to calculate the probability of a crash is a member of many different classes, in which the frequency of crashes differs. It is not obvious which class we should refer to for this aircraft. In general, any case is a member of very many classes among which the frequency of the attribute of interest differs. The reference class problem discusses which class is the most appropriate to use.

More formally, many arguments in statistics take the form of a statistical syllogism:

  1. proportion of are
  2. is an
  3. Therefore, the chance that is a is

is called the "reference class" and is the "attribute class" and is the individual object. How is one to choose an appropriate class ?

In Bayesian statistics, the problem arises as that of deciding on a prior probability for the outcome in question (or when considering multiple outcomes, a prior probability distribution).

History

John Venn stated in 1876 that "every single thing or event has an indefinite number of properties or attributes observable in it, and might therefore be considered as belonging to an indefinite number of different classes of things", leading to problems with how to assign probabilities to a single case. He used as an example the probability that John Smith, a consumptive Englishman aged fifty, will live to sixty-one.[1]

The name "problem of the reference class" was given by Hans Reichenbach, who wrote, "If we are asked to find the probability holding for an individual future event, we must first incorporate the event into a suitable reference class. An individual thing or event may be incorporated in many reference classes, from which different probabilities will result."[2]

There has also been discussion of the reference class problem in philosophy[3] and in the life sciences, e.g., clinical trial prediction.[4]

Applying Bayesian probability in practice involves assessing a prior probability which is then applied to a likelihood function and updated through the use of Bayes' theorem. Suppose we wish to assess the probability of guilt of a defendant in a court case in which DNA (or other probabilistic) evidence is available. We first need to assess the prior probability of guilt of the defendant. We could say that the crime occurred in a city of 1,000,000 people, of whom 15% meet the requirements of being the same sex, age group and approximate description as the perpetrator. That suggests a prior probability of guilt of 1 in 150,000. We could cast the net wider and say that there is, say, a 25% chance that the perpetrator is from out of town, but still from this country, and construct a different prior estimate. We could say that the perpetrator could come from anywhere in the world, and so on.

Legal theorists have discussed the reference class problem particularly with reference to the Shonubi case. Charles Shonubi, a Nigerian drug smuggler, was arrested at JFK Airport on Dec 10, 1991, and convicted of heroin importation. The severity of his sentence depended not only on the amount of drugs on that trip, but the total amount of drugs he was estimated to have imported on seven previous occasions on which he was not caught. Five separate legal cases debated how that amount should be estimated. In one case, "Shonubi III", the prosecution presented statistical evidence of the amount of drugs found on Nigerian drug smugglers caught at JFK Airport in the period between Shonubi's first and last trips. There has been debate over whether that is the (or a) correct reference class to use, and if so, why.[5][6]

Other legal applications involve valuation. For example, houses might be valued using the data in a database of house sales of "similar" houses. To decide on which houses are similar to a given one, one needs to know which features of a house are relevant to price. Number of bathrooms might be relevant, but not the eye color of the owner. It has been argued that such reference class problems can be solved by finding which features are relevant: a feature is relevant to house price if house price covaries with it (it affects the likelihood that the house has a higher or lower value), and the ideal reference class for an individual is the set of all instances which share with it all relevant features.[7][8]

See also

References

  1. ^ J. Venn,The Logic of Chance (2nd ed, 1876), p. 194.
  2. ^ H. Reichenbach, The Theory of Probability (1949), p. 374
  3. ^ A. Hájek, The Reference Class Problem is Your Problem Too, Synthese 156 (2007): 185-215.
  4. ^ Atanasov, Pavel D.; Joseph, Regina; Feijoo, Felipe; Marshall, Max; Siddiqui, Sauleh (2021-12-09). "Human Forest vs. Random Forest in Time-Sensitive COVID-19 Clinical Trial Prediction". SSRN Electronic Journal. Rochester, NY. SSRN 3981732.
  5. ^ M. Colyvan, H.M. Regan and S. Ferson, Is it a crime to belong to a reference class?, Journal of Political Philosophy 9 (2001): 168-181
  6. ^ Tillers, Peter (2005). "If wishes were horses: discursive comments on attempts to prevent individuals from being unfairly burdened by their reference classes". Law, Probability and Risk. 4 (1–2): 33–49. doi:10.1093/lpr/mgi001.
  7. ^ Franklin, James (Mar 2010). "Feature selection methods for solving the reference class problem" (PDF). Columbia Law Review Sidebar. 110. Retrieved 30 June 2021.
  8. ^ Franklin, James (2011). "The objective Bayesian conceptualisation of proof and reference class problems". Sydney Law Review. 33: 545–561. Retrieved 30 June 2021.
Kembali kehalaman sebelumnya


Index: pl ar de en es fr it arz nl ja pt ceb sv uk vi war zh ru af ast az bg zh-min-nan bn be ca cs cy da et el eo eu fa gl ko hi hr id he ka la lv lt hu mk ms min no nn ce uz kk ro simple sk sl sr sh fi ta tt th tg azb tr ur zh-yue hy my ace als am an hyw ban bjn map-bms ba be-tarask bcl bpy bar bs br cv nv eml hif fo fy ga gd gu hak ha hsb io ig ilo ia ie os is jv kn ht ku ckb ky mrj lb lij li lmo mai mg ml zh-classical mr xmf mzn cdo mn nap new ne frr oc mhr or as pa pnb ps pms nds crh qu sa sah sco sq scn si sd szl su sw tl shn te bug vec vo wa wuu yi yo diq bat-smg zu lad kbd ang smn ab roa-rup frp arc gn av ay bh bi bo bxr cbk-zam co za dag ary se pdc dv dsb myv ext fur gv gag inh ki glk gan guw xal haw rw kbp pam csb kw km kv koi kg gom ks gcr lo lbe ltg lez nia ln jbo lg mt mi tw mwl mdf mnw nqo fj nah na nds-nl nrm nov om pi pag pap pfl pcd krc kaa ksh rm rue sm sat sc trv stq nso sn cu so srn kab roa-tara tet tpi to chr tum tk tyv udm ug vep fiu-vro vls wo xh zea ty ak bm ch ny ee ff got iu ik kl mad cr pih ami pwn pnt dz rmy rn sg st tn ss ti din chy ts kcg ve 
Prefix: a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9