User:GoldenRing/BRFA Draft

Lua error in Module:BRFA at line 17: Invalid page name.

Lua error in Module:BRFA at line 17: Invalid page name.

Operator: Bradv (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 12:49, Tuesday, June 25, 2019 (UTC)

Function overview: Automatically place word count templates on statements at WP:A/R/C and related pages.

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: Not yet

Links to relevant discussions (where appropriate): This has been discussed on the clerks-l mailing list with no objections.

Edit period(s): Continuous.

Estimated number of pages affected: One or two pages per arbitration case.

Namespace(s): Wikipedia

Exclusion compliant (Yes/No): No

Function details: This bot will edit only pages in the (new) category Category:ArbCom pages with automatic evidence length headers. It will run every time any edit is made to a page in that category. If this is not acceptable, falling back to running once in 24 hours would be a reasonable substitute. Initially this category will contain only Wikipedia:Arbitration/Requests/Case but we would envisage it soon being used on case evidence pages.

Each time a page in that category is edited, the bot will retrieve both the wikitext of that page and the rendered HTML of the same version.

Working on the HTML text:

  • For each h3 tag in the page:
    • Extract all DOM elements and text between the h3 tag and the next h3 tag, or the end of the page if there is no next h3 tag.
    • If the h3 tag has the wordcount-ignore class, ignore this tag.
    • Count the visible words in the extracted elements and text. For the purposes of counting words, the following are ignored:
      • Any tag that jQuery considers :hidden.
      • Any tag with the wordcount-ignore class.
      • Any <div> tag with the siteSub, contentSub or jump-to-nav classes.
      • Any <span> tag with the localcomments class.
      • Any string matching the regular expression (\d{1,2}):(\d{2}), (\d{1,2}) ([A-Z][a-z]+) (\d{4}) \(UTC\) (this removes timestamps).
    • Count the diffs in the extracted elements and text. Initially this would be any wikilink matching Special:Diff/[0-9]+ or any URL with host en(.m)?.wikipedia.org and a diff= or oldid= parameter, though I anticipate that this will require some tweaking.
    • If the extracted text already contains a {{ArbCom evidence length header}} template, update the word parameter with the new word count and the diff parameter with the new diff count.
    • If the extracted text does not already contain a {{ArbCom evidence length header}} and the word count exceeds 450, add the template with the calculated word count and diff count.
  • Save the page.

Discussion

I hope this will be relatively uncontroversial, but I'm certainly open to suggestions of how the above logic might be flawed or produce undesired results. The intention here is to lighten the load on the arbitration clerks. We do not anticipate automated notifications of users who exceed the word count; IMO this is better done by a person. GoldenRing (talk) 12:49, 25 June 2019 (UTC)

Content Disclaimer

Informasi ini disarikan dari Wikipedia dan disajikan kembali untuk tujuan edukasi. Konten tersedia di bawah lisensi CC BY-SA 3.0. Kami tidak bertanggung jawab atas ketidakakuratan data yang bersumber dari kontribusi publik tersebut.

  1. The information displayed on this website is sourced in part or in whole from Wikipedia and has been adapted for the purpose of restating it. We strive to provide accurate and relevant information, however:
  2. There is no guarantee of absolute accuracy. Wikipedia is an open, collaborative project that can be edited by anyone, so information is subject to change.
  3. It is not intended to constitute professional advice. The content displayed is for informational and educational purposes only. For important decisions (e.g., medical, legal, or financial), please consult a professional.
  4. Content copyright. Wikipedia is licensed under the Creative Commons Attribution-ShareAlike License (CC BY-SA). This means that content may be reused with appropriate attribution and shared under a similar license.
  5. Responsible use. Any risk arising from the use of information from this website is entirely the responsibility of the user.