User:Improv/PWPD

Intent

Create a subset of articles, with or without images, for use on a particular set of embedded devices, along with software to browse them.

Issues

  • What content should be included?
    1. There are constraints on size and professionalism
    2. There are other projects that attempt to do the job, but they have a poor idea of what should even be on Wikipedia, much less on a static dump of it.
    3. Getting unvandalised versions of content is important. If "vetted" versions can be made, that would be even better.
    4. This is a continual project - I expect people to update the version on their devices every so often.
    5. Featured and Frontpaged articles have completely dropped the ball when it comes to keeping things encyclopedic. They're not at all useful for this project.
    6. My first intuition is to start with Portal:List, cutting out portals that are less encyclopedic (like Television, Video Games, Pokémon, Nudity, and Pornography) and using the other portals as pathways to appropriate, well-done content. I need to find a good way to do this. Perl will probably be my friend.
    7. I will probably make two versions of the content dumps, one with images, one without
      • I intend to exclude fair-use images
  • How should that content be acquired from Wikipedia?
    1. I initially think combining database dumps and wget (for images) will be appropriate
    2. I should research better ways to get images. Automagically handling license issues is important
  • What format should content be on the devices?
    1. Database? I need to see what (if any) databases are available on the system when the prototype hardware is ready. Postgres would be ideal.
    2. I need to write software to browse whatever format I choose. I don't think this will be too hard -- I can probably reuse the Wikiparser from POUND, modifying it slightly to be more compliant with MediaWiki's syntax
  • What formalities need to be observed to keep this legal?
    1. Avoiding fair use gets me part of the way there.
    2. Do I just need a list of contributors from the history pages, or do I need more?
    3. Getting the disclaimer right is important. I don't want to put the company giving me the prototype at legal risk either.

Interested People

Feel free to add yourself - much of what I'm doing may be of use on other devices, and when the device I'm working on it for is released (or if you have prototype hardware too), I'll be glad to have company. I expect/hope that interested parties actually have something useful to contribute to the project - if you can help with any of the above, that would be fantastic.

Content Disclaimer

Informasi ini disarikan dari Wikipedia dan disajikan kembali untuk tujuan edukasi. Konten tersedia di bawah lisensi CC BY-SA 3.0. Kami tidak bertanggung jawab atas ketidakakuratan data yang bersumber dari kontribusi publik tersebut.

  1. The information displayed on this website is sourced in part or in whole from Wikipedia and has been adapted for the purpose of restating it. We strive to provide accurate and relevant information, however:
  2. There is no guarantee of absolute accuracy. Wikipedia is an open, collaborative project that can be edited by anyone, so information is subject to change.
  3. It is not intended to constitute professional advice. The content displayed is for informational and educational purposes only. For important decisions (e.g., medical, legal, or financial), please consult a professional.
  4. Content copyright. Wikipedia is licensed under the Creative Commons Attribution-ShareAlike License (CC BY-SA). This means that content may be reused with appropriate attribution and shared under a similar license.
  5. Responsible use. Any risk arising from the use of information from this website is entirely the responsibility of the user.