WikiProject Manual of Style | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
WikiProject Categories | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
![]() Archives |
---|
Threads older than 40 days may be archived by MiszaBot II. |
Contents
Sorting within river basin categories
An editor has proposed and partially implemented a novel system of ordering entries in categories for river drainage basins. For an explanation of the system, see the headnote at Category:Thames drainage basin.
Although WP:SORTKEY does not claim to be an exhaustive list of possible sorting methods, the proposed system seems out of line with the guidelines there, because it relies on numbers (and letters) which are not part of article titles. It seems to me that the system makes it much more difficult to use the categories as navigational aids. For a discussion see Wikipedia_talk:WikiProject_Rivers#Sorting_in_drainage_basin_categories.
Any views?--Mhockey (talk) 21:32, 23 April 2016 (UTC)
- @Mhockey: Yes, but per WP:MULTI, please can we keep discussion in one place? --Redrose64 (talk) 22:56, 23 April 2016 (UTC)
We have a vast body of precedent that sub-cats may have shifts in topic, e.g. as discussed at Wikipedia_talk:WikiProject_Categories#A_big_problem_with_our_category_structures. E.g. Category:Food- and drink-related organizations cannot be ingested, yet the latter is an accepted sub-category of Category:Food and drink.
I recently asked James Michael DuPont (talk) why he set up Category:Open content companies as "not a subcat of Category:Open content". He replied that "a company is a subcat of company, not of content"… "Just because it is related to does not make it a sub category" and referred me to the mathematical article Subcategory. When I countered with the food example, he asked whether this is policy.
At present, the guidance WP:SUBCAT does not seem to cover the shifts in meaning to closely related topics. I propose that it be rewritten to match longstanding practice in English Wikipedia.
CN1, would you be able to help, please? – Fayenatic London 22:35, 5 May 2016 (UTC)
- I think it is purely a technical problem how you manage the categories. If your tool makes it hard to manage facets of the category system,then you will find it difficult. I propose that you have "people in open source" as a related category. Wikidata will provide some help eventually. James Michael DuPont (talk) 09:00, 1 June 2016 (UTC)
- I have no experience in writing guidelines for Wikipedia but here are some thoughts:
- I would understand that Category:Open content companies is not allowed into a category called Category:Open content works, or Category:Open content products but Category:Open content does not specify in that or any direction.
- It can be understood as two things: (1) content, which is open source or (2) the concept of "open content". This is also literally what the first paragraph of Open content says.
- (2)'s way of understanding the name is broader - let's call it a topic category.
- (1)'s way of understanding the cat is a register of open content objects e.g. films. - a register/object category.
- Compare Category:Stone and Category:Stone objects
- The difference between topic and register category is that the register never widens the scope of the category's subject/item/topic "X". In an all-pages category it lists all items of "X". In a cat with subcategories it breaks "X" down by sort keys (e.g. components, nationality, time period). It holds items of X.
- Topic categories go beyond the scope of their subject/item/topic "X" via intersections of "X" and another subject/item/topic which can be anything. It holds items about/to X, e.g. Category:Water and society or a one-topic-cat Category:Water. A type of topic category are eponymous categories: Wikipedia:Categorization#Eponymous_categories.
- CN1 (talk) 23:23, 7 May 2016 (UTC)
That kind of bizarrely rigid misunderstanding of categories comes up like spring weeds every year. The category system is created and maintained by editors, primarily to facilitate navigation, and to connect related articles and topics. It is not and never has been a strict classificatory hierarchy. We do no one any favors by imposing arbitrary roadblocks and deadends in this system just to satisfy some ultimately unrelated concept such as what is "mathematically" proper (I've also heard we "must" do it a certain way "because set theory"). Doing it that way, we'd end up with thousands of disconnected little category ladders you could only go up and down rather than a network that connects everything. postdlf (talk) 23:57, 7 May 2016 (UTC)
- What exactly do you think is a misunderstanding? Whether you like it or not, mathematics is the field that describes the logic (and provides the language for discussion) of categorization. If you have an idea for a different way of doing wp categorization (e.g. without WP:SUBCAT) then you can try to explain it - but, I suspect it'd end up as something that is less useful (because it would be less rigourous) and more time-consuming to maintain (more "flexibility" would mean more disagreements between editors) than the current system. We already have a system to provide navigation between articles about related topics (in an unstructured way) - normal links between articles; categorization is not intended just to duplicate that. If two categories are sufficiently related that it is useful to have a direct navigational link between them, but the two categories don't belong in a parent-child relationship, then just create a link between them (an extreme example being Category:Georgia (country) and Category:Georgia (U.S. state) where the "relationship" is just similarity of name). DexDor (talk) 22:01, 9 May 2016 (UTC)
- We've had this conversation before (and that discussion references this earlier one that you didn't participate in). postdlf (talk) 22:59, 9 May 2016 (UTC)
- Quoting Help:Category: "Categories are intended to group together pages on similar subjects. [..] Categories help readers to find, and navigate around, a subject area, to see pages sorted by title, and to thus find article relationships."
- In Wikipedia:FAQ/Categorization#What_is_the_purpose_of_categories.3F we read: "There are two main ways to use categories: lists and topics.".
- Topic categories are less rigid and more time consuming than lists, but one should not automatically take the line of least resistance.
- There are always disagreements but they are solved by discussions and new guidelines.
- CN1 (talk) 14:23, 12 May 2016 (UTC)
- Making categorisation more rigid will not make it more useful. Rathfelder (talk) 20:37, 11 May 2016 (UTC)
A big part of the category system already consists of topic categories. Fayenatic is right when he says that despite their importance and presence, they are mentioned very little in the category policy page and we need to formulate guidelines for how to use them or at least a description of them.
@Fayenatic london: The following is what I imagine for a paragraph in Wikipedia:Categorization#Category_tree_organization.
Topic categories
Topic categories are categories in which the relationship between its subcategories and themselves is't an is-a relationship, but a belongs-to relationship.
The presence of an is-a relationship can be objectively determined but it is not so easy to assess if a belongs-to relationship is justified.
Every belongs-to relationship has to be assessed individually.
This is why it's always a good first step to identify the nature of the subcategories in question, which means to ask if they are themselves a topic or a set category.
- Category:History — topic cat
- belongs-to relationship with other topic categories
-
- Category:Philosophy of history — links topics history & philosophy
- Category:Historic preservation — links topics history & preservation
- Category:History education — links topics history & education
- belongs-to relationship with set categories
-
- Category:Historians — links topic history & set category system branch people
- Category:History awards — links topic history & set category system branch awards
- Category:Historical eras — links topic history & set category system branch eras
- Relationships between topic categories are justified if the topic of the parent cat plays a large role in the topic of the sub cat.
- Relationships between a topic category and a set category of the same topic are always justified.
- Set categories have two traits: (I.) object type, which they register, and (II.) topic.
- The set category Category:Historians has the object type people and the topic history.
- Another example is Category:Food and Category:Food companies.
Questionable belongs-to relationship
- There are also harder cases.
- It was judged by a user that Category:September 11 attacks doesn't belong into Category:Presidency of George W. Bush.
- This is because one can hardly know if the presidency of George W. Bush played a large role in the attacks.
- Still the article September 11 attacks is a subpage. This is a good strategy to mention related topics, which's category is not suited for a belongs-to relationship.
- The best place to ask for opinions on questionable belongs-to reltionships is Wikipedia:CFD.
CN1 (talk) 18:14, 29 May 2016 (UTC)
- An excessively rigid approach to categorization will not work because the vast majority of users will be unaware of it. It needs to relate to patterns of human understanding. Rathfelder (talk) 07:31, 30 May 2016 (UTC)
- As Fayenatic london said, it's simply unbearable to not have categorization described in the Wikipedia policy because otherwise people will ask: "Well, is this even a rule?"
- So far, only set categories (= object categories) are described.
- I wish for criticism, but please be specific about what you don't think fits the nature of topic categories or maybe even add something on your own.
- As the last example showed, I'm unable to postulate rigid rules for some cases, but it is possible for some others.
- Where it is not possible, examples and descriptions can help inexperienced users.
- CN1 (talk) 11:09, 30 May 2016 (UTC)
- As Fayenatic london said, it's simply unbearable to not have categorization described in the Wikipedia policy because otherwise people will ask: "Well, is this even a rule?"
Order of categories on a page?
Is there any guidance to what order the categories should be on a particular page? For example, first the Eponymous categories and then alphabetical? Also, does this same guidance (alphabetical) apply for the categories that a category is subcatted in?Naraht (talk) 20:14, 9 May 2016 (UTC)
- As far as I know... We don't mandate any specific order. Many articles don't bother (so new categories are simply added after those added previously). Others are "organized" following some logical order (alphabetical is certainly logical... But there are other ways to do it). Suggest you look at other articles in the topic area and see what they do. Blueboar (talk) 22:24, 9 May 2016 (UTC)
- Helpful discussions:
Planned change to the implementation of sort keys
From the latest "Tech News":
Future changes
- Today, in categories, a page titled "11" comes before "2". We plan to reorder that: "2" will come before "11". Please tell the developers if the change will cause problems. [1]
Reposting here so that editors interested in category sorting are more likely to see it. -- John of Reading (talk) 03:25, 10 May 2016 (UTC)
- Yay! Good Ol’factory (talk) 03:52, 10 May 2016 (UTC)
- Sounds logical. We also demanded that in a program we had written for our firm. Debresser (talk) 12:56, 10 May 2016 (UTC)
- I suppose this means that instead of page names '+', followed by pagenames that begin with '0', '1', ... '9', 'A', 'B', we will see all pagenames that begin with a numeral grouped together under '#'?
- Evidently I don't know the status quo. Category:Deaths by year shows a big group of pages under '#' while Category:National Basketball Association seasons shows a big group under '1' followed by a big group under '2'. (The NBA isn't old enough to have any seasons defined by three-digit years, so numerical sorting makes no difference here.) --P64 (talk) 18:34, 11 May 2016 (UTC)
- @P64: Category:Deaths by year has a lot of pages grouped under "#" because those pages use
{{Deaths in century}}
, and the code for that template includes the line[[Category:Deaths by year|#{{#ifexpr:{{{1|5}}}<9|0|}}{{#expr:{{{1|5}}}+1}}]]
{{#ifexpr:
on, the important thing is that|#
immediately after the cat name which forces transcluding pages to sort under "#". Forced sortkeys like this should not change behaviour, it's those that use default sort keys - such as if you used[[Category:Deaths by year]]
- @P64: Category:Deaths by year has a lot of pages grouped under "#" because those pages use
OK to switch English Wikipedia's category collation to uca-default?
In the 2015 Community Wishlist Survey, the 5th most popular proposal was numerical sorting in categories (for example, sort 99 before 100). The WMF Community Tech team is ready to implement this, but a pre-requisite for the change is that we must switch English Wikipedia's category collation from "uppercase" (a simple collation algorithm that sorts strings based on character values, but considers uppercase and lowercase letters the same) to "uca-default" (which is based on the Unicode Collation Algorithm (UCA), the official standard for how to sort Unicode characters). The most noticeable difference is that UCA groups characters with diacritics with the their non-diacritic versions. So, for example, English Wikipedia currently sorts Aztec, Ärsenik, Zoo, Aardvark as "Aardvark, Aztec, Zoo, Ärsenik", but UCA collation would sort them as "Aardvark, Ärsenik, Aztec, Zoo" (with Aardvark, Ärsenik, and Aztec grouped under a single "A" heading, instead of under 2 separate headings). There are numerous other advantages to using UCA collation, but they are a bit technical to discuss, so I'll refer you to the documentation instead: [2][3][4]. If you would like to experiment with UCA collation, go to https://ssl.icu-project.org/icu-bin/collation.html. Set the collation to "und (type=standard)" (the default) and turn on numeric sorting in the settings. If anyone has any concerns or questions about switching to UCA, please reply here or in the Phabricator ticket. Thanks! Ryan Kaldari (WMF) (talk) 00:24, 25 May 2016 (UTC)
- Support as proposed above. — xaosflux Talk 00:59, 25 May 2016 (UTC)
- Support. Can't wait for numeric sorting to be implemented. — JJMC89 (T·C) 03:18, 25 May 2016 (UTC)
- Proper collation of diacritics is hardly a disadvantage. —Cryptic 05:18, 25 May 2016 (UTC)
- Support, with seconding Cryptic's comment above Goldenshimmer (talk) 05:41, 25 May 2016 (UTC)
- Comment Currently, all articles are being sorted just as the Unicode Collation Algorithm would do via the DEFAULTSORT parameter. So, diacritics/accent marks aren't currently an issue in articles. With UCI, less DEFAULTSORTs will be needed in non-biography articles in the future. However, this will alter most current non-biography talk pages as
|listas=
is not set in those, therefore the uppercase algorithm currently applies on those. If I remember correctly, UCI handles every variant of dash/hyphen, single quote marks and few others as separate charachters, so DEFAULTSORT will still need to be set for those cases. Depending on what "switch" is set in the UCI algorithm, de Gaule, De Gaule, de-Gaule and De-Gaule will be sorted in different orders. Other wikis have already changed to UCI. French is one of them. Bgwhite (talk) 06:22, 25 May 2016 (UTC) - Administritive note - Discussion moved from WP:VPT, since this is a better place to discuss category issues. עוד מישהו Od Mishehu 13:25, 25 May 2016 (UTC)
- Support of course!—Odysseus1479 17:58, 25 May 2016 (UTC)