This page is used for discussions of the operations, technical issues, and policies of Wikimedia Commons. Recent sections with no replies for 7 days and sections tagged with {{Section resolved|1=--~~~~}} may be archived; for old discussions, see the archives; the latest archive is Commons:Village pump/Archive/2024/06.
Please note:
If you want to ask why unfree/non-commercial material is not allowed at Wikimedia Commons or if you want to suggest that allowing it would be a good thing, please do not comment here. It is probably pointless. One of Wikimedia Commons’ core principles is: "Only free content is allowed." This is a basic rule of the place, as inherent as the NPOV requirement on all Wikipedias.
Any answers you receive here are not legal advice and the responder cannot be held liable for them. If you have legal questions, we can try to help but our answers cannot replace those of a qualified professional (i.e. a lawyer).
Your question will be answered here; please check back regularly. Please do not leave your email address or other contact information, as this page is widely visible across the internet and you are liable to receive spam.
Purposes which do not meet the scope of this page:
Most of these categories contain no media of their own, but subcategories of characters (that are often played by multiple actors), and the structure is often circular in nature (e.g. the category "Whoopi Goldberg" has the subcategory "Whoopi Goldberg characters", which has the subcategory "Shenzi", which has the subcategory "Whoopi Goldberg"). Most if not all of these were made by the same IP user who created a huge amount of category spam in Category:Space Jam, Category:Mickey Mouse and a bunch of others.
I don't think this category tree structure is inherently invalid, but I feel it's mis-applied and excessive in most of these cases. I'd like to hear more people's thoughts on this before I take this to CfD though. ReneeWrites (talk) 19:19, 28 May 2024 (UTC)[reply]
The whole thing seems rather ambiguous and pointless. Like the parent is called "Film characters" but then the subcategories aren't even characters. Or maybe they are. Is a category like that suppose to be for "characters of Chris Rock" or "Characters played by Chris Rock"? It's not really clear. Then on top of it a lot of the sub-categories only contain one child category but no files, which I'm not really a fan of. --Adamant1 (talk) 19:49, 28 May 2024 (UTC)[reply]
I think this category structure is invalid, and these categories should be deleted. The purpose of categories on Commons is fundamentally to categorize media files. These categories don't organize media; instead, they attempt to represent abstract relationships between subjects. But that's what we have Wikidata for! We don't need to create a clumsy imitation of it on this site.
The same probably goes for the following categories, at a minimum:
Commons is not the place for this. Al Capone is not defined by Alec Baldwin and neither is Alec Baldwin defined by Al Capone. All of these categories should be deleted. The only place this data should be presented is in Wikipedia. Wikidata, might hold the names of movies and their casts, however that again is held in Wikipedia. We are not a repository of facts; we hold files, last time I looked. Only recently we had to go through this nonsense with film locations. Broichmore (talk) 12:20, 4 June 2024 (UTC)[reply]
The main question to solve is: where to place a picture of actor x playing the character y in the film z? In the three categories for each of these. Enhancing999 (talk) 17:27, 9 June 2024 (UTC)[reply]
Under the actor, the character (if we have such a category), and (if that character is not a subcat of the film) the film. If we have more than a handful of such images for the same actor in the same film, then we can make a subcat bringing the three together. - Jmabel ! talk23:56, 9 June 2024 (UTC)[reply]
Laughingly Liam Neeson apparently played Hitler in a mini series. If we had a picture of him dressed up as Hitler, I certainly don’t want to see him filed under Hitler. Yes, under category Liam Neeson, and if we had one, for the TV series The latter should only have pix of the actual production, and in 121 years time, stills from the show, and the film itself. Broichmore (talk) 17:02, 16 June 2024 (UTC)[reply]
Every single "[Character] interpreters" category is like that. You can find them all compiled in Category:Actors by role and a CfD I opened for it last month at Discussion: Category:Actors by role where I made a similar argument about how to subcategorize actors playing specific characters. I also argued for all the "[Character] interpreters" categories to be deleted. ReneeWrites (talk) 08:03, 17 June 2024 (UTC)[reply]
This is not that different from actors being subcategorized with characters they've played in films (see Category:Actors by role). It's actually the same problem. Broichmore put it really well in a comment he made earlier. To borrow his Al Capone example, Al Capone's not an appropriate subcategory of Alec Baldwin, nor is Alec Baldwin an appropriate subcategory of Al Capone. But once we have media on Commons of Alec Baldwin as Al Capone it can be added to both those categories. If we have enough media to justify creating a category for it on its own it can be named something like "Category:Alec Baldwin as Al Capone" and be made a subcategory for both of them. ReneeWrites (talk) 18:22, 20 June 2024 (UTC)[reply]
@M F Gervais: It is there and it functional however due to how big and unwieldy it is as a pdf it takes a while to render, especially whern it has to develop the image cache first:
Now because PDFs are typically multipage document it can need extra formatting if you are trying to do it through standard wiki formatting. mw:help:images. PDFs should not be used if you want to display an image, please upload an image file per Com:File types — Preceding unsigned comment added by Billinghurst (talk • contribs) 07:59, 1 June 2024 (UTC)[reply]
Special:UncategorizedCategories is back over 1000 categories. If you can add appropriate parent categories to any of the many that have otherwise reasonable content, that would be very helpful. If you're not a admin, don't worry about the empty ones, one or another admin will eventually find those and delete them. - Jmabel ! talk06:05, 5 June 2024 (UTC)[reply]
Now up to 1165 categories. I have the feeling almost no one is addressing this. I've done literally thousands, probably over 5000, and while I still try to do 50 or so per week, that is not enough to keep up. - Jmabel ! talk17:53, 10 June 2024 (UTC)[reply]
@Jmabel: I did a few today and about once a month a small portion.
Question Is it a good idea to have the numbers next to the categories, like in other pages? Then you can see at a glance which are empty and not bother about them (for none admins like me) or to be able to remove them in quick succession (for admins). JopkeB (talk) 07:50, 15 June 2024 (UTC)[reply]
Wouldn't that be nice? As it is, in case you haven't noticed, you can hover over the link and see whether there are files in the category. - Jmabel ! talk14:56, 15 June 2024 (UTC)[reply]
No I did not notice and I still do not see it. When I hover over a link, I see only the category name, for instance Category:Activité supplémentaire bibliothéquaire, but nothing else. JopkeB (talk) 07:31, 18 June 2024 (UTC)[reply]
Weird that you get a different behavior than I do. So you don't get a thing like this in the popup?
Category:Queensferry Parish Church, South Queensferry ⋅ actions ⋅ popups
44 bytes, 0 wikiLinks, 0 images, 0 categories, 1 week 4 days old
----
Queensferry Parish Church, South Queensferry
Category members (5 shown)
File:Queensferry Parish Church 01.jpg, File:Queensferry Parish Church 02.jpg, File:Queensferry Parish Church 03.jpg, File:Queensferry Parish Church 04.jpg, File:South Queensferry Townscape , Queensferry Parish Church - geograph.org.uk - 3034870.jpg
RFC: Automatic categorisation both bane and gain; work needed to identify source of categorisation
Hi. Having been involved in large amounts of tidying over the years we are starting to get to an administrative burden from automatic categorisation where it is going wrong, Our use of complex and layered templates that directly apply categories, eg. Template:Topic by country, or the inhalation of categories based on Template:Wikidata infobox, or through Modules is requiring more and more time and more and more complex knowledge to resolve this (mis)categorisation where it goes wrong, or where it causes issues outside of our criteria.
We need some better technical solutions. We need a direct and overt ability to know the source of the categorisation be it:
direct category in the page
template that has local data
template that is importing information from wikidata
Some of this sort of exists when one has Com:HotCat as a gadget, though the other two have no ready means to identify the source.
Categorisation is clearly something where automation is useful and it is not in itself the problem. When it is wrong, and needs a lot of work to resolve, then it moves from problem to big problem.
We also need a better means for getting resolution categorisation fixes of the points in #2 and #3. We need guidance to people to how they best address categorisation that has gone wrong and they don't know how to fix it. Some of that is that we need to review our documentation in the templates to ensure that they have guidance for the appropriate use of the template, and what it actually does, as well as the guidance on the appropriate use of the parameters. Template designers/creators need to be involved in that space as an expectation, and those that put them through major rewrites. If it is hard to use and hard to understand then the community needs to challenge both its design and its purpose.
If we don't do something the categorisation issues are going to continue to multiply, and the rules that we have in place will be ignored and we will just have mess. I know that I am partly just stating the problem, and not necessarily the solution, however, at this point I am looking for comments about where others think we are, and some general thoughts on how we can address this at a higher level before drilling down into all the solutions. — billinghurstsDrewth00:22, 9 June 2024 (UTC)[reply]
It's probably a side thing, but I have a serious problem with categories being forced on us through infoboxes. Like there's a ton of people who are recipients of minor, non-notable awards that automatically get sorted into categories for said awards and their various sub-awards when it's not really useful to have things categorized down to that small of a level. You can't really do anything about it on our end either. Regardless, we shouldn't have how we categorize things dictated by other projects period. We certainly don't name categories based on standards set by Wikipedia editors, or keep files that violate the guidelines simply because of how other projects do things. -Adamant1 (talk) 00:34, 9 June 2024 (UTC)[reply]
Wikidata Infoboxes provide given name, surname, and birth and death dates, and "living people", which should presumably be uncontroversial. [Similarly, some gender info so it can do "men by name" and "women by name" as well as "people by name". - Jmabel ! talk 01:53, 9 June 2024 (UTC)] I'm not at all sure they should do any other automatic addition of categories, though there may be some others that are equally clear. I haven't really seen this thing with awards, but that may say something about what topics I work on. @Adamant1: can you give an example and (anyone) is there documentation somewhere about what categories infoboxes add? - Jmabel ! talk01:00, 9 June 2024 (UTC)[reply]
@Jmabel: I don't necessarily have an issue with infoboxes providing given name, surname, or birth and death dates. That's about it though. If you want an example of what I'm talking about checkout the subcategories in Category:Recipients of Russian military awards and decorations. Like categories for people that have won the various "X Years of Victory in the Great Patriotic War 1941–1945" medals. For instance Category:Heydar Aliyev, where there's like 30 categories for minor awards that I assume were all added by the infobox and can't be removed or edited. The whole thing is totally ridiculous overkill. --Adamant1 (talk) 01:11, 9 June 2024 (UTC)[reply]
The same way we decide anything else of the sort. It does seem odd for the decision to be hidden in a template. - Jmabel ! talk01:44, 9 June 2024 (UTC)[reply]
Interesting territory, and there I think that we need to take a bit of a step back. The first question has to be whether the category should exist here, prior to what and how it is populated. Only after that can we then discuss the means that we want things populated, and whether they are falling into a variation of Com:OVERCAT. I don't mind cats coming from WD data as long as it is sustainable and comparatively easy to manage and resolve. It is the deep/problematic dives that we need to resolve, either in the finding or in the fixing. — billinghurstsDrewth02:18, 9 June 2024 (UTC)[reply]
That's an excellent point by @Billinghurst. Fundamentally, we should be creating good categories and populating them in compliance with Commons category policies first and foremost, regardless of how this is done, be it manually or using templates and other tools. I agree very strongly with @Adamant1 that some of these categorization schemes (e.g. "recipients of X award") which clearly are really about storing data points about a topic in the form of categorization are not good form, as they aren't really about categorizing media, but trivial categorization of topics, which is not the purview of Commons. Josh (talk) 15:31, 10 June 2024 (UTC)[reply]
But as far as I can see it is not at all documented there; not even the mechanism (buried somewhere other than the code on that page) is documented. It's not at all clear where one would look to see what properties/categories are handled this way. - Jmabel ! talk01:48, 9 June 2024 (UTC)[reply]
I think Wikidata could be helpful for populating categories about video games, movies, television shows and animes. Adding the correct categories by hand is somewhat of an tedious process Trade (talk) 01:41, 9 June 2024 (UTC)[reply]
Wikidata Infoboxes provide given name, surname, and birth and death dates, and "living people", which should presumably be uncontroversial. I'd dispute that! Broad categories like "living people" or "2000 deaths" have limited utility on Commons. There are extraordinarily few situations where they are genuinely useful as a means of locating media. Omphalographer (talk) 02:00, 9 June 2024 (UTC)[reply]
Bollocks. The Commons category structure has been an untenable mess for years. A large part of the problem expressly lies with editors from Wikidata and Wikipedia who bring their baggage with them and fail to understand that Commons is a separate site with its own policies. A prime example of the Wikidata side of the problem is with the "Births in" categories. These editors have actively sandbagged a clear segregation from "People of" categories, resulting in a massive clusterfuck of superfluous categorization and a failure to understand what a meta category actually is, as opposed to what they personally think a meta category should be. In the few times where Commons admins have crossed paths with me in attempting to clean up this mess, I gained the impression that those admins had zero understanding of COM:CAT. However, let's not get bogged down with examples, because the problem's a lot bigger than any example.RadioKAOS (talk) 02:05, 9 June 2024 (UTC)[reply]
@RadioKAOS: I am very comfortable with us using WD data to categorise here. My issue primarily is how we fix it when it goes askew. Our categories, our categorisation, and decision-making how we use WD data to categorise here. We will always face the issue of implementation of decisions from contributors who edit elsewhere, so the issue isn't their ideas, it is the consensus they need to reach in its implementation, instead of unilateral implementation.
So for the moment, rather than stray into the "whataboutism" it would be nice if we focus on the issue, rather than inflate to a blame game. — billinghurstsDrewth02:27, 9 June 2024 (UTC)[reply]
@Billinghurst: Not to point fingers at Wikipedia users, but I think it gets to one route cause of the problem, which is that it seems like people from other projects use categories as a rudimentary way to store (or display) information about a subject. Not necessarily organize media related to it. Like with the example of categories related to awards, if you look at Category:Ivan Matyukhin there's 10 categories for awards that they have received but absolutely zero images in the category having to do with them.
So the categories are just being used as rudimentary ways to store and display biographical facts about Ivan Matyukhin, not to organize media related to the awards. And again not to point fingers, but I don't think that's something regular users of Commons would do on our end. Regardless, I think the problem could largely be solved if we were clearer about (and better enforced) the idea that categories are intended to group related pages and media. Not act as shoo-ins for Wikidata data item's or something. But then we don't have the ability to do that if the categories are being automatically created and added by the infoboxes either. So... --Adamant1 (talk) 11:37, 9 June 2024 (UTC)[reply]
@Adamant1: Creation of a cat and the population of a cat are different and separate acts. For WD, they are also both happening here, not at WD, as they are in templates that we control. Someone has created the category and someone has added the code to Template:Wikidata infobox for the population to occur. The automation thereafter is due to having created the cat, and done the coding to add the cat, the population is from data at WD. If that is the issue, then can we please address that in a different thread. At this time, it is the ability to locate and identify from where the categorisation is taking place and resolving that. — billinghurstsDrewth01:21, 10 June 2024 (UTC)[reply]
@Billinghurst If I understand you correctly, it seems what you are saying is that it is not the automation per se that is the problem, but instead our process of having created these kinds of categories in the first place...if Category:Ivan Matyukhin exists and the 10 'Category:Recipient of...' categories exist, we can hardly blame the automated tool for adding those presumably accurate connections, but instead it rests on us as a community to have the deeper discussion and develop a consensus on how much of this kind of categorization we should have in the first place. Am I reading you correctly? Josh (talk) 15:41, 10 June 2024 (UTC)[reply]
@Billinghurst If I understand you correctly, it seems what you are saying is that it is not the automation per se that is the problem, but instead our process of having created these kinds of categories in the first place...if Category:Ivan Matyukhin exists and the 10 'Category:Recipient of...' categories exist, we can hardly blame the automated tool for adding those presumably accurate connections, but instead it rests on us as a community to have the deeper discussion and develop a consensus on how much of this kind of categorization we should have in the first place. Am I reading you correctly? Josh (talk) 15:41, 10 June 2024 (UTC)[reply]
@Joshbaumgartner: My original point, is the fixing of problematic categorisation which was the primary reason for my raising the issue. These are all categories that are created by us, and the coding in the templates is by us, either through WD infobox or other Commons templates. Finding how and where to fix things is increasingly becoming difficult, and I am looking for solutions there. We need to show how it gets there, and either how to fix it, or where to request the remedy, AND we cannot be relying on individuals. [So a clear means to identify auto-populated cats, and in the documentation in the template to show it autopopulates and where.]
My second point is that we own our categories and their creation. If we allow them to exist, then auto-population is okay, though the criteria in my first point needs to be met. Point 2 cannot exist in isolation. — billinghurstsDrewth04:02, 11 June 2024 (UTC)[reply]
I fixed a few cases when trying to work on categories stuck in Category:Non-empty category redirects. This concerned mostly categories on category pages (not files) and -- beyond the question which name to choose -- the categorization itself was rarely controversial. (There is some debate about the "old map" and "historical map" categories at Module_talk:Messtischblatt, categorization added for years).
Categories added by Template:Topic by country are actually relatively straightforward, but that template did lack documentation (somewhat improved yesterday). They can highlight problems in our category tree. Wikidata was rarely much of an issue. (I did blame it by error when a category was added with &html entities).
A search in the source text of Template: or Module: namespace usually finds the definition of a categorization. "|setscats= " in template documentation is meant to help. A general problem with categories added by templates is that everything needs to be refreshed if it's changed. Once one was identified a search with PetScan on subcategories of Category:Non-empty category redirects helped find other problematic uses. I noted some finds on User talk:RussBot/category redirect log. Enhancing999 (talk) 09:14, 10 June 2024 (UTC)[reply]
To me this is that if a template categorises other pages, then the template needs to specifically say that is its purpose, and give clear statements of what it is doing, ie. where to expect to see results. Ideally I would like to see a complete list of categories that it populates as that makes reverse finding useful. I would also like to see categories that are populated automatically also have a maintenance category that says that can be autopopulated by such and such template. Clarity is gold in these situations. If there is a master template for broad categorisation, then it should have a section for problems noted, and it should be identified for watching by numbers of people. (fixing problems early before they propagate is also gold) — billinghurstsDrewth04:27, 11 June 2024 (UTC)[reply]
Not sure how practical that is. Potentially it could mean that one would have to edit every parent category (A of X, B of X, C of X) for each subcategory (NEW of X) instead of just a category.
Unless we find a central way to add them, this could mean that for 250 new categories one would have to edit every occurrence of several parent categories (All A of .., All B of .., All C of ..), possibly thousands. Enhancing999 (talk) 12:18, 11 June 2024 (UTC)[reply]
Thanks a lot @Billinghurst: for starting this RfC, I totally agree with your description of the problems that templates can create. So we need to:
inventorize the problems
give solutions, how can we address these problems.
Agree Templates are often a great tool, for instance for the date categories and the template that is importing information from wikidata (as long as it is limited to the basic categories, like given name, surname, birth and death dates (useful to decide whether works of an artist are in PD), people/men/women by name).
But I am struggling too often with automatic categorisation by templates, and indeed Template:Topic by country is one of them (others are about photographers). Some of my problems:
The template is automatically adding parent categories that do not exist for that country, while a parent of it or another alternative category does exists, and/or there are not enough files or subcategories to justify creating the red one (and it is a lot of work to create new ones over and over again, which I consider part of the "administrative burden" Billinghurst is talking about).
Sometimes there is even a better child category for a country/location than the automatically added one (for instance for the photographer by location by date: the standard parent is the location, but sometimes "history of location" or even a category that groups all the photographers together for the location and/or date would be better).
Some templates make use of lists or other pages that I cannot find, they might be hidden, but anyway not documented (with links) in the template.
Though it is indeed probably a side thing, I agree with Adamant1 that there are editors who create categories, just because there is a Wikidata item or an EN-WP category/page with the same name, no matter whether we need them on Commons or not. And then it is a lot of work to put that right again. That also contributes to the administrative burden.
Suggestions for solutions:
Before you intend to create a new template that is more complicated than a simple date template: present your proposal to the community (at least in plain English, you might of coarse also present (a part of) the proposed program), ask for comment. Same for adding automatically new parent categories by a WD template.
Good documentation should be a basic feature in each template, before a new one is published or in use:
in plain English, like functional specifications; explaining what the template does (what actions), how it does it ( mechanisms and for instance: what lists/other things/links it uses), when to use it (in what kind of categories) and how to use it (what exactly should you do to make it work). Written with people in mind who know nothing or very little of programming, but are interested in templates. This should also be checked and done for existing templates as well.
technically, for editors who will solve problems when the creator is not available.
A procedure for when a template creates trouble:
Where to drop the problem?
Who is going to solve it? Especially when the original creator is not available (or refuses to solve it, what I have experienced as well).
Can we remove the template and add better parent categories (and often a navigation template) instead? Without the risk that the next editor will reverse it?
Question@Mike Peel: do you have a system-based solution for how we can readily identify the categories that are/can be populated from WD (and thinking as maintenance cats) if it isn't already. What is done at WD end, and what can be done at Commons end to be clearly overt? — billinghurstsDrewth00:56, 12 June 2024 (UTC)[reply]
Solution mode
So taking the next step, what exactly do we want to achieve?
Starting simple, what if anything do we want to achieve at
and without getting into the detail, where else are we looking to get information into place, or where might we need clear procedural change, or mention of expectations. — billinghurstsDrewth00:50, 12 June 2024 (UTC)[reply]
It looks like the guidelines are roughly speaking OK, perhaps just some additions. The main issues might be applying and enforcement. JopkeB (talk) 07:42, 14 June 2024 (UTC)[reply]
For applying these policies, this can be done manually or automatically. To support manual application, the best tools are well-written and easily-accessible guidelines for editors of all levels on how to do so correctly and efficiently. For automated application, good tools, such as templates, are needed. These tools should allow for manual override (e.g. a nocat parameter to suppress automated categorization on a given application) where applicable. Documentation should make it clear to users how to use these options.
For enforcement, which I see basically as maintenance, automation is valuable in the form of good monitoring tools, such as automated flags for cases that are outside of the guidelines and spotting areas of inconsistent category organization and naming. However, actually addressing these situations is incumbent on human editors to do. As for the term 'enforcement', I associate that more with the involvement of authority, such as admin action to stop abuse or dealing with intentional disruption, while 'maintenance' doesn't necessarily imply that previous editors did anything wrong per se, but just that we are continuing to evolve and improve our categorization and so forth.
Ultimately, human editors are the key to successful categorization. Templates and other tools can be used by them to help increase their efficiency, but it is up to the human to ensure those tools are correctly applied. For example, applying a template and then saying that the categorization must be adhered to because that is what the template added is not appropriate. Thus the guidelines should focus on ensuring editors understand what the end goal is of categorization and provide them with tools on how to get there, with or without templates and gadgets. I would certainly like to start looking at some specific language being proposed for the above policies to get to the meat of the matter. Josh (talk) 14:55, 17 June 2024 (UTC)[reply]
Undoubtfully tools can play a role. But I think step one is to convince people who create (and/or adjust) templates, that creating(/adjusting) a template is only half of the job. The other half consists of creating(/adjusting) good documentation (and testing according to a test plan). JopkeB (talk) 11:55, 18 June 2024 (UTC)[reply]
Updating Commons:Templates and Commons:Template documentation is certainly a good idea. Both of these have relatively little and out-of-date information, so both lack utility in their current guise. The implications of what is changed are significant however, with a great potential for unintended consequences, so new wording should be first proposed with significant input solicited before any actual adoption of a new and revamped template policy is published. I think it is an effort worth doing, however, and look forward to participating. Josh (talk) 14:30, 17 June 2024 (UTC)[reply]
On the Wikidata side, if we are going to allow any element of Commons categorization to be controlled by Wikidata properties, then there has to be a clear rule set agreed to on the WD side for each of those controlling properties to ensure that changes on the WD side do not adversely damage our categorization scheme here. For example, existing properties such as 'instance of' and 'subclass of' are probably are unworkable, as the WD scheme for these is well established but quite different for Commons categorization in many ways. Perhaps a new set of properties will need to be thought out and proposed on WD specifically to support Commons categorization. This could all be very useful, and a cross-project collaboration effort which brings both Commons and WD minds together may well be able to work out some good tools for this. Josh (talk) 14:30, 17 June 2024 (UTC)[reply]
Is this category for flags that are fictional? Or is it for flags for countries featured in creative works? There is no way to infer this from the category name alone Trade (talk) 22:21, 10 June 2024 (UTC)[reply]
As I've interpreted it, it's both - they're flags which are fictional, and which have appeared in fictional works. I'm not sure how you'd have one without the other. Omphalographer (talk) 05:00, 11 June 2024 (UTC)[reply]
Right, so we are we showing both type of flags into the exact same category? This is just a mess to keep track of Trade (talk) 18:22, 11 June 2024 (UTC)[reply]
What do you mean by "both types"? As far as I'm aware, there is (or should be) only one type of image in this category - depictions of flags which stem from fictional works, and which represent countries which only exist within those works of fiction. A typical example would be File:Gilead-Flag.gif, the flag of the fictional country of Gilead from The Handmaid's Tale. Omphalographer (talk) 18:31, 11 June 2024 (UTC)[reply]
There is nothing in the category nor it's name to indicate that only flags from creative works should be features. Trade (talk) 22:24, 12 June 2024 (UTC)[reply]
I could be totally off base here but I've done some work in the area and I think at least some of the problem is the ambiguity of the parent categories and how the whole thing is structured going up from there. For instance the category has both Category:Flags in fiction and Category:Special or fictional flags as parents. But then Category:Special or fictional flags is also a parent of Category:Flags in fiction. So it's just circular. Plus the Wikidata entry for Category:Special or fictional flags appears to be about "unofficial flag", which really has nothing to with fictional flags to begin with. Regardless, it seems like this combines "special", "fictional", and "unofficial" flags into the same category and does it in a way were the categories are just circular. We should just pick a term, go with it, and make the parents categories actually lead somewhere meaningful. --Adamant1 (talk) 00:06, 13 June 2024 (UTC)[reply]
How about "Flags of countries from creative works"? This could then be a subcategory of "Flags from creative works", with it being a subcategory of "Symbols from creative works" Trade (talk) 21:57, 13 June 2024 (UTC)[reply]
The current name applies the adjective 'fictional' to the country, not the flag, which indicates that a fictional flag of a real country would not apply here. A fictional country is one from a fictional work (our scope would limit this further to notable fictional works), so any flag that is used by such a country would go here. However, renaming this category to "Flags of countries from creative work" could be interpreted either in the exact same way (expression: "(flags) of (countries from creative works)") or as Jmabel does as fictional flags of real countries (expression: "(flags of countries) from (creative works)") which is always a possibility when using a double-prepositional phrase. Thus I think the current category name is more clear. Josh (talk) 13:47, 17 June 2024 (UTC)[reply]
At this point Commons have hundreds if not more than a thousand fictional flags while flags from creative works makes up less than a hundred files. One is very clearly an issue, the other is not. Trade (talk) 14:23, 19 June 2024 (UTC)[reply]
@Adamant1, Category:Special or fictional flags is a complete violation of the Selectivity Principle and Simplicity Principle. A special flag (presumably described by the linked WD item as a privately used, unofficial flag) is very different from a fictional flag. The former is very much a real flag, while the latter is patently not real. Since these alone are two different topics, they should be two distinct categories, not to mention any other hodge podge that is currently in this category. As for the names, "special flags" is a horrible category name, as 'special' is way to broad of a concept. We should focus on what the flags represent (and hence whether they fit in our scope), so 'flags of countries', 'flags of social movements', 'flags of companies', 'flags of individual people', etc. are all potentially good concepts for categories. I would say the the OP category 'flags of fictional countries' also is fine as a concept for this reason. Josh (talk) 14:02, 17 June 2024 (UTC)[reply]
@Omphalographer: way back on your 05:00, 11 June 2024 remark about "I'm not sure how you'd have one without the other," it seems to me that most micronations are "fictional countries" without necessarily involving any fictional creative work. I don't think there is any escaping needing an explanatory headnote for any name we might come up with in this terrain. - Jmabel ! talk19:17, 17 June 2024 (UTC)[reply]
I think there is a meaningful distinction we can make between flags which are generally acknowledged as artifacts of fictional works (e.g. flags from books, movies, video games, etc) and flags which were created to be used in the real world to represent some entity, even if it's an entity of dubious existence like a micronation. Omphalographer (talk) 20:31, 17 June 2024 (UTC)[reply]
I'm rather confused. The general feed back seemed to me to amount to "logo detection isn't very useful." I was told by a couple of people when I asked informally, "Don't worry, it isn't like logo detection isn't the goal, this was just a side effect of work on something else that someone thought might be useful." And now you say that further work is proceeding on this front? What, exactly, put this on the front burner, especially given that we are constantly being reminded that dev has very limited resources for Commons? What is the problem we are trying to solve? - Jmabel ! talk22:25, 11 June 2024 (UTC)[reply]
@Jmabel Our impression, to be fair, was quite the opposite: that it was something that could be useful in dealing with the third-most frequent rationale for requests for deletions (the first two being copyvios and FoP, which we found it was impossible to tackle in an automated way). There was more difficulty in defining how this could be implemented, but not on its usefulness. This is why we are re-opening the feedback period, to understand how it could be implemented. Sannita (WMF) (talk) 10:36, 13 June 2024 (UTC)[reply]
@Sannita (WMF) "third-most frequent rationale for requests for deletions (the first two being copyvios" - This doesn't make sense at all. The only reason we would delete a logo is because it's a copyvio, not because its a logo. There are scores of logos which are in the public domain, either by age or by lack of creativity, while others get licensed under free licenses. I'm not sure why we should discourage people of uploading that specific content with such a warning, when those exact same rules apply to everything else. As it is, I tend to not support that implementation. And as JMabel mentioned, it's disheartening to see that resources were wasted developing such an apparently useless tool, when there are clearly established priorities (see the old wish lists, for instance). DarwinAhoy!16:16, 13 June 2024 (UTC)[reply]
@Sannita (WMF), Jmabel, and DarwIn: I'll leave others to decide on the best or most suited UI for the logo detection. As for the feature, I am supportive of this, but conditionally. Suggest this feature should be mandatory for users who do not have the appropriate user rights; I suggest users who are not admins/sysops, license reviewers, and/or autopatrolled. Users who are under these three tiers of user groups are free to upload logos and should not be slapped with this filter, since they are already aware of copyright issues and TOO considerations for logos. If possible, the feature should effectively block uses of "FileExporter" and other cross-wiki file transfer tools. And one more thing, I suggest the filter can prohibit new users (those who are not autoconfirmed) from uploading or importing logos (even photos showing logos that are non-de minimis/non-incidental). Hopefully, this will trim down at least a third or less (my guess) of deletion requests that contribute to the perennial backlogs. There are many more areas in Commons that also need attentions and resolutions, like Commons:Categories for discussion/Older (some open discussions were from before the lockdown era of 2020). JWilz12345(Talk|Contrib's.)08:30, 14 June 2024 (UTC)[reply]
@JWilz12345: I think the plan is for this to become a secret feature. It has no effect on the upload itself and nobody but the uploader will know about the warning. Possibly, the same effect could have been achieved by merely editing the current interface and noting "if it's a logo, follow logo guidelines". Enhancing999 (talk) 08:43, 14 June 2024 (UTC)[reply]
Just my opinion, but having a specific warning to the uploader saying the image might be a logo seems rather pointless. If not borderline condensing towards users. People generally know what they are uploading images of. The less clear thing is what license to use in any specific instance and I don't really how this deals with that. A better thing would probably just be a specific checkbox for logos that automatically adds a license and puts the image in a specific category for images that need reviewing on upload. Otherwise people are just going to just ignore the warning just like they are already ignoring guidelines by uploading the image to begin with. What we really need is better ways to review and deal with problematic images on our end though. Not try to unload that on uploaders by over complicating the UploadWizard with a bunch of warnings, extra boxes, and the like. --Adamant1 (talk) 20:52, 15 June 2024 (UTC)[reply]
@Adamant1: anything related to copyright is already complicated enough. That's perhaps a price to pay for establishing/creating a free media repository site like Commons, or more so, Wikipedia itself way back more than 20 years ago. Something that founders Wales and Sanger likely did not forseen or anticipate. (Note: just a part of my thoughts, and not a representative of my general perspective on Wikimedia movement, which I still support in the context of mandating global FoP). JWilz12345(Talk|Contrib's.)21:12, 15 June 2024 (UTC)[reply]
Thanks everyone for your comments! @Adamant1 about the checkbox, we thought of that option too, but ultimately decided against, because we didn't want to clutter too much the UploadWizard and make it more complicated for legitimate uploaders to upload a legitimate logo or fall into the "I'll just ignore that" kind of case. Anyway, our scope is to get to a better and more seamless way of uploading medias, but this will take more designing, prototyping, and testing, so it won't happen overnight.
To everyone, we're open to ideas for eventual moderation of logos in general, given that we don't want to unload a new bunch of work on volunteers without there being consensus. Sannita (WMF) (talk) 14:08, 20 June 2024 (UTC)[reply]
I just want to provide some context on @Sannita (WMF)'s post above ... what we're working towards here is an automatic process by which we reliably estimate the likelihood that an uploaded image will be deleted for any reason
If we had that process we'd be able to inform users that their upload is likely to be deleted (and why) during the upload process, which would be a better (and more educational) user experience than we have now. Also moderators would be able to find (and deal with) potentially problematic uploads much more easily
Our initial experiments with machine learning showed we can detect logos reliably, and they're a pretty common reason for DRs, so logo detection seemed like a promising place to start CParle (WMF) (talk) 14:36, 20 June 2024 (UTC)[reply]
There may be a misunderstanding here: being a logo is not a reason to delete. We have tens of thousands of logos legitimately on Commons. Laying aside logos that are PD because they are very old, or created by certain governments that don't claim copyrights, etc., an enormous number of logos are below the threshold of originality for copyright, especially in countries like the U.S. where that threshold is quite high. False positives -- discouraging or (worse) preventing upload of content that could legitimately be hosted on Commons -- is at least as bad, and arguably worse than false negatives, letting a "bad" file through. We can always delete a bad file; we cannot conjure a file we don't get to see. - Jmabel ! talk19:36, 20 June 2024 (UTC)[reply]
> being a logo is not a reason to delete
Absolutely, but being a logo is a signal that the upload is more likely to get deleted. We're not proposing to prevent logo uploads, just to alert the user if what they've uploaded looks like a logo, and attempt to educate them about the copyright implications (and also flag possible logos so that patrollers can check them) CParle (WMF) (talk) 10:56, 21 June 2024 (UTC)[reply]
I'm not sure logos are actually among the things where the highest percentage get deleted. But maybe they are. Do we have any available statistics on this? - Jmabel ! talk19:24, 21 June 2024 (UTC)[reply]
@Sannita (WMF): I may be missing something, but I don't readily see anything there that even suggests what percentage of logos are deleted, compared to what percentage of uploads in general. Is it there and I'm missing it, or is it just not there? - Jmabel ! talk18:22, 24 June 2024 (UTC)[reply]
June 16
How to find the source cat for why a given image is in a specific category tree?
Just correct it that requires me, and for other cases many other editors, to be able to quickly and easily see what the source of the category is. Hence my question.
(And I went through the cat tree but didn't check Demographics of the European Union somewhere in the branches above it and don't know how you found it). Prototyperspective (talk) 12:55, 16 June 2024 (UTC)[reply]
Right, you were hoping for a method there. I admit, I just randomly went up the tree from the in-file-categories to see which upwards category was most likely to lead towards "maps of the world". --Enyavar (talk) 13:09, 16 June 2024 (UTC)[reply]
Example (this is already part of the FastCCI gadget) Yes and I was looking for a method that it isn't manually going through all the category's cats' cats etc but quick reliable technical method. I found such a way today but I'm looking for something that does just that: the FastCCI tool can load all quality images anywhere in branch of a given category and when clicking on any item of it, it loads how the file is placed in it – see the screenshot. It's just this feature that I'd like to use on a given file, eg by entering the file's url into an additional searchbox shown on the category page or anything else. Prototyperspective (talk) 15:48, 16 June 2024 (UTC)[reply]
Perhaps the problem is that there aren't really clearly-defined 'source cats' (or more commonly referred to as a 'main category' or 'main topic') for topics in the category tree. We even have a maincat tag that has been added to some, but still no real agreement on what exactly makes one exact category a main category vs. any other. We have parent categories and sub-categories, but any category could conceivably be considered a main cat for all of its child tree, depending on the perspective of a given user. The reality is that categories are not exactly class-subclass or set-subset relationships. They may resemble that in many cases, but they are not limited to that. Categories are really not even trees so much as webs, so it as desirable as it might be to have a 'source cat' with some kind of defined 'tree' of sub-cat branches under it, that just isn't how categories ultimately are structured, and your example shows a lot of the reasons why.
Your example File:Krettnach Wegekreuz L138.jpg at the right illustrates the problem with rooting the category tree. It seems to present a singular linear path of categorization. The image in question is in 15 categories directly, but this tool only picks 1 to navigate up through. That makes sense as this tool is focused on quality images but even then this file is in 4 quality image categories...it picked 1. The same pick-one problem recurs at each level, leading to essentially a random category navigation exercise to a certain level. Josh (talk) 16:43, 17 June 2024 (UTC)[reply]
The main category/ies subject is certainly interesting but it's pretty much unrelated to this topic: this is about finding why a specific image is located in a specific category (such as the example file somewhere underneath "Maps of the world").
Yes, that's a good point but still not really what this topic is about: that one can check the source of categorization for a file (its category path) doesn't mean that it will or needs to be altered just if it doesn't seem to be right. For example, the "Maps of the world" cat contains a "Category:Maps of the world in art" that contains a lot of files one wouldn't expect to find anywhere underneath "Maps of the world". When one sees that it's just included due to the recurring standardized "xyz in art" subcategory, it won't need to be removed. It would be useful if for example tools/views that show files from many subcategories of a category like FastCCI could distinguish between several kinds of subcategories to e.g. exclude certain ones or give the user the option to do so but that's not what this is about and just something that could follow up on this.
This gets back to something I had wanted to pursue when we first introduced SDC, but most of the people pushing SDC were more or less antagonistic to categories, so they weren't interested in integrating the two well. Little by little some of this has made its way into Wikidata regardless, but not in a way that is particularly useful to us, because when it made its way into Wikidata, Commons' needs weren't particularly considered. In particular, distinctions in Wikidata like "subclass of" vs. "instance of" vs. "location" as a relation between items are all at one remove from categories, which are at best related to their parents (or, more precisely, to their parents' "main topics") with "category combines topics". It might yet be possible to piece this all back together and use structured data (I would hope in Wikidata rather than SDC, but I'd settle for either) to express the nature of each case of category inheritance. - Jmabel ! talk18:35, 18 June 2024 (UTC)[reply]
I think SDC can only ever be useful if it gets synced with categories and changed whenever categories are changed. For most files SDC are missing and easily half of those have them have flawed and/or very incomplete data in there, the data is a bit hidden and little cared about so people don't notice if there's false or vandalizing data. An example issue is that "depicting" something is different from a file having something as a subject. Lastly, I don't think SDC for WMC files has any use at this point whatsoever so is just a time-drain that could and would better be populated by bots/scripts only that largely use well-maintained categories if at all since it mostly just duplicates metadata & maintenance. Not even Wikidata is well-maintained, for example the items for subjects as fundamental and large-order as "Past", "Present", and "Future" were heavily flawed before I fixed them and categories and their contents should and could be as query-able as Wikidata items (example where they're not: one can't do a petscan and then run WMC category operations on all the files in the results). Most studies are also not in Wikidata, just an arbitrary 1% or so subset of them and queries for such as Scholia does them for charts about studies on a given research topic etc or by a specific author are not useful at this point either. Often there is data in categories not in fields that make WDInfoboxes auto-set categories and these infoboxes cause UI issues (reduce columns). Manually (really in 2024?) translated captions are frequently moved to other languages by vandals. WD item data could be useful to make the relation between categories more explicit, often they are phrased in a way that explains the relation but it could be more explicit.
I was thinking about how to make these more useful and maybe I can concretize some proposal soon. This also related to this question here as a tool that shows cat-paths for files could be useful for eliminating miscategorizations better enabling the categories to be used by bots/scripts that populate structure data / Wikidata based on them.
Wouldn't this better be asked at the Scope talk page? In COM:SCOPE it does say "legitimately" in use. That would mean it depends on the legitimacy of the Wikidata item itself and the legitimacy of the use there (does it actually show what the Wikidata item is about? is there a much better file available that should be used instead?). Prototyperspective (talk) 12:53, 16 June 2024 (UTC)[reply]
It could be used to artificially cause a scope loop between Commons and Wikidata. AFAIK, Wikidata is the only Wikimedia website where an item can be in scope for no other reason than because it links to a page on Commons. A file about a person can be in scope on Commons if the file can be potentially useful to someone somewhere in a very broad educational sense, even if there is no page about that person on any other Wikimedia website. But that doesn't go so far as to include photos of non-notable persons in non-relevant contexts. Of course, a file about a person can be more easily assumed to be in scope on Commons if another Wikimedia website has a page about that person. (N.B.: that question is different and independent of the question of whether a file is in use or not on another Wikimedia website.) That makes sense based on the assumption that the other website has independent notability criteria. But that does not make sense when the only inclusion criterion being met on Wikidata by a person is that the Wikidata page links to a page on Commons. -- Asclepias (talk) 15:16, 16 June 2024 (UTC)[reply]
'Wikidata, has nothing to do with defining scope for Wikipedia or commons. Look to the policies for both projects separately to ascertain what is scope. There are differences, minor ones perhaps. Generally speaking, if it passes the scope test for Wikipedia. it will be okay for commons too, as the latter is marginally more liberal. As far as we are concerned, Wikidata is a tool, and nothing more. Broichmore (talk) 16:35, 16 June 2024 (UTC)[reply]
I'd think yes. I noticed in a recent cleanup exercice some stuff got tossed that was in use there, but that seemed relatively marginal. Enhancing999 (talk) 22:29, 16 June 2024 (UTC)[reply]
It seems to me that the only times an image that is used to illustrate a Wikidata item would be appropriate to delete on a "scope" basis would be (1) the Wikidata item shouldn't exist, which should first be dealt with on Wikidata or (2) the image does not meet Wikidata's criteria for images illustrating an item, which should first be dealt with on Wikidata. And, no, I wouldn't defend the case where the only justification for the Wikidata item is that the Commons image exists and vice versa.
@Broichmore: are you saying (above) that it would be appropriate do delete an image that is legitimately in use on Wikidata, or just that the presence of a Wikidata item is insufficient to justify having other images of the subject in question? I'd agree with the latter: I believe there are cases where there are Wikidata items for every name on a war memorial or every person buried in a particular graveyard. That would merit hosting a single image of any of these people to support Wikidata, but would not mean that we would want an open-ended number of images of each such person. - Jmabel ! talk23:44, 16 June 2024 (UTC)[reply]
As your all too aware, criteria for uploading images on commons centres on copyright (we want only PD), modified by notability, the emphasis on the former.
In Wikipedia, it's possible to upload a non-PD item (using, the fair use parameter), if there's no other image (of even dubious suitability) available. Even so, the item then needs to be degraded so it's worthless commercially. Only items, most representative, are entertained. A pix of the ship. The person's face. an LP cover. An actual passport style portrait of the item.
Wikidata says it wants an image of (the) relevant illustration of the subject; if available, use more specific properties (sample: coat of arms image, locator map, flag image, signature image, logo image, collage image);only images which exist on Wikimedia Commons are acceptable. In other words only PD and no fair use.
Briefly, again, wanted images for the Wikipedia infobox and wikidata are passport only portraits, in that spirit I've been known to crop out portraits for that purpose. This example was used for both projects. That portrait will be replaced, only when better surfaces.
Wikipedia carries only one pix in its infobox, and so should Wikidata, however the latter can carry more. Perhaps a persons facial image, representative from different stages of their life. There's no real limit. I don’t think this has actually been tested, by abuse yet, the tacit acceptance is, use only one image, perhaps two, following Wikipedia guidelines.
Wikidata’s scope for inclusion is immense in comparison to WP and commons. Briefly an item should be represented by at least one valid sitelink to a page on Wikipedia, Wikivoyage, Wikisource, Wikiquote, Wikinews, Wikibooks, Wikidata, Wikispecies, Wikiversity, or Wikimedia Commons. ’’Note it refers to itself, it’s a bit woolly. But it wants to tag everything useful in the world. Read the policy here.
As I see it, Wikidata has three uses for us, first our infobox, containing the variables that dictate licensing, secondly (for an artist, if there's no WP article); locations of work by time period, useful for correct attribution, and third, disambiguation of artists or place names. Broichmore (talk) 11:07, 17 June 2024 (UTC)[reply]
It doesn't help there that the user whose contributions are largely at issue keeps cycling back and forth between reasonably serious arguments, ad hominem remarks, and remarks that seem entirely intended to derail the discussion. But I stand by what I wrote above: circular justification doesn't cut it; a Wikidata item can justify keeping one photo on Commons that would otherwise be deleted, to function as an image for that item; and it isn't up to us on Commons to decide whether the item meets Wikidata's (generally lower) threshold of notability. - Jmabel 01:09, 17 June 2024 (UTC)
The "scope loop" breaks on the Wikidata side. Per d:Wikidata:Notability, "Category items with a sitelink only to Wikimedia Commons are not permitted, unless either a) there is a corresponding main item which has a sitelink to a Commons gallery or b) the item is used in a Commons-related statement, such as category for pictures taken with equipment (P2033)". Commons can expedite the cleanup of objects involved in these loops by ignoring media uses on Wikidata which only refer back to Commons. Omphalographer (talk) 23:56, 16 June 2024 (UTC)[reply]
Concur. WD item creation that is involved with the uploader/creator in sequence within the same timeframe, and otherwise fails their notability (articles or independent linkage) is not "in use" (image or category). If it is borderline, or where I am in doubt, I will nominate there for deletion with appropriate commentary, and then nominate the image here with a DR, so we can do some follow-up (our DR review takes longer than theirs). That is pretty much what I will do for a draft article where it is dodgy and reeking of COI at the WPs. — billinghurstsDrewth04:38, 17 June 2024 (UTC)[reply]
@Omphalographer: that seems more of a technicality about WD than something germane here. It's specifically about Category items. That just means that if you are making up (for example) an item about Person FOO and the only Category:Person FOO is on Commons, you can make an item about "Person FOO", but not about "Category:Person FOO". - Jmabel ! talk06:38, 17 June 2024 (UTC)[reply]
@Jmabel, actually, you could, if "Person FOO" is sustainable itself. The item for "Category:Person FOO" is allowed per criteria #3 as structurally required as the topic's main category (P910) for the former item. What is not allowed is if "Person FOO" does not exist on Wikidata, and you solely wanted to create an item for "Category:Person FOO" with the only sitelink being to the Commons category. That would fail all three of the criteria for notability there. Of course, as you stated, this is all applicable to Wikidata alone and has no real relevance to Commons, except that you probably shouldn't add {{Wikidata infobox}} to "Category:Person FOO" without a valid corresponding Wikidata item. If that category is a valid Commons category, we certainly should not delete it just because it does not support a Wikidata link. Josh (talk) 17:08, 17 June 2024 (UTC)[reply]
@Joshbaumgartner: No. I have no idea of your level of experience on Wikidata (and feel free to correct me if that experience is extensive, and you can provide examples of something I've never seen), but in my experience no one goes around creating a "category" item just to use it within Wikidata for a topic's main category (P910). Otherwise, every item would get a corresponding "category" item. Hell, every category item could get a corresponding, totally useless "Category:Category" item, etc. ad infinitum. Until a "category" item is justified by the existence of categories on some sister project other than Commons or Wikidata itself, it should not be created. - Jmabel ! talk19:29, 17 June 2024 (UTC)[reply]
@Omphalographer, note that only applies if attempting to sustain notability under criteria 1 (i.e. solely on the basis of a sitelink). If an item qualifies under criteria 2 or 3 (i.e. is a "clearly identifiable conceptual or material entity that can be described using serious and publicly available references" or "fulfills a structural need"), the exclusion of 'items with a sitelink only to Wikimedia Commons' is not applicable. Josh (talk) 16:58, 17 June 2024 (UTC)[reply]
For people, I think it'd be acceptable to declare as in scope those items with an identifier that may presume the subject is notable or relevant, such as those of National Libraries. Bedivere (talk) 20:41, 20 June 2024 (UTC)[reply]
@Omphalographer The "scope loop" ought to be broken in Commons policy too so that we can achieve the right result on a file here without having to wait for an action in another project. Consigned (talk) 23:21, 21 June 2024 (UTC)[reply]
@Consigned: strongly disagree. Legitimate use on a sister project should be sufficient reason to consider an image to be in Commons' scope, and we should not be in a position to dictate to other projects what we consider legitimate for them. Commons is, as much as anything else, a repository for images required by other projects. Wikidata, for example, does not host images itself, and completely relies on Commons to be its image host. - Jmabel ! talk02:03, 22 June 2024 (UTC)[reply]
You could argue that images are tangential, if not completely at odds, with the purposes of Wikidata as a project though. I don't think the same could be said for Wikipedia since encyclopedias historically included images, but there's nothing about tabled or linked data that has anything to do with media. In most cases it has absolutely nothing to do with Wikidata being a "knowledge base of structured data" or whatever. I don't think we are obligated to host an image of something, let alone specific types of media, just because someone creates a Wikidata property or item for it either. --Adamant1 (talk) 02:26, 22 June 2024 (UTC)[reply]
@Jmabel: If item on WD is within their scope and the notability, then yes, any image is reasonable. That said, there are many who upload an image, then create an item there as self-promotion, so in those cases it is not reasonable to retain. Solely having an item there, and having an image here used there, should NOT be the sole criteria, we need to have a little investigation. We cannot have each site be a blocker to the other resolving spam and self-importance. — billinghurstsDrewth12:38, 22 June 2024 (UTC)[reply]
@Billinghurst: agreed on the problem, but Wikidata, not Commons, is where you have to fight to say something on Wikidata is not valid use. Yes, every sister project is in this sense a "blocker" because we are here, among other things, to serve those projects. Again, we can consider something spammy, but it is up to Wikidata to make their own determination about their notability threshold, which is a low threshold. Having an item exist on Wikidata isn't an argument to keep 14 photos of that subject; it isn't an argument to create a category for that subject; but it is an argument for keeping the photo Wikidata is using. If Wikidata decides that the subject is above their threshold of inclusion, then the are entitled to have one photo of it, and Commons is the only place that can be hosted. - Jmabel ! talk16:54, 22 June 2024 (UTC)[reply]
@Jmabel: Please do not make this into a Gordian knot or an ouroboros; WD is also here with a level of serve the projects, hence their first point in d:Wikidata:Notability. NOTABILITY 1a) For an item to be notable at Wikidata there are numbers of hoops, and the main one for that is that a SITELINK exists. NO files at Commons are sitelinks at WD, that is only to galleries and categories, and that is the zone that we control. [Note that having an image is not a notability criteria.] 1b) The item has to have a range of other criteria including links to other items, and links to the item. 2) We are all editors at Commons and Wikidata, so I am more than comfortable identifying an abuse of process of the two, and ask for the file to be deleted, and for the WD item to be deleted. At this end, I have more investment in the deletion process and involve myself from both sides, and both methods DR and Speedy; whereas I leave things up to them for their processing. 3) If there is a level of wait for their processes to flow, then I nom there, and DR here with comment about awaiting resolution at WD and we have enough lag that it will resolve one way or the other. Imperfect, though it gets rid of most dross. At that point, if they make the decision to keep, then so do we. It doesn't prevent us having some rigour and challenge the dross. — billinghurstsDrewth23:43, 22 June 2024 (UTC)[reply]
This is (or at least started out being) a discussion about deletion of files, not about categories. Of course we can get rid of a category even if it corresponds to a Wikidata item. Similarly, in theory, we can get rid of a category even if it corresponds to an article in the English-language Wikipedia, or in any number of Wikipedias, though we seldom do, even for categories with only one photo. But (barring copyright issues) we should not be the ones to get rid of a file that is being used by a sister project without first resolving that use on the project in question. And, yes (give or take a few people who are blocked) people here on Commons may participate in whichever of these sister projects they choose, but these projects are distinct overlapping communities that do not always reach the same consensus on everything. - Jmabel ! talk00:16, 23 June 2024 (UTC)[reply]
Wikidata has what they concurred to be a "common sense" conflict of editing policy. They are regularly deleting conflicts of interest, and components of self-importance. An image uploaded here and an item created at WD by the same person and neither has an editing history at Commons or WD, clearly aligns with the F10 criteria here, and there should be a deletion process. There is no 100% purity, so applying good sense to interpreting our clear principles here and there, and let the community manage attempts to abuse us. The UDR process is here to be used by anyone who thinks that there is a mistake, so we can have resolution as required. We are never going to be perfect, and we have a resolution process. — billinghurstsDrewth01:47, 23 June 2024 (UTC)[reply]
Those seem like a bit of a special case. Those categories are less of an intersection by date, more photos of an event which is known primarily by its date. I would hesitate to generalize from there. Omphalographer (talk) 19:43, 17 June 2024 (UTC)[reply]
Having complete coverage of useful subjects is generally a good thing - that's why we have GLAM collaborations. While there is a need to avoid actual duplicates (and yes, a few uploaders need to be reminded of that fact), I don't think that's the main factor behind the increasing number of questionable categories. A great many subjects are extremely complex and will rightfully have a large number of files; our category system should be designed to allow users to navigate those files efficiently.
The key issues to me are:
How do we decide which intersection categories are useful and which are not (the reason I started this discussion)
What tools need to be created to enhance the category system and have it adequately serve the 100+ million files on Commons, and
For tools that are beyond the resources of Commons volunteers to create and maintain, how do we get the WMF to prioritize making them?
Without steering the discussion too far away from that first point, there seems to be general agreement that we need the ability to arbitrarily subdivide a category by certain parameters. In particular, we need a tool that can allow you to find all files in a category (or category tree) within a selected date range - and have it be available on the category pages as well as in the search function. This would eliminate a large swath of intersection categories - in particular, the ones that require the most labor to populate and provide the least benefit. Division by date, file type, resolution, and license could all be done with existing structured data; it would provide a great deal of functionality without getting into any areas likely to be controversial. Pi.1415926535 (talk) 06:08, 18 June 2024 (UTC)[reply]
The question of intersection categories is a constant one on COM:CFD. Essentially, nearly every topic category is an intersection category. Thus I don't think how we treat discussions on intersection categories is really that different from how we approach topic categories in general. There are sometimes calls to arbitrarily limit the amount of sub-categorization of a topic (essentially a process of intersecting categories), but these rarely gain consensus as it usually involves unintended consequences and lots of exceptions. Essentially, the basics are that there ought to be enough media to support an intersection category, that the intersection be sufficiently distinguishable to make a distinct category, and there it offer some meaningful sub-division of its parent categories. If any of these are in question, the issue is raised and discussed at COM:CFD (or other appropriate forum) and consensus for that particular use case is implemented. Basically, if a user creates a category, so long as it does not violate Commons category policies, until there is a consensus that it is not useful, it is presumed useful and kept.
As for tools we need, better and more accessible search tools that obviate the need to use categories as search criteria would be high on my wish list. There are some good SPARQL query tools and gadgets that are helpful for those users with the initiative to learn and apply them, but they are not available to the mainstream and most users I would presume are not interested in learning to code just to view the media they are looking for. User-friendly interfaces built directly into the Commons interface (not requiring opt-in gadgets or third-party sites) are a must to make these useful for more than a small group of users.
Clearly, involvement beyond volunteer contributors will be needed for at least some of these tools, and I have absolutely no clue how to push that cart. I'm not politically minded, nor am I steeped in the inner circle workings of the project, so I wouldn't even know how to start such a drive, but I'll gladly sign on to voice support for valuable tools. I added my name to several items on the wishlist a while ago...not sure if that did anything.
While I continue to look forward to better tools and development in the future and am eager to see how we can use them to best effect, I have been doing Wiki projects since 2005 and the idea that we shouldn't worry about making current systems work since there is a new tool just around the corner that will solve all problems is a song I have heard for going on 20 years now and one that all too rarely fails to deliver on the promises. Thus, my approach is to make the system we have now work as well as possible, and here that means making sure the categorization system provides the most robust and accessible system possible for finding media. I'll be pleased the day search tools obviate the need for it, but until that is shown to be true, I will be opposed to any attempt to limit categorization on the promise that search tools are the answer. Josh (talk) 17:03, 18 June 2024 (UTC)[reply]
Subcategories shouldn't be created simply because a category had a large number of files in it. If there are a lot of images related to something, that is what it is; we don't need to introduce artificial distinctions just to make categories smaller. Time-based category intersections in particular seem to have little value unless they're categorizing something which changes over time in a way that's significant to the topic. Photos of a person, perhaps (although breaking it down by month would still be excessive); photos of generic topics like weather, not so much. Omphalographer (talk) 15:01, 17 June 2024 (UTC)[reply]
Chronological categorization seems to be coming under increased scrutiny lately. As the number of hosted files continues to grow and topical categories get larger, there seems to be increasing efforts to diffuse topics by date, and by increasingly precise dates at that. Beside appearing to be a benign effort to diffuse bloated topic categories, there can be specific technical value to categorization by specific date for maintenance, curation, and some specific research efforts. However, the unintended consequence is that topic files become buried in layers of date-based categorization, frustrating most users looking for images of that topic who are not concerned with exactly when the image is of. I think it is fair to conclude that the vast majority of users looking for images of snow in Massachusetts do not care if happens to be January or February of 2012, or if it is February of 2012 or 2013. Maybe one might want to focus on more recent times vs. pictures of yore, and some may indeed be interested in January vs. February for seasonal differences (not caring which exact year), but most don't probably care about the exact year even, much less month or day. Requiring them to select one particular moment in time to see images is very frustrating for most. This also makes normal diffusion more difficult, as files moved to specific dates are less likely to correctly get diffused to more appropriate sub-categories (e.g. a pic already moved to Snow in Massachusetts in February 2012 is less likely to ever be correctly be put into Snow in Boston as it no longer appears directly under Snow in Massachusetts).
The solution to this dilemma would be one where the files are available in the main topic category for users to look through without requiring they select a specific date to be limited to, while also permitting the images to be collated by date to whatever level of specificity is meaningful for the topic. We have the same issue with categorization by media type. Most users are not looking for images of a precise type to limit their browsing to, but such categorization does have a lot of technical value to specific users. What we have done is make the Category:Media types tree separate from the Category:Topics tree. Files can be added to the Category:Media types tree and be diffused to whatever exact media type specification they fit, however, they are not to be removed from the Category:Topics tree (i.e. the main topic categories for the file). The Category:Media types categories are to be "__HIDDEN__" which puts them in a separate list of categories (only visible to users who elect to see non-topical categories).
We could adopt this approach for date-based categorization, creating Category:Chronological categories as a separate non-topical tree. This would allow continued categorization by specific date without diffusing topical categories. For example, in the case of Massachusetts snow, all files regardless of date would be present under Category:Snow in Massachusetts, a topical (visible) category. The files themselves can be additionally categorized by date under Chronological categories to whatever level of precision those involved feel is warranted. They could be accessed from the main category via Snow in Massachusetts by period which would still appear under the main category for all users. Josh (talk) 17:44, 17 June 2024 (UTC)[reply]
Is there a reason specific dates can't be off loaded to structured data? I think that would be a better way to do things since we are already having issues with the amount of needless categories in general. Really most categories could, and probably should, be off loaded to structured data at this point. --Adamant1 (talk) 19:54, 17 June 2024 (UTC)[reply]
To query for images from a certain date, just use inception date (wdt:P571) in a structured data query at https://commons-query.wikimedia.org/. We need to stop treating categories like a query language. They are ill-suited for that purpose. And if using structured data queries is too difficult, we should get the WMF to add more capabilities to Special:MediaSearch, for example, filtering by date. Nosferattus (talk) 20:48, 17 June 2024 (UTC)[reply]
@Nosferattus, structured queries are great, and you are right that categories are not queries! Unfortunately, requiring users to have SPARQL knowledge in order to search for files is, I fear, a bridge too far. We need tools to bridge that gap and bring query functionality to a broader user base before we can point to that as the answer to the problem. Josh (talk) 16:08, 18 June 2024 (UTC)[reply]
@Adamant1 That is absolutely the way it should be done, but right now the structured data just isn't there yet to make this an accessible option for a lot of users and use cases. I'm all for moving that forward. In the meantime, this issue will persist. Josh (talk) 16:05, 18 June 2024 (UTC)[reply]
I do think date categories are useful, especially when you are looking for other photographs taken on a certain date or month of the same subject and either harmonize their categories or create a new category, or to check whether a photo can indeed be taken on a certain day (if there is snow on the other photos instead of rain or blue sky, you know that something is wrong).
I disagree with Omphalographer that subcategories shouldn't be created simply because a category has a large number of files in it. If a topic category has more than 200 files, it should be broken down into subcategories to keep a good overview. Those new subcategories should preferably be topic categories, not date categories. Exceptions might be longlists and subjects like all the pages of a book.
I agree with Josh that files in date categories should also be available/stay in topic categories, for the reasons he mentions.
I agree with Pi.1415926535 that we need a tool that can allow you to find all files in a category (or category tree) within a selected date range.
@JopkeB: We have a tool that allows you to find whatever you want without abusing catagories. Want to see all the photographs of snow taken on January 1, 2009? Here you go: https://w.wiki/ARP$. Just click the Run button. Frankly though, I think the use case of needing to find images of a certain subject at a certain location on a certain date is entirely made up. Who has ever actually needed this? I certainly haven't. And if we're going to diffuse by location and date, why stop there? We could also diffuse by file type, license, aspect ratio, color vs. black and white, photographer, etc. The point is, categories are a poor substitute for search queries and are not what they were designed for. But when all you have is a hammer, everything looks like a nail. Fortunately, we now have other tools, so we don't have to keep abusing categories. And if writing queries is intimidating, just ask Magnus or the WMF to create whatever type of search interface you want. The data is there. We don't need categories to redundantly encode it. Nosferattus (talk) 14:39, 18 June 2024 (UTC)[reply]
@Nosferattus, we do diffuse by many of the things you list, such as photographer, color, file type, and a hundred other criteria, depending on the topic. Scoffing at those who aren't prepared to create SPARQL queries to view what they need is not the answer. Also, no, the structured data is not all there yet, so queries are rarely complete. In theory you are not wrong...in fact a good enough database with a good enough search interface may make categories completely obsolete. We aren't there on either the data or the interface yet. If it is as easy as you claim it is, then go forward, get WMF to create the interface and demonstrate how the average user can easily use it in lieu of category browsing and maybe then you will have a valid argument to not use categories. Until then, we need categories to work for the broad user base that continues to need them. Josh (talk) 16:25, 18 June 2024 (UTC)[reply]
You do have a good point that the data and interface are not complete. However, in my opinion, they are a lot more usable than our terrible category system which is only getting less and less useful by the day, mainly due to over-diffusion. Nosferattus (talk) 17:02, 18 June 2024 (UTC)[reply]
@Nosferattus And I agree that over-diffusion is a serious issue. This is why I've floated the idea, at least in the case of the date diffusion, to remove that from the topic category tree and make it a separate tree, leaving the files undiffused in the original topic category. I see search tools as a valuable adjunct to this, in fact. I am not personally well versed in our third-party interfaces, but I would like to build a template (or bit to include in existing templates) that invokes a search to identify any files in a non-topical category that are not still present in the related topic category. This would permit easier maintenance reversing incorrect removals from the topic category. Josh (talk) 17:09, 18 June 2024 (UTC)[reply]
Thanks, @Nosferattus: for the link. But I am one of those for who aquiring "SPARQL knowledge in order to search for files is ... a bridge too far." as Josh says. And I agree with him that this tool only show files with the correct structured data, what is not good enough for me. JopkeB (talk) 09:44, 19 June 2024 (UTC)[reply]
@JopkeB, as you can probably guess, I'm in agreement of most of what you've written here as many of these ideas are things that have come op in CfDs and other forums we've participated in. I am not a fan of placing hard arbitrary lower or upper limits on category size because I find topics and structures where very large or small categories do make sense within the scope of the topic and the available media. Of course, truly bloated categories do need to be diffused into meaningful sub-categorization, preferably by multiple different criteria. They don't even necessarily need to get to 200 files to warrant this in many cases. I do see you accept some exceptions though, which is good. One issue is stating 'subcategories should preferably be topic categories, not date categories.' I think I get what you are trying to say and I agree, but the root of this discussion is based on the fact that 'date categories' are 'topic categories' at the moment, at least structurally. The main effect of this is that, per COM:OVERCAT it is in fact required to remove a file from the main topic category when adding it to a date category. This is why I'm floating the idea of breaking date cats away from the topic tree and making them their own tree, akin to media type cats, thus reversing that requirement and making it a requirement to keep the media in the topic cat when adding it to a date cat. Josh (talk) 16:38, 18 June 2024 (UTC)[reply]
I'm not a big user of date categories, though I've had to work on several of them as part of being a broad-based maintainer. Commons doesn't have an exact analogue to Enwiki's non-diffusing categories, but we do have major category trees (see major category policy) which are similar in effect, in that categories under one major category do not diffuse categories in another major category (e.g. a file in a Media types category does not diffuse from the related Topics category; the file should exist separately in both). This is why I presented the idea of creating Category:Chronological categories as a major category and making all 'by date' categories fall under this, prohibiting diffusion of the original topic category. I understand that nuking date categories would be your first choice, but appreciate your understanding that at least making them non-diffusing is an improvement that should be made if they are to be retained. Josh (talk) 17:21, 18 June 2024 (UTC)[reply]
I also support this idea. Whether we ultimately keep these date-based subcategories or not, infusing their contents back into parent categories is a clear step in the right direction. Omphalographer (talk) 17:21, 18 June 2024 (UTC)[reply]
Support the idea in absence of a better option like moving dates to structured data. The whole thing just seems circular though. We don't fully embrace SDC. So it's not properly implemented, naturally leading to lower adoption rates Etc. Etc. I'd like to at least see a realistic plan with some implementable steps to move in that direction. Along better guidelines and enforcement around these kinds of things. Although admittedly both are tangential to the current problem and I have no issue with Josh's idea in the midterm. --Adamant1 (talk) 21:20, 18 June 2024 (UTC)[reply]
I absolutely agree with the suggestion of always keeping/ automatically adding some other category (by default, the parent one, in no better one is available) besides date categories. COM:OVERCAT has been misused and abused a lot here, unfortunately, either by lack of good sense either by some obsession with pigeonholing everything people see in a cat - and pushing files into data cats while removing the parent cat is only one of those situations. DarwinAhoy!17:50, 19 June 2024 (UTC)[reply]
@Ghouston You were more visionary I guess. It is indeed a problem that isn't always apparent until it has grown into a real monster. Regardless, I've seen it really come to a head in a lot of different topics over the last year or two and while we can't go back to 2013 and fix it then, we can do something now. I've created Category:Chronological categories as a base and a corresponding CfD to get input over there, but it sounds like a lot of support for not allowing diffusion by date to remove the files from non-date topical categories. Josh (talk) 17:42, 21 June 2024 (UTC)[reply]
What to do in this case
While we continue to discuss broader issues, I'd like to get some consensus as to what should be done in this specific case. AnRo continues to create hyperspecific categories such as Category:Spring 1951 in Boston, many of which have very few files, and they have not responded here nor at their talk page since this discussion began. To my mind, this is now an administrative matter - disruptive editing and a refusal to communicate - that needs action. Pi.1415926535 (talk) 18:03, 18 June 2024 (UTC)[reply]
This is just exacerbating the situation. There is some good discussion going on about how we can improve the system, but no matter what systemic changes we make, a disruptive effort by someone seemingly unwilling to participate in discussion is always going to be a problem. Unfortunately, some specific further action on that front might be needed. They don't even seem to be employing a consistent naming structure to these categories. Josh (talk) 22:59, 19 June 2024 (UTC)[reply]
As part of a Wikimedia France project on sign languages I uploaded better files for a whole category of 100 files and I have the authorisation to remove the old files. I see Wikidata uses these files 83 times (see https://glamtools.toolforge.org/glamorous.php ). I would like wikimedia projects to migrate to the new .webm files, and delete the old .ogv files. How should I proceed ?
Done: I added the `other_versions` value to the {tl|Elix}} template. Maybe it will help bots to detect those pages, remove them, and put a Redirect in place. Yug(talk)18:37, 18 June 2024 (UTC)[reply]
June 19
New changes to the "Depicts" step in UploadWizard available on Beta Commons
Hi all! I wanted to announce that on Beta Commons a new version of the "depicts" step of UploadWizard is available for testing.
A brief note about the changes:
basic "depicts" annotations (and other statements set up in campaigns) without qualifiers or references can now be added on the same page where the user is entering captions, locations, etc
the separate extra page for adding structured data is removed from UploadWizard
qualifiers and references can still be added on individual File pages as before (and that will take only one extra click)
The reason for us doing this is that we're hoping that by simplifying depicts annotations we'll make it easier to spot copyvios (and in particular FoP violations). The drawback from a user perspective that we already know of is mostly for users who might be uploading multiple images at once with non-depicts statements and/or qualifiers and references, and copying those from the first image to all other images in the upload. This functionality is no longer available - as far as we can tell it's not used much, but if there are people using it then we'd like to hear from those users who use it.
Why does the Main subjects visible in this work (optional) label have a “pointer” cursor? Clicking it doesn’t seem to do anything (it doesn’t focus the input field).
Why will this only be available until Monday afternoon?
It'll be available until Monday afternoon because we have to revert it by Monday evening, otherwise it automatically goes to production. If there's demand for more testing we can re-enable it on beta on Tuesday (until the following Monday) CParle (WMF) (talk) 14:09, 19 June 2024 (UTC)[reply]
I think this change has a lot of potential!
Regarding adding more structured data with "one extra click", I think there should be a button for that on the last page of the wizard (next to each uploaded photo).
Why do you call it "Main subjects visible in this work" when the values are used in a depicts statement? I think inconsistent terminology creates confusion.
I think the created depicts statements could be automatically marked as prominent ("The most prominent subject(s) depicted in this work").
Do you (or do you plan to) suggest categories based on the chosen Wikidata items?
There's a link to the File page for each image on the last page, and that's where you can click the "structured data" tab and add extra structured data. We actually talked on the team about adding another link directly to the structured data tab, but weren't convinced that the extra clutter on the page was worth it
We're calling it "Main subjects visible in this work" because we weren't sure if new uploaders would understand "depicts" (we had a lot of discussion about the exact wording). Open to other suggestions if you have them
I'm not sure this is really necessary, but if there's community demand for it we can certainly do it
It's an interesting idea, but likely to be a lot of work to figure out how to do that (and it's not on our roadmap atm)
@Sannita (WMF) I usually upload dozens of photos at once, and I used to upload them in batches so that it was very easy to put the "depicts" with the "apply to all" or "copy to all" function. If this is no longer available, and we'll have to pick depicts for every single file, I believe I'll stop adding them entirely. The extra field will be only additional, useless clutter, and I would frankly prefer if it wasn't there at all, or was hidden in another tab, as before. So, to me, in such a workflow, it's an unwanted change, and one for the worst. But I really don't get how do you think this change will help spotting copyvios, to start with. DarwinAhoy!17:37, 19 June 2024 (UTC)[reply]
@DarwIn you actually can still copy "depicts", just not other statements (or qualifiers/references)
Freedom of panorama copyvios are the most common reason for deletion requests on commons (see here https://phabricator.wikimedia.org/T340546). What we're hoping is If someone uploads a picture of, for example, Burj Khalifa then they'll add depicts:Burj Khalifa, and we'll be able to spot the FoP violation automatically
Note that we don't propose to take any automatic action right now, but we're working towards better automatic detection of files that are likely to get deleted CParle (WMF) (talk) 13:58, 20 June 2024 (UTC)[reply]
@Sannita (WMF): I tested out the changes and it seems like an improvement. Two tangential issues though:
Why is UploadWizard only setting the depicts statement? Why not also set inception, media type, and copyright status? You can get those for free from the existing fields.
Phabricator bug T261764 makes using the depicts interface error-prone and confusing, especially for newbies. Is there any chance that the Wikidata folks could help fix that?
I started this gallery in my user space. Before I move it to the public space (ideally a more comprehensive version), I would like to make sure that such a gallery complies with what is considered acceptable in this project. Otherwise I would gladly leave it where it is. There is not corresponding category after all. But from the reader's point of view, I see multiple practical purposes of such a gallery, for example can it serve as an aid to identify species that they have seen in a garden. Stilfehler (talk) 16:57, 19 June 2024 (UTC)[reply]
Is it correct to use the "Extracted from" template for images that are merely filtered and aren't a crop or "extracted" in any meaningful way? (And if it *is* okay, what's the difference between that and *any* derivative image?)
Since I didn't receive a response or any form of engagement the first time round, I figured it would be more productive to ask here whether or not this was okay or not? Ubcule (talk) 19:03, 19 June 2024 (UTC)[reply]
The template documentation of "extracted from" explicitly says that it is for cropped images. However, the two examples are probably not derivative works either in the sense of Commons:Derivative works (you do not seem to be claiming to have added distinctly copyrightable content). They are probably simply "retouched pictures". -- Asclepias (talk) 19:27, 19 June 2024 (UTC)[reply]
@Asclepias: - Okay, so we're both agreed that the use of "extracted from" for such images is misleading and incorrect regardless.
Sometimes a source is just a source. IMO, the simple normal link in the source field does just fine. Optionally, a thumbnail can be displayed in the other versions field (with or without particular format). -- Asclepias (talk) 21:59, 19 June 2024 (UTC)[reply]
@Jmabel and Asclepias: - The problem with {{Other version}} is that it doesn't make clear which is the original or "parent" version. Particularly if I'm uploading a modified version of an existing image, I like to be clear that mine is the "derivative" version (in the more general sense) and to be able to make that relationship obvious in a manner that's clear and consistent for both users and automated processing.
Ditto simply displaying a thumbnail in the "other versions" field- it has no semantic meaning.
@Asclepias: - I honestly haven't come across anyone complaining about my use of {{Derived from}} until now, and I've been editing here for a long time. I'm still not 100% convinced that "derivative" in this sense *was* necessarily required to include/imply "distinctly copyrightable" input?
Merely reminding that the documentation of the template "Derived from" says that it is specifically for derivative works, which has a precise meaning legally and in the Commons official guideline on the matter. If you want to use the template in a broader sense than what that says, do what you want. But then you could hardly complain when Mewhen123 uses the template "Extracted from" in a broader sense than what it says. -- Asclepias (talk) 00:39, 20 June 2024 (UTC)[reply]
@Jmabel and TheDJ: - Apparently you are technically correct that- by its own definition- the Commons' {{Retouched picture}} template covers anything "which [..] has been digitally altered from its original version". By that definition, this would count anything up to and including even (e.g.) a re-saved JPEG with no visible difference as "retouched"(!)
Regardless, I suspect that the majority of people would take "retouched" to mean an image which had undergone more serious and active modification of the fundamental content itself beyond (e.g.) trivial brightness, contrast, colour balance tweaks etc., even if that wasn't actually the case (Note this redirect on English Wikipedia).
You'll understand why I might feel this to be misleading, whether or not it falls within Common's (own) definition.
Regardless, it seems odd- and frustrating- then that we don't at least have a proper "most general case" root template to indicate that one file is based or derived (in the more general sense) another "parent" image and nothing more than that.
And, as TheDJ already indicated above, hacking such information non-systematically into "freeform" fields is no use to to automated systems, Wikipedia's included. Ubcule (talk) 13:47, 20 June 2024 (UTC)[reply]
June 20
Category:Engraved illustrations from Iconographic Encyclopedia of Science, Literature and Art, Published in 1851
Похоже, была ошибка атрибуции в источнике, который использовался при загрузке. Наверное, был образ диска, где эти иллюстрации були смешаны с иллюстрациями из ЭСБЕ. Нужно ботом описания исправить. Займусь задачей, только нужно сначала сверить действительно ли все ли они из Iconographic Encyclopedia --Butko (talk) 15:34, 20 June 2024 (UTC)[reply]
The comment in support of this proposal was: "I would like to replace it with this icon being more modern and slightly more accessible" by Manjiro5
As the template talk page is unlikely to get much traffic to comment on this, but the template is widely used, I am putting it out here on the VP for comment before we move forward with the edit request. Josh (talk) 17:33, 21 June 2024 (UTC)[reply]
Who actually belongs in this category? Looking at the images placed here it looks completely random and quite frankly arbitrary. Most of the people located here are not even working in media Trade (talk) 08:42, 22 June 2024 (UTC)[reply]
Personal opinion: It is limited value trash, though it is long existing trash that seems to have been accepted and have subsidiary categories. — billinghurstsDrewth12:33, 22 June 2024 (UTC)[reply]
Depending on the definition every person who is notable for a own category on Commons could be in that category making it completely useless. GPSLeo (talk) 13:32, 22 June 2024 (UTC)[reply]
My preference is for Category:People. Looking at the setup of both categories it's where you can reasonably find what you're looking for. When I think celebrities I think actors, athletes, even influencers. All of those are categorized elsewhere. This is like having two sock drawers but one doesn't even have any socks. ReneeWrites (talk) 16:13, 22 June 2024 (UTC)[reply]
Comment I think that it can be argued that the category should have no people listed directly as they would fall into a specific subsidiary category, and probably no images. If they are that much of a celebrity they get their own category which is linked to wikidata. If they are not that much of a celebrity, they are failing. — billinghurstsDrewth14:15, 22 June 2024 (UTC)[reply]
Hi, I support deletion of this category, although it is useful as a honeypot to catch copyright violations and vanity images. Yann (talk) 16:19, 22 June 2024 (UTC)[reply]
Is there something like TinEye or Google Image Reverse Search for Wikimedia Commons?
Does Wikimedia Commons have a function where I can input image files and search if any similar-looking images have been uploaded on Commons? --MaplesyrupSushi (talk) 02:24, 23 June 2024 (UTC)[reply]
I don't see a duplicate search button there, is enabling a gadget needed for that? I just have a firefox addon by which one can right click to reverse image search. However, the proper approach to this would be, as suggested earlier, create a script/bot that scans through all the WMC files to find likely copyvios and e.g. put them all in a category for them to be reviewed rather than a huge time-sink of reverse searching individual images manually. Prototyperspective (talk) 09:11, 23 June 2024 (UTC)[reply]
Any reason not to just use Google Image itself and look for images on Commons?
It might help if you said what is the issue you are trying to solve for which you want this tool. There may be a way to solve it other than the specific way you are suggesting. - Jmabel ! talk18:17, 23 June 2024 (UTC)[reply]
If I understood you correctly, that is only about duplicate files within Commons but I was saying there needs to be a Tineye/GoogleImage reverse search bot that identifies files that are likely copyvios needing manual review.
I see, thanks, the Category:Duplicate besides being a hidden cat, is put there by a human. So it doesn't help anyone on this thread.
Interestingly the very first such image I looked at, was wrongly catted, as a duplicate. This file is anything but a duplicate, I have reverted that edit. - Broichmore (talk) 10:33, 24 June 2024 (UTC)[reply]
@MaplesyrupSushi: For the kind of artwork, you’re doing, its essential to attribute the artist/s, the publisher, and the true originating owner museum (the true source of the image) that's step 1.
Next, it's search through commons, using different search terms, and variations on the artists name, till you have the category for the particular artist complete. Should you have missed an image it’s not a problem.
This is because 18th and 19th century Lithographs and engravings are all unique, even if they come from the same book and edition, they are different, that’s why provenance and ownership is important, they are different in terms of condition and hand painting, and they should not be overwritten by anything other than the exact same print, from the same book, in the same museum.
Ranjeet Singh and his bolster image discussions are proof of that.
Your problem is not really solved by using google and TinEye, very often the exact original title searched for using simple SQL syntax will uncover more of the same item, here and on the internet. As the images are unique, as described, the checksums differ to, they can be too difficult for even google to cope with.
Variations of the same print are something we want! Degraded duplicates of the same image are not.
I have to say that engravings from Victorian newspapers, and 18th and 19th century lithographs on commons are routinely uploaded regularly with scant regard for any of the above. If they were obviously modern images they would be deleted on sight.
As we discussed before, quality of the source of the image is critical, I can't think of a worse source than the Panjab Digital Library, and because most of the images are PD, there's a multitude of real honeys out there. Broichmore (talk) 20:23, 23 June 2024 (UTC)[reply]
@MaplesyrupSushi I forgot to mention two concepts to you, that enable search in commons.
Firstly, all these files (we have of paintings, engravings, historical photos) need the Artwork template. The essential fields to fill in, are: the artist, the title (as given by the publisher or artist), the date (of publication, creation, in the description field all of the available captions or at least the relevant keywords from it, the institution owner. Nice to have's are the medium, and dimensions.
Finally, secondly, the museums Accession number, (for the British Library that's the shelfmark number). Any other given numbers can just go into the description field.
This last item the Accession number is critical, because commons has bots, that occasionally scrape museum sites, and it looks for that number, if the bot cant find it, it will upload a duplicate. The minimum cats are the artist, and engraver (or publisher if not available). Nice to have's are the museum source collection, and the most pertinent descriptive cat.
Sikh images and Indo, Pakistani original art can be a problem, for these images you can only fill out what you know, but I think you'll find that on finding an image, you will find variants elsewhere that will provide the bits and pieces you need. Be aware that some sites strip this information out, they are in the business of selling prints for decorative purposes, academia means nothing to them, they don't pay out against copyright claims unless challenged. Their defence being that to the best of their knowledge the items are PD, something which they don’t highlight on their sites.
It goes without saying that licencing is essential for uploading, for your stuff that will come with PD USA. The uploader is not the author, the license is always an art one.
I have been working with indexing commons photos images using imagehashes. Current status is that I have 70% of jpg/png/webp/tiff images indexed and there is REST api for querying hashes. However, currently there is no web interface for querying commons images by files or urls as my focus has been on indexing and publishing the database and not yet in the web UI. (see proposal, github). In any case if you have immediate need I could add minimal web form interface for querying commons images using files/urls to toolforge if there is need for that. --Zache (talk) 12:16, 24 June 2024 (UTC)[reply]
Sometimes links to existing pages lead to a Creating page, as if the page did not exist. Currently I have a case like that in Template:GDR propaganda navbar, namely the military link. It should lead to Category:GDR Propaganda; military. But when I click on it, I am told, that I can create the page. (Not sure what would happen if I do.) The other links also turn black, when they lead to the current page. This one does not. Can that be fixed? Watchduck (quack) 17:14, 23 June 2024 (UTC)[reply]
this strange way of creating a "street (Berlin)" has one bad effect. if i type "cat:Schillerpromenade", uploadwizard/hotcat will show me both Schillerpromenade (Berlin-Neukölln) Schillerpromenade (Berlin-Oberschöneweide) as well as Schillerpromenade (Berlin), which confused me (why is there a cat for berlin when there are also two more specific cats?). RZuo (talk) 20:31, 23 June 2024 (UTC)[reply]
"having the same name as" is probably kind of a Berliner specialty, as Berlin is probably world's one of few (if not the only one) capital where certain street names are being used more than once.
The only practical usefulness of this scheme is (IMO), that it can be sorted in parental categories which refer to the name itself (such as Category:Schiller streets in Germany) as a whole, instead of putting each street category into it (usually with the result that some streets are in and some not). Regards --A.Savin12:18, 24 June 2024 (UTC)[reply]
Plenty of these in NYC (where numbered streets in Manhattan and Brooklyn are completely unrelated; I believe the ones in the Bronx do relate to the Manhattan grid). I suppose it's technically not a capital, but it's a bigger city than Berlin.
Seattle, where I live, is loaded with streets that are distinguished only by a directional, and may or may not be closely related. First Avenue S is a continuation of First Avenue, but First Avenue NW and First Avenue NE are not. Burke Ave N is basically a line on the map and refers to 7 unconnected streets that fall on that line; there is a Ravenna Ave NE (in two unconnected pieces), Ravenna Pl NE, and an (also discontinuous) NE Ravenna Boulevard (none of which intersect each other, and one of which -- Ravenna Ave NE -- is mostly not in the Ravenna neighborhood). There's a rather famous corner of Pike Place and Pike Street in the heart of Pike Place Market, but it's beaten for sheer confusion by the corner of Bellevue Avenue E, Bellevue Place E, and Bellevue Court E. - Jmabel ! talk18:56, 24 June 2024 (UTC)[reply]
June 24
No sound for uploaded videos (mp4 files)
I've uploaded recently and noticed the mp4 versions don't have sound, but the webm versions do.
I am using MacOS and Safari, and have tested in Chrome too. I also had a friend test on their Linux machine who confirmed the mp4 files had no sound when streamed or downloaded.
These were originally mp4 files, converted to webm using HandBrake before uploading. The originals play fine for me, but the mp4's on Commons seem to have lost their audio. — Preceding unsigned comment added by Jimmyjrg (talk • contribs)
It may be worth noting that on 19 June, @Jimmyjrg began uploading many video tutorials claiming "own work" and releasing as CC-BY 4.0, when indeed these are derivative works which depict live Wikipedia pages, including text, WMF logos, and images, some of which are in the Public Domain. The latter PD status clearly overrides any ownership, license claim, or Creative Commons licensing. I am concerned, but given their veteran status, perhaps Jimmyjrg can independently rectify the license status for these videos (I count 27 as of this writing). I am unsure of which templates/licenses are applicable, apart from {{Copyright by Wikimedia}}. Elizium23 (talk) 04:42, 24 June 2024 (UTC)[reply]
@Elizium23: I'm not sure what you are saying here. I haven't looked at the videos. Insofar as the videos incorporate (without licensing) images that are not public domain the licenses need to be indicated, but a video can incorporate public domain content to whatever degree its creator wishes. That is the nature of public domain (although a few countries recognize "moral rights" independent of copyright that require giving credit for some public-domain content). Public-domain status is not "viral". I could make a video that consisted of juxtaposing a series of 100 public-domain images each shown for one second; my video would certainly be eligible for copyright, even though no one image in the video is copyrightable. In other words, to paraphrase but reverse you: PD status of images in a video does not override a copyright claim for the video as a whole. - Jmabel ! talk05:22, 24 June 2024 (UTC)[reply]
@Jmabel, you're correct, of course. I certainly did have that backwards. I've struck out the offending remark. But I stand by my claim that these videos are derivative, and incorporate plenty of content that belongs to the WMF and to other Wikipedians, and all content demands attribution by their applicable, existing CC licenses. Elizium23 (talk) 05:28, 24 June 2024 (UTC)[reply]
Hi @Elizium23: thanks for bringing this to my attention. It looks like I have completely misunderstood how licensing works for instructional videos. I'll reach out the WMF team members I've been talking with about this project, and I'll amend the licenses asap. Jimmyjrg (talk) 00:01, 25 June 2024 (UTC)[reply]
@Koavf I uploaded webm videos, but on the Commons file under 'Transcode status' there are links to mp4 files to stream or download. These are the ones I have found have no sound. Jimmyjrg (talk) 23:56, 24 June 2024 (UTC)[reply]
Why are we doing this? (Reasons why WMC is useful)
In the context of discussions relating to Commons:Media knowledge beyond Wikipedia I created an initial version of a list of reasons why Wikimedia Commons is useful or ways it could be used for. You're invited to participate at Commons:Why Wikimedia Commons is useful and add any usecases/reasons that are currently not on the page.
Adding more concrete illustrative examples or barriers usefulness-types would also be helpful.
I think such a list could make WMC more useful than it already is, communicate the value of it to relevant people (e.g. people considering freeing their collections or contributing), and raise awareness/activities of users that eventually raise WMC's value and our work here to the world. Things can also be discussed on its talk page. Prototyperspective (talk) 14:26, 24 June 2024 (UTC)[reply]
A new research report on Cross-wiki uploads have been published
Hello all! I'm happy to announce the publication of the UX research report called "Cross-Wiki Uploads on En and Ar Wikipedias". The research, conducted in collaboration with the Structured Content team, was aimed at understanding how users were interacting with the Visual Editor upload tool. We hypothesized that the UI may contribute to users uploading files as "own work" when the work is not theirs. What we found is, indeed, users are erroneously uploading files as "own work".
Some of the findings of the report are:
14 of 16 users interviewed from English and Arabic wikis uploaded others work as their own, and only a few of those files had been moderated. So the problem is much larger than documented.
This is partly because users interpret "own work" differently, so many believe they have the authority to upload when, according to copyright rules, they do not. This is also because the UI does not present alternative options in a way that users understand (the text on the UI is very confusing to them).
There is widespread confusion about what is/isn't allowed to be uploaded, what constitutes copyright, who holds the copyright, and how does that relate to Creative Commons licenses. The image policy is not accessible or known to users.
Interestingly, we found that most uploaders were either marketers (editing/uploading on behalf of another entity such as their employer), or they were self-promoters (creating pages about themselves, unaware of the "notability" requirement).
@Sannita (WMF) a good research! It is timely, considering that I opened a similar issue regarding some problematic cross-wiki uploads (also through the regular VP forum, I think it is still here – not archived as of this posted comment).
Some two thoughts of mine. Re: "The 'upload page on Wikipedia' statement is the most confusing of all," I think it is due to the general public perception that both Wikipedia and Wikimedia Commons are same, as a single platform for educational content, even if both are different websites that happened to be under the umbrella of Wikimedia Foundation. No matter how we try to educate people about the differences in policies and scopes of the two websites, a layman will still treat both as identical; or, as one website. Even the anti-Wikipedia group ADAGP who opposes FoP does not use "Wikimedia Commons" in their presentation to the EU Parliament in 2015; they used "Wikipedia" instead. Perhaps the upload forms should use understandable, layman's language.
Re: "Participants don't know what to do with files that are not their own creation," I guess a massive copyright education campaign is needed. Explanations should not use excessive legalese or technical terms. For Wikimedia groups and affiliates, reforms to copyright laws must be pushed to simplify things. Much of the complexity of copyright for an average layman is due to the complex copyright laws. JWilz12345(Talk|Contrib's.)16:42, 24 June 2024 (UTC)[reply]
10 out of 11 English users were promotional - 5 self-promotion and 5 marketing. Only one of those 10 was notable enough to be in scope, and even then it appears there were copyright issues. The problem here is not the UI; the problem is that the entire cross-wiki upload system makes it much easier for spammers without providing much benefit to anyone else. At minimum, cross-wiki upload needs to be turned off from the User: and Draft: namespaces (where most of the spam comes from). Pi.1415926535 (talk) 18:52, 24 June 2024 (UTC)[reply]
I would also say before investing much time and therefore money in improvements for the cross-wiki upload we should discuss if we want to give cross-wiki upload a chance or if we want to block it entirely to not approved (autopatrol or similar on other wikis) accounts. GPSLeo (talk) 18:59, 24 June 2024 (UTC)[reply]
Second in motion to @GPSLeo's suggestion. Cross-wiki uploading had good intentions, but is easily abused. Perhaps only allow cross-wiki uploading to users who are among one of these user groups: admins/sysops, license reviewers, and autopatrolled users. Cross-wiki feature should be treated as a right with burden of responsibility to the uploader. I'm not in favor of axing out the feature entirely. JWilz12345(Talk|Contrib's.)02:44, 25 June 2024 (UTC)[reply]
We really, really need some process and tools such that, when files are cross-wiki uploaded and don't remain on the page they were uploaded for (e.g. because the edit adding them was reverted, the page was deleted, or the user never completed an edit adding the image), those files can be identified and deleted from Commons with a minimum of hassle. Right now, there's no good way to spot those images, and deleting them will usually require a DR. That's a lot of overhead to get rid of a file which the uploader may not have ever expected to persist on Commons. Omphalographer (talk) 19:11, 24 June 2024 (UTC)[reply]
cross-wiki upload should probably just be blocked outright at this point. As it's clearly an issue that has no easy, implementable, solution. At least not in the short-term. IMO there really needs to be a more clear separation between the projects for something like cross-wiki uploading to work. It's never going to "fixed" if everyone using it just thinks Commons is a glorified subdomain of Wikipedia though. --Adamant1 (talk) 22:28, 24 June 2024 (UTC)[reply]
Just about the logo question: maybe the study sees is too much form a Commons perspective, despite studying English Wikipedia and people contributing to articles there: personally, I think the default recommendation should be to upload it for "fair use" directly at enwiki. Enhancing999 (talk) 22:46, 24 June 2024 (UTC)[reply]
Yup. Possibly combined with a "logo" branch in a Wizard that tries to work out what is the relevant country, then describes the relevant threshold of originality and asks whether it is clearly above (keep it local), clearly below (go to Commons) or just plain unclear (keep it local, which is safer). Also, logo + "own work" is almost always wrong, though of course it is very occasionally correct. - Jmabel ! talk23:08, 24 June 2024 (UTC)[reply]
It doesn't seem like English Wikipedia wants to host images of logos going by the number we have, but that would be the better solution. Although it would then screw over the ability of other projects to use the files in a lot of cases. So maybe it's not the best way to go about this. --Adamant1 (talk) 23:16, 24 June 2024 (UTC)[reply]
en-wiki routinely hosts logos that are over the threshold of originality, if they have an article about the organization in question. - Jmabel ! talk23:29, 24 June 2024 (UTC)[reply]
MIDI file transcoding
Hi, I've uploaded some MIDI files of chords to Commons (e.g. File:3-4A set class on C.mid) which I need for en:List of set classes, and I gather that eventually Ogg and MP3 audio files are generated. How long do I have to wait? Or, is there some arcane incantation I need to perform that I've missed? Or do I need to add it to a queue somewhere? Is there some documentation somewhere that I can read about it if so? I have so far not found anything. — Jonathanischoice (talk) 22:06, 24 June 2024 (UTC)[reply]
Electrical equipment in the background
Can anyone say what is the equipment in the background here? I'd like to add further description and appropriate categories. - Jmabel ! talk23:32, 24 June 2024 (UTC)[reply]
@Glrx: Yeah, that's a bit like what it looked to me as well, though you have a much more closely matching example. If it's correct, it raises a question of what two Seattle City Light employees were doing posing in front of telephony equipment, rather than equipment involved in the generation of transmission of electricity. But if we can be confident that is the case, I guess we don't have to solve the "why". - Jmabel ! talk05:01, 25 June 2024 (UTC)[reply]
June 25
Unable to use CropTool
I'm unable to use CropTool at File:1983. Febrero, 22. Recibimiento del cardenal José Alí Lebrún.jpg. When clicking directly from the toolbar, I'm directed to CropTool's main page, being asked to enter the file's URL or name, after which I'm confirmed that the file exists but I'm not able to crop it. The tool works perfectly in any other file I have found.