Wikipedia:Edit filter noticeboard

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Welcome to the edit filter noticeboard
Filter 953 — Flags: disabled; Pattern modified
Last changed at 06:53, 15 February 2019 (UTC)

Filter 614 — Pattern modified

Last changed at 09:01, 14 February 2019 (UTC)

Filter 881 — Flags: disabled

Last changed at 19:34, 12 February 2019 (UTC)

Filter 231 — Pattern modified

Last changed at 08:31, 12 February 2019 (UTC)

Filter 963 (new) — Actions: none; Flags: enabled,private; Pattern modified

Last changed at 19:31, 11 February 2019 (UTC)

Filter 384 — Pattern modified

Last changed at 08:09, 11 February 2019 (UTC)

This is the edit filter noticeboard, for coordination and discussion of edit filter use and management.

If you wish to request an edit filter, please post at Wikipedia:Edit filter/Requested. If you would like to report a false positive, please post at Wikipedia:Edit filter/False positives.

Private filters should not be discussed in detail here; please email an edit filter manager if you have specific concerns or questions about the content of hidden filters.



Setting filter 957 to disallow?[edit]

I noticed there was a lot of vandalism that involved just removing the article lead, so I created Special:Abusefilter/957. There was one false positive which I solved by tightening the filter, and since then I haven't seen any edits caught that aren't clearly vandalism (/test edit), with a few hundred hits since then. I'll be monitoring for a few more days but it looks good for setting to disallow to me. Galobtter (pingó mió) 06:52, 16 January 2019 (UTC)

I've created MediaWiki:Abusefilter-disallowed-testedit for this, which can be used as a disallow when there's a reasonable chance of an edit being a test. Galobtter (pingó mió) 08:06, 16 January 2019 (UTC)
@Galobtter: Good filter. I'm seeing nothing but garbage in the log. My only (hypothetical) concern would be redirects and attack pages. In theory, that should be covered by new_size > 500, but perhaps out of an abundance of caution !(lcase(added_lines) rlike "#redirect|{{((db-(attack|g10))|wi)}}" from Filter 3 should be in this one as well. Suffusion of Yellow (talk) 07:00, 17 January 2019 (UTC)
I was thinking of adding that but I actually did see a couple of hits where someone used a redirect to vandalize, but since there don't appear to be any more, and as the last condition it should cost very little, I've added that. Galobtter (pingó mió) 07:15, 17 January 2019 (UTC)
@Galobtter: I thought we had a filter that tagged new users creating redirects from existing pages. But it seems Filter 28 has been disabled since 2010 (!) for Terrible run time + not catching much useful + filter is overwrought now. Perhaps it could be revived, in more efficient form. Suffusion of Yellow (talk) 07:32, 17 January 2019 (UTC)
Mediawiki automatically tags all edits that convert an page to a redirect with "mw-new-redirect". Galobtter (pingó mió) 07:43, 17 January 2019 (UTC)
That would explain why I thought there was a filter. Easy enough to narrow down to IPs, etc. from Special:RecentChanges if desired, in any case. Suffusion of Yellow (talk) 07:57, 17 January 2019 (UTC)
 Done with the custom message MediaWiki:Abusefilter-disallowed-article-lead-removed. Galobtter (pingó mió) 07:15, 20 January 2019 (UTC)
Back to log only per Special:AbuseLog/23036967, Special:AbuseLog/23039986, Special:AbuseLog/23040599 (granted, two of the three are copyright violations, but that really isn't what this was meant to be for); I've made the filter more focused on blankings of the lead, by limiting the added_lines length - this'll catch most of the vandalism that I originally created the filter for, such as pure blanking of the lead. The vandalism that won't be caught now already is or should be caught by other filters, or is probably very difficult to catch for an automated system. Galobtter (pingó mió) 06:46, 21 January 2019 (UTC)
Filter tuned to reduce FPs, and set to disallow again. Galobtter (pingó mió) 13:39, 23 January 2019 (UTC)
@Galobtter: Any chance this filter could be refined to catch edits like Special:Diff/881907474? Before you made the change on January 23, similar edits to Today's Featured Article were being disallowed, which was great because that was the only thing that Special:AbuseFilter/951 was doing (the other thing 951 was supposed to do is now handled by 944). If there are just too many FPs, no worries. MusikAnimal talk 21:33, 5 February 2019 (UTC)
MusikAnimal, No version of this filter could've caught that specific edit because the filter also checks if bolded text is removed (i.e the lead '''William Dowling Bostock'''), which is because there are legitimate reasons to remove an infobox by itself. I suppose there aren't many reasons to remove the Use English, Use DMY dates, or the short description template, which does give me an idea for probably a related new filter though Galobtter (pingó mió) 05:31, 6 February 2019 (UTC)

Does filter 380 need to be private?[edit]

  • Filter 380 (hist · log) ("Multiple obscenities")

Username Needed suggested at WP:EF/FP/R that this filter should be made public, and I'm inclined to agree. It's similar in purpose to Filter 384, and I don't see anything LTA-related in the regex. I can see an argument for keeping the condition on line 2 private, but really, anyone going through the trouble of gaming that will also know how to keep their vandalism subtler than "fuck fuck fuck". Is there something I'm missing? Suffusion of Yellow (talk) 20:59, 30 January 2019 (UTC)

Already done by Galobtter as I was typing this. Suffusion of Yellow (talk) 21:04, 30 January 2019 (UTC)

"Repeated attempts to vandalise" breaks filter privacy[edit]

Any hit on the "Repeated attempts to vandalise" filter allows the otherwise hidden filter hit to be moved into the public filter log. Should we make the filter private? [Username Needed] 10:23, 1 February 2019 (UTC)

@Username Needed: can you provide a couple of examples? — xaosflux Talk 13:13, 1 February 2019 (UTC)

[1] and [2]. I wasn't stating them to avoid WP:BEANS. [Username Needed] 13:44, 1 February 2019 (UTC)

Thanks for pointing that out, I've made it private. Galobtter (pingó mió) 17:10, 1 February 2019 (UTC)

Phone number tag for edits[edit]

Moved from Village pump (proposals)

Recently I report to oversight a piece of vandalism that had a phone number in it. I wonder, could we make a tag for edits like "possible phone number"? Could the system recognize something that looks like a phone number? As far I can figure there is no reason to put a phone number in any article or for that matter any page. Richard-of-Earth (talk) 05:35, 6 February 2019 (UTC)

@Richard-of-Earth: To avoid WP:BEANS, I suggest bringing this up at WP:EFN instead --DannyS712 (talk) 05:51, 6 February 2019 (UTC)
Counterexamples include many of the articles in Category:Telephone numbers in the United States. Anomie 12:44, 6 February 2019 (UTC)
I don't think publicly tagging the edits would be a good idea (drawing attention to potentially non-public information), but a private filter might be a good idea. There would likely be many false positives and false negatives, so log-only is probably the best implementation. --AntiCompositeNumber (talk) 13:28, 6 February 2019 (UTC)
@Richard-of-Earth, Anomie, and AntiCompositeNumber: I created private filter Special:AbuseFilter/962 to get a feel for how much this may hit on and determine if it is useful. Its having to run a regex on each edit, so we may want to scale it back to just certain namespaces if it gets costly. — xaosflux Talk 15:59, 6 February 2019 (UTC)
Ain't archiving going to consume a lot of the false positives.....? WBGconverse 17:25, 6 February 2019 (UTC)
@Winged Blades of Godric: if so, we can easily exempt page titles containing "/" outside of article space. — xaosflux Talk 17:58, 6 February 2019 (UTC)
Not that (how's that even possible?!), I was talking about adding wayback-machine links to stale sites. See their date-time format.WBGconverse 18:03, 6 February 2019 (UTC)
@Winged Blades of Godric: the WBC 14 digit numbers shouldn't collide with this, if you see FP's let me know though. — xaosflux Talk 18:09, 6 February 2019 (UTC)
Eh bad glance-over; currencies are valued:-( WBGconverse 18:29, 6 February 2019 (UTC)
Wow, great that you guys who clearly know more about this then I do are jumping on it. @Xaosflux:How do I get access to this filter? Richard-of-Earth (talk) 20:07, 6 February 2019 (UTC)
@Richard-of-Earth: we always need more people via.....WP:RFA...... - alternatively special access to private filters can be gained by Wikipedia:Edit filter helper

access. — xaosflux Talk 20:13, 6 February 2019 (UTC)

Note, I've disabled this - need to make a better regex for it, just don't have time today. — xaosflux Talk 20:40, 6 February 2019 (UTC)
@Richard-of-Earth: for your specific concern, are the numbers usually in a specific format? (examples: (212)555-1212; +12125551212; 212-555-1212; 44 20 7499 9000;) — xaosflux Talk 14:29, 7 February 2019 (UTC)
In this case there were no dashes or spaces, just a big number. In this case it was something like "call XXXXXXXXXX for a good time" type message. There is software that detects such things. Whenever I get texts, the software highlights phone numbers. Perhaps there is a white paper or some kind of article somewhere about what criteria is used to detect them.

I think the tag should be "Possible Phone Number" as you cannot really be sure if a given number is an actual phone number. The idea I had was not to detect vandalism per se, but to detect vandalism that needs Oversight RevisionDelete. So perhaps it could be narrowed to edits that get removed as vandalism. I was thinking it would be used by the anti-vandalism editors, but now that I think about it, the editors with the oversight bit could just use it directly and quickly RevisionDelete then. That would make it useful even if the filter cannot be public.

Perhaps other filters could be created for other personal information like emails and addresses. Perhaps we should ask the Oversight people what they would find useful. Richard-of-Earth (talk) 19:43, 7 February 2019 (UTC)

It runs in my mind that email addresses are already disallowed, or at least tagged. Nyttend (talk) 00:31, 8 February 2019 (UTC)
This doesn't seem like a good idea. After all, there are many different formats of phone numbers; NANP XXX-XXX-XXXX is only a small part of the world, and XX-XX-XXXX-XXXX cited by Xaosflux is only one other type. For example, according to Telephone numbers in India, some phone numbers are just five digits long (e.g. 58888), without any - or   dividers, and that's in a country with one of the world's largest number of anglophones, not some small state where almost nobody speaks English. I don't see how one could possibly tag for something like this without a significant number of false positives or without missing a large percentage of potential phone numbers. Nyttend (talk) 00:14, 8 February 2019 (UTC)

Setting Filter 231 to disallow?[edit]

Filter 231 currently only warns and tags, because there are ostensible reasons for typing 50+ characters in a row without spaces, but having checked I think 200+ diffs/abuselog entries of mostly saved changes, the closest to a false positive I've found is a bit of spam (and most .onion urls are blacklisted too..), so I think its reasonable to set this to disallow. Galobtter (pingó mió) 08:56, 12 February 2019 (UTC)

If it's 99% all vandalism (or spam), then I support disallowing. I suggest keeping the custom message, though it does need to be reworded to state the edit was disallowed. Thanks for doing the tedious work of checking for false positives! MusikAnimal talk 17:50, 12 February 2019 (UTC)
Maybe that's okay in 99.999999999999999999999999999999999999999999999999999999999999% of cases, but I can imagine someone occasionally writing out a big number, e.g. 10,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 is a working redirect to an article that contains that precise string. If we're disallowing flagged edits, we should consider not flagging if the string is exclusively digits, or at least if it's exclusively a combination of digits and commas or periods. Nyttend (talk) 00:00, 13 February 2019 (UTC)
The filter won't catch that string. Nihlus 00:17, 13 February 2019 (UTC)
Okay, good then. I saw that I wasn't warned, but I thought maybe it was because I'm an admin. Nyttend (talk) 00:29, 13 February 2019 (UTC)
No filter (or even human) can be 100% accurate, so the goal of filters is to have a very low FP rate, not zero; but anyhow, as Nihlus says, the filter wouldn't flag that as it only counts long strings of alphabetical or numerical characters (i.e, the filter is actually more like "Long string of characters containing no spaces or punctuation"). Galobtter (pingó mió) 08:17, 13 February 2019 (UTC)
See, if I'd known that "no spaces or punctuation" were the situation, I wouldn't have objected; I don't envision someone writing 99.999999999999999999999999999999999999999999999999999999999999% except to make a point (otherwise you could write 99.9999999999999999999999999999999999999999999999999%, just 49 uninterrupted characters instead of 60, and it equals 100% anyway), but the interrupted-by-punctuation is definitely necessary. Thank you for helping me understand better. Nyttend (talk) 01:36, 14 February 2019 (UTC)
Nein nein nein nein nein nein nein nein nein...? I'm sooo tempted to Godwin this thread. Suffusion of Yellow (talk) 01:56, 14 February 2019 (UTC)
@Suffusion of Yellow: ? --DannyS712 (talk) 01:59, 14 February 2019 (UTC)
@DannyS712: Godwin's law: As an online discussion grows longer, the probability of a comparison involving Nazis or Hitler approaches 0.999... Suffusion of Yellow (talk) 02:10, 14 February 2019 (UTC)
@Suffusion of Yellow: I know what godwin's law is, I just didn't understand how it was relevant (now I get it................) --DannyS712 (talk) 02:11, 14 February 2019 (UTC)
MusikAnimal, regarding the custom message, I've been thinking about it and (while I don't think it matters too much) I think the default disallow message works fine here; the edits are basically all (bad-faith) vandalism and so the message doesn't need to be any softer than usual, and I don't know if we should be explaining to vandals why their edit was blocked and thus how to get around the filter (and the should-be-rare good faith editor should report the issue rather than removing the the offending bit of their edit). Galobtter (pingó mió) 20:16, 14 February 2019 (UTC)
@Galobtter: Yeah you're right. Let's just go with the standard disallow message, then. Thanks again! MusikAnimal talk 01:16, 15 February 2019 (UTC)

New User Script[edit]

Those involved with edit filters may b interested in User:Suffusion of Yellow/filter-highlighter, just released by @Suffusion of Yellow. --DannyS712 (talk) 00:51, 13 February 2019 (UTC)

Regex check[edit]

Hi, having a brain fart here - can someone take a look at Special:AbuseFilter/960 for what my error is? — xaosflux Talk 21:48, 16 February 2019 (UTC)

@Xaosflux: I think it's actually the missing parenthesis around 'User:'+user_name+'/' that's the problem. Though I suppose changing the regex to \.(js|css)$ would prevent the occasional FP. Suffusion of Yellow (talk) 22:02, 16 February 2019 (UTC)
There's also the variables new_content_model and old_content_model, which would avoid flagging edits like Special:Diff/883058281 but I'm not sure they're ever filled in unless the model is being changed. Don't seem to contain anything at Special:Abusefilter/examine, at least. Suffusion of Yellow (talk) 22:17, 16 February 2019 (UTC)
Actually, just tried with a "real" filter and seems that they are filled in for all edits. But old_content_model will be empty for page creations. Suffusion of Yellow (talk) 22:23, 16 February 2019 (UTC)