Steinkauz had deleted Post #3763838, but after looking into it myself I believe that post is the better quality one, and deleted the other duplicate instead.
Steinkauz, how did you determine which of the two posts to delete? The Pixiv’s original has the same size as the one you had deleted, and when compared on top of one another, #3763838 seems to have fewer artifacts.
You're right, the artifacts are minimal better, but I think that is due to the smaller filesize/compression/whatever. Sorry for the mistake I made there, gladly it wasn't a huge one.
Here is how I compare posts:
Open posts in new tabs via the "Original:"
Tab with ctrl+tabkey between them (if tinted/differently cropped)
If they are the same, I zoom in and compare them again (ctrl+mousewheel/pluskey)
And f they still are the same, I take the picture with the bigger filesize. If not, the biggest/best file is mostly the way to go.
Edit: System told me pixiv cuts "useless" data out of files to keep them smaller, or something like that. That's why sometimes there pixiv file is smaller than the nijie for example, even though they are the same.
Pixiv, nijie, and yande.re may all alter the metadata of originals and may also apply lossless compression to JPGs and PNGs, all of which will cause the md5 to differ and may provide an absolutely bit-for-bit identical image which is nevertheless quite different in filesize. These would be the "baddest" duplicates around as they are utterly identical.
I find rapidly flipping between tabs of two images in the same browser window tends to highlight any differences well.
Steinkauz said:
And if they still are the same, I take the picture with the bigger filesize.
I don’t think that’s smart. (Edit: may actually have misunderstood what you meant) If they’re the same, you should keep the smaller one. I used to think that bigger file size = better quality, but after having worked with a bunch of duplicates cases I’ve changed my mind and think that file size doesn’t really mean much. I’d say you should try to keep the one with best quality, and if they’re 100% the same, then smaller file size is better because it takes up less space.
I think it depends. You can't get data back from a stripped file.
Steinkauz said:
I think it depends. You can't get data back from a stripped file.
But is that data worthwhile? Compression exists for a reason, having smaller sizes is better.
ALAKTORN said:
But is that data worthwhile? Compression exists for a reason, having smaller sizes is better.
ever heard of the term bigger is better? there a reason for it.
I think the handful of GB the bigger duplicates may take are nothing compared to everything else that happens on the channel.
If the file was losslessly recompressed there is no direct problem with it, but frequently these files will have had metadata stripped, which may be worth preserving.
Also if you are losslessly recompressing or stripping metadata from files it is likely the md5s of even the same file processed with the same tools and settings on two different systems will differ, promoting the proliferation of yet more duplicates as opposed to sticking with whatever seems to be the original.
gudnamsedd said:
ever heard of the term bigger is better? there a reason for it.
That’s about something else.
Steinkauz said:
I think the handful of GB the bigger duplicates may take are nothing compared to everything else that happens on the channel.
That’s not a good point.
System said:
If the file was losslessly recompressed there is no direct problem with it, but frequently these files will have had metadata stripped, which may be worth preserving.Also if you are losslessly recompressing or stripping metadata from files it is likely the md5s of even the same file processed with the same tools and settings on two different systems will differ, promoting the proliferation of yet more duplicates as opposed to sticking with whatever seems to be the original.
This is a good point. If bigger file sizes could mean fewer duplicates, then I guess they’re worth having. I’m not sure how much this theory holds up to truth, however.
We did test it - stripping the metadata and compressing files down never gave the same md5 as files which came from another source which had the same processing applied. So each stripped version will likely cause proliferation of identical duplicates.
Plus messing with originals often leads to someone reuploading it from the same source... for a few % difference it seems a lot of hassle.
I am not 100% sure, so I am asking for the sake of not making a mistake. We have two identical pictures:
A has 1000x1000, 500KB
B has 700x700, 700KB
Which one to keep? Higher resolution for quality? Bigger filesize for metadata?
I’d go higher resolution, if it’s of better quality. If an upscale then no.
Usually I would do the same, unless one picture is slightly higher in resolution but half the filesize, then I would take the filesize.
You fail to mention the file type. In the case of PNG the higher resolution would probably always be preferable, for JPGs there would almost certainly be visual differences.
In the case of a JPG it is more likely someone has downscaled a higher resolution original but saved it at a higher quality setting, creating a lower resolution image with larger file size.
This would imply keeping the higher resolution one.
We have some evident deficiencies in the moderation process... the pool of unapproved posts continues to grow, so it would seem further measures are needed to encourage organised approval rather than ad hoc approval as seems to be the only sort going on at present.
I'm not really sure how best to approach this short of deleting unapproved posts or being a nuisance to users who aren't moderating but can (like maybe stripping them of moderation capability if they do not use it).
Do we have any clear guidelines which posts to approve? If we do, I managed to missed them. If not, we need them. For my part, I seldomly approve posts - ad hoc only, as you already stated.
Do you have any numbers at hand how many unapproved posts per day we have? If it's just a massive backlog, maybe we could manage it with some combined force.
How I would imagine a moderation process:
Guidelines to follow, like posts must have an artist/source tag, if not known, request tags, parenting. Also a couple of non automated tags, something between 5 and 10 should be sufficient.
If a post isn't approved, it comes to a second pool while the uploader receives a kind message to fix the deficits if he doesn't want to get is post deleted*. Don't know if an amount of time would be fair, but after an adequate procedure the posts receive its final rating.
Not sure what is realizable/wise from these.
At the moment, we have only the "hide from moderation queue" button, but what exactly happens to these posts? The eternal abyss of unapproval?
Also, some kind of motivation is needed, if ad hoc moderation isn't enough.
Steinkauz said:
Do we have any clear guidelines which posts to approve? If we do, I managed to missed them. If not, we need them. For my part, I seldomly approve posts - ad hoc only, as you already stated.
Same here - for me it's a combination of being afraid to approve something that doesn't meet the guidelines I missed, and not having access to the whole interface.
I can only say that approving is boring and takes a long time. Don’t know what to do.
Actually Steinkauz gave me an idea, maybe we should have a warning message for people who don’t tag their uploads with at least 5 general-type tags, or maybe even refuse the upload.
Also that discussion we had about adding questions/guidelines to aid tagging at upload… that should happen. Like I remember someone proposing a “how many people are in the picture?” question with then tags listed for the answer like “1girl, 1boy, 4boys, 6+girls” or whatever.
http://chan.sankakucomplex.com/wiki/show?title=howto%3Atag_checklist
This list is quite useful for people who don't know much about tagging.
Adding a link to that when uploading a picture could be a start.
If I may I'd like to segue into talking about this proliferation of redundant and erroneous tagging. I'm disorientated on what recourse to take it.
Primarily when it comes to this abundant amount of "Furry" related images receiving such misrepresentative tagging along side with the shear force of posts it has very quickly become overwhelming.
Are new Uploaders/Contributors supervised and assessed to ensure they are not unintentionally causing problems or issues (like perhaps this mass upapproval of posts that has arisen lately)? If not, I think they should be; predominately, the mass uploaders who contribute 100's if not 1000's of images in a short timeframe.
The number of pending posts is at 3374 of last count (you can track it from the pagination of the moderation view too), not an unmanageable figure given the volume of posts and user activity on the site.
What prompted me to make this post was noticing that the number was up 200 on the last time I checked a few weeks ago (at which point we tried to move matters forward a little by posting about it and adding proper pagination to the moderation interface). So the situation has deteriorated slightly in spite of that.
I'm not sure if we do have a clear approval guide up anywhere. It is however a fairly undemanding process - something like:
1. Is the post acceptable content? (not low quality, illegal or otherwise)
2. Does the post have an acceptable standard of tagging?
3. Is the post a duplicate?
The hold up on a properly performed approval of a problematic post is basically checking it is not a duplicate and dealing with it if it is poorly tagged.
Encouraging proper tagging and duplicate checking from uploaders is a slightly different issue, but with clear bearing on this. A tag checklist suitable for inclusion on the upload page sounds like it should be a priority.
Razat, you can perhaps proceed to janitor level if you are interested in a more advanced role.
Zeninth, the approach has generally been to let the uploads finish and then correct the tags with aliases and mass edits.
It is much more obvious what the problems are when there are a lot of posts exhibiting them, although of course by then the "damage" is done.
We can also enforce custom rules on them once we know what the issues are, to replace and rewrite tags before they are added. But to do this we need to know what tags are at issue.
I do have a list of the tags compiled. Should I list them here or sent VIA PM?
Also, I'll look for any other possible tags that I may have overlooked while I double check what tags I have already found.
PM them over and I'll take a look, thanks.
High res PNG vs. lower res JPG. Should the latter be deleted?
Tend to delete#3670720, but would keep it though as legitimate_variation (transparent vs nontransparent).
^I added the tag.
Question: What makes Post #3717222 a legitimate variation? Both it and its parent look 100% the same to me. Its parent has fewer artifacts and so I’d tend to delete the lower quality post.
It's tinted, meaning in this case the color is slighty brigther then in the other one. I would delete it, but the rules. I also noticed that these brigthness differences are always the same. The picture with the higher filesize is darker while the lower filesize is brighter.
To be honest, I'm not even 100% sure if System also wants to keep these.
How do you notice that? I honestly see no difference in brightness. O_o

Top 10 Best Anime Girls of 2015
Hai to Gensou no Grimgar “Has Oppai!”
Picking Up Japan Express Vol. 36 Worth a Pickup
Delectable Dizzy Cosplay by Lechat
Dimension W Out Of This World
System
1 year agoModeration Issues
Any issues relating to moderation - including but not limited to post approval, flagging and deletion, and abusive edits or other untoward activity, can be brought up here.