# Should the accuracy and originality scores be kept different; should citation suggestions and accuracy reviews be kept separate, in the reviews section?

It has been suggested in the answer above the relevant discussion, by Ron, that in the refreeing/reviews section, there should be 2 questions for each paper.

In the first question, people would vote on the originality and in it's reviews talk about its originality, whether the work is not actually new, whether there are citations that are worth citing in the paper, etc. This has chances of detoriating into a pile of "Cite Me!"s.

In the second question, people would vote on the accuracy and in it's reviews talk about its accuracy. Nothing much to explain here.

However, in the relevant discussion, the point was raised by Ryan Thorgren, that, "where would the physics explanation of the paper go? would that need a separate question? where; in reviews, or in Q&A?".

That is a valid point, after all, we don't have a real reason to have the 2 questions. When I upvote a question, I upvote it thinking "I want more people to see this question", and it helps the question become more noticeable. Sure, inside my brain, I am involuntarily thinking "Ok, this has _____ much accuracy and _____ much originality, but that need not be done by the system, I can do the thinking myself, I don't need the system to help me. I am upvoting the question to bump it's prominence higher up, and the system helps do that.

However, Ron's point still holds about not mixing "Cite me!"s with accuracy-reviews. We can't ban "Cite me!"s, they are still legitimate, and potentially important, at least occasionally. Even if we use comments for them, as suggested by Ryan Thorngren, they will still remain visible.

I don't know which point is more important, even if both points are equally important, it is to be noted that it may be technically difficult to incorporate Ron's suggestions about having 2 questions for each paper,...

After writing this, I think I am no longer neutral, but my own vote would be that "let each paper have only 1 question".

Would this question not better be set up as a polling question, with two different answers advocating both possibilities which we can vote on?

Your question is a nice summery of the points brought up so far, it deserves an upvote anyway and my mouse was already hovering about the corresponding button...

All that would be needed to make it a poll is changing the last two paragraphs a bit and writing the two polling answers.

@Dilaton Done.

I hope we did not annoy Ron (?), for some reason his About me is back to the P.SE text at present ... :-/

To Dilaton: That's the site which syncs the profiles automatically, relax! I trust you guys, I expect hostile review, that's what I am asking this site to do. So I don't get upset when people disagree with me, only when people censor other people.

+ 4 like - 0 dislike

I am adding an answer, because there are about 4 people here, so you don't want a vote in this case, and votes are meaningless anyway. You know who everybody is, and what they think, so you should just achieve consensus through discussion.

The idea for "two questions" is simply to have two separate votes, and two separate locations for the two different kinds of (significant) editoral comments, so that you segregate comments about previous related work from comments about the paper itself.

The comments about the paper itself should strictly consists of positive and negative reviews regarding the physics content, which are not duplicative of the paper's contents. A positive review should consist of a nontrivial application or extension, not meriting a separate paper, but interesting enough to give perspective on the importance of the work, or perhaps a more pedagogical explanation of an obscure point, which explains the applications of the result, and gives confidence that the calculation is correct, because it explains a hard step.

A negative review can consist of a counterexample to a claimed theorem, a flaw in an argument, an experimental incogruency, a naturalness comment, a misunderstood thing in the paper, anything that would normally go on a negative referee report. Both types of refereeing answers can be voted on normally. These things go in the "review" section as answers.

For the "originality" question, the idea is to simply aggregate the claims of prior art in one place, and use these claims to come up with an originality score. This is produced by community consensus, by people voting +1 when they feel the paper's score is too low, and voting "-1" when they feel it is too high (or voting 0). This is the normal process on these types of sites, and the community usually converges to good scores.

Then the reputation is the weighted sum of the accuracy score by the originality score, something like exp(originality/5) times accuracy, or whatever reputation function people think is fair (I think perhaps exp((originality/5)^{1/3}) times accuracy--- with sign of course--- will be fair).

Part of the idea is to provide authors reputation simply from their published work, so if they come here, they automatically get reputation from publications, they don't need to do mickey-mouse stuff on the site to get reputation. But also, it is designed to be a reasonable substitute for journal refereeing which does not require anything else. It's sufficient to produce a number which is roughly similar to the number journals use when deciding whether to publish a paper or not to publish, minus the political aspects.

There is no third criterion, originality and accuracy are completely sufficient to produce a reasonable outcome for all published papers.

A straight "vote up/down" on papers doesn't work well at all, as the vote mixes up originality/quality and therefore it can also be "I don't like the author", "I don't like the institution", "I don't like the subject matter", "I don't like the font", or whatever other superficial criteria people use. It will simply reinforces dogma regarding papers, and in those cases where you have accurate original papers, you need to make sure that people show you why it is not original specifically, or why it is not correct specifically. If they don't have to do this, they will make up reasons for upvoting/downvoting, which are different from these two criteria, which are the only ones you should use. Period.

I am sure that if you don't separate the criteria like this, into two separate votes, with two separate kinds of comments, papers which are both original and correct will get political downvotes, without any specific criticism, simply because they are unpopular. They will be downvoted, without anyone showing a mistake, and without anyone showing prior art. This is the tragedy of politics that requires high-power editors at major journals. The high-power guys are just there to simply ignore all aspects of criticism except originality and correctness.

With specific criteria questions, you will ownvote on originality if you think there is a missing citation, or the result is folklore. You will downvote on content if you think there is a serious mistake deserving a downvote, and hopefully you will also say what that mistake is! The separation essentially requires you to be specific about why you are downvoting the paper (or upvoting), the natural thing is that you downvote a paper's accuracy and also upvote a specific criticism, or you upvote to upvote a specific missing citation, and downvote the paper's originality.

Any general comments of a non-refereeing nature, like explanation of the results, can go into the QUESTION for the accuracy section. You can summarize the whole content of the paper itself in the question, feel free! The question text is otherwise just a link. But if you have a neutral comment, neither a criticism nor an extension/significant-reworking, it should just be added to the question community wiki style, or as a comment, so it doesn't give you rep, and it has no bearing on the refereeing content--- it's just explanation of the paper.  This also provides a way to summarize old nonfree literature.

The result in this way can be extremely similar to a (good) journal's refereeing, if people are careful about how they vote. If people are careless about how they vote, however, you might need to strictly enforce downvoting on published literature, so that every downvote comes with an associated upvote of one or more specific criticism of the paper, and each downvote on originality comes with an upvote on a missing citation. This type of thing can make it so that if a criticism is retracted (for example, someone posts a sincere mistake, and later it is shown to be wrong), then all the downvotes which came from this criticism are automatically retracted when the criticism is deleted. This is a major project, and it might not be necessary, it can happen just by asking people to vote with sincerity and downvote with respect, and generally, I trust the community on this. People who are free usually do a good job on evaluating literature.

The originality/quality separation is absolutely essential. Without it, you will never get original material, you get quora-like or stackexchange-like rehashes or old crap. Rehashes get cheap upvotes, original material gets hardly anything, because hardly anyone can evaluate it an be sure it is correct. Rehashes are next to zero effort, while original material is extremely hard to produce. Nobody will bother with original material unless original material is rewarded much more. This is the only way to get professional level contributions, to separate out original contributions and reward them more.

The two criteria really are the entire content of the refereeing process, when it is done right. When it is done wrong, there are a million other things to consider, like "does this suit my journal's audience?", "What will be the impact factor?", "How many people care about this topic?" and so on and so on. These are all bogus criteria, and journals do best when they are not a part of the refreeing process. All you need is ORIGINALITY/ACCURACY, and that's it.

If you are under the impression that crackpots score high on originality, don't be. It is extremely rare that you here an original crackpot idea. The only crackpot ideas that score high on originality are those that have an extremely good chance for not being crackpot at all.

answered Mar 11, 2014 by (7,720 points)
I have some dumb questions concerning the practical calculations and assignement of the reputation to different people involved in the review process. I'd like to recapitulate how I understand it from the above explanations and maybe you can correct what I am getting wrong?

• So we have two questions for each paper getting reviewed, one for originality and one for accuracy.
• On both questions and their answers, people would up and downvote (hopefully in a reasonable way as described in the above answer) as usually done on Q&A sites.
• The current scores of both questions (O and A) would then be used to calculate the final score S of the paper according to a formula like for example
$S = \exp(O/5)\times \frac{1}{3} A$
for example?

My questions now are:

Who would earn the "overal reputation" of the paper S, the (leading) author?

Would the reputation earned from the answers to the two questions simply go to the people who posted them?

What about the reputation corresponding to the individual current scores of the two questions?

You see, I am quite confused about these details ... :-/

@Dilaton I was under the impression that Ron was talking about some overall score displayed, and not the actual reputation gained by the author.

Okay, I see now that the point is not to discuss the paper's science, but to evaluate it. I'm not sure exactly why this is the goal, but if it is, I'm fine with the idea of the proposed system in two separate pieces.

I believe it can (and should) be implemented much less clunkily than having two questions in the question feed. Perhaps each question can have an originality and a correctness "tab", and you can access either one by clicking the link to the question in the feed.

The question's priority could be determined by the combined metric of value we decide on.

I agree that "two questions" is a hack, but I wanted to make no changes to Q2A software, or minimal changes, so that the site doesn't fork Q2A. Tabbing questions like this might be a serious extension, but this is indeed the preferred way to do it. This feature might be generally useful to Q2A people anyway.

The reason that refereeing is the goal is because this is what is needed now. The arxiv and related internet sites allows free publication of papers, but journals are still being used for refereeing and ensuring accuracy. I think it's time to take away this function from them, because they are doing something free sites can do much quicker and better. This also allows this website to fill a niche that is empty right now, and it is first, so it can succeed big.

The rep gain from the reply is not enough if your goal is to replace the traditional system. There is no way a political author would be able to respond to detailed criticism of the guts of a paper, they wouldn't know what's going on in the paper well enough.

I think the best bet is to give the reputation to the author that comes here to rebut the criticism, or make comments. This is probably the author that is most involved. If more than one comes, then you can split it equally between all that come, but best to cross that bridge when you come to it.

@RonMaimon Wouldn't the rep gain from the replies suffice? The fact that they did not come to rebut does not necessarily mean that their name is on the paper only for political reasons.

+ 4 like - 0 dislike

Let me formulate some questions and concerns to contribute to this discussion from the technical point of view, and to get a better understanding of your ideas. I do not vote on your requests because I will not be able contribute as a scientist to your site.

1) I would prefer to use only one question, because otherwise we would have to establish a logical relation between questions, a concept absolutely not forseen in Q2A.

2) If I understand you correctly, an imported question should look like this (do not yet comment design details, only the concept):

This is not yet implemented, its only photoshopped! The text includes a link to the paper on ArXiV. Then two voting inputs are possible, one for originality and one for accuracy. Every user may give only one vote on each of them. Using a formula (the easiest part to implement!), a score value is computed from the total of each vote criteria, which is displayed within the question. The usual vote panel (grayed in this image) will be disabled or removed in this category. Is this what you want to have?

3) The asking user will be a robot user, which is used to make automatic imports from ArXiV. Making edits within this question will be possible only by users having this permission due to the according points. Is this OK? Are edits wanted in such questions?

4) Imports will be made by using OAI-PMH which seems to be the best method to implement both, importing all existing papers by bulk metadata access and also daily import of new paper metadata. For the first task ArXiV proposes this protocol to prevent crawl of the archive and for the second it has the advantage to enable data stamps for daily harvesting requests with the “from” date set to the date of last harvest. Like this we will always be sure to get a complete set of all new papers. I started to study this protocol in detail, it's new to me.

5) There will be many thousand review questions in this category, if we import all of them since 1994. What is your idea to navigate within these categories? We do not have a search engine like ArXiV provides one. Don't ask me to implement one; creating a new framework instead of Q2A will take years.

6) I agree that if we want to be the reviewing instance for the physics fields of ArXiV it is required to import all relevant papers as proposed. However, would it be an alternative for the beginning to import only selected papers by copying a link into a special plugin, which imports this paper as a question?

7) The daily import of new papers would cover completely the list of the Q&A page “Recent Activity”. I could import these papers silently. Would that be OK?

8) My development time does not cost anything but your time to wait until we are able to go online. However, I am not willing and able to provide more time than I actually did (these are actually about 10-20 hours per week). I am actually crawling through the code of Q2A to find the hook for these features. I am not yet able to tell you how long it will take, but it could be about two months to realize this development. Do you want to wait so long?

answered Mar 11, 2014 by (0 points)
1. That would be good, of course, having 2 questions was just an idea to implement if the 2 votes per question was not possible.
2. Yes, that is right, I like the design too, however, it should be "Review the originality in the comments, and the accuracy/quality in the answers", since we don't need to mix them up.
3. Yes, I think edits are necessary here, to summarise the paper.
4.
5. I think we can just have a link on the "Reviews" category to the ArXiV search engine on ArXiV itself.
6. Will having all ArXiV physics paper links take up too much space on the server? Will it be expensive? If so, I guess we should be able to compromise, at least for now, until we start thinking about Ron's idea on academic advertising of papers. Then again, there will be more academic advertising if we have all physics papers on the ArXiV. If there are no such monetary problems, I think it is better to have all the ArXiV physics papers.
7. Yes, that is fine, as long as the papers still appear in the category activity itself.
8. I don't think that would be an issue, but we could implement Dilaton's idea of hiding the reviews section temporarily.

If there is a hierarchical tag system, you can put the papers into subcategories.

My idea for this was to go through the papers manually myself, and sort them into tag-categories one by one for about 5,000 papers, that should only take a few days. From these, you can have a bot find the subject matter by looking for citations to papers already categorized, and placing them in the category next to the existing categorized paper.

There's only about 100,000 physics papers in all, so with 1000 users changing paper categories, it will take a week to get the papers into the proper categories, so long as the seed is competently made, and the tree can grow (and shrink).

Then the display can be by the tags, so you can go through the papers heirarchically. We can outsource our search to Google, since the papers will link back to the arxiv link, you can also make sure that there is an easy way to find the page for a specific paper on arxiv (or wherever).

+ 2 like - 0 dislike

Let each paper have two separate questions for originality and accuracy.

answered Mar 10, 2014 by (1,975 points)
+ 1 like - 2 dislike

Let each paper have only one question.

answered Mar 10, 2014 by (1,975 points)

This is only possible if you have two separate votes, for originality and correctness, otherwise the site will not function. This is just how journals evaluate contributions, and you need these two separate criteria.

The discussions of the content of the paper go in the QUESTION ITSELF, in the "quality" section, after the link to the paper, this should be editable by anyone. This is for neutral discussions about the content of the paper, and it can provide a free summary of the paper, if the paper is not freely available.

