How to do bad statistics, and how NOT to do it, Part II (DaVinci) by alexs1320

View this thread on: hive.blog | peakd.com | ecency.com

iamutopian · @alexs1320 · Feb 22 '19 (edited)

$24.35

How to do bad statistics, and how NOT to do it, Part II (DaVinci)

Two weeks ago, I've wrote the post to explain why it's **fundamentally wrong** to use Average as relevant parameter.

It's about 20-25 "screens" long, everything is explained in details, [link](https://steemit.com/iamutopian/@alexs1320/good-bad-and-ugly-statistics-how-not-to-do-it).

Surprisingly, I need to explain it even further.

Grab some coffee, it will be a long post. Again.

Data were taken from the **Official Utopian Review Sheet**
Date: **14. Feb - 21. Feb 2019**
Number of translations: **25**, not perfect for everything but more than enough to prove the point

### How the scores are given?
---------
The [official Questionnaire](https://review.utopian.io/) consists of 6 questions:
* Accuracy
* Number of Mistakes
* How consistent the translation is
* Quality of post itself
* Legibility of translated text
* Number of translated words

### What is the distribution of ranks for each category?
-----------
This is very much expected, strong grouping around the recommended score "Very Good"

![](https://cdn.steemitimages.com/DQmNmH4xowux77qWEzEvpTCNP7k3p1qdk2AgbguuYneJPFL/image.png)

This is the catch that I already explained in my previous post. Distribution **can't be normal distribution** when you have "the wall" on one side, in this case: No Errors and 1 Error. Teams GER, ITA and ESP had 3 translations (in total) with the Ranks 3 or 4.

![](https://cdn.steemitimages.com/DQmXGGeu5FuP4rm3LTLjEiN5K5QDCeUcbeCAuKMfdaAHQAP/image.png)

How consistent the translations were? One may think that there are two questions, but this is a classical case of underlying factors. Accuracy is correlated with Errors and with Consistency. Very intuitive, just as tall people have long legs and arms as well.

![](https://cdn.steemitimages.com/DQmdeDiNyMm9yjKJiRA3v9p8YwapU7aiS8R6eJ6YmMpssrK/image.png)

Of course, no surprise for Legibility. It's also correlated

![](https://cdn.steemitimages.com/DQmYU1gY7gXdh33oUYJTCBCa6V3RifodwHHkqAnksPTYSTR/image.png)

**As you can see, we have 4 questions related to "quality" in our questionnaire. In reality, it's a single question.**

The only post that was not almost perfectly correlated, was [this post](https://steemit.com/utopian-io/@silviu93/node-js-translation-report-7-1211-words). Everything was excellent - but there were 6 mistakes (Rank 4).

Rank 3 or 4 in the Error Category usually means Rank 2 or 3 in Accuracy. I guess there were only typos in this post.

Maybe we should change something concerning the typos @imcesca , @silviu93? 
I personally don't like this system where everything is a mistake, and maybe this case is the perfect example how to lose a lot of points for no real reason.

**Quality of post is telling us something interesting: there are two grouping points**: Very Good and Sufficient.
Sufficient was only given by the teams: **Dutch and Serbia**
@alexs1320 , @scienceangel , @misslasvegas , @altrosa - maybe we are too harsh to our teams?

![](https://cdn.steemitimages.com/DQmc5W4iHpkv9e2T9LP6zAMvjThnvdwo5a7bHN7FCNQZ7tu/image.png)

Word Count, only Serbian team is occasionally translating 2000+ due to Cyrillic/Latin translations - nothing unusual here.

![](https://cdn.steemitimages.com/DQmUBirvjRDTt6L7WXSCFBDNdV3yzGP3sGQ7Mts9vLWz56a/image.png)

##### Short conclusion, there is nothing wrong with Ranks (not Points, Ranks)

### Ranks vs Points
---------
The system of translating Ranks into the Points is a bit odd:
* 0 negative point for Rank 1 (Excellent)
* a lot of negative points for Ranks 2-3

This is how 25 scores look like when we calculate the sum of all 4 "quality parameters":

![](https://cdn.steemitimages.com/DQmViAGdoKFuNBfmYq57hBeMSLeobZuX5ULppksKHiTmCk1/image.png)

Pay attention! This is **not a histogram**, this is a bar plot!

I know it's a bit stupid to make a histogram with only 25 cases, but anyway...

![](https://cdn.steemitimages.com/DQmQ6oK8Gg699pvj1eDx1dbmLoFfzC6fNcCc429Pm6HET2B/image.png)

There is "a wall" on the left side of distribution for several questions, this is the result that was expected:

![](https://cdn.steemitimages.com/DQmQ6oK8Gg699pvj1eDx1dbmLoFfzC6fNcCc429Pm6HET2B/image.png)

The lowest 5 scores were given in: Spanish Team (2x, different translations, different moderators), Italian, German and Arabic.

#### How word count is distorting the reality
-------------
Besides the average score is irrelevant, without the normalization to the word count - **it's double irrelevant**.

Here is why:

![](https://cdn.steemitimages.com/DQmfGcEpCqKfJRS93trM5uaiWAiYf6aqRw2DMfz4vWdQSzo/image.png)

And the same scores normalized to 2000+ words:

![](https://cdn.steemitimages.com/DQmQTbkFZbYTdejxsZ1fyiHRZYHXKYx6nixoFRrm2kcsAyr/image.png)

Why is this important? Because the differences become smaller, of course:
* Not-Normalized: Average 75.6 , Deviation 8.0 , Median 77.0
* Normalized: Average 83.7 , Deviation 7.5 , Median 85.0

*I know it's wrong to do Average and Deviation, but people like those two parameters for some reason

(Not)Surprisingly, everything is fine once the points are normalized. 

Let's do average scores for teams (*there are at least 3 reasons why this is pointless, but anyway, it's a norm to do it)

![](https://cdn.steemitimages.com/DQmUz58apCJqdGCpWTCUwrgzmVrSqLLWEcHpbYHU6K2Qw62/image.png)

Let's make it more dramatic. Such an inequality :D

![](https://cdn.steemitimages.com/DQmTtcxBYjNYeDjSY3RnErh364QEQ29xcA5mbCvnUQiMcvb/image.png)

Now, let's normalize the scores to the word count:

![](https://cdn.steemitimages.com/DQmRmUiGkhX71xT7URkktZyAhwTeRnFiYqQbWMgjc7gGJY7/image.png)

As you can see, the perfect equality. 
Several translations contained a lot of errors and that's basically it...

## In Conclusion
------------
* **Always normalize** the data
* Don't consider **points** if the relation to ranks is not linear-ish
* Don't use average if the distribution is not **normal distribution**
* There is **no need to have multiple questions** if the answers are highly correlated

The only question that is "controversial" is the distribution of ranks for "quality of post".

Duch and Serbs, maybe we are too harsh:

![](https://cdn.steemitimages.com/DQmc5W4iHpkv9e2T9LP6zAMvjThnvdwo5a7bHN7FCNQZ7tu/image.png)

The difference is only 3 points, so - who cares. 

There was also 1 post with **all** Excellent scores, except the **Rank 4** in Errors, a bit unusual.

## There is no need to cry "woooolf!!!" we are all doing a good job.

## Enjoy the weekend or at least be productive :D

| Languages | Moderators (Proofreaders) | Translators|
| -------- | -------- | -------- |
| 1. Spanish    | @alejohannes; @marugy99    | @elpoliglota, @samuellmiller, @acrywhif, @isabella394,  @cremisi @navx,  @thatmemeguy, @dedicatedguy, @rositaumce |
  | 2.  Greek | @ruth-girl; @dimitrisp | @trumpman, @katerinaramm, @lordneroo, | 
 | 3. Italian     | @mcassani  @imcesca    | @filippocrypto,  @viki.veg  @silviu93, @robertbira, @akireuna; @jacksartori; @deusjudo    |
| 4. Chinese | @sunray @aafeng |  @susanli3769, @victory622, @breathewind, @aaronli, @hannahwu, @itchyfeetdonica |
| 5. German     | @egotheist, @infinitelearning     | @louis88, @future24, @laylahsophia, @sooflauschig,  @supposer, @achimmertens    |
| 6. Arabic    | @dr-frankenstein; @libanista    |  @fancybrothers,  @fatimamortada, @rabihfarhat,  @hazem91  |
  | 7.  Polish | @villaincandle @jestemkioskiem   |   @j4nke, @koscian, @ribson,  @shake1, @fuzeh, @apocz, @yasminafly, @froq |
 | 8. Vietnamese     | @carlpei     |  @adam.tran, @lantracy, @lecongdoo3, @lenancie,  |
| 9. Dutch | @misslasvegas @minersean  |   @altrosa, @dragonsandsnakes, @tokentattoo, @anouk.nox |
| 10. Slovenian     | @fbslo      | @nedy |
| 11. Serbian     | @scienceangel  @alexs1320   |  @nikolanikola, @svemirac, @hidden84 |
| 12. French    | @leyt     | @yassinebad, @ahmedess, @roxane |
| 13. Portuguese    | @leurop, @portugalcoin     |   @mrprofessor, @martusamak, @d4rkflow, @leodelara, @warnas |
| 14. Hebrew     |  @nv21089   | @leurbanexplorer,  @amit9202, @simba |
| 15. Yoruba     | @zoneboy      |  @fatherfaith, @jubreal, @mcyusuf|
| 16. Russian     | @tata-natana       |  @erikaflynn, @vezirbek; @bell1982|
| 17. Filipino     | @ruah       |  @toffer, @marou27, @josephace135, @timliwanag, @dandalion |
| 18. Turkish     | @sargoon, @mugurcagdas      |  @gokhandogru |
| 19. Korean     | @joeypark      |  @dakeshi|
| **19 LT**     | **30 LM**     | **79 Translators**     |

👍 utopian-io, tombstone, redes, steem-ua, trumpman, ctime, lemouth, scienceangel, crowdwitness, amosbastian, vladimir-simovic, tobias-g, alexs1320, jaff8, helo, espoem, chappertron, alexzicky, imcesca, tykee, serialfiller, yassinebad, zest, mcfarhat, yu-stem, mkgfabo, codingdefined, toofasteddie, lecongdoo3, bil.prag, ismailkah, bachuslib, accelerator, optimizer, tata-natana, leyt, ahmedess, portugalcoin, yehey, pagliozzo, merlin7, robertbira, roguescientist84, zoricatech, drsensor, dasc, phage93, fanta-steem, wissenskrieger, we-are, rycharde, we-are-steemians, phatima, replichara, silviu93, nelkeljdm, jacksartori, astromaniac, ascorphat, spbeckman, shammi, sargoon, jga, bukiland, and 22 others

`author`	alexs1320
`permlink`	how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci
`category`	iamutopian
`json_metadata`	{"tags":["iamutopian","utopian-io","davinci","math","science"],"users":["imcesca","silviu93","alexs1320","scienceangel","misslasvegas","altrosa","alejohannes","marugy99","elpoliglota","samuellmiller","acrywhif","isabella394","cremisi","navx","thatmemeguy","dedicatedguy","rositaumce","ruth-girl","dimitrisp","trumpman","katerinaramm","lordneroo","mcassani","filippocrypto","viki.veg","robertbira","akireuna","jacksartori","deusjudo","sunray","aafeng","susanli3769","victory622","breathewind","aaronli","hannahwu","itchyfeetdonica","egotheist","infinitelearning","louis88","future24","laylahsophia","sooflauschig","supposer","achimmertens","dr-frankenstein","libanista","fancybrothers","fatimamortada","rabihfarhat","hazem91","villaincandle","jestemkioskiem","j4nke","koscian","ribson","shake1","fuzeh","apocz","yasminafly","froq","carlpei","adam.tran","lantracy","lecongdoo3","lenancie","minersean","dragonsandsnakes","tokentattoo","anouk.nox","fbslo","nedy","nikolanikola","svemirac","hidden84","leyt","yassinebad","ahmedess","roxane","leurop","portugalcoin","mrprofessor","martusamak","d4rkflow","leodelara","warnas","nv21089","leurbanexplorer","amit9202","simba","zoneboy","fatherfaith","jubreal","mcyusuf","tata-natana","erikaflynn","vezirbek","bell1982","ruah","toffer","marou27","josephace135","timliwanag","dandalion","sargoon","mugurcagdas","gokhandogru","joeypark","dakeshi"],"image":["https://cdn.steemitimages.com/DQmNmH4xowux77qWEzEvpTCNP7k3p1qdk2AgbguuYneJPFL/image.png","https://cdn.steemitimages.com/DQmXGGeu5FuP4rm3LTLjEiN5K5QDCeUcbeCAuKMfdaAHQAP/image.png","https://cdn.steemitimages.com/DQmdeDiNyMm9yjKJiRA3v9p8YwapU7aiS8R6eJ6YmMpssrK/image.png","https://cdn.steemitimages.com/DQmYU1gY7gXdh33oUYJTCBCa6V3RifodwHHkqAnksPTYSTR/image.png","https://cdn.steemitimages.com/DQmc5W4iHpkv9e2T9LP6zAMvjThnvdwo5a7bHN7FCNQZ7tu/image.png","https://cdn.steemitimages.com/DQmUBirvjRDTt6L7WXSCFBDNdV3yzGP3sGQ7Mts9vLWz56a/image.png","https://cdn.steemitimages.com/DQmViAGdoKFuNBfmYq57hBeMSLeobZuX5ULppksKHiTmCk1/image.png","https://cdn.steemitimages.com/DQmQ6oK8Gg699pvj1eDx1dbmLoFfzC6fNcCc429Pm6HET2B/image.png","https://cdn.steemitimages.com/DQmfGcEpCqKfJRS93trM5uaiWAiYf6aqRw2DMfz4vWdQSzo/image.png","https://cdn.steemitimages.com/DQmQTbkFZbYTdejxsZ1fyiHRZYHXKYx6nixoFRrm2kcsAyr/image.png","https://cdn.steemitimages.com/DQmUz58apCJqdGCpWTCUwrgzmVrSqLLWEcHpbYHU6K2Qw62/image.png","https://cdn.steemitimages.com/DQmTtcxBYjNYeDjSY3RnErh364QEQ29xcA5mbCvnUQiMcvb/image.png","https://cdn.steemitimages.com/DQmRmUiGkhX71xT7URkktZyAhwTeRnFiYqQbWMgjc7gGJY7/image.png"],"links":["https://steemit.com/iamutopian/@alexs1320/good-bad-and-ugly-statistics-how-not-to-do-it","https://review.utopian.io/","https://steemit.com/utopian-io/@silviu93/node-js-translation-report-7-1211-words"],"app":"steemit/0.1","format":"markdown"}
`created`	2019-02-22 14:28:42
`last_update`	2019-02-22 15:16:27
`depth`	0
`children`	17
`last_payout`	2019-03-01 14:28:42
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	18.575 HBD
`curator_payout_value`	5.775 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	8,416
`author_reputation`	150,945,165,388,638
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	0
`post_id`	80,241,345
`net_rshares`	44,779,225,683,707
`author_curate_reward`	""

properties (23)vote details (86)

voter	rshares	pct
tombstone	2,848,098,830,750	12.11%
lemouth	269,770,448,342	55%
nelkel	87,794,699	20%
redes	939,777,868,601	18%
bukiland	1,570,549,816	3.14%
vodonik	605,306,124	33%
roguescientist84	10,362,287,852	100%
replichara	4,641,262,916	100%
jga	1,764,809,682	15.14%
rycharde	5,366,874,481	11.03%
yehey	12,255,242,913	10%
helo	99,449,200,204	50.06%
trumpman	308,585,798,308	100%
pinoy	106,519,666	10%
toofasteddie	24,227,277,308	19%
codingdefined	28,175,304,102	20%
bachuslib	19,418,555,045	100%
tykee	49,930,092,338	100%
alexzicky	72,929,036,699	70%
zest	39,599,326,986	100%
silviu93	4,594,799,212	100%
accelerator	16,697,268,642	1.1%
nicola71	386,643,033	49%
espoem	87,494,485,524	45.24%
mcfarhat	35,595,781,294	20.02%
vladimir-simovic	146,211,920,784	100%
utopian-io	37,364,913,628,479	30.28%
shammi	2,151,361,599	35%
jaff8	113,899,646,985	50.06%
newsrx	84,100,595	6.81%
imcesca	55,955,673,958	100%
greenorange	547,392,973	100%
alexs1320	127,876,450,495	100%
nelkeljdm	4,511,849,486	20%
amosbastian	152,455,279,736	50.06%
ismailkah	20,090,816,537	100%
jasonwhite	832,209,807	50%
scienceangel	160,832,624,176	100%
portugalcoin	12,616,007,523	15%
zoricatech	10,266,549,567	100%
sargoon	2,034,517,131	100%
tobias-g	128,895,759,757	48%
mondodidave73	1,062,823,781	42%
robertbira	10,646,984,834	25%
dr-frankenstein	1,089,430,427	50%
dasc	7,542,400,597	100%
erikaa	453,522,703	100%
bil.prag	20,523,419,865	40%
smokeynagata	593,415,859	15%
jacksartori	4,050,676,387	25%
yu-stem	34,044,989,826	100%
we-are	5,805,820,584	10.41%
phage93	7,412,661,582	15%
serialfiller	46,124,011,232	85%
sergino	373,082,136	2%
drsensor	8,128,945,633	44%
pagliozzo	11,920,610,423	25%
spbeckman	2,215,685,295	100%
elmauza	568,330,895	100%
gridbox	454,558,586	100%
cryptouno	496,778,197	5%
astromaniac	3,196,093,522	100%
mops2e	527,096,757	36.19%
lecongdoo3	22,244,901,522	100%
swapsteem	1,323,241,343	15.14%
chappertron	86,134,888,680	100%
fanta-steem	6,957,217,515	100%
merlin7	10,706,528,633	0.3%
steem-ua	705,249,520,460	6.81%
tata-natana	15,859,287,752	70%
yassinebad	42,004,123,322	100%
kaczynski	156,901,972	100%
leyt	15,507,191,581	100%
ahmedess	13,192,914,591	100%
wojtas19022	554,195,243	100%
optimizer	16,316,560,193	12%
mkgfabo	29,891,792,322	100%
adamantino	477,998,724	25%
phatima	4,862,559,949	7%
ascorphat	2,367,542,758	2.5%
crowdwitness	154,945,741,013	100%
ctime	284,562,118,390	10.01%
we-are-steemians	5,133,732,063	5.12%
hozn4ukhlytriwc	511,775,233	15%
wissenskrieger	6,743,360,965	100%
cindis	623,096,237	100%

@alexeygrigurko · Feb 25 '19

omg

properties (22)

`author`	alexeygrigurko
`permlink`	re-alexs1320-how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci-20190225t090807036z
`category`	iamutopian
`json_metadata`	{"community":"waiviodev","app":"waiviodev/1.0.0","format":"markdown","tags":["waiviodev","iamutopian"],"users":[],"links":[],"image":[]}
`created`	2019-02-25 09:08:06
`last_update`	2019-02-25 09:08:06
`depth`	1
`children`	0
`last_payout`	2019-03-04 09:08:06
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.000 HBD
`curator_payout_value`	0.000 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	3
`author_reputation`	-64,457,290,544
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	10,000
`post_id`	80,363,552
`net_rshares`	0

@imcesca · Feb 22 '19

$0.08

Very interesting post, thank you.

Regarding the post by @silviu93 that you singled out, you are very correct in your assumption that something didn’t match.
That something mostly had to do with me traveling to Austria and surprisingly not being able to (1) use international roaming and (2) find some decent WiFi that allowed me to review/notify my team of my predicament until the post was 6 days old. When I finally managed to find some network, it was the spa’s WiFi, which meant I had very limited time to actually use my phone. After completing the review I rushed through the scoring, actually assuming the post would miss its payout. I have now checked back on the post and was surprised to see I was wrong on the last account (and I’m glad for Silviu on that account, since he’s been rather unlucky with payouts lately).
Nevertheless, your assumption was correct: most of the mistakes pertaining that translation batch were missed spaces or extra commas (which is probably the most common mistake in the Italian team: we don’t use the Oxford comma in Italian, but it’s easy to forget to take it off when the string is a simple list of words left in English). In general, though, I do agree with you that 6 mistakes should affect the other scores, too.

I also agree on the fact that 5 out of 6 questions basically judge the same exact thing when a translation is well made. And frankly, I have never reviewed translations that were not well made, which I believe it’s the whole point of this project. I have never seen this as a learning environment but rather as a collection of already-skilled individuals. Sure, some of the translators have gotten better with time than they were at the beginning: practice will do that for everything and everyone. But the whole point in recruiting them was that they were already good at this.
I don’t particularly hate the new questionnaire and I actually like it more than its predecessor. But I find it redundant and in some aspects inadequate (why the gap in mistakes count? why give up on the major/minor mistake breakdown?). I had repeatedly voiced my opinion in writing back when we were brainstorming, both in comments and posts, but was unable to participate in the vocal chat due to work engagements. It seems like the only thing that mattered, in the end, was the vocal chat though, and whatever opinion wasn’t voiced over there didn’t really matter. So I’ve just given up on the issue and simply use what tools I’ve been given.

👍 alexs1320, espoem, ascorphat, ezravandi, jadabug, votes4minnows, rij, piresfa, fister, house-targaryen, steiller, kraggan, walker5, thevoteproject, jackmoksha, awesome-n, to-upgrade, jester87, lost-and-found, petrm, seb3364, wirsing, irenweiher, kleinheim, enriquo, rpc34, edm0nd24, cceleste, rustyrobert, miggel, sjennifer, b1337, joshi110, madel, multibeam, no0, apt-get, potsdam, bajaro, rii, paidforwinrar, redradish, yeee, vote-o-mator, freeskate, m3ik3, sp33dygonzales, beissler, dolleyb, michael44, b33r, nooo, gierit, isabelll, badeder, dontblink, hljk, tokyoduck, heger, mythosacademy, lordofreward

`author`	imcesca
`permlink`	re-alexs1320-how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci-20190222t195930366z
`category`	iamutopian
`json_metadata`	{"tags":["iamutopian"],"users":["silviu93"],"app":"steemit/0.1"}
`created`	2019-02-22 19:59:30
`last_update`	2019-02-22 19:59:30
`depth`	1
`children`	1
`last_payout`	2019-03-01 19:59:30
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.064 HBD
`curator_payout_value`	0.013 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	2,488
`author_reputation`	29,635,823,751,778
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	10,000
`post_id`	80,255,085
`net_rshares`	157,758,628,365
`author_curate_reward`	""

properties (23)vote details (61)

voter	rshares	pct
jadabug	1,284,520,994	1%
espoem	18,955,619,661	10%
alexs1320	118,111,685,051	100%
steiller	275,245,982	100%
ezravandi	1,775,750,926	1.5%
mythosacademy	248,949,506	100%
no0	272,310,467	100%
paidforwinrar	272,179,285	100%
nooo	271,019,821	100%
yeee	271,852,272	100%
petrm	273,790,459	100%
isabelll	270,492,760	100%
michael44	271,355,862	100%
seb3364	273,640,105	100%
edm0nd24	272,697,526	100%
rii	272,184,399	100%
jester87	273,967,864	100%
walker5	275,082,535	100%
gierit	270,740,276	100%
sp33dygonzales	271,530,773	100%
sjennifer	272,470,476	100%
dolleyb	271,433,770	100%
miggel	272,524,487	100%
m3ik3	271,568,760	100%
apt-get	272,300,492	100%
kleinheim	273,405,037	100%
bajaro	272,244,679	100%
badeder	270,039,446	100%
lordofreward	248,678,500	100%
kraggan	275,213,513	100%
beissler	271,467,098	100%
fister	275,363,615	100%
joshi110	272,415,779	100%
madel	272,321,941	100%
hljk	269,818,396	100%
house-targaryen	275,268,190	100%
irenweiher	273,529,879	100%
lost-and-found	273,868,263	100%
wirsing	273,572,616	100%
rij	276,163,998	100%
cceleste	272,613,684	100%
rpc34	272,905,954	100%
b1337	272,460,219	100%
b33r	271,282,321	100%
tokyoduck	269,372,006	100%
potsdam	272,247,578	100%
piresfa	275,519,661	100%
redradish	271,961,942	100%
awesome-n	274,476,398	100%
to-upgrade	274,160,380	100%
heger	269,217,131	100%
enriquo	273,398,224	100%
rustyrobert	272,531,251	100%
freeskate	271,575,853	100%
jackmoksha	274,667,292	100%
multibeam	272,313,821	100%
votes4minnows	608,180,313	5%
thevoteproject	274,801,627	100%
vote-o-mator	271,623,733	100%
dontblink	270,018,327	100%
ascorphat	2,075,015,191	2.5%

@alexs1320 · Feb 22 '19

Don't worry, it's maybe 10 points = 2-3 $ = a good coffee  :)

properties (22)

`author`	alexs1320
`permlink`	re-imcesca-re-alexs1320-how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci-20190222t202612218z
`category`	iamutopian
`json_metadata`	{"tags":["iamutopian"],"app":"steemit/0.1"}
`created`	2019-02-22 20:26:30
`last_update`	2019-02-22 20:26:30
`depth`	2
`children`	0
`last_payout`	2019-03-01 20:26:30
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.000 HBD
`curator_payout_value`	0.000 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	61
`author_reputation`	150,945,165,388,638
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	0
`post_id`	80,255,990
`net_rshares`	0

@roguescientist84 · Feb 22 '19

$0.07

Thanks for this! I want to study a good book on statistics, or a udemy class but now is not the time. So your expert delivery is appreciated in the meantime.  I have been traveling for work a lot but hopefully I will put some time on the EM420 this weekend before I leave again for 2 weeks.

👍 alexs1320

`author`	roguescientist84
`permlink`	re-alexs1320-how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci-20190222t153420301z
`category`	iamutopian
`json_metadata`	{"tags":["iamutopian"],"app":"steemit/0.1"}
`created`	2019-02-22 15:34:21
`last_update`	2019-02-22 15:34:21
`depth`	1
`children`	5
`last_payout`	2019-03-01 15:34:21
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.050 HBD
`curator_payout_value`	0.016 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	290
`author_reputation`	8,991,615,309,124
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	10,000
`post_id`	80,243,885
`net_rshares`	124,987,558,248
`author_curate_reward`	""

properties (23)vote details (1)

voter	weight	wgt%	rshares	pct	time
alexs1320	0 B		124,987,558,248	100%

@alexs1320 · Feb 22 '19

$0.05

Good luck with the microscope :D

Good book of statistics...
I would always recommend help page of Statsoft Statistica: http://documentation.statsoft.com/
It's wonderful

The first task is to master descriptive statistics and to develop the paradigm of thinking. To know what to look for and how to avoid the most common pitfalls. Here are some rules:
* never discard any data before you are 100% sure, try to explain them first
* don't make "tarballs" of everything
* check the distribution of values
* try to understand correlations
* check if your variables are continuous
* check if the process is linear

Otherwise, you can conclude, using statistics, that **lions are not dangerous at all**.
I mean, on average, how many people are eaten by lions?

You can also conclude that the average body temperature of a human is about 30 degrees, as there are always some people who are not alive.

**Ok, those are trivial, but correlations between variables, man... That's the least intuitive thing.**

You think you know someone, but if you don't know how they behave in non-correlated scenarios - you don't know them. You can not predict their reactions.

The same is truth in "the opposite direction". In the XXI century it's a taboo to have prejudice up to the scale where it's forbidden to think.

The reality is that if you know how to recognize several key factors, you can predict the behavior of people with basically 95+ %. IQ, conscientiousness, emotional stability, openness to ideas = You are gifted with the [awareness](https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1744-6570.1991.tb00688.x)

**People, master statistics, it will save you about 20.000 $, 3 liters of tears and 5 years of your life.**

👍 scienceangel, roguescientist84

`author`	alexs1320
`permlink`	re-roguescientist84-re-alexs1320-how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci-20190222t164339589z
`category`	iamutopian
`json_metadata`	{"tags":["iamutopian"],"links":["http://documentation.statsoft.com/","https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1744-6570.1991.tb00688.x"],"app":"steemit/0.1"}
`created`	2019-02-22 16:43:54
`last_update`	2019-02-22 16:43:54
`depth`	2
`children`	4
`last_payout`	2019-03-01 16:43:54
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.036 HBD
`curator_payout_value`	0.011 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	1,770
`author_reputation`	150,945,165,388,638
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	0
`post_id`	80,246,695
`net_rshares`	87,928,921,236
`author_curate_reward`	""

properties (23)vote details (2)

voter	weight	wgt%	rshares	pct	time
roguescientist84	0 B		10,362,070,930	100%
scienceangel	0 B		77,566,850,306	50%

@scienceangel · Feb 22 '19

$0.09

>People, master statistics, it will save you about 20.000 $, 3 liters of tears and 5 years of your life.

Now I'm curious.

👍 alexs1320, yu-stem, penghuren, cheneats

`author`	scienceangel
`permlink`	re-alexs1320-re-roguescientist84-re-alexs1320-how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci-20190222t195742426z
`category`	iamutopian
`json_metadata`	{"tags":["iamutopian"],"app":"steemit/0.1"}
`created`	2019-02-22 19:57:42
`last_update`	2019-02-22 19:57:42
`depth`	3
`children`	3
`last_payout`	2019-03-01 19:57:42
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.070 HBD
`curator_payout_value`	0.022 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	122
`author_reputation`	113,585,841,123,698
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	10,000
`post_id`	80,255,026
`net_rshares`	169,322,395,828
`author_curate_reward`	""

properties (23)vote details (4)

voter	rshares	pct
cheneats	5,150,100,878	16.2%
alexs1320	120,444,346,987	100%
penghuren	5,840,784,329	63%
yu-stem	37,887,163,634	100%

3 replies

@steem-ua · Feb 24 '19

#### Hi @alexs1320!

Your post was upvoted by @steem-ua, new Steem dApp, using UserAuthority for algorithmic post curation!
Your post is eligible for our upvote, thanks to our collaboration with @utopian-io!
**Feel free to join our [@steem-ua Discord server](https://discord.gg/KpBNYGz)**

properties (22)

`author`	steem-ua
`permlink`	re-how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci-20190224t144637z
`category`	iamutopian
`json_metadata`	"{"app": "beem/0.20.18"}"
`created`	2019-02-24 14:46:39
`last_update`	2019-02-24 14:46:39
`depth`	1
`children`	0
`last_payout`	2019-03-03 14:46:39
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.000 HBD
`curator_payout_value`	0.000 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	288
`author_reputation`	23,214,230,978,060
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	10,000
`post_id`	80,329,793
`net_rshares`	0

@tykee · Feb 24 '19

$10.11

There are lots of visuals to relate here. I can see you put a lot of effort in practice, and I appreciate that.
I missed the first post of the series. However, I glance through to get more info.

Your view about the questionnaire isn't a bad one. The questionnaire isn't perfect, and that is why we always try to improve it whenever we can. As you have said, most of those questions are centered around quality, but I think some of those questions are different.
Don't forget, suggestions for improvement are always welcome!

There are a few analysis/visuals I don't understand in the post. I think it would have been clearer, if you have added more texts to give a more detailed explanation.

With the information in the post, I think this post is also about the translation post by silviu95 and the review given to it. The mistake does not look nice, and such shouldn't persist.

As you have said, 'we all are doing a good job''. Thank you!

Please note that while the CM hasn't changed the footer, I am not scoring #iamutopian posts based on the questionnaire. They have their own metric, and that will be the case until we go live with the new guidelines and new questionnaire, which will be comprehensive enough to reflect these types of posts.

To view those questions and the relevant answers related to your post, [click here](https://review.utopian.io/result/1/13131314).

----
Chat with us on [Discord](https://discord.gg/uTyJkNm).
[[utopian-moderator]](https://join.utopian.io/)

👍 utopian-io, mightypanda, alexs1320, espoem, codingdefined, amosbastian, reazuliqbal, jaff8, organicgardener, monster-inc, fastandcurious, ascorphat, linknotfound, elviento, emrebeyler

`author`	tykee
`permlink`	re-alexs1320-how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci-20190224t135308866z
`category`	iamutopian
`json_metadata`	{"tags":["iamutopian"],"links":["https://review.utopian.io/result/1/13131314","https://discord.gg/uTyJkNm","https://join.utopian.io/"],"app":"steemit/0.1"}
`created`	2019-02-24 13:53:12
`last_update`	2019-02-24 13:53:12
`depth`	1
`children`	5
`last_payout`	2019-03-03 13:53:12
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	7.660 HBD
`curator_payout_value`	2.454 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	1,501
`author_reputation`	233,202,435,251,808
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	10,000
`post_id`	80,327,506
`net_rshares`	16,615,586,858,176
`author_curate_reward`	""

properties (23)vote details (15)

voter	rshares	pct
elviento	466,668,627	0.65%
codingdefined	28,230,398,461	20%
espoem	29,048,276,344	15%
utopian-io	16,103,118,339,730	11.61%
jaff8	13,604,232,854	6%
emrebeyler	11,534,450	0.01%
alexs1320	126,537,321,459	100%
amosbastian	18,427,080,939	6%
organicgardener	11,671,184,902	35%
reazuliqbal	18,151,025,574	10%
mightypanda	253,144,190,918	100%
fastandcurious	4,279,878,047	100%
linknotfound	507,555,403	100%
ascorphat	2,099,841,876	2.5%
monster-inc	6,289,328,592	100%

@alexs1320 · Feb 24 '19

$0.05

Concerning the questionnaire, I had a very radical solution - but it wasn't accepted.

The basic premise is that **translation must be as good as possible**.
Clients must get the top quality, no discussion about that.

As all the translations will be perfect when finished - all the translations are equal. Scores will be equal.

As the consequence... Moderators are forced to get the best translators, in order to facilitate their own work.

The level of quality is going up, there are no sparks between translators and moderators as they are collaborators and not rivals. There is no bad blood between the teams concerning the scores *(as all scores are equal). Of course, if translators are bad - they will be replaced.

------------

In the current system, there are several things which are not very logical:

* There is a mistake. Neither moderator nor translator notice the mistake. **It's not fixed. Mistake still exists, it's not fixed - but the score is perfect**.
* There is a mistake. Either moderator or translator notice the mistake. **It's fixed. Mistake no longer exists, it's fixed - but the score is not perfect**. It's 8 points lower

Or... We can see that all the parameters concerning quality are telling us that the translation is "so-so", but there is only one error. Ho-how... If it's bad - make it better. Imagine having 200-300 strings, 1 mistake and the quality is "meh... so-so". It's not very logical to me.

I also can't understand how some people are constantly making 10 mistakes per 1000 words. My answer is already 300 words long. Imagine 3-4 errors, made while writing in the native language. How is it possible - I don't understand.

👍 tykee, yu-stem

`author`	alexs1320
`permlink`	re-tykee-re-alexs1320-how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci-20190224t145532603z
`category`	iamutopian
`json_metadata`	{"tags":["iamutopian"],"app":"steemit/0.1"}
`created`	2019-02-24 14:55:51
`last_update`	2019-02-24 14:55:51
`depth`	2
`children`	3
`last_payout`	2019-03-03 14:55:51
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.036 HBD
`curator_payout_value`	0.011 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	1,673
`author_reputation`	150,945,165,388,638
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	0
`post_id`	80,330,130
`net_rshares`	79,084,532,531
`author_curate_reward`	""

properties (23)vote details (2)

voter	weight	wgt%	rshares	pct	time
tykee	0 B		47,751,079,889	100%
yu-stem	0 B		31,333,452,642	100%

@tykee · Feb 25 '19 (edited)

$0.08

Your idea is good, regarding the quality of translations. However, I believe that, within a group of professional, there are people that will be superior. Thus, there is no how the quality of work would be equal.  

Some people have years of experience in translation, while some aren't. Yet, anyone can get better if the chance is given. As I have mentioned in my comment above, things are not really perfect now, but it would be better to look for a less strict way. 

Also, anyone that provides way less quality could be replaced since we want good translations. We shouldn't nurture poor works.

👍 alexs1320

`author`	tykee
`permlink`	re-alexs1320-re-tykee-re-alexs1320-how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci-20190225t095126532z
`category`	iamutopian
`json_metadata`	{"tags":["iamutopian"],"app":"steemit/0.1"}
`created`	2019-02-25 09:51:27
`last_update`	2019-02-25 09:52:45
`depth`	3
`children`	2
`last_payout`	2019-03-04 09:51:27
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.062 HBD
`curator_payout_value`	0.020 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	598
`author_reputation`	233,202,435,251,808
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	10,000
`post_id`	80,364,876
`net_rshares`	135,912,457,390
`author_curate_reward`	""

properties (23)vote details (1)

voter	weight	wgt%	rshares	pct	time
alexs1320	0 B		135,912,457,390	100%

2 replies

@utopian-io · Feb 26 '19

Thank you for your review, @tykee! Keep up the good work!

properties (22)

`author`	utopian-io
`permlink`	re-re-alexs1320-how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci-20190224t135308866z-20190226t151751z
`category`	iamutopian
`json_metadata`	"{"app": "beem/0.20.17"}"
`created`	2019-02-26 15:17:51
`last_update`	2019-02-26 15:17:51
`depth`	2
`children`	0
`last_payout`	2019-03-05 15:17:51
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.000 HBD
`curator_payout_value`	0.000 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	57
`author_reputation`	152,955,367,999,756
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	10,000
`post_id`	80,445,403
`net_rshares`	0

@utopian-io · Feb 24 '19

$0.08

Hey, @alexs1320!

**Thanks for contributing on Utopian**.
We’re already looking forward to your next contribution!

**Get higher incentives and support Utopian.io!**
 Simply set @utopian.pay as a 5% (or higher) payout beneficiary on your contribution post (via [SteemPlus](https://chrome.google.com/webstore/detail/steemplus/mjbkjgcplmaneajhcbegoffkedeankaj?hl=en) or [Steeditor](https://steeditor.app)).

**Want to chat? Join us on Discord https://discord.gg/h52nFrV.**

<a href='https://steemconnect.com/sign/account-witness-vote?witness=utopian-io&approve=1'>Vote for Utopian Witness!</a>

👍 alexs1320

`author`	utopian-io
`permlink`	re-how-to-do-bad-statistics-and-how-not-to-do-it-part-ii-davinci-20190224t180930z
`category`	iamutopian
`json_metadata`	"{"app": "beem/0.20.17"}"
`created`	2019-02-24 18:09:33
`last_update`	2019-02-24 18:09:33
`depth`	1
`children`	0
`last_payout`	2019-03-03 18:09:33
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.058 HBD
`curator_payout_value`	0.019 HBD
`pending_payout_value`	0.000 HBD
`promoted`	0.000 HBD
`body_length`	591
`author_reputation`	152,955,367,999,756
`root_title`	"How to do bad statistics, and how NOT to do it, Part II (DaVinci)"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 HBD
`percent_hbd`	10,000
`post_id`	80,337,391
`net_rshares`	128,402,778,503
`author_curate_reward`	""

properties (23)vote details (1)

voter	weight	wgt%	rshares	pct	time
alexs1320	0 B		128,402,778,503	100%