maemo.org - Talk

maemo.org - Talk (https://talk.maemo.org/index.php)
-   Community (https://talk.maemo.org/forumdisplay.php?f=16)
-   -   The Testing is half empty (https://talk.maemo.org/showthread.php?t=41179)

Jaffa 2010-01-18 23:13

Re: The Testing is half empty
 
Quote:

Originally Posted by fms (Post 480766)
Yes, I remember the meeting. But I do not seem to remember any of the points proposed by Flandry agreed at that meeting. This is kinda troubling, too.

For cross-reference, the summary of the meeting, and the discussion with the most involved developers (i.e. maemo-developers) is here:

http://lists.maemo.org/pipermail/mae...er/022381.html

As VDVsx says, he's going to push through the agreed changes and improvements; but the person most familiar with the code (X-Fade) has been involved in other things (which hopefully are now mostly in the past).

Texrat 2010-01-18 23:20

Re: The Testing is half empty
 
Quote:

Originally Posted by fms (Post 480658)
Well it has long been agreed that 5 votes is usually sufficient. In fact, it was agreed in an IRC meeting months ago.

The number of testers should be driven by the degree of confidence you want vis-a-vis results. I can look at the statistical tables later and provide a good guideline. But regardless, I don't think it should be a guess or good feeling (not saying 5 or 10 are).

But roughly (this is for example only and in no way scientific) it would look something like 1 tester = 50% confidence that test results sufficiently address likely defects; 2 testers = 70%; 3 testers = 80%; 4 testers = 87%; 5 testers = 91%, etc (tends to be logarithmic).

Jaffa 2010-01-18 23:31

Re: The Testing is half empty
 
Quote:

Originally Posted by Texrat (Post 480999)
But roughly (this is for example only and in no way scientific) it would look something like 1 tester = 50% confidence that test results sufficiently address likely defects; 2 testers = 70%; 3 testers = 80%; 4 testers = 87%; 5 testers = 91%, etc (tends to be logarithmic).

That's fascinating. Do you know of any articles/papers on this WRT software quality?

geneven 2010-01-18 23:43

Re: The Testing is half empty
 
I'd just like to say that anyone who sees the fMMs thread sees a thrilling example of how great software can be and is being developed here. It seems to me that it has almost nothing to do with officially established procedures, but is due to the common sense of one developer. The same goes for mymenu and, some time ago, the work of the liqbase guy.

I hope these are the models you are using for deciding how best to handle software here. Maybe you see more sides of this issue than I do, but these are shining examples. Of course, the developers I mentioned above aren't the only heroes out there, but I want this message short.

Flandry 2010-01-19 00:21

Re: The Testing is half empty
 
FWIW both the required karma and quarantine are stored in fields of a repository record. Here is where the check is made. This is the schema for the data record.

Should be easy enough to change those two values should someone with access actually care to do so...

Texrat 2010-01-19 00:34

Re: The Testing is half empty
 
Quote:

Originally Posted by Jaffa (Post 481022)
That's fascinating. Do you know of any articles/papers on this WRT software quality?

I'll look for some. I've only applied it to product quality (the concept is called AQL or Acceptable Quality Level) but I'll see if I can find something relevant to our use.

VDVsx 2010-01-19 01:36

Re: The Testing is half empty
 
Quote:

Originally Posted by Flandry (Post 481081)
FWIW both the required karma and quarantine are stored in fields of a repository record. Here is where the check is made. This is the schema for the data record.

Should be easy enough to change those two values should someone with access actually care to do so...

What are the benefits of changing these values ?
You get a lot more apps in Extras, but the quality will be lower for sure.
In the beginning I was also a bit against the quarantine time, but had to change my mind after see skilled testers finding big blockers in apps with 10+ thumbs during the quarantine period.

Texrat 2010-01-19 01:58

Re: The Testing is half empty
 
ooo... found good stuff on software quality!

Typical metrics:
http://www.scribd.com/doc/7010681/So...uality-Metrics

Formal softtware testing (outside the scope of most projects here, but I found good material in its 732 pages:
http://digi.physic.ut.ee/tanel/books...g.eBook-KB.pdf

So far I've come across vague statements asserting that more testers can increase confidence levels in results, but nothing matching what I suggested yet. However, the following describes a methodology for building a software test plan and may be useful:

http://www.lucas.lth.se/publications...erssonCdoc.pdf

Flandry 2010-01-19 02:21

Re: The Testing is half empty
 
Quote:

Originally Posted by VDVsx (Post 481154)
What are the benefits of changing these values ?
You get a lot more apps in Extras, but the quality will be lower for sure.
In the beginning I was also a bit against the quarantine time, but had to change my mind after see skilled testers finding big blockers in apps with 10+ thumbs during the quarantine period.

It leads to better software, faster.

Going back to the original point of this thread--we should do it because it was what was proposed and, i gather, agreed upon at the IRC meeting. In any case, the examples you gave are a good example of how the present system doesn't work, not that it does. There should not have been 10 thumbs up if there were blockers. I would venture to guess that the 10 thumbs up were popularity votes and not from the "testers group" that i advocated we adopt.

Anyway, my interest in this is that when an update for an app is ready, especially a trivial update, and it improves upon the current Extras version, making it go through the same level of scrutiny as the original version is a waste of time and discourages thorough testing in the cases where it really is called for. In other words, it leads to people being careless and cavalier in testing and missing blockers.

I would even go so far as to say that requiring 10 tests may be less secure than requiring 5, for the threefold reason that the tester is more likely to be complacent when there are 9 others to pick up the slack, there are more tests to get done, and because the dev is more likely to get fed up with the process and recruit the "testers" in less than helpful ways (which seems to be a fairly common practice). I have resisted telling people to "try and and thumb up" because i believe in following the mandated procedure, but i don't happen to believe that this one is very effective.

I'd be fine with just reducing the karma and quarantine to 50% in the case of an app already in Extras. 5 days and 5 tests is more than enough, especially if the dev and the testers are both coming into it as a positive thing (a chance to scare out bugs) and not an onerous and unrealistic burden. Five or even one real test is immensely better than ten cursory tests.

RevdKathy 2010-01-19 07:57

Re: The Testing is half empty
 
Reading this, and the karma thead has made me wonder if we need a smarter tool than 'thumbs up' for these things. Especially now we have a lot more people on board who don't actually know what a 'thumbs up' should signify. I'm going to be a pain an open a thread on that, which will relate to both these issues.


All times are GMT. The time now is 23:44.

vBulletin® Version 3.8.8