June progress report
Today, in addition to working on a number of UI improvements, I took some time to create a video to report on my progress one month into this project. So bust out the popcorn and enjoy!
Also, to commemorate this milestone, I bundled up the current trunk files as “version 0.1”. I invite you to try it out! I’d love to get your feedback. 🙂
Konstantin 3:56 am on July 7, 2010 Permalink |
Hey,
I just came across your project on WP Tevern for the first time, and I’m impressed!
Couple of questions:
Will your plugin be i18n ready, though? (always an issue for non-English-native customers)
Do you use WordPress’ Settings API for the admin pages?
(I love plugins and themes that integrate their option pages the way it is intended…)
mitcho/芳貴 2:26 pm on July 7, 2010 Permalink |
I’d definitely like to make sure it’s i18n-compatible. Being Japanese myself, that’s definitely something I hope to ensure down the line. If you test it out and see places where i18n might be an issue, I’d love to get your feedback. 😀
No, I’m not using the settings API right now, but I’ll look into it. Thanks for the tip! 🙂
Eoin 12:07 pm on July 7, 2010 Permalink |
Looks brilliant Mitcho! One thing you didn’t cover in the video — how is this better than using, say, Google Website Optimizer? I can think of a couple of reasons, but I was wondering what your thinking and motivation is.
mitcho/芳貴 2:29 pm on July 7, 2010 Permalink |
The advantages, in my mind, are robustness, extensibility, and ownership:
1. It’s all PHP, server-side selection, so there’s a lot more you can realistically do in terms of different treatments rather than just trying out different strings, say. You could even try out different behavior, or different algorithms for something.
2. I’m trying to build it out to be very extensible, using WordPress-style hooks galore. This means you’ll be able to easily create new metrics. Someone could create a plugin to enable a page view time metric, or a bounce rate metric, or a WP eCommerce checkout revenue metric. All of this is easily possible due to the hooks.
3. At the end of the day, you own all the data yourself, not Google. If you want to run more detailed statistics, you can do that too.
What did you see as the advantages?
Tal Galili 12:52 pm on July 7, 2010 Permalink |
Hi Mitcho,
I was sent here through a friend, and just finished watching your Demo.
Here are a few comments:
1) Wonderful! I am very excited about your project and thinks it is great!
2) Instead of Z score, another possible something to give is the (let’s say 95%) confidence intervals for the test results.
3) There is an entire field which I am guessing you heard of, called partial factorial experiments. The methodology they offer can allow smart allocation of interaction of variants (but that is more then simply A/B testing)
4) I would also hope this will be implemented into theme features (as widgets, header navigation bar options and so on).
5) There is also the issue of multiple comparisons once someone starts doing many such tests. That would raise more issue, but I am not sure how the solutions for this can be easily explained to people in the scope of such a project. If you’d like to correct for this issue, I would suggest implementing the BH (FDR based) method , since it is relatively easy to program – and offers scalability with the number of tests being performed still saving the FDR at 0.05 (not the type I error, mind you).
6) I also love YARPP – glad to see what you are giving to the rest of us – amazing stuff!
7) Rounding down the numbers in the results might prove to be helpful to people (using the SD as a measure to how much to round down can help).
Good luck and thanks again!
Cheers,
Tal
mitcho/芳貴 2:34 pm on July 7, 2010 Permalink |
2) Yup, that’s coming.
3) I’m aware of that methodology, but I’m starting this project just with simple A/B/C testing without getting into multi-factor ANOVA. There’s also an argument that such methodologies aren’t actually particularly beneficial, and that trying to run experiments which are orthogonal is normally good enough. Here’s a good academic paper which touches on this topic: http://exp-platform.com/hippo.aspx
4) Hope so! If it’s not in ShrimpTest itself, ShrimpTest will easily provide the hooks and functions so that custom plugins and themes can do this.
5) I’m not familiar with that and I’d love to hear more about this. Do you have a reference or a paper explaining these methods?
Thanks for the feedback! 🙂
Tal Galili 3:16 pm on July 7, 2010 Permalink
Hi Mitcho
3) Legit – but only when you don’t have any interactions (which might not always be the case). Still I agree this is not urgent compared to all there is to accomplish.
4) Wonderful 🙂
5) Read here:
http://en.wikipedia.org/wiki/False_discovery_rate
On the concept of FDR as a way for correcting to multiple comparisons.
The first reference (Benjamini, Yoav; Hochberg, Yosef (1995). “Controlling the false discovery rate: a practical and powerful approach to multiple testing”) Is the original article that introduced this concept to modern area statisticians.
My pleasure!
BTW, are you an R user?
New Tool: A/B Test Plugin for WordPress! | Adjoozey 5:54 pm on July 8, 2010 Permalink |
[…] can see a demo of the plugin on the ShrimpTest site. Erlewine takes you through setting up an A/B test for text, using […]
If you’ve found A/B split testing with Wordpress difficult or frustrating, this new awesome plugin may be the answer – http://bit.ly/bi6mMW 5:12 am on July 11, 2010 Permalink |
[…] testing with WordPress difficult or frustrating, this new awesome plugin may be the answer – https://shrimptest.wordpress.com/2010… 12 hours […]
Joen 5:46 am on July 13, 2010 Permalink |
Hello Mitcho,
My name is Joen Asmussen, I’m a usability helper for Automattic, Matt asked me to stop by and give my few cents. And let me start by saying: this is super duper impressive. It’s an incredible project. Great work.
Secondly, right now this is very advanced software for advanced users. Usability improvements, I believe, will have to be iterative as the software unfolds. As such, please consider my advice “ideas” and implement or ignore them at your leisure.
1:
The shortcode .. what if you want to test something other than text, such as two different images, or even three different images? Example: [ab donations=””][/ab]. Is this within the scope of the project?
Could this be simplified thusly:
[ab id=”donations”][/ab][ab id=”donations”][/ab]
or even with a separator:
[ab id=”donations”]|[/ab]
?
2:
I would suggest the shortcodes be accompanied by helpful buttons in both the visual and HTML editors. One visual editor button could be “Insert A/B test”, which spawned a dialog a la:
Insert A/B test
Name: Donate/pay conversion
Control text: Donate to me!
Variant 1: Pay me!
Variant 2: Buy me a coffee
[+] Add variant
3.
Good default settings are supremely important for usable software, especially in the learning process. Since this is complex, I can only point out that I find “Conversions” as a default “Metric type”, seems far simpler than “Manual (PHP required)”.
Also, do you think pre-filling the goal URL with, say, ?goal= makes sense?
4.
Props to the experiments overview. Looks great. Might I suggest a column that displays whether an experiment is active or unpublished? The rollover “activate” is kinda hard to find unless you know where to look for it.
5. Have you considered adding “What’s this?” link buttons next to tricky concepts, such as the Z-score? Are you using the WordPress Help tab already? May be worthwhile.
6. It may be helpful to color-code significant z-score results with a green background.
Joen 5:48 am on July 13, 2010 Permalink |
Okay so I shouldn’t have written HTML examples in my post, they got stripped. I hope the gist of my message is intact, though.
mitcho/芳貴 12:18 am on July 19, 2010 Permalink |
Thanks for your comments Joen! My goal indeed is to try to lower the bar for doing A/B testing, although my priority thus far has been to just get the functionality all working. But that makes your feedback right now particularly valuable. 🙂
Let me respond to your points:
1. different images in the shortcode would work fine now, though they would require the HTML editor right now. You can also add more than one alternate variant parameter to the [ab] shortcode so you can test multiple variants, not just one, against the control.
2. I completely agree. I would like to tackle this soon, so I’ve made a ticket for it: http://plugins.trac.wordpress.org/ticket/1159
3. You’re right that “conversions” is a much simpler metric type, and ought to be the default. I’ve gone ahead and fixed this.
4. There’s a column right next to it which has each experiment’s status. Could be clearer, though.
5. I’m ashamed to say I’ve been slow on documentation, including user-facing docs. Just created a ticket to remind myself: http://plugins.trac.wordpress.org/ticket/1160
I will be adding help via the Help tab, but the “What’s this” type buttons are a good idea as well. Do you know if the WordPress core uses these at all? If so, I’d like to follow the same styling.
6. Thanks! See this thread for some of my (and others’) thoughts on the matter: https://shrimptest.wordpress.com/2010/07/15/i-implemented-a-function-to-compute-p-va/#comment-53
Joen 8:26 am on July 19, 2010 Permalink
Excellent stuff.
Also, if you have some specific questions you need some usability advice on, feel free to get in touch with me.
Joe Fletcher 7:05 pm on August 4, 2010 Permalink |
Perfect, I was looking for an easy way to test various aspects of my sites. As a designer, I’m always faced with design decisions and normally need to make an educated guess as to which call out, for example, would be most effective. This will help to rapidly create a few options and then decide based on results rather than guessing.
One thing I’m interested to try with this is to test graphics vs. text links. Is “Follow me on Twitter” more effective as a slick button or a simple text link? That sort of thing.
mitcho/芳貴 12:07 am on August 5, 2010 Permalink |
Re: graphic vs. text links. Great idea. This would be pretty easy with ShrimpTest. If it’s in the content, you can just use the [ab] shortcode… if it’s in the theme, use a “manual” variant type experiment. The current version of ShrimpTest should work fine for this. Let me know if you give it a shot! 🙂