Was Bem's "Feeling the Future" paper exploratory?

Well, according to the published version of the conference presentation, the hypothesis was fixed after 101 had been conducted.
Where does it say that?

So yes, there is a problem if 101 forms part of Experiment 5 (though of course, it's the composition of Experiment 5 that's not very clear).
Suppose it doesn't. What then?

If 101 does form part of Experiment 5, the hypothesis wasn't fixed in advance, and instead there would have been 6 hypotheses to choose from (3 levels of valence and 2 levels of arousal).
What makes you certain that there were only 6?
 
Well, according to the published version of the conference presentation, the hypothesis was fixed after 101 had been conducted. So yes, there is a problem if 101 forms part of Experiment 5 (though of course, it's the composition of Experiment 5 that's not very clear). If 101 does form part of Experiment 5, the hypothesis wasn't fixed in advance, and instead there would have been 6 hypotheses to choose from (3 levels of valence and 2 levels of arousal).

The problem with Experiment 5 is that the number of trials doesn't match that in 102, and the types of images don't match those used in 101 (in 101: negative, neutral and positive valence, and low and high arousal; in Experiment 5: high-arousal negative and low-arousal neutral). Either some of these descriptions are wrong, or else neither 101 nor 102 was included in Experiment 5, and the whole of Experiment 5 was done after the conference presentation. But of course in that case, both 101 and 102 should have been mentioned in the "File Drawer" section of "Feeling the Future", which supposedly lists the unpublished studies Bem had done.
 
Diatom

That's my reading of the conference publication. Feel free to disagree if you have a different interpretation (but I should warn you I don't generally respond well to interrogation!).

If 101 didn't form part of Experiment 5, then I think it's plausible that the stated hypothesis for Experiment 5 was fixed in advance. However, if Experiment 5 didn't include either 101 or 102, then that would mean Experiment 5 was done after Experiment 6 - and Experiment 6 is described in "Feeling the Future" as a replication of Experiment 5!
 
The trials in experiment 5 have to preceed experiment 6 as he says in experiment 5 that he didn't include erotic trials until 6.

We know in any event that he did different analyses for experiment 5 after the fact because he says so directly in the footnote in feeling the future.

We know he added subjects to experiment 6 after the fact because in Bem 2003 he specifically states they set the number at 100 and in Experiment 6 there are 150.

Given that he describes experiment 5 as having been reported in Bem 2003, the fact that there is so much trouble matching the two must be seen, as a weakness in the paper.

I'm not sure personally what to make of this apparent mixing and matching in the first place. Why are these experiments combined at all in Feeling the Future? If they were conceived as separate experiments shouldn't they be reported that way? If they were combined shouldn't he be explicit about how the combining took place? What reason could there be not to provide this detail, other than in a vague footnote?
 
If 101 does form part of Experiment 5, the hypothesis wasn't fixed in advance, and instead there would have been 6 hypotheses to choose from (3 levels of valence and 2 levels of arousal).

Valence:
  1. Positive
  2. Negative
  3. Neutral
  4. Positive + Neutral
  5. Positive + Negative
  6. Neutral + Negative
  7. Positive + Negative + Neutral
Arousal:
  1. High
  2. Low
  3. High + Low
There are 7 possible combinations of 3 levels of valence and 3 possible combinations of 2 levels of arousal. The set of possible analyses, each of which could be expressed as a distinct hypothesis, is the Cartesian product of the two sets of factor-level combinations enumerated above. That is, there are 21 of them, which is a lot more than the 6 you counted.

This assumes, of course, that the factors were fully crossed in the experiment.

ETA: Actually, there are even more hypotheses, because the direction of the effect could be asserted to be either positive or negative.
 
Last edited:
We know he added subjects to experiment 6 after the fact because in Bem 2003 he specifically states they set the number at 100 and in Experiment 6 there are 150.

I think it's accepted that Experiment 6 is a combination of 103 (n=50) and 201, 202 and 203 (n=100).
 
Either some of these descriptions are wrong, or else neither 101 nor 102 was included in Experiment 5, and the whole of Experiment 5 was done after the conference presentation.

Actually, I think it's clear that the description of the protocol in "Feeling the Future" is wrong. The description has "strongly arousing negative picture pairs or neutral control picture pairs: positively arousing (i.e. erotic) picture pairs were not introduced until Experiment 6 ..." But earlier on the instructions to participants are quoted, and they say "Most of the pictures range from very pleasant to mildly unpleasant, but ... some of the pictures contain very unpleasant images ..." It seems clear from that that the strongly arousing negative pictures were in the minority, and that the other pictures included strongly arousing positive ones and weakly arousing negative ones. That is consistent with the protocol for 101 described in the earlier paper.

Given that the description of Experiment 5 is incorrect in that respect, I wouldn't be surprised if it's also incorrect in stating that all the subjects performed 48 trials, in which case there would be no reason to exclude 102. Experiment 5 could essentially have been 101 and 102 combined, which would have been the natural assumption anyway (though there is still a discrepancy, because that would give 110 subjects, whereas there are said to be only 100 for Experiment 5).
 
I'm not sure personally what to make of this apparent mixing and matching in the first place. Why are these experiments combined at all in Feeling the Future? If they were conceived as separate experiments shouldn't they be reported that way? If they were combined shouldn't he be explicit about how the combining took place? What reason could there be not to provide this detail, other than in a vague footnote?

I think it's accepted that Experiment 6 is a combination of 103 (n=50) and 201, 202 and 203 (n=100).

Wrt Experiment 6, I wonder what would have been presented if, say, the combination of Experiments 103 and 203 was significant, but the other experiments individually not, and folding any combination of the others in with 103 + 203 resulted in non-significance. By my count, there are 53 possible ways that experiments from a pool of four could be combined (anybody want to check that count for me?).

Edit: Linda caught a mistake. The correct number is 54.
Edit2: Laird caught another mistake. The correct number is 51.
 
Last edited:
By my count, there are 53 possible ways that experiments from a pool of four could be combined (anybody want to check that count for me?).

Sure, I'll check that count. The possible combinations are:

4C4 + 4C3 + 4C2 + 4C1 + 3C3 + 3C2 + 3C1 + 2C2 + 2C1 + 1C1

= 1 + 4*3*2/(3*2*1) + 4*3/(2*1) + 4 + 1 + 3*2/(2*1) + 3 + 1 + 2 + 1

= 26

Have I made a mistake or have you? And where does the mistake lie?
 
Sure, I'll check that count. The possible combinations are:

4C4 + 4C3 + 4C2 + 4C1 + 3C3 + 3C2 + 3C1 + 2C2 + 2C1 + 1C1

= 1 + 4*3*2/(3*2*1) + 4*3/(2*1) + 4 + 1 + 3*2/(2*1) + 3 + 1 + 2 + 1

= 26

Have I made a mistake or have you? And where does the mistake lie?

I wouldn't call your calculation a "mistake." More likely, I didn't explain the problem in sufficiently rigorous language. Rather than attempting to do so, let me give some examples of ways that the studies could be combined that you're not counting.

Say we have four experiments labeled A, B, C, and D, and denote studies that are combined for analysis as a set. So, for instance, {A, B} would denote that studies A and B have been combined. Then, denote a possible way of presenting an arbitrary combination of studies as a set of sets. For example, {{A, B}, {C}} would denote presenting studies A and B combined, C separately, and D not at all. That is an example of a "combination" (maybe we should call it a "super-combination") that you haven't counted. Another example would be {{A, B}, {C}, {D}}.
 
Laird

That'll teach you to check someone's calculation for them when they hadn't told you what they were calculating! ;) What larks!
 
OK, so, given your more rigorous wording, here's my amended working:

[Edit: as I acknowledged in a later post, this working was wrong. I will not delete it from the historical record, but I have corrected it further below]

1-set-combos, both complete and incomplete (individual sets of four, three, two and one):

4C4 + 4C3 + 4C2 + 4C1

2-set-combos, complete (individual sets of three-and-one, two-and-two, one-and-three):

4C3*1C1 + 4C2*2C2 + 4C1*3C1

2-set-combos, incomplete (individual sets of two-and-one, one-and-two, one-and-one):

4C2*2C1 + 4C1*3C2 + 4C1*3C1

3-set-combos, complete (individual sets of two-and-one-and-one, one-and-two-and-one, and one-and-one-and-two):

4C2*2C1*1C1 + 4C1*3C2*1C1 + 4C1*3C1*2C2

3-set-combos, incomplete (one-and-one-and-one):

4C1*3C1*2C1

4-set-combos, complete (one-and-one-and-one-and-one):

4C1*3C1*2C1*1C1


Now combining into equivalent sets of combinations, and then dividing by the number of possible permutations of each set of combinations:

4C4 + 4C3 + 4C2 + 4C1 + (4C3*1C1 + 4C1*3C1) / 2 + 4C2*2C2 + (4C2*2C1 + 4C1*3C2) / 2 + 4C1*3C1 + (4C2*2C1*1C1 + 4C1*3C2*1C1 + 4C1*3C1*2C2) / (3*2) + 4C1*3C1*2C1 + 4C1*3C1*2C1*1C1

= 1 + 4*3*2/(3*2*1) + 4*3/(2*1) + 4 + (4*3*2/(3*2*1)*1 + 4*3) / 2 + 4*3/(2*1)*1 + (4*3/(2)*1 + 4*3*2/(2*1)) / 2 + 4*3 + (4*3/(2*1)*2*1 + 4*(3*2)/(2*1)*1 + 4*3*1) / (3*2) + 4*3*2 + 4*3*2*1

= 104

So, which of us is wrong, and why?


[Edit: here is the corrected version. Aside from a few minor slip-ups, the major thing I missed was dividing by the factorial of each number of subsets which shared a common size, as noted in post #238. I've marked all changes to the above working in bold green - each bolded, green item is either an addition or an amendment]

1-set-combos, both complete and incomplete (individual sets of four, three, two and one):

4C4 + 4C3 + 4C2 + 4C1

2-set-combos, complete (individual sets of three-and-one, two-and-two, one-and-three):

4C3*1C1 + 4C2*2C2/2! + 4C1*3C3

2-set-combos, incomplete (individual sets of two-and-one, one-and-two, one-and-one):

4C2*2C1 + 4C1*3C2 + 4C1*3C1/2!

3-set-combos, complete (individual sets of two-and-one-and-one, one-and-two-and-one, and one-and-one-and-two):

4C2*2C1*1C1/2! + 4C1*3C2*1C1/2! + 4C1*3C1*2C2/2!

3-set-combos, incomplete (one-and-one-and-one):

4C1*3C1*2C1/3!

4-set-combos, complete (one-and-one-and-one-and-one):

4C1*3C1*2C1*1C1/4!


Now combining into sets of equivalent combinations, and then dividing by the number of repetitions of equivalent combinations:

4C4 + 4C3 + 4C2 + 4C1 + (4C3*1C1 + 4C1*3C3) / 2 + 4C2*2C2/2 + (4C2*2C1 + 4C1*3C2) / 2 + 4C1*3C1 / 2 + (4C2*2C1*1C1 + 4C1*3C2*1C1 + 4C1*3C1*2C2) / (3*2) + 4C1*3C1*2C1 / (3*2) + 4C1*3C1*2C1*1C1 / (4*3*2)

= 1 + 4*3*2/(3*2*1) + 4*3/(2*1) + 4 + (4*3*2/(3*2*1)*1 + 4*1) / 2 + 4*3/(2*1)/2 + (4*3/(2)*2 + 4*3*2/(2*1)) / 2 + 4*3 / 2 + (4*3/(2*1)*2*1 + 4*(3*2)/(2*1)*1 + 4*3*1) / (3*2) + 4*3*2 / (3*2) + 4*3*2*1 / (4*3*2)

= 51
 
Last edited:
Diatom

That's my reading of the conference publication. Feel free to disagree if you have a different interpretation (but I should warn you I don't generally respond well to interrogation!).
I'm not trying to interrogate you. I'm just trying to find out what your opinion is. I was completely wrong about what your opinion regarding the data presented at the conference. Now it's becoming much clearer to me.
One thing that is still unclear to me is why you thought there were only 6 hypotheses. As Jay points out, at least 21 are obvious.
 
Edit: reworded several times "Now combining into... [etc]", it's tricky to get that wording right.

[Edit: and even then I still didn't get it right. Now corrected in green]
 
Last edited:
Wrt Experiment 6, I wonder what would have been presented if, say, the combination of Experiments 103 and 203 was significant, but the other experiments individually not, and folding any combination of the others in with 103 + 203 resulted in non-significance. By my count, there are 53 possible ways that experiments from a pool of four could be combined (anybody want to check that count for me?).
I get 54. But I'm not trying to be an ass.

Linda
 
2nd edit: amended final result from 102 to 106 (with correspondingly amended working).

And now from 106 to 104 (again with correspondingly amended working). This stuff ain't simple!

[Edit: but it's really not that difficult either. In any case, 51 is, after all, the correct answer, as we have all come to agree]
 
Last edited:
But I can think of another possibility that all of this misses. Perhaps, Jay, you would also like to include the possibility of sets-of-sets-of-sets, e.g. {{{{A},{B}},{C}},{D}}. If so, it's going to get a lot messier. You're going to have to *really* allow me some working!

Edit: or, hey, let's go to the extreme and allow for sets-of-sets-of-sets-of-sets, e.g. {{{{{{{{A}},B}},C}},D}}.
 
Last edited:
I'm not trying to interrogate you. I'm just trying to find out what your opinion is. I was completely wrong about what your opinion regarding the data presented at the conference. Now it's becoming much clearer to me.

Actually, my opinion is evolving somewhat. It isn't the same today as it was yesterday.

One thing that is still unclear to me is why you thought there were only 6 hypotheses. As Jay points out, at least 21 are obvious.

At any rate there are 6 types of images in 101. To my mind that means 6 straightforward null hypotheses. (I know Bem likes one-tailed tests, so he specifies a direction, but I don't think that's entirely rational.) No doubt more elaborate hypotheses can be constructed, but I think that Bem's hypotheses were simple, and that if we're counting up possible alternatives we should be looking at equally simple ones.
 
At any rate there are 6 types of images in 101. To my mind that means 6 straightforward null hypotheses. (I know Bem likes one-tailed tests, so he specifies a direction, but I don't think that's entirely rational.) No doubt more elaborate hypotheses can be constructed, but I think that Bem's hypotheses were simple, and that if we're counting up possible alternatives we should be looking at equally simple ones.
Actually it is even worse than Jay makes it out.

The valance and arousal are rated on a 9 point scale. How do you get from a 9 point scale to 2 levels of arousal? You can place the cut-off anywhere. You can leave out the middle altogether. And that wouldn't even change the "simple hypotheses".
You can also have more than just 2 levels and the hypotheses would still seem equally simple.

What's more is that the IAPS images are rated on a third scale as well. For all we know Bem could have used this third scale instead of one of the two that ended up in the paper. How would we know?

Finally, I don't see why the combinations that Jay presented are "more elaborate". Bem says that the "negative pictures were drawn from the negative/high arousal category". Why not negative from any arousal category? Surely that would be even simpler but it is a combination not among the 6 "straightforward NHs".
I do agree that some mathematically combination are too weird to justify openly. In that regard I would like to discuss that "differential recall index" which is highlighted in the blog you linked earlier.
 
Diatom

Sorry, but I find all the purely speculative suggestions about how Bem could have cheated a bit boring, unless they can be supported by some evidence that he did actually cheat.

Of course, I realise some people are approaching this from the point of view that he must have cheated, and working from there. But in that case, what's the point of all these elaborate scenarios? If he wanted to cheat, he could just edit the files containing the results, and save himself all the trouble. It would have been a lot simpler and a lot safer. And quite frankly, if he had been cheating, I think he'd have been a damn sight less sloppy about what he published!
 
Back
Top