Quote:
>>Effect of first event in inning, both leagues, 1984-1991

>>Leadoff         Prob of >= 1 run     Avg. number of runs
>>-------------------------------------------------------------
>>single  .428                    .855
>>walk            .432                    .865
>I'm assuming that over eight years of data, the difference is statistically
>significant.

IS the difference statistically significant?

Over 8 years of data,
there have been (8 years)*(162 games/year/team)*(26 teams) = 33696 leadoff
innings.  We are given an event x (e.g., x = leadoff single scoring).
Event x has a TRUE probability p of occurring each chance
(e.g. p = about 0.43).  Using the DeMoivre-Laplace Theorem,
if we have n chances (in our case n=33696) we can compute the
probability of observing x between a fraction "a" and a fraction "b"
of all possible occurrences.

Assume that p = 0.43.  Given 33696 possible occurrences, what is
the probability of observing leadoff singles scoring between fraction
"a" and fraction "b" of the time?

a      b       probability
---------------------------
0       0.428    22.9 %
0.428   0.429    12.6
0.429   0.430    14.5
0.430   0.431    14.5
0.431   0.432    12.6
0.432   1        22.9

So if the TRUE probability of a leadoff single scoring is 0.43, then
there is a 22.9 % chance that we will observe the leadoff single
scoring a fraction of 0.428 or less.

Similarly, if the true probability of a leadoff walk scoring is 0.43, then
there is a 22.9 % chance that we will observe the leadoff single
scoring a fraction of 0.432 or more.

There is therefore a (0.229)*(0.229) = 5.2 % chance that both leadoff
singles score 0.428 or less and leadoff walks score 0.432 or more.
Of course, there is also a 5.2 % chance that both leadoff
singles score 0.432 or more and leadoff walks score 0.428 or less.

RESULT:
If the probability of leadoff walks scoring is the same as the
probability of leadoff singles scoring, there is approximately
(5.2 + 5.2) = 10.4 % chance that we would observe a 0.004 (or more)
discrepancy over an 8-year observation period.

----------
Dan Simon

Reference: A. Papoulis, "Probability, Random Variables, and
Stochastic Processes," Mc-Graw Hill Publishing, 1984, pages 42-49.

Quote:
>>Effect of first event in inning, both leagues, 1984-1991

>>Leadoff         Prob of >= 1 run     Avg. number of runs
>>-------------------------------------------------------------
>>single  .428                    .855
>>walk            .432                    .865
>I'm assuming that over eight years of data, the difference is statistically
>significant.

It is. Back of the envelope estimates based on a little over 300,000
lead off hitters have a standard error of just under .001. So about
all the values in Sherri's column 1 are significantly different. What
confuses the issue and why I said it looks like noise is that other
factors are present. Recall the question has to do with whether a
walk or single or error is desirable. I'm pretty sure that we if did
a big study we'd determine that 1 and 2 hitters do indeed walk more
relative to getting singles than 7-8-9 hitters. Managers not named
Chuck Tanner have figured out that guys who do walk a lot do make
pretty good top of the lineup hitters. 1 and 2 hitters score more
because the best hitters in the lineup follow directly. That doesn't
mean it is better to get a walk to lead off an inning though. As
Roger said, its one of those 3 decimal place effects. It is of
minor importance and can likely by attributed to other factors we
can't/haven't taken into account.

All which is in rough agreement with

Quote:
>My hypothesis: more runs score with a leadoff walk than a leadoff
>single because walks are relatively more likely to come in the middle
>of the lineup with good hitters up.  (In other words, with the 7-8-9
>sequence of batters, the #7 hitter is going to have a higher ratio
>of singles to walks than the #3 hitter.)

Gerry

Quote:

>>>>        The avergae pitcher gives up a hit an inning. If a pitcher gives
>>>> up a single to lead off the inning, there is a good chance there won't
>>>> be another hit that inning. OTOH, if the pitcher leads off yeilding a
>>>> walk, the batting team still has that hit they are likely to pick up
>>>> later in the inning.

>>> This is the gambler's fallacy.  The chance that a coin will land heads
>>> does not depend on the previous flip.  Likewise, the chance that a team
>>> will get a hit in the current inning does not depend on what happened in
>>> the earlier part of the inning.

>>        No, it is not gambler's fallacy; the chance of a team getting two hits
>>in an inning is less than the chance of them getting one hit in an inning.

> Exactly -- this is the gambler's fallacy in its pure form.  Yes, the
> chance of getting two hits is smaller -- but the team doesn't NEED two hits!
> It only needs one, because it already HAS one.   The conditional probability
> of two hits total, GIVEN that there's already been one, is HIGHER than
> the raw probability of getting one, as we know from split-stats.

You aren't arguing what I'm arguing. I'm arguing that from the start of
an inning, the chances of the batting team getting two hits in that inning is
far less than getting just one hit in that inning. This argument is NOT
incorrect. It might not be relevant, but it isn't incorrect.

Brandon Cope

Quote:

>>>>>    The avergae pitcher gives up a hit an inning. If a pitcher gives
>>>>> up a single to lead off the inning, there is a good chance there won't
>>>>> be another hit that inning. OTOH, if the pitcher leads off yeilding a
>>>>> walk, the batting team still has that hit they are likely to pick up
>>>>> later in the inning.
>>>> This is the gambler's fallacy.  The chance that a coin will land heads
>>>> does not depend on the previous flip.  Likewise, the chance that a team
>>>> will get a hit in the current inning does not depend on what happened in
>>>> the earlier part of the inning.
>>>    No, it is not gambler's fallacy; the chance of a team getting two hits
>>>in an inning is less than the chance of them getting one hit in an inning.
>> Exactly -- this is the gambler's fallacy in its pure form.  Yes, the
>> chance of getting two hits is smaller -- but the team doesn't NEED two hits!
>> It only needs one, because it already HAS one.   The conditional probability
>> of two hits total, GIVEN that there's already been one, is HIGHER than
>> the raw probability of getting one, as we know from split-stats.
>    You aren't arguing what I'm arguing. I'm arguing that from the start of
>an inning, the chances of the batting team getting two hits in that inning is
>far less than getting just one hit in that inning. This argument is NOT
>incorrect. It might not be relevant, but it isn't incorrect.

No, Brandon, that is *not* what you were arguing.  You were arguing that
"two hits less likely than one" interpretation of the gambler's fallacy.

Now you come around and claim that you only meant to say that two hits
are less likely than one -- a fact so stunningly irrelevant that you
couldn't possibly have meant it wrt the "explanation" you proposed above.

Sorry, it doesn't wash.

Roger

|>
|>
|>>       The avergae pitcher gives up a hit an inning. If a pitcher gives
|>> up a single to lead off the inning, there is a good chance there won't
|>> be another hit that inning. OTOH, if the pitcher leads off yeilding a
|>> walk, the batting team still has that hit they are likely to pick up
|>> later in the inning.
|>
|> This is the gambler's fallacy.  The chance that a coin will land heads
|> does not depend on the previous flip.  Likewise, the chance that a team
|> will get a hit in the current inning does not depend on what happened in
|> the earlier part of the inning.
|
|       No, it is not gambler's fallacy; the chance of a team getting two hits
|in an inning is less than the chance of them getting one hit in an inning.
|

No. No. No. No. No.  The chance of a team getting one hit in a given inning
is about the same chance as a team getting two hits in an inning given that
the lead off batter already got a hit.  The only reason why the odds are
a little different is due to the fact that the guy who got a hit may be
out while running the bases.

All this discussion reminds me of a story about a guy who called up the
airlines and asked "What are the odds that there will be a bomb on the plane?"
The airlines said "about a 1000 to 1".  He then asked what are the odds of
two bombs being on the same plane.  The airlines said "about a million to 1".
"Oh, I like those odds better" he said.  He then brought a bomb on the plane
so he could get the million to one odds.

--
Warren Usui

uunet!lcc!aardvark

Quote:

>    You aren't arguing what I'm arguing. I'm arguing that from the
start of
> an inning, the chances of the batting team getting two hits in that
inning is
> far less than getting just one hit in that inning. This argument is
NOT
> incorrect. It might not be relevant, but it isn't incorrect.

True.  However, the chance of getting a second hit in an inning
after getting a lead-off hit is NOT lower than the chance of
getting one hit in an inning (any more than the chance of tossing
heads is lower if you just tossed heads; it's still 50-50 to toss
another).  Based on probablility alone, you should have the same
chance to get another hit after a lead-off single as there was
to get a single hit in the first place.  Actually, since batting
averages seem to go up a bit with men on base, the chance of
getting that second hit is actually somewhat higher than
getting the first hit.

--
Jim Mann

Sherri's table again

Quote:
>>>Effect of first event in inning, both leagues, 1984-1991

>>>Leadoff             Prob of >= 1 run     Avg. number of runs
>>>-------------------------------------------------------------
>>>single      .428                    .855
>>>walk                .432                    .865
>>I'm assuming that over eight years of data, the difference is statistically
>>significant.
>IS the difference statistically significant?
>Over 8 years of data,
>there have been (8 years)*(162 games/year/team)*(26 teams) = 33696 leadoff

off the game. Maybe I'm wrong. Two teams play each game (so divide
the above by 2). About 18 half innings are played on average (so
multiply by 18) per game. I think your "n" is off by factor of 9.
I think the whole mess yields standard error around .0005 for col 1
in the table. If the table refers only to leading off the game
then cancel what I just wrote.
Gerry

Quote:

>> No, Brandon, that is *not* what you were arguing.  You were arguing that
>> "two hits less likely than one" interpretation of the gambler's fallacy.

>    Yes, I did say this, but I was looking at it from the *start* of the
>inning, not *after* the first batter reached base. This ain't the gambler's
>fallacy (though it is largely irrelevant). From the start of an inning, there
>is a slightly better chance of a team scoring if the first runner reaches
>via walk than hit, since in most cases they will only get 1 hit/inning.

Don't bother with him, Roger.  He not only doesn't know enough
mathematics to figure out the right answer on his own, he doesn't know
enough mathematics to realize that he is wrong when the correct answer
is handed to him on a silver platter.  The scary thing is that he
*still* probably understands probability theory better than your
average ballplayer/manager/GM/announce/baseball fan.

Quote:
>Excuse me for arguing philosophically rather than with statistical
>analysis, which you silly stat heads thrive on.

Sorry, Brandon.  Your argument isn't a "philosophical" one either.
It's simply wrong.  It isn't even irrelevant.  And your insistence on
it looks *really* stupid to anybody who actually understands
probability theory.  (Which doesn't necessarily mean that you won't be
able to find people to agree with you.)

Quote:
>What does irk me is that you and the other stat
>heads are so damned rude to anyone who doesn't fall down and kiss your feet.

Aw!  I think I hurt his feelings.  Shame on me.  (Did anybody see me
demanding that he kiss my feet?  Frankly, I'd rather keep uneducated
idiots like him as far away from me as possible.)

Brandon states that 2 + 2 = 5.  I tell him (by e-mail, I think) that
he's wrong.  He tells me "It's not your newsgroup, I can argue that 2
+ 2 = 5 if I want to."

And I wasn't even rude, the first time.

<sigh>
-Valentine
--

Clemens four Cy Young!

Quote:

>    *This* is the person that line was meant for.

Well, then at least have the guts to name names.  *I* knew it was
meant for me, but you probably mislead a few of the innocent
bystanders.

Quote:
>Nope, you didn't hurt my feelings. I simply refuse to listen to anyone
>who is rude and arrogant, regardless of who is correct.

A sensible attitude.  If you don't like somebody, stick your fingers
in your ears and yell "I'M NOT LISTENING" at the top of your lungs.

I believe I ignored your first post.  (Knew *somebody* would correct
you.)  And didn't I respond by e-mail to the second with something
like: "You are wrong.  It is the gambler's fallacy.  Give it up."  I
don't remember the exact wording, but I don't think I was particularly
rude.  If that is being arrogant and rude, I certainly don't apologize
for it.

My later posts and replies were *quite* arrogant, rude, and
condescending.  But you are a pig-headed idiot who doesn't understand
third grade math.  Not to mention that I was rather surprised when you
took offense at that first letter.  (Do you blow up at *everybody* who
corrects you in absolute terms?)

Quote:
>If you are wrong, ignoring you doesn't matter, and if you are right,
>small-minded attacks on others, since you will only get more self-righteous.

Impossible.  I'm already as arrogant as I can get.  Though I have been
known to change my mind on occasion when I *was* wrong.  I think you
took offense at my tone.  I didn't say "I think you are wrong", or
"You might be wrong".  I said "You are wrong".  And you were wrong.  I
didn't "think" it, I *knew* it.  I don't often state in such flat-out
terms that I am right unless I *am* right.

Quote:
>Crude persons do not deserve to be listened to

No, you aren't crude.  Just piggishly stubborn.  It's okay to be wrong
occasionally.  Happens to all of us.  It isn't generally considered an
insult to point that out.  And I think you overreacted to a
not-particularly-offensive letter.

Of course, I might be mis-remembering what I wrote in that first letter.
Did you save a copy?  Do you care to edify the group as to its contents?

Please grow up.  You'll save yourself a lot of headaches if you don't
take everything as a personal attack.

-Valentine
--

Clemens four Cy Young!

Quote:

>>        Yes, I did say this, but I was looking at it from the *start* of the
>>inning, not *after* the first batter reached base. This ain't the gambler's
>>fallacy (though it is largely irrelevant). From the start of an inning, there
>>is a slightly better chance of a team scoring if the first runner reaches
>>via walk than hit,

> WHOA!  There's your conditional probability right there: the "if."
> Which means that you're *not* taking it from the start of the inning;

I'll admit to making a big logical mistake, but I still don't think it
was the gambler's falalcy.

Quote:
>>However, this would indicate that a leadoff walk should score *much* more often
>>than a leadoff single, so 1 hit/inning theory doesn't seem to hold very well,

> Why not? It's empirically proven.  League batting average is around .250,
> which means 3 outs per hit, or one hit per inning, on average.

I didn't mean that pitchers don't give up a hit an inning :-)

What I meant (and poorly said) was that if my argument was sound (which
it wasn't) that a walk followed by a hit (or anything else) would score more
often than a single followed by anything else.

My mistake was misusing the "1 hit an inning" fact.

Quote:
>>not upset about knowing that. What does irk me is that you and the other stat
>>heads are so damned rude to anyone who doesn't fall down and kiss your feet.

> Huh?  When you first posted what you did, I responded with a humorous
> analogy -- the bomb joke.  That was simply to make obvious the problem
> in your argument.  Nothing to do with statistics; people who hate statistics
> but know their logical fallacies would do the same.  I don't want you to
> kiss my feet, though simple courtesy would be just fine.  I don't recall
> saying anything *** about you -- or, for that matter, of invoking
> statistics.  My argument was strictly on the philosophical side of
> probability theory.

Sorry, that line wasn't meant for you. Of the 3-4 staheads on the
newsgroup who constantly get on my case, you are probably the most polite.

Brandon Cope

Quote:

>>What does irk me is that you and the other stat
>>heads are so damned rude to anyone who doesn't fall down and kiss your feet.

> Aw!  I think I hurt his feelings.  Shame on me.  (Did anybody see me
> demanding that he kiss my feet?  Frankly, I'd rather keep uneducated
> idiots like him as far away from me as possible.)

> Brandon states that 2 + 2 = 5.  I tell him (by e-mail, I think) that
> he's wrong.  He tells me "It's not your newsgroup, I can argue that 2
> + 2 = 5 if I want to."

> And I wasn't even rude, the first time.

*This* is the person that line was meant for.

Nope, you didn't hurt my feelings. I simply refuse to listen to anyone
who is rude and arrogant, regardless of who is correct. If you are wrong,
ignoring you doesn't matter, and if you are right, agreeing with you will only
lead you to continue your crass, small-minded attacks on others, since you will
only get more self-righteous.

Crude persons do not deserve to be listened to (if you consider me in
that group, *please* put me in your kill-file -- I wouldn't want to hear from
you until you learn not to be obnoxious, anyway).

Brandon Cope

Quote:

>>>    Yes, I did say this, but I was looking at it from the *start* of the
>>>inning, not *after* the first batter reached base. This ain't the gambler's
>>>fallacy (though it is largely irrelevant). From the start of an inning, there
>>>is a slightly better chance of a team scoring if the first runner reaches
>>>via walk than hit,

>> WHOA!  There's your conditional probability right there: the "if."
>> Which means that you're *not* taking it from the start of the inning;

>    I'll admit to making a big logical mistake, but I still don't think it
>was the gambler's falalcy.

Then what do you think the gambler's fallacy is?  I assure you as
an innocent bystander that you committed it.

Quote:
>>>not upset about knowing that. What does irk me is that you and the other stat
>>>heads are so damned rude to anyone who doesn't fall down and kiss your feet.

I've followed this thread.  You posted a mistake.  You were corrected.
You stubbornly held to the original mistake.  After this goes back and
forth three times, you lash out like this.  Who's being rude?  You appear
to have thrown out the first insult in this thread.
--

the university of chicago law school, chicago, illinois 60637
ajax crypto-analyzer -- warning! remove all chicken bones first

Quote:

>>>    Yes, I did say this, but I was looking at it from the *start* of the
>>>inning, not *after* the first batter reached base. This ain't the gambler's
>>>fallacy (though it is largely irrelevant). From the start of an inning, there
>>>is a slightly better chance of a team scoring if the first runner reaches
>>>via walk than hit,

>> WHOA!  There's your conditional probability right there: the "if."
>> Which means that you're *not* taking it from the start of the inning;

>    I'll admit to making a big logical mistake, but I still don't think it
>was the gambler's falalcy.

When I first read Brandon's set of articles, a.k.a. the "Cope Hypothesis"
:-) I thought that he had fallen to the gambler's fallacy. But after
reading his last response, I think I see what he is saying, and I
believe that it is *not* the gambler's fallacy.

What I think he's saying, and please correct me if I'm wrong, is that
if we look at what happens in each inning *after the fact*, then we
can see that in a certain number of innings, there will be 1 hit in
that inning, in a certain number there will be 2, 3, and so forth.

Now, in those that we know (since we are looking a posteriori) have
1 hit, those with a leadoff hit are less likely to produce a scored run
than are those in which the leadoff player gets on base by a walk,
because we *know* that a hit will follow in this case, and because
we *know* a hit will not follow in the first case. This is not the
gambler's fallacy, obviously.

In those innings with two hits, the ones with leadoff walks are again
more likely to score (presumably) since two hits will follow, while
the ones with leadoff singles will score less frequently since only
one hit will follow. And so forth.

However, this argument is probably irrelevant, because many of those
innings in which there is a leadoff walk, there are no hits; in those
comparison becomes P(1 hit|walk) vs. P(2 hits|leadoff hit), which
is roughly the same (presumably).

If I (or Brandon or anyone else) were to say that the P(1 hit|walk)=
P(another hit|walk) is significantly different from P(2 hits|leadoff hit)=
P(another hit|leadoff hit) then I (or the other) would be committing
the gambler's fallacy. But not so here (though, admittedly, the
argument is mostly irrelevant).