Red Sox MLE's: A Test of 1 vs. 3 Year Park Factors?

Red Sox MLE's: A Test of 1 vs. 3 Year Park Factors?

Post by Eric M. Va » Fri, 02 Feb 2001 08:42:46


Seldom has a ballpark changed apparent character as severely as the Pawtucket
Red Sox' McCoy Stadium did last year.

The Red Sox MLE's in the back of STATS' Green book are based on an "m" of about
.800, which we can divide by .82 to get the ratio of offense level in major
league, to minor league & park: about 0.97.  I believe that represents a 3-year
park factor.

The actual number *for last year alone*, however, was essentially 1.225 (that's
using the actual scoring per inning in both parks; and the ERA of all the other
teams in each league, rather than total league R/G).  That raises "m" to almost
1.00 on the nose.  IOW, if you use this 1-year park factor, PawSox MLE's are
very close to their actual minor league numbers.

OK, these are small sample sizes, but what happened last year?

Alcantara
---------   PA    BA   OBP    SA
Pawtucket  332  .308  .364   .662
Stats MLE       .281  .326   .566
Boston      48  .289  .333   .578

Burkhart
--------
Pawtucket  437  .255  .392   .504
MLE             .233  .339   .423
Boston      95  .288  .442   .493

Veras
-----
Pawtucket  233  .211  .258   .294
MLE             .192  .225   .258
Boston     176  .244  .278   .299

The MLE's nailed Alcantara, but Burkhart and Veras -- who had the bigger sample
sizes -- actually hit better at Boston than at Pawtucket, in a pattern
consistent with Fenway's park effects.

Two factors worth mentioning:

Unlike Burkhart and Veras, Alcantara's PA were accumulated coming off the bench,
rather than as a regular over a short stretch of time.

Second, there's a lot of evidence that Pawtucket was much tougher on lefty
hitters last year than righties.  Dernell Stenson and Curtis Pride had huge home
/ road splits, Burkhart was much better as a RH hitter in Pawtucket but much
better as a lefty in Boston, PawSox lefty short men Sang-Hoon Lee and Tim Young
had much better ERA's at home than on the road.  That might explain some of why
Burkhart faired relatively better than Alcantara in Boston (but not Veras, of
course).

Burkhart looks like he has a decent chance to make the team as a backup 1B.  It
will be interesting to see whether his performance is as STATS projects, using
3-year factors, or as I have, using 1-year factors.  

Even likelier, if this turns out to be an abberation caused by weather, look for
Dernell Stenson's numbers at Pawtucket to take a huge leap upwards (from .268 /
.349 / .487).  If Stenson has a "breakthrough" season and hits his way into the
BoSox lineup, you read it here first: much of the the actual breakthrough was
last year (a 26% increase in adjusted RC/27), but it was masked by the park.  

(Side note: can anyone ever remember a team having *three* legitimate rookie
prospects playing the same position in the same year, as the Sox do at 1B with
Stenson, Burkhart, and Juan Diaz?  I mean, *two* is obviously rare.)

--
----
Eric M. Van

". . . from that day forward she lived happily ever after.  Except for the dying
at the end.  And the heartbreak in between." - Lucius Shepard.

 
 
 

Red Sox MLE's: A Test of 1 vs. 3 Year Park Factors?

Post by Dan Szymborsk » Fri, 02 Feb 2001 13:55:48


says...

Quote:
> Seldom has a ballpark changed apparent character as severely as the Pawtucket
> Red Sox' McCoy Stadium did last year.

> The Red Sox MLE's in the back of STATS' Green book are based on an "m" of about
> .800, which we can divide by .82 to get the ratio of offense level in major
> league, to minor league & park: about 0.97.  I believe that represents a 3-year
> park factor.

> The actual number *for last year alone*, however, was essentially 1.225 (that's
> using the actual scoring per inning in both parks; and the ERA of all the other
> teams in each league, rather than total league R/G).  

> That raises "m" to almost
> 1.00 on the nose.  IOW, if you use this 1-year park factor, PawSox MLE's are
> very close to their actual minor league numbers.

Based on *3* players average just over *100* major league at-bats this
year?  You very darn well know better than this, Eric; the noise for data
this small and of this type is loud enough to completely invalidate
almost any conclusion of this nature.

What you're doing is the equivalent of using your bathroom scale to weigh
a dime and a quarter and deciding that they in fact weight the same.

I'm not sure where you're getting the extreme park factor for Pawtucket
anyway; Pawtucket home games had 718 runs scored in them and Pawtucket
road games had 718 runs scored in them and no amount of needless minor
fiddling (using innings instead of games, subtracting both from
league totals) would make it into an extreme park.  

--
Dan Szymborski

"Experience is a revelation in the light of which we renounce our errors
of youth for those of age."
     - Ambrose Bierce

 
 
 

Red Sox MLE's: A Test of 1 vs. 3 Year Park Factors?

Post by Eric M. Va » Sat, 03 Feb 2001 00:13:07

Quote:


> says...
> > Seldom has a ballpark changed apparent character as severely as the Pawtucket
> > Red Sox' McCoy Stadium did last year.

> > The Red Sox MLE's in the back of STATS' Green book are based on an "m" of about
> > .800, which we can divide by .82 to get the ratio of offense level in major
> > league, to minor league & park: about 0.97.  I believe that represents a 3-year
> > park factor.

> > The actual number *for last year alone*, however, was essentially 1.225 (that's
> > using the actual scoring per inning in both parks; and the ERA of all the other
> > teams in each league, rather than total league R/G).

> > That raises "m" to almost
> > 1.00 on the nose.  IOW, if you use this 1-year park factor, PawSox MLE's are
> > very close to their actual minor league numbers.

> Based on *3* players average just over *100* major league at-bats this
> year?

No, no, no!  Based on a 1-year park factor.  I then present the tiny amount of
data and say that it fits the 1-year park factor better.  Of *course* you can't
conclude that that means the extreme 1-year park factor is real and not noise of
some sort.  But it makes it slightly more likely, doesn't it?

Quote:
> You very darn well know better than this, Eric; the noise for data
> this small and of this type is loud enough to completely invalidate
> almost any conclusion of this nature.

> What you're doing is the equivalent of using your bathroom scale to weigh
> a dime and a quarter and deciding that they in fact weight the same.

> I'm not sure where you're getting the extreme park factor for Pawtucket
> anyway; Pawtucket home games had 718 runs scored in them and Pawtucket
> road games had 718 runs scored in them

Whoa! I have 612 runs scored at home and 738 scored on the road.  PawSox scored
323 runs in  590 home innings and 390 runs in 635 road innings; they allowed 289
runs in 639 home innings and 348 in 609 road innings.

Where's your data from?  Mine's from downloading the score of every game from
CNN/SI.  I just double-ckecked with ESPN's game log -- there are two differences
-- ESPN omits a 1-0 game at Syracuse 4/14 (part of a doubleheader) but adds a
2-2 tie at Richmond 5/27.  Both are in error (the "tie" was suspended and
finished the next day).

Quote:
> and no amount of needless minor
> fiddling (using innings instead of games,

Using innings instead of games lowers the PAF from .932 to .923.  Worth the
effort, IMHO.

Quote:
> subtracting both from
> league totals) would make it into an extreme park.

--
----
Eric M. Van

". . . from that day forward she lived happily ever after.  Except for the dying
at the end.  And the heartbreak in between." - Lucius Shepard.

 
 
 

Red Sox MLE's: A Test of 1 vs. 3 Year Park Factors?

Post by Dan Szymborsk » Sat, 03 Feb 2001 02:53:43


says...
.

Quote:

> > Based on *3* players average just over *100* major league at-bats this
> > year?

> No, no, no!  Based on a 1-year park factor.  I then present the tiny amount of
> data and say that it fits the 1-year park factor better.  

But how the data fits in a sample this size is irrelevant.  Given the
error rate inherent in MLEs over 600 at-bats (a standard error in the 60-
70 range for OPS), 100 per player is miniscule.

Quote:
> Of *course* you can't
> conclude that that means the extreme 1-year park factor is real and not noise of
> some sort.  But it makes it slightly more likely, doesn't it?

Not really.  You can't see anything in that noise.

The best practical way to test 1-year vs. 3-year in park factors would be
to test the year 0-year 1 results with both.    

Quote:
> > You very darn well know better than this, Eric; the noise for data
> > this small and of this type is loud enough to completely invalidate
> > almost any conclusion of this nature.

> > What you're doing is the equivalent of using your bathroom scale to weigh
> > a dime and a quarter and deciding that they in fact weight the same.

> > I'm not sure where you're getting the extreme park factor for Pawtucket
> > anyway; Pawtucket home games had 718 runs scored in them and Pawtucket
> > road games had 718 runs scored in them

> Whoa! I have 612 runs scored at home and 738 scored on the road.  PawSox scored
> 323 runs in  590 home innings and 390 runs in 635 road innings; they allowed 289
> runs in 639 home innings and 348 in 609 road innings.

Hmmm.  Something screwy's going on here.  It seems by home/road player
data is off, which really ticks me off since my source is usually pretty
good.

Still wouldn't get you an m of 1.00, though.  You're still looking at an
m around 0.90 and an M around 0.95.  

Quote:
> Where's your data from?  Mine's from downloading the score of every game from
> CNN/SI.  I just double-ckecked with ESPN's game log -- there are two differences
> -- ESPN omits a 1-0 game at Syracuse 4/14 (part of a doubleheader) but adds a
> 2-2 tie at Richmond 5/27.  Both are in error (the "tie" was suspended and
> finished the next day).

> > and no amount of needless minor
> > fiddling (using innings instead of games,

> Using innings instead of games lowers the PAF from .932 to .923.  Worth the
> effort, IMHO.

Eh.  Needless precision.  There's not a single player that would have a
different evaluation as a result of using .932 instead of .923.

[...]

--
Dan Szymborski

"Experience is a revelation in the light of which we renounce our errors
of youth for those of age."
     - Ambrose Bierce

 
 
 

Red Sox MLE's: A Test of 1 vs. 3 Year Park Factors?

Post by Eric M. Va » Sat, 03 Feb 2001 06:37:55

Quote:


> says...
> .

> > > Based on *3* players average just over *100* major league at-bats this
> > > year?

> > No, no, no!  Based on a 1-year park factor.  I then present the tiny amount of
> > data and say that it fits the 1-year park factor better.

> But how the data fits in a sample this size is irrelevant.  Given the
> error rate inherent in MLEs over 600 at-bats (a standard error in the 60-
> 70 range for OPS), 100 per player is miniscule.

> > Of *course* you can't
> > conclude that that means the extreme 1-year park factor is real and not noise of
> > some sort.  But it makes it slightly more likely, doesn't it?

> Not really.  You can't see anything in that noise.

I dunno.  The difference here, practically, is profound -- either Morgan
Burkhart is much better than Dante Bichette or he's not nearly as good.

Quote:
> > The best practical way to test 1-year vs. 3-year in park factors would be
> to test the year 0-year 1 results with both.

You do, however, run into the natural improvement of young hitters with age.

- Show quoted text -

Quote:

> > > You very darn well know better than this, Eric; the noise for data
> > > this small and of this type is loud enough to completely invalidate
> > > almost any conclusion of this nature.

> > > What you're doing is the equivalent of using your bathroom scale to weigh
> > > a dime and a quarter and deciding that they in fact weight the same.

> > > I'm not sure where you're getting the extreme park factor for Pawtucket
> > > anyway; Pawtucket home games had 718 runs scored in them and Pawtucket
> > > road games had 718 runs scored in them

> > Whoa! I have 612 runs scored at home and 738 scored on the road.  PawSox scored
> > 323 runs in  590 home innings and 390 runs in 635 road innings; they allowed 289
> > runs in 639 home innings and 348 in 609 road innings.

> Hmmm.  Something screwy's going on here.  It seems by home/road player
> data is off, which really ticks me off since my source is usually pretty
> good.

> Still wouldn't get you an m of 1.00, though.  You're still looking at an
> m around 0.90 and an M around 0.95.

How do you figure that?  AL ERA, excluding Sox, was 4.975; IL ERA excluding
Pawtucket was 4.40.  Pawtucket park factor was 0.923.  4.975 / 4.40 / .923 * .82
= 1.005.

Maybe you mistook the .923 as the run-rate factor?   Stats' style of reporting
park factors would have given Pawtucket an 84 for R.

Quote:

> > Where's your data from?  Mine's from downloading the score of every game from
> > CNN/SI.  I just double-ckecked with ESPN's game log -- there are two differences
> > -- ESPN omits a 1-0 game at Syracuse 4/14 (part of a doubleheader) but adds a
> > 2-2 tie at Richmond 5/27.  Both are in error (the "tie" was suspended and
> > finished the next day).

> > > and no amount of needless minor
> > > fiddling (using innings instead of games,

> > Using innings instead of games lowers the PAF from .932 to .923.  Worth the
> > effort, IMHO.

> Eh.  Needless precision.  There's not a single player that would have a
> different evaluation as a result of using .932 instead of .923.

I suppose it's a philosophical difference -- if I cut this corner at *every*
juncture, might *all* the errors add up?  So why not be a little more accurate,
whererever I can?  It may  make no practical difference, but I sleep better <g>.

Quote:

> [...]

> --
> Dan Szymborski

> "Experience is a revelation in the light of which we renounce our errors
> of youth for those of age."
>      - Ambrose Bierce

--
----
Eric M. Van

". . . from that day forward she lived happily ever after.  Except for the dying
at the end.  And the heartbreak in between." - Lucius Shepard.

 
 
 

Red Sox MLE's: A Test of 1 vs. 3 Year Park Factors?

Post by Dan Szymborsk » Sat, 03 Feb 2001 10:55:27


says...

Quote:


> > says...
> > .

> > > > Based on *3* players average just over *100* major league at-bats this
> > > > year?

> > > No, no, no!  Based on a 1-year park factor.  I then present the tiny amount of
> > > data and say that it fits the 1-year park factor better.

> > But how the data fits in a sample this size is irrelevant.  Given the
> > error rate inherent in MLEs over 600 at-bats (a standard error in the 60-
> > 70 range for OPS), 100 per player is miniscule.

> > > Of *course* you can't
> > > conclude that that means the extreme 1-year park factor is real and not noise of
> > > some sort.  But it makes it slightly more likely, doesn't it?

> > Not really.  You can't see anything in that noise.

> I dunno.  The difference here, practically, is profound -- either Morgan
> Burkhart is much better than Dante Bichette or he's not nearly as good.

.80 is a little low for the 3 year park factors, anyway.  .85 is closer
instead of the .83 I've been doing.

Using .85 gets you Burkhart as 243/353/449.  .95 gets you 254/371/477.

But even that's not right.  STATS does *not* include HBP in their MLEs
while you're including them by posting Pawtucket numbers.

.85 with HBP is actually 243/368/449.  .95 is 254/385/477.

For the record here, Chris Dial and I have tested the MLEs that I have
done for AAA and AA over the years and the park/league factors I've been
using has resulted in more accurate results.  I don't think the STATS
problem is wrong use of league factors, but from poor design decisions.  
They actually seem to be using m lower than .80 (are you still forgetting
M like you did with your first RC/27 calcs?)

Quote:
> > > The best practical way to test 1-year vs. 3-year in park factors would be
> > to test the year 0-year 1 results with both.

> You do, however, run into the natural improvement of young hitters with age.

True, but the improvement wouldn't be relevant to a comparison since
we're doing both in consecutive years.

- Show quoted text -

Quote:

> > > > You very darn well know better than this, Eric; the noise for data
> > > > this small and of this type is loud enough to completely invalidate
> > > > almost any conclusion of this nature.

> > > > What you're doing is the equivalent of using your bathroom scale to weigh
> > > > a dime and a quarter and deciding that they in fact weight the same.

> > > > I'm not sure where you're getting the extreme park factor for Pawtucket
> > > > anyway; Pawtucket home games had 718 runs scored in them and Pawtucket
> > > > road games had 718 runs scored in them

> > > Whoa! I have 612 runs scored at home and 738 scored on the road.  PawSox scored
> > > 323 runs in  590 home innings and 390 runs in 635 road innings; they allowed 289
> > > runs in 639 home innings and 348 in 609 road innings.

> > Hmmm.  Something screwy's going on here.  It seems by home/road player
> > data is off, which really ticks me off since my source is usually pretty
> > good.

> > Still wouldn't get you an m of 1.00, though.  You're still looking at an
> > m around 0.90 and an M around 0.95.

> How do you figure that?  AL ERA, excluding Sox, was 4.975; IL ERA excluding
> Pawtucket was 4.40.  Pawtucket park factor was 0.923.  4.975 / 4.40 / .923 * .82
> = 1.005.

> Maybe you mistook the .923 as the run-rate factor?   Stats' style of reporting
> park factors would have given Pawtucket an 84 for R.

Did you remember to take out pitchers' hitting numbers from the IL, which
is half NL teams that *do* end up with a lot of pitcher ABs?  When you do
so, you get a league adjustment of .86 instead of .82.  Then it's simply
1/.92 and then 1.09*.86 or .94 (which is a little higher than .90, of
course).

Over a 3-year period, you'd *actually* have an m of .85.  I haven't
played with the STATS MLEs recently, but in the past, they weren't even
using actual park factors, but the old "add up the hitters runs and
pitchers runs and divide by league average" method despite the
availability of park factors.

- Show quoted text -

Quote:

> > > Where's your data from?  Mine's from downloading the score of every game from
> > > CNN/SI.  I just double-ckecked with ESPN's game log -- there are two differences
> > > -- ESPN omits a 1-0 game at Syracuse 4/14 (part of a doubleheader) but adds a
> > > 2-2 tie at Richmond 5/27.  Both are in error (the "tie" was suspended and
> > > finished the next day).

> > > > and no amount of needless minor
> > > > fiddling (using innings instead of games,

> > > Using innings instead of games lowers the PAF from .932 to .923.  Worth the
> > > effort, IMHO.

> > Eh.  Needless precision.  There's not a single player that would have a
> > different evaluation as a result of using .932 instead of .923.

> I suppose it's a philosophical difference -- if I cut this corner at *every*
> juncture, might *all* the errors add up?  So why not be a little more accurate,
> whererever I can?  It may  make no practical difference, but I sleep better <g>.

I still think you're doing the equivalent of using a scalpel to cut a
bagel.

--
Dan Szymborski

"Experience is a revelation in the light of which we renounce our errors
of youth for those of age."
     - Ambrose Bierce

 
 
 

Red Sox MLE's: A Test of 1 vs. 3 Year Park Factors?

Post by stat.. » Sun, 04 Feb 2001 03:55:27


Quote:
> I don't bother with the whole Bill James MLE adjustment (I
> know, now I'm being inconsistent about accuracy).  I just take
> RC/27 (which is going to differ from everyone else's, anyway,
> since I'm using my own Contextual Runs formula, which you
> should, too, if you want to be as accurate as we know how to
> be), adjust for park and league, and multiply by .82 to get an
> MLE figure.

You guys are pathetic.  You're concerned about the internal
accuracy of calculations whose external relevance to the players
in question is nebulous at best.  You'll post article after
article arguing about the integrity of tiny third decimal
adjustments to RC/Lu and yet you can't say with any degree of
confidence which of the players in question is likeliest to get a
hit in his next at bat.

Pitiful.  You're not arguing about the relative baseball skills
of the players.  You're simply arguing about the relative
statistical skills of stat fans.

Take it out of here...

cordially, as always,

rm

 
 
 

Red Sox MLE's: A Test of 1 vs. 3 Year Park Factors?

Post by ashb.. » Sun, 04 Feb 2001 06:35:01

Quote:

> yet you can't say with any degree of
> confidence which of the players in question is likeliest to get a
> hit in his next at bat.

Uh, whom are you going to nominate as someone who can?

Managers can't.  If they could, you'd see a *lot* more intentional
walks, and they'd be ones that aren't so strictly situational.
"Walk him.  Yes Einstein, I KNOW it walks in the tying run.  But
this guy Carter's about to knock one out of here, guar-on-teed,
game over.  C'mon, we'll get 'em in the top of the 10th for ya."

--

Thought for the moment:
I didn't cheat, I just changed the rules.