Methods of using multiple regression in football tendency analysis

Abstract

Methods are disclosed in which the user defines three or more categories of plays that an American football opponent may run and multiple regression techniques are used to estimate the probability that the opponent will run a play in each such category, under particular game conditions, based on data collected from the opponent's past games. The regression coefficients and game condition data are entered into a computer device, which calculates the predicted probabilities and sorts and displays the categories of plays according to such probabilities. The user may assign ratings or rankings to schemes that the user may execute, based on the expected effectiveness of each such scheme against each category of plays that the opponent may run, and such ratings or rankings may be combined with the predicted probabilities to recommend schemes to the user under various game conditions. If permitted by the rules, such predictions or recommendations may be used to assist the user in play calling during a game. The methods may also be used to enhance scouting reports, improve efficiency of practices, and/or develop more sophisticated play sheets.

Claims

1. A method of analyzing the play calling tendencies of an American football opponent, comprising: defining at least three distinct, non-overlapping categories of plays that the opponent might call, all such categories together being exhaustive of the universe of plays that the opponent might call; collecting information on the plays that the opponent actually called in prior games, assigning each such play to exactly one of the defined categories, and collecting information about the score, time remaining, down and distance, field position, personnel package, and other relevant information in the prior games when each such play was called; inputting all such information collected into a computerized statistical database; for each of the at least three defined categories of plays, performing a multiple regression wherein each observation is a play that the opponent ran in a past game, the dependent variable is a binary variable indicating whether the play that the opponent ran was in that category, and the explanatory variables reflect the game conditions during that play; using appropriate explanatory variables reflecting the game conditions, to estimate the probability that the opponent will run a play in that category given various game conditions; programming a computer to use the coefficients estimated by each such regression to model the probability that the opponent will run a play in the corresponding category under particular conditions during a game; entering sets of game conditions into the computer to use as or to calculate values for the explanatory variables of the models; and solving the models for each category of plays and displaying each category, along with the predicted probability that the opponent will run a play in that category, sorted by probability, from highest to lowest.

2. The method of claim 1 wherein the explanatory variables include one or more dummy variables reflecting the presence or absence in the game of key individual players on the opposing team.

3. The method of claim 1 further comprising inputting data about the game conditions into the computer during a game, before each play by the opponent, immediately after the conclusion of the preceding play; computing the predicted probability that the opponent will run a play in each category or plays; and displaying the categories of plays and predicted probabilities on a computer display in time to assist in the selection of a scheme by the user and for such scheme to be communicated to players on the field.

4. The method of claim 3 wherein, for plays that do not follow extended stoppages of play, the user inputs only the fact that the preceding play resulted in an incomplete pass or the field position after the play and relevant changes in the opponent's personnel package; and the computer calculates or estimates other game conditions.

5. The method of claim 1 further comprising identifying a set of schemes that the user is prepared to deploy against the opponent; assigning a numerical rating or ranking to each such scheme against each category of plays, based on the expected effectiveness of the scheme against plays in that category; identifying recommended schemes, based on the average rating or ranking of each scheme against all of the play categories, weighted by the predicted probability that the opponent will run a play in each category; and displaying the recommended schemes in order of weighted average rating or ranking, from highest to lowest.

6. The method of claim 5 further comprising displaying the recommended schemes on a computer display in time to assist in the selection of a scheme by the user and for such scheme to be communicated to players on the field.

7. The method of claim 6 wherein, for plays that do not follow extended stoppages of play, the user inputs only the fact that the preceding play resulted in an incomplete pass or the field position after the play and relevant changes in the opponent's personnel package; and the computer calculates or estimates other game conditions.

8. The method of claim 6 wherein the a priori ranks or ratings can be adjusted during the course of a game.

9. The method of claim 1 wherein the information collected includes information about the formation from which the opponent ran each play in prior games and that information is used to formulate additional explanatory variables used in the models.

10. The method of claim 1 wherein the coefficients and/or the predicted probabilities generated by the models are used in the preparation of scouting reports on the opponent.

11. The method of claim 1 wherein the coefficients and/or the predicted probabilities generated by the models are used in the preparation of play sheets to be used during a game against the opponent.

12. The method of claim 1 wherein the coefficients and/or the predicted probabilities generated by the models are used to allocate practice time.

13. The method of claim 1 wherein the coefficients and/or the predicted probabilities generated by the models are used to enable the scout teams to simulate the opponent's behavior more accurately.

14. The method of claim 1 wherein the opponent whose tendencies are analyzed is the user's team and the coefficients and/or the predicted probabilities generated by the models are used for self-scouting.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 is a flow chart depicting the steps in an embodiment of the methods disclosed herein.

(2) FIG. 2 depicts an exemplary structure for a statistical database created and used according to the methods disclosed herein.

(3) FIG. 3A depicts the regression output for a particular category of plays.

(4) FIG. 3B depicts the regression output for another category of plays.

(5) FIG. 4A depicts an exemplary display of the probabilities generated by the multiple regression models, along with the game condition data that were used as or to compute the explanatory variables of the models.

(6) FIG. 4B depicts an exemplary display of the probabilities generated by the multiple regression models based on a different set of game condition data.

(7) FIG. 4C depicts an exemplary display of the probabilities generated by the multiple regression models based on another set of game condition data.

(8) FIG. 5 is a flow chart depicting the steps of an embodiment in which the methods disclosed herein may be used to recommend schemes that are expected to be effective against the categories of plays that the opponent is likely to run.

(9) FIG. 6 depicts an example of a matrix resulting from the user's assignment of a numerical rating or ranking to each scheme against each category of plays, based on the expected effectiveness of the scheme against plays in that category.

(10) FIG. 7 is an exemplary flow chart of a program for entering game condition data before each play.

DETAILED DESCRIPTION OF THE INVENTION

(11) The descriptions and examples of the methods disclosed herein are provided from the perspective of a user seeking to analyze the offensive play calling tendencies of an opponent, in order to develop and deploy effective defensive schemes. A skilled practitioner will recognize that the methods disclosed herein could similarly be used to analyze the opponent's defensive tendencies to facilitate the user's development of an offensive game plan or to assist in offensive play calling. There is no intention to limit the scope of the claims to the prediction of the opponent's offensive plays or to the facilitation of defensive play calling or game preparation.

(12) FIG. 1 is a flow chart depicting an embodiment of the methods disclosed herein. In the initial step 10, films, videos, and/or other records of the opponent's past games are reviewed to collect information needed in steps 20, definition of categories of plays that the opponent has run in past games, and 30, construction of a computerized statistical database. As noted in FIG. 1, the information collected in step 10 is of the type commonly used to develop game plans, scouting reports, and practice schedules.

(13) In step 20 of the method, the user defines a plurality of categories of offensive plays that the opponent might run. The categories should be defined so as to be exhaustive and non-overlapping, meaning that every play that the opponent might run will fall within the definition of exactly one category.

(14) In one experiment, the ten categories, described in Table 1, were used:

(15) TABLE-US-00003 TABLE 1 Play Category Definition Zone Simple handoff with zone blocking, typically with Blocking double team at point of attack Trap/Pull Simple handoff or pitch with interior lineman or linemen Blocking (including tight end) trap blocking or pulling Option Run Handoff with extended mesh point or trailing pitch man Draw/Delay Delayed handoff or quarterback draw after showing pass Reverse Handoff or pitch to a receiver, including jet sweep, end around, reverse, flea flicker, etc. RPO Quarterback roll out or bootleg Screen Any screen pass Play Action Play action pass (except where a fake handoff precedes an RPO or screen pass) Short Less than 3-step dropback from shotgun; less than Dropback 5-step dropback from under center Deep More than 2-step dropback from shotgun; more than Dropback 5-step dropback from under center
Note that these definitions provide unambiguous rules for resolution where a play seems initially to fall into more than one category—e.g., when a screen pass is preceded by play action.

(16) The play categories should be defined with sufficient precision to be useful for preparing scouting reports, planning practices, and play calling—i.e., selection of schemes likely to be effective against plays that the opponent is expected to run. As noted above, the ability to predict only two categories of offensive plays-“run” and “pass”—is likely to be of extremely limited value. Each category should be defined, however, to ensure that the statistical database contains sufficient observations of plays in that category; otherwise, the regression equation for that category will not yield meaningful coefficients. It is recommended that the categories be defined so as to provide at least three observations in each category. (Out of a total of 333 plays from five games included in the database created for the experiment described herein, three (0.9%) were in the Draw/Delay category.)

(17) While the embodiments described herein reflect particular definitions of the categories, a skilled practitioner will recognize that useful results may be obtained using different categories. There is no intention to limit the scope of the claims to the particular categories described herein.

(18) Referring again to FIG. 1, in the next step 30 of the method, a computerized database is constructed using information collected in steps 10 and 20. The database software allows for mathematical manipulation of the data and includes or readily interfaces with statistical software that can be used for MLR.

(19) FIG. 2 depicts an exemplary structure for the statistical database 100. Based on the review of the opponent's past games, each offensive play is included in the database 100 as a separate observation 110. (In the experiment described herein, five of the opponent's games were reviewed, yielding 333 observations.) Only plays that provide information about the opponent's play calling tendencies are included in the database. Thus, while punts, field goal attempts, kneel downs, spikes, and fumbled snaps are technically offensive plays, they would not be included in the database. Plays that are called back due to penalty would be included, so long as the type of play the opponent intended to run is apparent.

(20) Each observation 110 in the database 100 includes an entry in each of the fields 120. Each of the fields 120 corresponds to one of the categories of plays defined by the user; in the experiment described herein, there were ten such fields 120, corresponding to the ten defined play categories. For each observation 110, one of the fields 120 will have a value of 1, indicating that the opponent ran a play in that category; all other fields 120 will have a value of 0. The values in each of the fields 120 will be used as the dependent variable in one of the regression equations estimated in accordance with the method disclosed herein.

(21) As shown in FIG. 2, in addition to fields for play categories 120, the statistical database 100 comprises a plurality of additional fields 130 and 140 for each observation 110. Fields 130 and 140 contain data that will be used as explanatory variables in the regressions and/or to calculate such variables, in accordance with the method disclosed herein.

(22) The fields 130 contain, for each observation 110, data relating to the opponent's play selection that is known (or can be accurately approximated) immediately after the conclusion of the preceding play—e.g., the score of the game, the time remaining, down and distance, field position, personnel in the opponent's huddle.

(23) The fields 140 contain, for each observation 110, data relating to the opponent's formation when the play begins, which, due to the opponent's ability to execute various shifts and motions, cannot be known until the ball is snapped. The data contained in the fields 140 may include, for example, the number of running backs in the backfield, whether tight ends are lined up strong right or left, the number of receivers split right or left, whether the quarterback is under center or in the shotgun, etc.

(24) Referring again to FIG. 1, in the next step 40 of the method, the data in the statistical database are used to estimate a regression equation for each category of plays. The dependent variable in the regression for each category is a dummy variable with the value 1 for an observation if the opponent ran a play in that category and the value 0 otherwise. The explanatory variables in each regression are constructed from the data contained in the statistical database reflecting the conditions when each play was run; the same explanatory variables are used in each regression. In the experiment described herein, ten regressions were run, one corresponding to each defined category of plays.

(25) While the methods disclosed herein reflect particular specifications of the regressions, a skilled practitioner will recognize that useful results may be obtained using different explanatory variables, including different functional forms or interactions of the data used to calculate the variables described herein. There is no intention to limit the scope of the claims to the particular regression specifications described herein.

(26) In the experiment described herein, ten explanatory variables, described in Table 2, were used:

(27) TABLE-US-00004 TABLE 2 Explanatory Variable Description ODYPR A variable intended to measure the team's perception of its opponent's relative vulnerability to passes vs. runs, calculated as [yards per pass]/ [yards per rush] yielded by opponent over its preceding 10 games. STR A variable reflecting the score and time remaining, calculated as: [scoring margin]/[minutes remaining in game]; the scoring margin is positive if the team is leading, negative if trailing, 0 if tied. STR.sup.2 [STR].sup.2 or, if STR is negative, − [STR].sup.2; a quadratic form of STR appeared to improve the fit of the model. 2M1H A variable reflecting time remaining in the first half (0 if more than two minutes) and distance from the team's own goal line on a scale from 0 to 100 (increasing with distance from own goal and as less time remains in half). DD A variable reflecting the down and distance situation, calculated as: [yards to go]/ (4 − [down]); in other words, the average yards the team must gain on each play in order to make a first down on the third down play. (Examples: on 1st and 10, DD = 3.333; on 3rd and 1, DD = 1.) On 4th down, DD = [yards to go]. DD.sup.2 [DD].sup.2; a quadratic form of DD appeared to improve the fit of the model. Own A variable measuring the team's proximity to its own goal line on a scale GL Prox from 0 to 100 (100 being closest), calculated as: ([yardage from opponent's goal line].sup.2)/9,801. (The denominator is a scaling factor equal to 99.sup.2.) Opp A variable measuring the team's proximity to its opponent's goal line on a GL Prox scale from 0 to 100 (100 being closest), calculated as: ([yardage from own goal line].sup.2)/9,801. (The denominator is a scaling factor equal to 99.sup.2.) TEs The number of tight ends on the field. RB s The number of running backs on the field.

(28) In one embodiment, the explanatory variables of the MLR model include one or more dummy variables reflecting the presence or absence of one or more of the opponent's key players in the game. This would provide more refined predictions if, for example, the opponent is more likely to run certain categories of plays when the first-string running back is in the game than when he is being rested. Such dummy variables could also correct for anomalies in the statistical database caused by variations in the opponent's personnel. For example, if the database comprised data from the opponent's past six games but their starting quarterback missed two of those games due to injury, such a variable could provide insight into the effect of that player's absence on the opponent's play calling.

(29) FIG. 3A depicts the regression output 200 for a particular category of plays 210, deep dropback passes. The data shown are a subset of the regression output typically generated by commercially available statistical software packages, such as Excel or STATA.

(30) The regression statistics 220 provide diagnostic information about the regression equation as a whole. The R square, for example, is commonly cited as a measure of the “goodness of fit” or the percentage of variation in the dependent variable explained by the regression equation. Although the R squares generated by the methods described herein are lower than might be desirable for some applications of regression modeling techniques, high R squares are neither expected nor required for the methods to be of value in the current application. Because play callers intentionally try to be unpredictable—i.e., to introduce an element of random variation in their play calling—it is unsurprising that equations generated according to the claimed methods explain only limited percentages of the variations in plays called.

(31) In many applications of multiple regression techniques, the primary objective is to understand the impact, if any, of particular explanatory variables on the dependent variable. In such cases, the focus is on the magnitude and sign of the coefficients of, and the diagnostic statistics associated with, each explanatory variable; the R square of the equation is of less importance. FIG. 3 contains a list of the explanatory variables 230, the coefficients 240 of each explanatory variable, and diagnostic statistics 250 associated with each explanatory variable.

(32) Each of the coefficients 240 describes the mathematical relationship between the corresponding explanatory variable 230 and the dependent variable 210—i.e., the probability that the opponent will run a play in the category corresponding to dependent variable 210. If the sign of one of the coefficients 240 is positive, the probability that the opponent will run a play in that category increases as the value of the corresponding explanatory variable 230 gets larger; if the sign of the coefficient 240 is negative, the probability decreases as the value of the corresponding explanatory 230 variable gets larger.

(33) Referring to the diagnostic statistics 250, the t-statistic, in particular, is a commonly cited measure of the statistical significance of the effect of an explanatory variable on the dependent variable. A larger t-statistic provides higher confidence that the corresponding explanatory variable has a meaningful effect on the dependent variable.

(34) As shown in FIG. 3A, STR was highly significant as an explanatory variable in this equation. The negative sign on the coefficient 240 of STR indicates that, when this opponent had a large lead late in the game, they were less likely to call deep dropback passes. While this conclusion may seem obvious, the methods disclosed herein enable the user to quantify this effect and to estimate its magnitude simultaneously with other relevant factors. Moreover, different opponents may exhibit this effect differently; some will deviate from their game plan sooner, or to a greater extent, than others. Described herein are disciplined, systematic, and quantitative methods, leveraging all available data about the opponent's past behavior, to supplement and enhance coaching intuition.

(35) While the coefficient 240 on STR.sup.2 is much smaller than the coefficient on STR and bears the opposite sign, it is also statistically significant, suggesting that the effect of STR on the dependent variable 210 gets slightly smaller as STR approaches large positive or large negative values.

(36) The coefficient 240 on OwnGLProx is negative, suggesting that, when the opponent is near its own goal line, it is less likely to call a deep dropback pass, perhaps because the risk of a sack or turnover is heightened given that field position. The coefficient 240 on OppGLProx is also negative, suggesting that the opponent is less likely to call a deep dropback pass when it is near the other team's goal line, perhaps because such plays typically involve deep pass patterns that are ineffective close to the other team's end line.

(37) Not surprisingly, the coefficient 240 on DD is positive, indicating that the opponent is more likely to call a deep dropback pass on late downs with longer yardage to go. Such a call is more likely, for example, on third and 10 than on first and 10 or third and 1. The coefficient 240 on 2M1H is also positive, indicating that the opponent is more likely to call a deep dropback pass during a two-minute drill in the first half.

(38) For comparison purposes, FIG. 3B depicts the regression output 200 for another category of plays 210, runs with zone blocking. This regression was estimated with the same explanatory variables 230 as in FIG. 3A but with a different dependent variable. Again, the coefficient 240 on STR is highly significant but bears the opposite sign, indicating that, when this opponent had a large lead late in the game, they were more likely to call this type of run. The coefficient 240 on DD is also significant but with a negative sign, indicating that this opponent is less likely to call a play in this category on late downs with long yardage to go.

(39) Referring again to FIG. 1, in the next step 50 of the method, the coefficients from the regression equations estimated in step 40 are programmed into a computer to create, for each category of plays, a model of the probability that the opponent will call a play in that category under particular game conditions. The computer is programmed to provide for entry of data used as or to compute the values of the explanatory variables of the equations and to perform all calculations necessary to compute such values.

(40) In the next step 60 of the method, data reflecting particular game conditions are entered into the computer, which calculates and uses a complete set of values of the explanatory variables to solve the regression model for each category of plays. The solutions represent, for each category, a probability that the opponent will run a play in that category under those particular game conditions.

(41) In the experiment described herein, five consecutive games played by a team were reviewed according to step 20. The data from those five games were used to construct the statistical database according to step 30 and that database was used to estimate regression equations according to step 40. The resulting regression coefficients were entered into a computer to create a regression model for each category of plays according to step 50. Data from each offensive play run by the team during its next game were used as or to compute, according to step 60, a complete set of values of the explanatory variables. According to step 70, those values were used to solve the models, generating, for each category of plays, a predicted probability that the team would run a play in that category; the computer was used to sort and display the categories and the associated probabilities.

(42) In step 80, the defined categories and the corresponding probabilities calculated in step 70 are sorted by probability, from highest to lowest, and displayed. FIG. 4A depicts an exemplary display 300 of the probabilities generated by the multiple regression models using the experimental data, along with the game condition data 310 that were used as or to compute the explanatory variables of the regression models. Each category of plays 320 and its associated probability 330 is displayed, sorted in order of probability from highest to lowest. (The user may choose not to display every category but a smaller number of categories with the highest predicted probabilities 330.) On the team's first offensive play of the game-n first and 10 and the game tied—the models predicted that the team was most likely to call a play in the “zone blocking” category. Note that, when each of a team's play calls is assigned to exactly one category, and the same values of the explanatory variables are used to solve the model for each category, the sum of the probabilities 330 generated for all categories will equal exactly 1 (100%).

(43) For comparison purposes, FIG. 4B depicts an exemplary display 300 of the probabilities 330 generated on the next play in the same experiment. On second and 8, the models predicted that the team was most likely to call a play in the “short dropback pass” category. For further comparison, in FIG. 4C, with the team trailing late in the game, the most likely category was predicted to be “deep dropback pass.”

(44) As reflected in FIG. 4C, linear probability models such as those described herein sometimes yield anomalous predicted probabilities smaller than zero. One possible response is simply to replace any negative predicted values with zero, and rescaling all of the resulting probabilities so that they total 100%. Another possible response is to add the absolute value of the largest negative probability to all of the raw probabilities. A third possibility would be to set the negative values not to zero but to an arbitrarily small positive probability, such as 0.1%. Finally, a skilled practitioner would be familiar with logit or probit modeling techniques that generate probabilities without the possibility of returning negative values.

(45) In the experiment described herein, the probabilities generated by the regression models were compared with the plays actually called by the team studied over the course of an entire game. As shown in Table 3, the team called a play in the category predicted to be most likely (Prediction 1) almost half the time (37 out of 81 plays). The team called a play in one of the two categories predicted to be most likely two-thirds of the time (54 out of 81 plays).

(46) TABLE-US-00005 TABLE 3 Prediction Frequency Percentage Cumulative % 1 37 45.68 45.68 2 17 20.99 66.67 3 5 6.17 72.74 4 11 13.58 86.42 5 8 9.88 96.30 6 0 0.00 96.30 7 0 0.00 96.30 8 0 0.00 96.30 9 3 3.70 100.00 10 0 0.00 100.00 Total 81 100.00 100.00

(47) In one embodiment, the methods disclosed herein may be used to provide predicted probabilities during a game, to assist the user with play calling, when the rules permit. In this embodiment, step 60 in FIG. 1, the entry of relevant game condition data, occurs immediately after the play preceding each play run by the opponent. Step 80, the display of play categories and probabilities, occurs in time to assist with the selection of a scheme and for such scheme to be communicated from the coaches' box to the sideline and/or from the sideline to players on the field. The display may be on a tablet computer or other portable device readily accessible to play callers.

(48) In another embodiment, the methods disclosed herein may be used to recommend schemes that are expected to be effective against the categories of plays that the opponent is likely to run under given game conditions. FIG. 5 is a flow chart depicting the steps of this embodiment. The probability that the opponent will run a play in each defined category is calculated according to steps 10, 20, 30, 40, 50, 60, and 70 of FIG. 1. In step 15, the user identifies a set of schemes from the game plan that the user is prepared to deploy against the opponent.

(49) In step 35, the user assigns a numerical rating or ranking to each identified scheme against each category of plays defined in step 20, based on the expected effectiveness of the scheme against plays in that category. (For purposes of this disclosure, it is assumed that a scheme with a higher rating or ranking is expected to be more effective; more effective schemes could just as easily be denoted with lower ratings or rankings.) FIG. 6 depicts an example of the resulting matrix 400, with m categories of plays 410 and n schemes 420. Each of the cells 430 in the matrix 400 contains the rating or ranking of one of the schemes 420 in the game plan reflecting the user's a priori assessment of the likely effectiveness of that scheme against plays in one of the categories 410. For example, “R.sub.2.3” is the a priori rating or ranking of scheme 2 against plays in category 3. The ratings or rankings are entered into the computer along with the MLRs.

(50) In step 75, after the probability is calculated for each category of plays, a weighted average rating or ranking for each scheme is calculated according to the following algorithm:

(51) ${\overline{R}}_{.Math.} = {.Math.}_{j = 1}^{m} R_{i, j} P_{j}$
where R.sub.t is the weighted average rating or ranking of scheme i, m is the number of defined categories of plays, R.sub.i,j is the a priori rating or ranking of scheme i against plays in category j, and P.sub.j is the predicted probability that the opponent will run a play in category j.

(52) In step 85, the schemes with the highest weighted average ratings or rankings will be displayed as recommended schemes, sorted in order of weighted average rating or ranking, from highest to lowest, under the given game conditions. The user may choose not to display every scheme but may select a smaller number of schemes with the highest weighted average ratings or rankings. The user may also choose whether to display the categories of plays and/or the predicted probability associated with each category.

(53) In still another embodiment, the a priori rankings or ratings can be adjusted during the game, perhaps in response to actual outcomes. The adjusted rankings or ratings will thereafter be used to generate recommended schemes.

(54) The methods disclosed herein may be used to recommend schemes during a game, to assist the user with play calling, when the rules permit. As described above, step 60 of FIG. 5, the entry of relevant game condition data, occurs immediately after the play preceding each play run by the opponent. Step 85, the display of recommended schemes, occurs in time to assist with the selection of a scheme and for such scheme to be communicated from the coaches' box to the sideline and/or from the sideline to players on the field. The display may be on a tablet computer or other portable device readily accessible to play callers.

(55) In order to minimize the delay between the end of the preceding play and the display of the recommended schemes for the next play, it would be advantageous to streamline the entry of new game condition data. In one embodiment, a full set of game condition data will be entered only at the beginning of a drive-which, by definition will result from a score or other change of possession—or after some other event accompanied by an extended stoppage of play, such as a penalty or a time out. Absent such an extended stoppage of play, minimal entry of game condition data will be required.

(56) FIG. 7 is an exemplary flow chart of a program for entering game condition data before each play. Step 505, entering the play categories and the regression coefficients reflecting the opponent's tendencies, occurs before the game. In step 510, initial values for relevant game conditions-score, quarter, minutes and seconds remaining, down and distance, field position, etc.—are assigned before the first play. This step can include prompting the user for some values that will not change over the course of the game, such as the value of the variable ODYPR, described above.

(57) In step 515, the user is asked whether most of the game conditions can be automatically updated before the next play. The user will respond negatively if the next play is the first play of a drive or in the event of an extended stoppage in play, such as for a penalty, a time out, or the end of a quarter. If the user declines automatic updating in step 515, then, in step 520, the program will prompt the user with the current value of each game condition, allowing the user to enter new values for those conditions that need to be updated. In step 525, the user will be prompted to input relevant changes in the opponent's personnel package, if any. At that point, all game conditions necessary to solve the regression models. will have been updated.

(58) Subsequent to the first play of a drive, and where there has been no extended stoppage of play, the user will select automatic updating when queried in step 515. If the user selects automatic updating, then, in step 530, the program will ask the user whether the previous play resulted in an incomplete pass. If not, in step 535, the program will prompt the user to update the field position resulting from the previous play, if any yardage was gained or lost. From that information, in step 540, the program will automatically calculate the new down and distance (although, if the ball is placed within a yard of the line to gain, the program will prompt the user to confirm whether a first down was made). Because the clock will be running, the program will also estimate the time remaining when the next play is initiated. (To improve the accuracy of such estimates, the user may be prompted as to whether the opponent is in “hurry up” mode.) The program then runs step 525, prompting the user to input any relevant changes in the opponent's personnel package.

(59) If, in step 530, the user indicates that the previous play resulted in an incomplete pass, the program will, in step 545, automatically update the down (the distance to go and field position will be unchanged) and estimate the time remaining (which, since the clock will have stopped as soon as the play ended, can be estimated with reasonable accuracy). The program then runs step 525, prompting the user to input any relevant changes in the opponent's personnel package.

(60) It will be apparent that, regardless of how the game condition data are updated, the update concludes with step 525. As soon as that input is received, in step 550, the program displays a complete set of the updated game conditions. In step 555, the program calculates any necessary explanatory variables, solves the regression models, and, in some embodiments, formulates recommendations to the user with regard to likely effective schemes. In step 560, the predicted play categories, corresponding probabilities, and/or recommended schemes are sorted and displayed.

(61) After the predicted play categories, corresponding probabilities, and/or recommended schemes are displayed, in step 565, the program evaluates whether the time remaining in the game has reached zero. If so (and the score is not tied), the program terminates. Otherwise, the program returns to step 515, to begin the process of updating game condition data for the next play.

(62) Even if the use of computer technology in the manner described herein is prohibited on the sideline or in the coaches' box during games, the methods described herein can be used to enhance game preparation and improve play calling. Those methods can be used, for example, to identify, confirm, and quantify the significance of keys that should be included in the scouting report on an opponent.

(63) In one embodiment, the explanatory variables used in the regression models include data on the opponent's formation at the snap. Referring again to FIG. 2, such data would be included in the statistical database in fields 140. The explanatory variables could reflect, for example, the number of running backs in the backfield, whether tight ends or wide receivers are lined up close to the interior line, and whether the quarterback is under center or in the shotgun. (Because the offense, by using motions and shifts, can conceal its ultimate formation until a second or two before the snap, such variables cannot be used to provide real-time predictions/recommendations as described above.) In the experiment summarized in Table 3, the inclusion in the regression models of several variables reflecting the formation at snap improved the accuracy of the highest probability prediction to 53.09%.

(64) The categories of plays may be defined to indicate the lateral direction, left or right, to which certain types of plays were run. Similarly, the explanatory variables may reflect data with a directional component—e.g., strong side, position relative to hash marks, the number of split receivers on each side of the formation. The resulting regression models may provide insight into the tendencies of an opponent to run particular types of plays to the left or right, to the strong or weak side, to the wide side or short side of the field, etc., under particular game conditions-information that might enhance the value of scouting reports.

(65) The methods described herein could be used to assist in play calling even without using computer technology on the sideline or in the coaches' box by, for example, facilitating the development of more sophisticated play sheets. Such play sheets could take into account not only down, distance and field position, but also the score, time remaining, and personnel packages. Potentially, a number of different play sheets could be prepared for each game; the user would select a particular sheet to use during a given drive based on the game conditions at the outset of that drive.

(66) The methods described herein could also be used to improve the efficiency of practices. Those methods could be used, for example, to focus the user on the categories of plays that the opponent is most likely to run, ensuring the most productive allocation of practice time. Those methods could enable the scout teams to simulate the opponent's behavior more accurately.

(67) Finally, an important aspect of game preparation is self-scouting. The user could utilize the methods described herein to analyze its own play calling to ensure that it does not exhibit any predictable tendencies that could be exploited by opponents.

Methods of using multiple regression in football tendency analysis

Inventors

Cpc classification

Classification Explorer

G06N7/01

PHYSICS

Classification Explorer

G06N20/00

PHYSICS

Classification Explorer

A63B71/0616

HUMAN NECESSITIES

International classification

Classification Explorer

A63B71/06

HUMAN NECESSITIES

Classification Explorer

A63B43/00

HUMAN NECESSITIES

Classification Explorer

G06N7/00

PHYSICS

Abstract

Claims

Description