Sequences of passes and players
Auch, one week and no post from me - instead there seems to be a bunch of great posts from various people. Thanks for keeping this blog interesting.
After running around in CTIA and related events for the whole week, I decided to use a few hours to program a small python app to analyze sequences of passes (strings of players and strings of certain types of passes). I wanted to look past single passes to find out if there is a dominant string of players in a team. Or if a team seems to use a certain sequence of passes, like dumps after short attacking passes. Again I used data from WUGC open final: Canada vs. USA. small set of data, I admit.
Before diving in to the results, short note on the number of passes per team in that game. Canada passed 234 times and USA 164 times. Canada used proportionally more dumps than USA (29% - 18%) whereas USA used mid-length passes more than Canada (15% vs. 8%). Also the maximum number of passes from player A to player B was 6 for USA (Studarus to Lobue) whereas Al Nichols passed 13 times to Evan Wood for Canada. These differences of course are visible in pass sequences too.
Lets first look at the strings of players. For USA here are the most common 4-player strings (one time strings are not displayed, rest assured there were plenty of different srings for both teams).
| Count | Sequence |
|---|---|
| 2 | Studarus - Dugan - Studarus - Lobue |
| 2 | Dugan - Studarus - Lobue - Burfeind |
And here are the most common 3-player strings
| Count | Sequence |
|---|---|
| 2 | Seidler - Husak - Cameros |
| 2 | Studarus - Dugan - Studarus |
| 2 | Dugan - Studarus - Lobue |
| 2 | Studarus - Lobue - Burfeind |
| 2 | Studarus - Lobue - Steets |
| 2 | Lobue - Steets - Leiserson |
| 2 | Glimme - Studarus - Lobue |
Here are the same tables for Canada. First 4 players
| Count | Sequence |
|---|---|
| 2 | Savage - Lugsdin - Hassell - Savage |
| 2 | Savage - Cruickshank - Lugsdin - Alexander |
| 2 | Ortiz - Wood - Cruickshank - Ortiz |
and then 3 players
| Count | Sequence |
|---|---|
| 5 | Wood - Nichols - Wood |
| 3 | Savage - Cruickshank - Lugsdin |
| 3 | Nichols - Wood - Nichols |
| 2 | Lugsdin - Ortiz - Savage |
| 2 | Cruickshank - Lugsdin - Wooldridge |
| 2 | Savage - Lugsdin - Hassell |
| 2 | Lugsdin - Hassell - Savage |
| 2 | Nichols - Pottinger - Harmer |
| 2 | Wood - Pottinger - Grant |
| 2 | Pottinger - Nichols - Wood |
| 2 | Wooldridge - Ortiz - Lugsdin |
| 2 | Savage - Alexander - Ortiz |
| 2 | Cruickshank - Savage - Lugsdin |
| 2 | Roberts - Ortiz - Lugsdin |
| 2 | Lugsdin - Savage - Alexander |
| 2 | Savage - Alexander - Savage |
| 2 | Cruickshank - Lugsdin - Alexander |
| 2 | Ortiz - Lugsdin - Alexander |
| 2 | Ortiz - Wood - Cruickshank |
| 2 | Wood - Cruickshank - Ortiz |
I guess the higher number of different 3-player strings for Canada with 2 or more instances is explained mostly with higher number of Canadian passes in the game. It is however interesting to note that Wood seemed to play more (5 vs. 3) give-n-go plays off Nichols than the opposite way.
Pass categories
Here is a breakdown of 2 pass strings for USA
| Count | Sequence |
|---|---|
| 21 | a - a |
| 12 | a - m |
| 12 | s - a |
| 11 | d - a |
| 9 | a - d |
| 9 | d - s |
| 7 | l - d |
| 7 | s - s |
| 6 | d - m |
| 5 | m - a |
| 5 | a - l |
| 5 | a - s |
| 4 | m - d |
| 3 | m - m |
| 3 | a - h |
| 3 | s - d |
| 2 | l - a |
| 2 | m - h |
| 2 | s - m |
| 2 | l - l |
| 2 | h - d |
and the same list for Canada
| Count | Sequence |
|---|---|
| 27 | a - d |
| 27 | a - a |
| 21 | d - a |
| 16 | d - d |
| 13 | s - a |
| 11 | a - m |
| 9 | a - s |
| 8 | d - s |
| 6 | m - a |
| 6 | d - m |
| 4 | s - d |
| 4 | s - h |
| 4 | s - s |
| 4 | d - l |
| 4 | l - d |
| 3 | a - h |
| 3 | l - l |
| 3 | l - a |
| 3 | m - d |
| 2 | s - m |
| 2 | l - s |
| 2 | d - c |
| 2 | d - h |
| 2 | a - l |
There a more dump passes in the high ranking pass sequences for Canada than for USA, but that can be expected based on the higher number of dump passes Canada used in the game.
Finally, were there any player/pass-type strings which had more than one instance?
Yes, one for USA
| Count | Sequence |
|---|---|
| 2 | Glimme - a - Studarus - a - Lobue |
And 4 for Canada
| Count | Sequence |
|---|---|
| 2 | Nichols - d - Pottinger - s - Harmer |
| 2 | Nichols - a - Wood - a - Nichols |
| 2 | Savage - a - Alexander - s - Ortiz |
| 2 | Wood - d - Nichols - a - Wood |
Any suggestions how to proceed from here, or should I drop investigating this approach?
October 4th, 2005 at 5:10 am
I think that only the most extreme trends would pop out of the data, since there is just so much noise. However, I do think it’d be interesting and useful to see the strings for the beginning of each possession, and to track whether the team was successful in scoring or not. Most teams call a sequence to start the play, so it’d be nice to know whether it worked or not (or, if you’re an opponent, it’d be nice to know what the other teams ran the most).
Frankly, I’m surprised that there weren’t more 3 person strings with higher frequencies. Early 90s NY probably went Weiss-King-Dobyns 50% of the time.
October 5th, 2005 at 3:13 am
interesting idea, but the data is *way* too sparse to get anything meaningful. i like the analysis from a “defensive preparation” standpoint (ah, to think of the future when ultimate teams can pay people to scout…), but you’re going to need something on the order of a couple tournament’s worth of data to be able to say anything meaningful about a team.
on a team of 21 players, there are 8000 possible 3-player sequences (you can’t pass to yourself, afterall). without too rigorous of an analysis, let’s just say you’d need to see on the order of 1000-10000 strings to draw any real conclusions.
as for jim’s comment, you don’t expect to see anything with that high of a frequency, unless the offense can run 2- or 3- man weave the length of the field, or huck it right off the pull with high success; even if A-B-C is the bread and butter of the offense, you’ll get just as many X-Y-A, X-A-B, B-C-X and C-X-Y strings, so the frequencies are never going to get *too* large.
October 5th, 2005 at 7:45 am
Jim, I am planning to add some more features to my small app, including analysing just the start or end of the possession, being able to select only successful or unsuccessful possessions, and offensive lineup vs. defensive lineup possessions. Another addition could be to add wild cards in the search so being able to track down if there is indirect connection of two players (like Jim - “*” - Alex, where * could be anyone), or even patterns of play (if team has more give-n-go type plays A-*-A, or A-*-*-A than other teams)
Stephen, you are correct, one game is by no means an adequate dataset. I’ll try with Team USA stats this weekend - let’s see if there something interesting.