• Python Iterators

  • Part 1 : The Basics

  • Part 2 : Generators

  • Part 3 : itertools

    The itertools module provides two types of resources

    1. a collection of functions that you can import and use directly
    2. a set of recipes that you can copy and paste into your programs
  • Appendixes

  • PEP 289

    PEP: 289
    Title: Generator Expressions
    Author: python@rcn.com (Raymond Hettinger)
    Status: Final
    Type: Standards Track
    Created: 30-Jan-2002
    Python-Version: 2.4
    Post-History: 22-Oct-2003

    Abstract

    This PEP introduces generator expressions as a high performance,
    memory efficient generalization of list comprehensions [1] and
    generators [2]
    .

    Rationale

    Experience with list comprehensions has shown their widespread
    utility throughout Python. However, many of the use cases do
    not need to have a full list created in memory. Instead, they
    only need to iterate over the elements one at a time.

    For instance, the following summation code will build a full list of
    squares in memory, iterate over those values, and, when the reference
    is no longer needed, delete the list::

    sum([x*x for x in range(10)])

    Memory is conserved by using a generator expression instead::

    sum(x*x for x in range(10))

    Similar benefits are conferred on constructors for container objects::

    s = Set(word  for line in page  for word in line.split())
    d = dict( (k, func(k)) for k in keylist)

    Generator expressions are especially useful with functions like sum(),
    min(), and max() that reduce an iterable input to a single value::

    max(len(line)  for line in file  if line.strip())

    Generator expressions also address some examples of functionals coded
    with lambda::

    reduce(lambda s, a: s + a.myattr, data, 0)
    reduce(lambda s, a: s + a[3], data, 0)

    These simplify to::

    sum(a.myattr for a in data)
    sum(a[3] for a in data)

    List comprehensions greatly reduced the need for filter() and map().
    Likewise, generator expressions are expected to minimize the need
    for itertools.ifilter() and itertools.imap(). In contrast, the
    utility of other itertools will be enhanced by generator expressions::

    dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector))

    Having a syntax similar to list comprehensions also makes it easy to
    convert existing code into a generator expression when scaling up
    application.

    Early timings showed that generators had a significant performance
    advantage over list comprehensions. However, the latter were highly
    optimized for Py2.4 and now the performance is roughly comparable
    for small to mid-sized data sets. As the data volumes grow larger,
    generator expressions tend to perform better because they do not
    exhaust cache memory and they allow Python to re-use objects between
    iterations.

    BDFL Pronouncements

    This PEP is ACCEPTED for Py2.4.

    The Details

    (None of this is exact enough in the eye of a reader from Mars, but I
    hope the examples convey the intention well enough for a discussion in
    c.l.py. The Python Reference Manual should contain a 100% exact
    semantic and syntactic specification.)

    1. The semantics of a generator expression are equivalent to creating
      an anonymous generator function and calling it. For example::

      g = (x**2 for x in range(10))
      print g.next()

      is equivalent to::

      def __gen(exp):
          for x in exp:
              yield x**2
      g = __gen(iter(range(10)))
      print g.next()

      Only the outermost for-expression is evaluated immediately, the other
      expressions are deferred until the generator is run::

       g = (tgtexp  for var1 in exp1 if exp2 for var2 in exp3 if exp4)

    is equivalent to::

    def __gen(bound_exp):
        for var1 in bound_exp:
            if exp2:
                for var2 in exp3:
                    if exp4:
                        yield tgtexp
    g = __gen(iter(exp1))
    del __gen
    1. The syntax requires that a generator expression always needs to be
      directly inside a set of parentheses and cannot have a comma on
      either side. With reference to the file Grammar/Grammar in CVS,
      two rules change:

      a) The rule::

        atom: '(' [testlist] ')'

      changes to::

        atom: '(' [testlist_gexp] ')'

      where testlist_gexp is almost the same as listmaker, but only
      allows a single test after ‘for’ … ‘in’::

        testlist_gexp: test ( gen_for | (',' test)* [','] )

      b) The rule for arglist needs similar changes.

      This means that you can write::

      sum(x**2 for x in range(10))

      but you would have to write::

      reduce(operator.add, (x**2 for x in range(10)))

      and also::

      g = (x**2 for x in range(10))

      i.e. if a function call has a single positional argument, it can be
      a generator expression without extra parentheses, but in all other
      cases you have to parenthesize it.

      The exact details were checked in to Grammar/Grammar version 1.49.

    2. The loop variable (if it is a simple variable or a tuple of simple
      variables) is not exposed to the surrounding function. This
      facilitates the implementation and makes typical use cases more
      reliable. In some future version of Python, list comprehensions
      will also hide the induction variable from the surrounding code
      (and, in Py2.4, warnings will be issued for code accessing the
      induction variable).

      For example::

      x = "hello"
      y = list(x for x in "abc")
      print x    # prints "hello", not "c"
    3. List comprehensions will remain unchanged. For example::

      [x for x in S]    # This is a list comprehension.
      [(x for x in S)]  # This is a list containing one generator
                        # expression.

      Unfortunately, there is currently a slight syntactic difference.
      The expression::

      [x for x in 1, 2, 3]

      is legal, meaning::

      [x for x in (1, 2, 3)]

      But generator expressions will not allow the former version::

      (x for x in 1, 2, 3)

      is illegal.

      The former list comprehension syntax will become illegal in Python
      3.0, and should be deprecated in Python 2.4 and beyond.

      List comprehensions also “leak” their loop variable into the
      surrounding scope. This will also change in Python 3.0, so that
      the semantic definition of a list comprehension in Python 3.0 will
      be equivalent to list(<generator expression>). Python 2.4 and
      beyond should issue a deprecation warning if a list comprehension’s
      loop variable has the same name as a variable used in the
      immediately surrounding scope.

    Early Binding versus Late Binding

    After much discussion, it was decided that the first (outermost)
    for-expression should be evaluated immediately and that the remaining
    expressions be evaluated when the generator is executed.

    Asked to summarize the reasoning for binding the first expression,
    Guido offered [5]_::

    Consider sum(x for x in foo()). Now suppose there's a bug in foo()
    that raises an exception, and a bug in sum() that raises an
    exception before it starts iterating over its argument. Which
    exception would you expect to see? I'd be surprised if the one in
    sum() was raised rather the one in foo(), since the call to foo()
    is part of the argument to sum(), and I expect arguments to be
    processed before the function is called.
    
    OTOH, in sum(bar(x) for x in foo()), where sum() and foo()
    are bugfree, but bar() raises an exception, we have no choice but
    to delay the call to bar() until sum() starts iterating -- that's
    part of the contract of generators. (They do nothing until their
    next() method is first called.)

    Various use cases were proposed for binding all free variables when
    the generator is defined. And some proponents felt that the resulting
    expressions would be easier to understand and debug if bound immediately.

    However, Python takes a late binding approach to lambda expressions and
    has no precedent for automatic, early binding. It was felt that
    introducing a new paradigm would unnecessarily introduce complexity.

    After exploring many possibilities, a consensus emerged that binding
    issues were hard to understand and that users should be strongly
    encouraged to use generator expressions inside functions that consume
    their arguments immediately. For more complex applications, full
    generator definitions are always superior in terms of being obvious
    about scope, lifetime, and binding [6]_.

    Reduction Functions

    The utility of generator expressions is greatly enhanced when combined
    with reduction functions like sum(), min(), and max(). The heapq
    module in Python 2.4 includes two new reduction functions: nlargest()
    and nsmallest(). Both work well with generator expressions and keep
    no more than n items in memory at one time.

    Acknowledgements

    • Raymond Hettinger first proposed the idea of “generator
      comprehensions” in January 2002.

    • Peter Norvig resurrected the discussion in his proposal for
      Accumulation Displays.

    • Alex Martelli provided critical measurements that proved the
      performance benefits of generator expressions. He also provided
      strong arguments that they were a desirable thing to have.

    • Phillip Eby suggested “iterator expressions” as the name.

    • Subsequently, Tim Peters suggested the name “generator expressions”.

    • Armin Rigo, Tim Peters, Guido van Rossum, Samuele Pedroni,
      Hye-Shik Chang and Raymond Hettinger teased out the issues surrounding
      early versus late binding [5]_.

    • Jiwon Seo single handedly implemented various versions of the proposal
      including the final version loaded into CVS. Along the way, there
      were periodic code reviews by Hye-Shik Chang and Raymond Hettinger.
      Guido van Rossum made the key design decisions after comments from
      Armin Rigo and newsgroup discussions. Raymond Hettinger provided
      the test suite, documentation, tutorial, and examples [6]_.

    References

    .. [1] PEP 202 List Comprehensions
    http://www.python.org/dev/peps/pep-0202/

    .. [2] PEP 255 Simple Generators
    http://www.python.org/dev/peps/pep-0255/

    .. [3] Peter Norvig’s Accumulation Display Proposal
    http://www.norvig.com/pyacc.html

    .. [4] Jeff Epler had worked up a patch demonstrating
    the previously proposed bracket and yield syntax
    http://python.org/sf/795947

    .. [5] Discussion over the relative merits of early versus late binding
    https://mail.python.org/pipermail/python-dev/2004-April/044555.html

    .. [6] Patch discussion and alternative patches on Source Forge
    http://www.python.org/sf/872326

    Copyright

    This document has been placed in the public domain.

    • Why study iterators?

      The design of the Python language is closely tied to the ideas of iterators and iteration.

      Other authors have noticed this too.

    • Iterable types

      Python has many standard data types that are iterable

      • lists
      • tuples
      • strings
      • dictionaries
      • sets
      • file handles
    • And you, as the programmer, can make any of your classes iterable by defining a few special methods. We’ll show you how to do this later.

    • Definitions

      First, what do we mean by the terms iterable and iterator?

      The official Python Glossary says

    • Explicit vs. implicit iteration

      Your programs can perform iteration in one of two ways

      • explicitly, by using a while or for statement
      • implicitly, by calling a function that expects an iterable object as one of its arguments
    • Explicit iteration with loop statements

      Let’s start by looking at Python’s two loop statements

      • the while statement
      • the for statement

      These let us iterate explicitly.

    • Iterating with a while statement

      There are 2 ways to iterate with a while loop

      1. using a loop counter variable
      2. using the iter and next functions
    • Iterating with a for statement

      The for statement greatly simplifies iteration by hiding all the try/except machinery. It’s still there but you don’t have to think about it.

      Behind the scenes, the for statement calls iter() on the container object. The function returns an iterator object that defines the method next() which accesses elements in the container one at a time. When there are no more elements, next() raises a StopIteration exception which tells the for loop to terminate.

      Classes — Python 2.7.13 documentation. (2017). Docs.python.org. Retrieved 6 July 2017, from https://docs.python.org/2/tutorial/classes.html#iterators

      It’s important to remember that for loops do not count, they iterate.

    • Summary

      Python’s iter and next functions allow you to iterate over collections of data with a fine level of control. But you almost never need to use them. The simple for statement usually does what you need.

    • Generator expressions

      Generator expressions were described in PEP 289 and accepted into Python version 2.4.

    • Generator functions

    • The itertools functions

    • The itertools recipes

    • The more-itertools package

      Author: Erik Rose
      Install from: https://pypi.python.org/pypi/more-itertools

      This package contains the recipes from the itertools module plus some additional functions and recipes.

      I strongly suggest installing this package so you have easy access to the itertools recipes.

    • “The use of iterators pervades and unifies Python”

      9. Classes — Python 2.7.13 documentation. (2017). Docs.python.org. Retrieved 6 July 2017, from https://docs.python.org/2/tutorial/classes.html#iterators

    • “Iterators are the “secret sauce” of Python 3. They’re everywhere, underlying everything, always just out of sight. Comprehensions are just a simple form of iterators. Generators are just a simple form of iterators. A function that yields values is a nice, compact way of building an iterator without building an iterator.”

      Classes & Iterators - Dive Into Python 3. (2017). Diveintopython3.net. Retrieved 6 July 2017, from http://www.diveintopython3.net/iterators.html

    • iterable

      An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list, str, and tuple) and some non-sequence types like dict, file objects, and objects of any classes you define with an __iter__() or __getitem__() method. Iterables can be used in a for loop and in many other places where a sequence is needed (zip(), map(), …). When an iterable object is passed as an argument to the built-in function iter(), it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to call iter() or deal with iterator objects yourself. The for statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop. See also iterator, sequence, and generator.

      Docs.python.org. (2017). Glossary — Python 3.6.2rc1 documentation. [online] Available at: https://docs.python.org/3/glossary.html#term-iterator [Accessed 6 Jul. 2017].

    • iterator

      An object representing a stream of data. Repeated calls to the iterator’s __next__() method (or passing it to the built-in function next()) return successive items in the stream. When no more data are available a StopIteration exception is raised instead. At this point, the iterator object is exhausted and any further calls to its __next__() method just raise StopIteration again. Iterators are required to have an __iter__() method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as a list) produces a fresh new iterator each time you pass it to the iter() function or use it in a for loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container.

      Docs.python.org. (2017). Glossary — Python 3.6.2rc1 documentation. [online] Available at: https://docs.python.org/3/glossary.html#term-iterator [Accessed 6 Jul. 2017].

    • Iterating with a loop counter variable

      This style of iteration is commonly found in languages like C or Java. It works like this

      • You use a counter variable to index into the iterable.
      • You increment the counter during each iteration.
      • The counter increments until the subscript operator raises an IndexError exception.
      index = -1
      while True:
          try:
              index += 1
              value = iterable[index]
          except IndexError:
              break
          else:
              # code block

      This method only works if the iterable implements the subscript [ ] operator. Some iterables, like sets, do not.

    • The iter and next functions.

      You might not be familiar with these two functions. Here’s what they do

      iter(o[, sentinel])
          Return an iterator object.
      next(iterator[, default])
          Retrieve the next item from the 
          iterator by calling its next() 
          method. If default is given, it is 
          returned if the iterator is 
          exhausted, otherwise StopIteration is 
          raised.

      As you can see, they are designed to be used together.

    • Using iter and next

      Here’s a while loop that works for any iterable object. You use iter to create an iterator object, then use the next function to read values from that iterator. The iterator raises a StopIteration exception when it is exhausted.

      it = iter(iterable)
      while True:
          try:
              value = next(it)
          except StopIteration:
              break
          else:
              # code block
    • Syntax

      for value in iterable:
          # code block
    • A counting for loop

      Use the enumerate function when you need to know how many times you’ve gone through a loop.

      NOTE: by default, enumerate starts counting at zero. This is consistent with list and tuple indexes which also start at zero.

    • Syntax

      Generator expressions mimic the syntax of list comprehensions. This is by design.

      (output-expression for variable in input-expression)

      The generator expression can include an optional if clause. The output-expression is evaluated only when the condition is True.

      (output-expression for variable in input-expression if condition)

    • Lazy evaluation

      Generator expressions are lazy. Values are produced one at a time when they are needed. Generating a million values takes no more memory than generating one or two.

    • Limitations

      There are some statements that cannot appear inside a generator expression

      • continue statements
      • break statements
      • try/except statements
      • while statements
      • else statements
    • for count, value in enumerate(iterable):
          # code block
    • Continue statements

      There is no way to skip over an iteration. However, the output-expression is not evaluated when the if clause is False.

    • Break statements

      A generator expression cannot contain a break statement. But breaking out of an enclosing for statement achieves the same effect

      for value in generator-expression:
          if condition:
              break
    {"cards":[{"_id":"7dae791b8bac43fb84000026","treeId":"7dae783b8bac43fb84000023","seq":10640923,"position":1,"parentId":null,"content":"# Python Iterators\n"},{"_id":"7db1ced18bac43fb8400003c","treeId":"7dae783b8bac43fb84000023","seq":10668101,"position":1.25,"parentId":null,"content":"# Part 1 : The Basics"},{"_id":"7dae79c28bac43fb84000027","treeId":"7dae783b8bac43fb84000023","seq":10668104,"position":1,"parentId":"7db1ced18bac43fb8400003c","content":"## Why study iterators?\n\nThe design of the Python language is closely tied to the ideas of *iterators* and *iteration*. \n\nOther authors have noticed this too.\n"},{"_id":"7daf9e258bac43fb8400002e","treeId":"7dae783b8bac43fb84000023","seq":10641571,"position":0.5,"parentId":"7dae79c28bac43fb84000027","content":">**\"The use of iterators pervades and unifies Python\"**\n\n> *9. Classes — Python 2.7.13 documentation. (2017). Docs.python.org. Retrieved 6 July 2017, from https://docs.python.org/2/tutorial/classes.html#iterators*"},{"_id":"7dae7a2c8bac43fb84000028","treeId":"7dae783b8bac43fb84000023","seq":10641570,"position":1,"parentId":"7dae79c28bac43fb84000027","content":"> **\"Iterators are the “secret sauce” of Python 3. They’re everywhere, underlying everything, always just out of sight. Comprehensions are just a simple form of iterators. Generators are just a simple form of iterators. A function that yields values is a nice, compact way of building an iterator without building an iterator.\"**\n\n> *Classes & Iterators - Dive Into Python 3. (2017). Diveintopython3.net. Retrieved 6 July 2017, from http://www.diveintopython3.net/iterators.html*\n"},{"_id":"7db109768bac43fb84000036","treeId":"7dae783b8bac43fb84000023","seq":10668106,"position":2,"parentId":"7db1ced18bac43fb8400003c","content":"## Iterable types\n\nPython has many standard data types that are iterable\n\n* lists\n* tuples\n* strings\n* dictionaries\n* sets\n* file handles\n"},{"_id":"7db10fbf8bac43fb84000037","treeId":"7dae783b8bac43fb84000023","seq":10668107,"position":3,"parentId":"7db1ced18bac43fb8400003c","content":"And you, as the programmer, can make any of your classes iterable by defining a few special methods. We'll show you how to do this later."},{"_id":"7dae816d8bac43fb84000029","treeId":"7dae783b8bac43fb84000023","seq":10668108,"position":4,"parentId":"7db1ced18bac43fb8400003c","content":"## Definitions\n\nFirst, what do we mean by the terms *iterable* and *iterator*?\n\nThe official Python Glossary says"},{"_id":"7dae81b88bac43fb8400002a","treeId":"7dae783b8bac43fb84000023","seq":10641539,"position":1,"parentId":"7dae816d8bac43fb84000029","content":"### iterable\n>An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list, str, and tuple) and some non-sequence types like dict, file objects, and objects of any classes you define with an `__iter__()` or `__getitem__()` method. Iterables can be used in a for loop and in many other places where a sequence is needed (`zip()`, `map()`, ...). When an iterable object is passed as an argument to the built-in function `iter()`, it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to call `iter()` or deal with iterator objects yourself. The for statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop. See also iterator, sequence, and generator.\n\n>*Docs.python.org. (2017). Glossary — Python 3.6.2rc1 documentation. [online] Available at: https://docs.python.org/3/glossary.html#term-iterator [Accessed 6 Jul. 2017].*"},{"_id":"7dae85358bac43fb8400002d","treeId":"7dae783b8bac43fb84000023","seq":10641542,"position":1.5,"parentId":"7dae816d8bac43fb84000029","content":"### iterator\n>An object representing a stream of data. Repeated calls to the iterator’s `__next__()` method (or passing it to the built-in function `next()`) return successive items in the stream. When no more data are available a StopIteration exception is raised instead. At this point, the iterator object is exhausted and any further calls to its `__next__()` method just raise StopIteration again. Iterators are required to have an `__iter__()` method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as a list) produces a fresh new iterator each time you pass it to the `iter()` function or use it in a for loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container.\n\n>*Docs.python.org. (2017). Glossary — Python 3.6.2rc1 documentation. [online] Available at: https://docs.python.org/3/glossary.html#term-iterator [Accessed 6 Jul. 2017].*"},{"_id":"7db0dbba8bac43fb84000033","treeId":"7dae783b8bac43fb84000023","seq":10668109,"position":5,"parentId":"7db1ced18bac43fb8400003c","content":"## Explicit vs. implicit iteration\n\nYour programs can perform iteration in one of two ways\n* *explicitly*, by using a *while* or *for* statement\n* *implicitly*, by calling a function that expects an iterable object as one of its arguments"},{"_id":"7db0c7328bac43fb84000032","treeId":"7dae783b8bac43fb84000023","seq":10668111,"position":6,"parentId":"7db1ced18bac43fb8400003c","content":"## Explicit iteration with loop statements\n\nLet's start by looking at Python's two loop statements\n\n* the *while* statement\n* the *for* statement\n\nThese let us iterate *explicitly*."},{"_id":"7dafb8708bac43fb8400002f","treeId":"7dae783b8bac43fb84000023","seq":10668112,"position":7,"parentId":"7db1ced18bac43fb8400003c","content":"## Iterating with a ***while*** statement\n\nThere are 2 ways to iterate with a *while* loop\n\n1. using a loop counter variable\n2. using the `iter` and `next` functions"},{"_id":"7dafbc528bac43fb84000030","treeId":"7dae783b8bac43fb84000023","seq":10680668,"position":1,"parentId":"7dafb8708bac43fb8400002f","content":"### Iterating with a loop counter variable\n\nThis style of iteration is commonly found in languages like C or Java. It works like this\n\n* You use a counter variable to index into the iterable. \n* You increment the counter during each iteration.\n* The counter increments until the subscript operator raises an `IndexError` exception.\n\n```\nindex = -1\nwhile True:\n try:\n index += 1\n value = iterable[index]\n except IndexError:\n break\n else:\n # code block\n```\nThis method only works if the iterable implements the subscript `[ ]` operator. Some iterables, like *sets*, do not."},{"_id":"7db1753b8bac43fb8400003a","treeId":"7dae783b8bac43fb84000023","seq":10641548,"position":1.5,"parentId":"7dafb8708bac43fb8400002f","content":"## The `iter` and `next` functions.\nYou might not be familiar with these two functions. Here's what they do\n```\niter(o[, sentinel])\n Return an iterator object. \n```\n```\nnext(iterator[, default])\n Retrieve the next item from the \n iterator by calling its next() \n method. If default is given, it is \n returned if the iterator is \n exhausted, otherwise StopIteration is \n raised.\n```\nAs you can see, they are designed to be used together.\n"},{"_id":"7dafc74f8bac43fb84000031","treeId":"7dae783b8bac43fb84000023","seq":10641579,"position":2,"parentId":"7dafb8708bac43fb8400002f","content":"### Using `iter` and `next`\n\n\nHere's a *while* loop that works for any iterable object. You use `iter` to create an *iterator object*, then use the `next` function to read values from that iterator. The iterator raises a `StopIteration` exception when it is exhausted.\n```\nit = iter(iterable)\nwhile True:\n try:\n value = next(it)\n except StopIteration:\n break\n else:\n # code block\n```"},{"_id":"7db0e75a8bac43fb84000034","treeId":"7dae783b8bac43fb84000023","seq":10680685,"position":8,"parentId":"7db1ced18bac43fb8400003c","content":"## Iterating with a ***for*** statement\n\nThe *for* statement greatly simplifies iteration by hiding all the *try/except* machinery. It's still there but you don't have to think about it.\n\n> Behind the scenes, the for statement calls iter() on the container object. The function returns an iterator object that defines the method next() which accesses elements in the container one at a time. When there are no more elements, next() raises a StopIteration exception which tells the for loop to terminate.\n\n> *Classes — Python 2.7.13 documentation. (2017). Docs.python.org. Retrieved 6 July 2017, from https://docs.python.org/2/tutorial/classes.html#iterators*\n\nIt's important to remember that for loops do not *count*, they *iterate*."},{"_id":"7db0eb058bac43fb84000035","treeId":"7dae783b8bac43fb84000023","seq":10680692,"position":1,"parentId":"7db0e75a8bac43fb84000034","content":"### Syntax\n```\nfor value in iterable:\n # code block\n```"},{"_id":"7e0b784b7f82909a6b00002d","treeId":"7dae783b8bac43fb84000023","seq":10680724,"position":2,"parentId":"7db0e75a8bac43fb84000034","content":"### A counting ***for*** loop\n\nUse the `enumerate` function when you need to know how many times you've gone through a loop. \n> NOTE: by default, `enumerate` starts counting at zero. This is consistent with list and tuple indexes which also start at zero."},{"_id":"7e0b7c647f82909a6b00002e","treeId":"7dae783b8bac43fb84000023","seq":10680712,"position":1,"parentId":"7e0b784b7f82909a6b00002d","content":"```\nfor count, value in enumerate(iterable):\n # code block\n```"},{"_id":"7db13f4a8bac43fb84000039","treeId":"7dae783b8bac43fb84000023","seq":10668115,"position":9,"parentId":"7db1ced18bac43fb8400003c","content":"## Summary\n\nPython's `iter` and `next` functions allow you to iterate over collections of data with a fine level of control. But you almost never need to use them. The simple `for` statement usually does what you need."},{"_id":"7df34027b3638a903200001f","treeId":"7dae783b8bac43fb84000023","seq":10668100,"position":1.5,"parentId":null,"content":"# Part 2 : Generators"},{"_id":"7e010ac3ebebad838a000023","treeId":"7dae783b8bac43fb84000023","seq":10674356,"position":1,"parentId":"7df34027b3638a903200001f","content":"## Generator expressions\n\nGenerator expressions were described in PEP 289 and accepted into Python version 2.4."},{"_id":"7e01461eebebad838a000029","treeId":"7dae783b8bac43fb84000023","seq":10674418,"position":2,"parentId":"7e010ac3ebebad838a000023","content":"### Syntax\n\nGenerator expressions mimic the syntax of list comprehensions. This is by design. \n\n> (*output-expression* **for** *variable* **in** *input-expression*)\n\nThe generator expression can include an optional **if** clause. The *output-expression* is evaluated only when the condition is True.\n\n> (*output-expression* **for** *variable* **in** *input-expression* **if** *condition*)\n"},{"_id":"7e0157f7ebebad838a00002a","treeId":"7dae783b8bac43fb84000023","seq":10674463,"position":3,"parentId":"7e010ac3ebebad838a000023","content":"### Lazy evaluation\n\nGenerator expressions are *lazy*. Values are produced one at a time when they are needed. Generating a million values takes no more memory than generating one or two."},{"_id":"7e010f98ebebad838a000025","treeId":"7dae783b8bac43fb84000023","seq":10674583,"position":4,"parentId":"7e010ac3ebebad838a000023","content":"### Limitations\n\nThere are some statements that cannot appear inside a generator expression\n* continue statements\n* break statements\n* try/except statements\n* while statements\n* else statements"},{"_id":"7e0120abebebad838a000026","treeId":"7dae783b8bac43fb84000023","seq":10674593,"position":1,"parentId":"7e010f98ebebad838a000025","content":"### Continue statements\n\nThere is no way to skip over an iteration. However, the *output-expression* is not evaluated when the **if** clause is False."},{"_id":"7e016077ebebad838a00002b","treeId":"7dae783b8bac43fb84000023","seq":10674572,"position":2,"parentId":"7e010f98ebebad838a000025","content":"### Break statements\n\nA generator expression cannot contain a *break* statement. But breaking out of an enclosing **for** statement achieves the same effect\n```\nfor value in generator-expression:\n if condition:\n break\n```"},{"_id":"7e010b48ebebad838a000024","treeId":"7dae783b8bac43fb84000023","seq":10674296,"position":2,"parentId":"7df34027b3638a903200001f","content":"## Generator functions"},{"_id":"7e010750ebebad838a000020","treeId":"7dae783b8bac43fb84000023","seq":10680634,"position":1.75,"parentId":null,"content":"# Part 3 : itertools\n\nThe `itertools` module provides two types of resources\n1. a collection of *functions* that you can import and use directly\n2. a set of *recipes* that you can copy and paste into your programs"},{"_id":"7e01086bebebad838a000021","treeId":"7dae783b8bac43fb84000023","seq":10674291,"position":1,"parentId":"7e010750ebebad838a000020","content":"## The itertools functions\n"},{"_id":"7e01095febebad838a000022","treeId":"7dae783b8bac43fb84000023","seq":10680550,"position":2,"parentId":"7e010750ebebad838a000020","content":"## The itertools recipes\n"},{"_id":"7e0b3e937f82909a6b00002c","treeId":"7dae783b8bac43fb84000023","seq":10680620,"position":3,"parentId":"7e010750ebebad838a000020","content":"## The more-itertools package\n\nAuthor: Erik Rose\nInstall from: https://pypi.python.org/pypi/more-itertools\n\nThis package contains the recipes from the `itertools` module plus some additional functions and recipes.\n\nI strongly suggest installing this package so you have easy access to the `itertools` recipes."},{"_id":"7e0143f2ebebad838a000028","treeId":"7dae783b8bac43fb84000023","seq":10674364,"position":1.875,"parentId":null,"content":"# Appendixes"},{"_id":"7e01343bebebad838a000027","treeId":"7dae783b8bac43fb84000023","seq":10674366,"position":1.9375,"parentId":null,"content":"## PEP 289\n\nPEP: 289\nTitle: Generator Expressions\nAuthor: python@rcn.com (Raymond Hettinger)\nStatus: Final\nType: Standards Track\nCreated: 30-Jan-2002\nPython-Version: 2.4\nPost-History: 22-Oct-2003\n\n\nAbstract\n========\n\nThis PEP introduces generator expressions as a high performance,\nmemory efficient generalization of list comprehensions [1]_ and\ngenerators [2]_.\n\n\nRationale\n=========\n\nExperience with list comprehensions has shown their widespread\nutility throughout Python. However, many of the use cases do\nnot need to have a full list created in memory. Instead, they\nonly need to iterate over the elements one at a time.\n\nFor instance, the following summation code will build a full list of\nsquares in memory, iterate over those values, and, when the reference\nis no longer needed, delete the list::\n\n sum([x*x for x in range(10)])\n\nMemory is conserved by using a generator expression instead::\n\n sum(x*x for x in range(10))\n\nSimilar benefits are conferred on constructors for container objects::\n\n s = Set(word for line in page for word in line.split())\n d = dict( (k, func(k)) for k in keylist)\n\nGenerator expressions are especially useful with functions like sum(),\nmin(), and max() that reduce an iterable input to a single value::\n\n max(len(line) for line in file if line.strip())\n\nGenerator expressions also address some examples of functionals coded\nwith lambda::\n\n reduce(lambda s, a: s + a.myattr, data, 0)\n reduce(lambda s, a: s + a[3], data, 0)\n\nThese simplify to::\n\n sum(a.myattr for a in data)\n sum(a[3] for a in data)\n\nList comprehensions greatly reduced the need for filter() and map().\nLikewise, generator expressions are expected to minimize the need\nfor itertools.ifilter() and itertools.imap(). In contrast, the\nutility of other itertools will be enhanced by generator expressions::\n\n dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector))\n\nHaving a syntax similar to list comprehensions also makes it easy to\nconvert existing code into a generator expression when scaling up\napplication.\n\nEarly timings showed that generators had a significant performance\nadvantage over list comprehensions. However, the latter were highly\noptimized for Py2.4 and now the performance is roughly comparable\nfor small to mid-sized data sets. As the data volumes grow larger,\ngenerator expressions tend to perform better because they do not\nexhaust cache memory and they allow Python to re-use objects between\niterations.\n\nBDFL Pronouncements\n===================\n\nThis PEP is ACCEPTED for Py2.4.\n\n\nThe Details\n===========\n\n(None of this is exact enough in the eye of a reader from Mars, but I\nhope the examples convey the intention well enough for a discussion in\nc.l.py. The Python Reference Manual should contain a 100% exact\nsemantic and syntactic specification.)\n\n1. The semantics of a generator expression are equivalent to creating\n an anonymous generator function and calling it. For example::\n\n g = (x**2 for x in range(10))\n print g.next()\n\n is equivalent to::\n\n def __gen(exp):\n for x in exp:\n yield x**2\n g = __gen(iter(range(10)))\n print g.next()\n\n Only the outermost for-expression is evaluated immediately, the other\n expressions are deferred until the generator is run::\n\n\n g = (tgtexp for var1 in exp1 if exp2 for var2 in exp3 if exp4)\n\n is equivalent to::\n\n def __gen(bound_exp):\n for var1 in bound_exp:\n if exp2:\n for var2 in exp3:\n if exp4:\n yield tgtexp\n g = __gen(iter(exp1))\n del __gen\n\n2. The syntax requires that a generator expression always needs to be\n directly inside a set of parentheses and cannot have a comma on\n either side. With reference to the file Grammar/Grammar in CVS,\n two rules change:\n\n a) The rule::\n\n atom: '(' [testlist] ')'\n\n changes to::\n\n atom: '(' [testlist_gexp] ')'\n\n where testlist_gexp is almost the same as listmaker, but only\n allows a single test after 'for' ... 'in'::\n\n testlist_gexp: test ( gen_for | (',' test)* [','] )\n\n b) The rule for arglist needs similar changes.\n\n This means that you can write::\n\n sum(x**2 for x in range(10))\n\n but you would have to write::\n\n reduce(operator.add, (x**2 for x in range(10)))\n\n and also::\n\n g = (x**2 for x in range(10))\n\n i.e. if a function call has a single positional argument, it can be\n a generator expression without extra parentheses, but in all other\n cases you have to parenthesize it.\n\n The exact details were checked in to Grammar/Grammar version 1.49.\n\n3. The loop variable (if it is a simple variable or a tuple of simple\n variables) is not exposed to the surrounding function. This\n facilitates the implementation and makes typical use cases more\n reliable. In some future version of Python, list comprehensions\n will also hide the induction variable from the surrounding code\n (and, in Py2.4, warnings will be issued for code accessing the\n induction variable).\n\n For example::\n\n x = \"hello\"\n y = list(x for x in \"abc\")\n print x # prints \"hello\", not \"c\"\n\n4. List comprehensions will remain unchanged. For example::\n\n [x for x in S] # This is a list comprehension.\n [(x for x in S)] # This is a list containing one generator\n # expression.\n\n Unfortunately, there is currently a slight syntactic difference.\n The expression::\n\n [x for x in 1, 2, 3]\n\n is legal, meaning::\n\n [x for x in (1, 2, 3)]\n\n But generator expressions will not allow the former version::\n\n (x for x in 1, 2, 3)\n\n is illegal.\n\n The former list comprehension syntax will become illegal in Python\n 3.0, and should be deprecated in Python 2.4 and beyond.\n\n List comprehensions also \"leak\" their loop variable into the\n surrounding scope. This will also change in Python 3.0, so that\n the semantic definition of a list comprehension in Python 3.0 will\n be equivalent to list(<generator expression>). Python 2.4 and\n beyond should issue a deprecation warning if a list comprehension's\n loop variable has the same name as a variable used in the\n immediately surrounding scope.\n\nEarly Binding versus Late Binding\n=================================\n\nAfter much discussion, it was decided that the first (outermost)\nfor-expression should be evaluated immediately and that the remaining\nexpressions be evaluated when the generator is executed.\n\nAsked to summarize the reasoning for binding the first expression,\nGuido offered [5]_::\n\n Consider sum(x for x in foo()). Now suppose there's a bug in foo()\n that raises an exception, and a bug in sum() that raises an\n exception before it starts iterating over its argument. Which\n exception would you expect to see? I'd be surprised if the one in\n sum() was raised rather the one in foo(), since the call to foo()\n is part of the argument to sum(), and I expect arguments to be\n processed before the function is called.\n\n OTOH, in sum(bar(x) for x in foo()), where sum() and foo()\n are bugfree, but bar() raises an exception, we have no choice but\n to delay the call to bar() until sum() starts iterating -- that's\n part of the contract of generators. (They do nothing until their\n next() method is first called.)\n\nVarious use cases were proposed for binding all free variables when\nthe generator is defined. And some proponents felt that the resulting\nexpressions would be easier to understand and debug if bound immediately.\n\nHowever, Python takes a late binding approach to lambda expressions and\nhas no precedent for automatic, early binding. It was felt that\nintroducing a new paradigm would unnecessarily introduce complexity.\n\nAfter exploring many possibilities, a consensus emerged that binding\nissues were hard to understand and that users should be strongly\nencouraged to use generator expressions inside functions that consume\ntheir arguments immediately. For more complex applications, full\ngenerator definitions are always superior in terms of being obvious\nabout scope, lifetime, and binding [6]_.\n\n\nReduction Functions\n===================\n\nThe utility of generator expressions is greatly enhanced when combined\nwith reduction functions like sum(), min(), and max(). The heapq\nmodule in Python 2.4 includes two new reduction functions: nlargest()\nand nsmallest(). Both work well with generator expressions and keep\nno more than n items in memory at one time.\n\n\nAcknowledgements\n================\n\n* Raymond Hettinger first proposed the idea of \"generator\n comprehensions\" in January 2002.\n\n* Peter Norvig resurrected the discussion in his proposal for\n Accumulation Displays.\n\n* Alex Martelli provided critical measurements that proved the\n performance benefits of generator expressions. He also provided\n strong arguments that they were a desirable thing to have.\n\n* Phillip Eby suggested \"iterator expressions\" as the name.\n\n* Subsequently, Tim Peters suggested the name \"generator expressions\".\n\n* Armin Rigo, Tim Peters, Guido van Rossum, Samuele Pedroni,\n Hye-Shik Chang and Raymond Hettinger teased out the issues surrounding\n early versus late binding [5]_.\n\n* Jiwon Seo single handedly implemented various versions of the proposal\n including the final version loaded into CVS. Along the way, there\n were periodic code reviews by Hye-Shik Chang and Raymond Hettinger.\n Guido van Rossum made the key design decisions after comments from\n Armin Rigo and newsgroup discussions. Raymond Hettinger provided\n the test suite, documentation, tutorial, and examples [6]_.\n\nReferences\n==========\n\n.. [1] PEP 202 List Comprehensions\n http://www.python.org/dev/peps/pep-0202/\n\n.. [2] PEP 255 Simple Generators\n http://www.python.org/dev/peps/pep-0255/\n\n.. [3] Peter Norvig's Accumulation Display Proposal\n http://www.norvig.com/pyacc.html\n\n.. [4] Jeff Epler had worked up a patch demonstrating\n the previously proposed bracket and yield syntax\n http://python.org/sf/795947\n\n.. [5] Discussion over the relative merits of early versus late binding\n https://mail.python.org/pipermail/python-dev/2004-April/044555.html\n\n.. [6] Patch discussion and alternative patches on Source Forge\n http://www.python.org/sf/872326\n\n\nCopyright\n=========\n\nThis document has been placed in the public domain.\n"}],"tree":{"_id":"7dae783b8bac43fb84000023","name":"Python Iterators","publicUrl":"python-iterators"}}