Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Adding a scoped index parameter to foreach #8130

Closed
jamesqo opened this issue Jan 24, 2016 · 21 comments
Closed

Proposal: Adding a scoped index parameter to foreach #8130

jamesqo opened this issue Jan 24, 2016 · 21 comments
Labels
Area-Language Design Discussion Feature Request Resolution-Won't Fix A real bug, but Triage feels that the issue is not impactful enough to spend time on.

Comments

@jamesqo
Copy link
Contributor

jamesqo commented Jan 24, 2016

I realize that there are already several other issues (#1651, #1752, and #5194) covering this, but they didn't address exactly what I had in mind so I'm making a new one here.

Background

A common pattern when working with IEnumerable<T> is to capture the index during each iteration of the loop. For example, LINQ has many overloads that take a Func<T, int> as well as a Func<T>, such as Select and Where. Languages like Python solve this problem with a method called enumerate, which returns the index and value of the current item in the collection:

for index, item in enumerate(items):
    # do something with index and item

Problem

At the moment, there's no way to capture the index in a foreach loop without 1) introducing a local variable in the outer scope and 2) incrementing it manually after each iteration. For example:

int index = 0;
foreach (var item in GetItems())
{
    Console.WriteLine($"Index: {index}. Item: {item}.");
    index++;
}

// index is still visible here after the loop has completed

Aside from the boilerplate, this can be problematic in functions that have multiple foreach loops in the same scope: a new index variable with a new name has to be initialized for each loop.

int i = 0;
foreach (var item in items) { /* Use i */ i++; }

int index = 0;
foreach (var item in items2) { /* Use index */ index++; }

/* Running out of variable names... we have to keep coming
up with new ones like idx or ind, which creates ambiguity */

Other Solutions

For-loops

A traditional for-loop works and scopes correctly, but

  • can only be used for collections that support indexers, and
  • requires a significant amount of boilerplate

I'm sure you've all seen plenty of examples of for-loops already, so I'll leave it to you to weigh their pros and cons.

foreach + Select / other LINQ-based variations

This also works well and scopes properly:

foreach (var tuple in items.Select((item, index) => Tuple.Create(item, index)))
{
    var item = tuple.Item1;
    var index = tuple.Item2;
    // use item and index
}

But let's be honest, this is not as readable as it could be. It's not immediately clear what Item1 and Item2 mean, and the Select expression makes the code look like an overstretched diving board (see the similarity?). Language support for tuples may improve the problem somewhat, but it'd still have to look like this.


Proposal: Add a scoped index parameter to foreach

With the new syntax, the code example above now looks like this:

foreach (var item, index in GetItems())
{
    Console.WriteLine($"Index: {index}. Item: {item}.");
}

The compiler-generated code is equivalent to

{ // Extra braces to add scope
    int outer = 0;
    var e = GetItems().GetEnumerator();

    while (e.MoveNext())
    {
        int index = outer; // Handle closures
        var item = e.Current;

        Console.WriteLine($"Index: {index}. Item: {item}.");
        outer++;
    }
}

Benefits

  • Less boilerplate. The need to initialize and increment an index variable after every loop has been eliminated.
  • No indexers required. This means that types like IEnumerable, ICollection, or IReadOnlyCollection can be used with the new syntax.
  • Scoping. The scope of index is limited to inside the loop, which circumvents the problem above with multiple loops in the same scope.
  • Immutability. index is readonly, preventing accidental assignments and making the code easier to read.

Type Inference

  • By default, the type of index is inferred to be int.
  • index may also be prefixed with var or int, e.g.
foreach (var item, var index in items)

for those who prefer to spell their types out explicitly.

  • Since we support collections with counts greater than int.MaxValue (via LongCount), the type of index may also be explicitly specified as long:
foreach (var item, long index in largeCollectionOfItems)

This ensures that we keep performant code for the vast majority of cases, but support using larger numbers when they are needed.

@orthoxerox
Copy link
Contributor

I like this idea, we just need to make sure this won't conflict with the destructuring assignment syntax.

@SolalPirelli
Copy link

If tuples are added as first-class citizens to the language, this could be replaced by an extension method on IEnumerable<T>, without the need for special syntax.

@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 24, 2016

@orthoxerox Probably not, the existing consensus in so far in #6400 seems to lean towards using let.

@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 24, 2016

@SolalPirelli Good point. I assume you mean something like this:

static void ForEach(this IEnumerable<T> source, Action<T, int> action)
{
    int count = 0;
    foreach (var item in source) action(item, count++);
}

I agree that would be less work than implementing a new language feature. It would be kinda strange to newcomers though, since you would have some people writing foreach and others writing ForEach. It's better to provide just one syntax for consistency.

ninja edit Eric Lippert has a blog post on this: https://blogs.msdn.microsoft.com/ericlippert/2009/05/18/foreach-vs-foreach/

@alrz
Copy link
Member

alrz commented Jan 24, 2016

What's wrong with GetItems().Select((item,index) => { ... })?

@SolalPirelli
Copy link

@jamesqo No, I meant an IEnumerable<T * int> Indexed(this IEnumerable<T> source) method, which would then be iterated using foreach(var (item, index) in stuff.Indexed()) { ... }.

@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 24, 2016

@alrz Because Select has to return something (the return type can't be void), so your code would end up looking like this:

items.Select((item, i) =>
{
    // work
    return 0;
});

edit Also not to mention Select is lazy so you'd have to iterate through it to see any side effects.

@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 24, 2016

@SolalPirelli Hmm, seems interesting. Would it be possible to omit the parentheses in the tuple declaration?

@alrz
Copy link
Member

alrz commented Jan 24, 2016

@jamesqo @SolalPirelli That was proposed in #6067 but it seemingly dismissed. Then you have to do this:

foreach(var tuple in list.Indexed()) {
  let (var item, var index) = tuple;
  // ...
}

@Blecki
Copy link

Blecki commented Jan 24, 2016

/* Running out of variable names... we have to keep coming
up with new ones like idx or ind, which creates ambiguity */

You could try more meaningful variable names than 'index'. In your example, 'indexItems1' and 'indexItems2' would be a better name. Further, if you're not using the value of 'indexItems1' after the loop you could just use index over and over. Or you could do this.

foreach (var indexedValue in SomeList.Select((e, i) => Tuple.Create(e,i)))
     ....

@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 24, 2016

@Blecki See my updated post. Your solution would definitely work, I just don't think it's readable as it could be.

In your example, 'indexItems1' and 'indexItems2' would be a better name.

Well, I guess it's a matter of opinion. But let's be honest, would you rather have code that looks like this:

int indexItems1 = 0;
foreach (var item in items1)
{
    // do something...
    indexItems1++;
}

int indexItems2 = 0;
foreach (var item in items2)
{
    // do something...
    indexItems2++;
}

or this:

foreach (var item, index in items1)
    // do something...

foreach (var item, index in items2)
    // do something...

if you're not using the value of 'indexItems1' after the loop you could just use index over and over

Yeah, but if you accidentally forget to reset index to 0 then you may run into problems, e.g.

int i = 0;
foreach (var item in items1) { /* */ i++; }

// Forgot to reset i = 0...
foreach (var item in items2)
{
    // Whoops! Now i starts at items1.Count() and continues from there.
    i++;
}

@gafter
Copy link
Member

gafter commented Jan 25, 2016

As the OP points out, this is morally similar to existing issues (#1651, #1752, and #5194), but I'll leave this open for a bit for discussion.

@paul1956
Copy link
Contributor

If you are going to do this there needs to be a "Finally" so I can access the index.

@Blecki
Copy link

Blecki commented Jan 25, 2016

@jamesqo, I can't argue with the readability. I avoid tuples because the names 'item1' and 'item2' are meaningless.

@techman2553
Copy link

How about a mashup of both for and foreach:

foreachindex(var item in collection, int i = 0, i++)
{
    Console.WriteLine("item={0}  index={1}", item, i);
}

This gives you the freedom to change the behavior of the index if you wanted to for some reason:

foreachindex(var record in records, long pos = 0, pos += 20)
{
    Console.WriteLine("record={0}  byte position={1}", record , pos);
}

In fact, you wouldn't even need "foreachindex", just overload the foreach function to accept the additional arguments:

foreach(var target in targets, double dist = 1000, dist -= 50.55 )
{
    Console.WriteLine("target={0}  distance={1} feet", target, dist);
}

@GeirGrusom
Copy link

public struct IndexedElement<T>
{
    public T Value { get; }
    public int Index { get; }
    public IndexedElement(T value, int index)
    {
        Value = value;
        Index = index;
    }
    public static IndexedElement<T> Map(T value, int index)
    {
        return new IndexedElement<T>(value, index);
    }
}

void Main()
{
    var items = new List<string> { "Foo", "Bar" };

    foreach (var item in items.Select(IndexedElement<string>.Map))
    {
        Console.WriteLine(item.Index);
    }
}

@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 26, 2016

@techman2553 I'm not convinced, the whole point of this proposal was to eliminate the boilerplate and the need for a state variable. Your suggestion seems to just move it to a different place (inside the foreach loop).

@GeirGrusom Your solution is good, but there's still a need to write out Select(IndexedElement<T>.Map) every time you want the index, which aside from being repetitive can create overhead (since LINQ uses iteration). Also, it'd be preferable to capture the index/value in 2 variables, rather than wrapping them in one.

@jamesqo
Copy link
Contributor Author

jamesqo commented Jan 26, 2016

The general consensus so far seems to be "we don't need this." If we choose not to go ahead with this proposal, I think we should reconsider the earlier suggestion about using the new destructuring syntax in loops. That would allow us to do something like this:

foreach (let (thing, i) in things.Index())
{
    // do something with thing and i
}

Although it's definitely a bit more verbose than my proposal above, it would support the scenario with no new syntax additions to the language (at least, after tuples/destructuring assignments are added in C# 7).

@RichiCoder1
Copy link

Correct me if I'm wrong, but wouldn't it be:

foreach ((var item, var index) in things.Index())
{
    // do something with item and index
}

@alrz
Copy link
Member

alrz commented Jan 26, 2016

@RichiCoder1 You can't use patterns in foreach.

@gafter gafter added the Resolution-Won't Fix A real bug, but Triage feels that the issue is not impactful enough to spend time on. label Jan 26, 2016
@gafter gafter closed this as completed Jan 29, 2016
@fowl2
Copy link

fowl2 commented Apr 28, 2020

For reference, 'de-structuring foreach' did end up being allowed, eg. this is ok:

foreach (var (item, index) in things.Indexed())
{
    // do something with item and index
}

Neither linq or morelinq have the "Indexed()" method though. An straightforward but inefficient implementation would be:

public static IEnumerable<(TItem item, int index)> Indexed<TItem>(this IEnumerable<TItem> items)
    => items.Select((item, index) => (item, index));

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Language Design Discussion Feature Request Resolution-Won't Fix A real bug, but Triage feels that the issue is not impactful enough to spend time on.
Projects
None yet
Development

No branches or pull requests