标签云

微信群

扫码加入我们

WeChat QR Code

Let me prefix this by saying that I know what foreach is, does and how to use it. This question concerns how it works under the bonnet, and I don't want any answers along the lines of "this is how you loop an array with foreach".For a long time I assumed that foreach worked with the array itself. Then I found many references to the fact that it works with a copy of the array, and I have since assumed this to be the end of the story. But I recently got into a discussion on the matter, and after a little experimentation found that this was not in fact 100% true.Let me show what I mean. For the following test cases, we will be working with the following array:$array = array(1, 2, 3, 4, 5);Test case 1:foreach ($array as $item) {echo "$item\n";$array[] = $item;}print_r($array);/* Output in loop:1 2 3 4 5 $array after loop: 1 2 3 4 5 1 2 3 4 5 */This clearly shows that we are not working directly with the source array - otherwise the loop would continue forever, since we are constantly pushing items onto the array during the loop. But just to be sure this is the case:Test case 2:foreach ($array as $key => $item) {$array[$key + 1] = $item + 2;echo "$item\n";}print_r($array);/* Output in loop:1 2 3 4 5 $array after loop: 1 3 4 5 6 7 */This backs up our initial conclusion, we are working with a copy of the source array during the loop, otherwise we would see the modified values during the loop. But...If we look in the manual, we find this statement:When foreach first starts executing, the internal array pointer is automatically reset to the first element of the array.Right... this seems to suggest that foreach relies on the array pointer of the source array. But we've just proved that we're not working with the source array, right? Well, not entirely.Test case 3:// Move the array pointer on one to make sure it doesn't affect the loopvar_dump(each($array));foreach ($array as $item) {echo "$item\n";}var_dump(each($array));/* Outputarray(4) {[1]=>int(1)["value"]=>int(1)[0]=>int(0)["key"]=>int(0)}12345bool(false)*/So, despite the fact that we are not working directly with the source array, we are working directly with the source array pointer - the fact that the pointer is at the end of the array at the end of the loop shows this. Except this can't be true - if it was, then test case 1 would loop forever.The PHP manual also states:As foreach relies on the internal array pointer changing it within the loop may lead to unexpected behavior.Well, let's find out what that "unexpected behavior" is (technically, any behavior is unexpected since I no longer know what to expect).Test case 4:foreach ($array as $key => $item) {echo "$item\n";each($array);}/* Output: 1 2 3 4 5 */Test case 5:foreach ($array as $key => $item) {echo "$item\n";reset($array);}/* Output: 1 2 3 4 5 */...nothing that unexpected there, in fact it seems to support the "copy of source" theory.The QuestionWhat is going on here? My C-fu is not good enough for me to able to extract a proper conclusion simply by looking at the PHP source code, I would appreciate it if someone could translate it into English for me.It seems to me that foreach works with a copy of the array, but sets the array pointer of the source array to the end of the array after the loop.Is this correct and the whole story?If not, what is it really doing?Is there any situation where using functions that adjust the array pointer (each(), reset() et al.) during a foreach could affect the outcome of the loop?


DaveRandom There's a php-internals tag this should probably go with, but I'll leave it to you to decide which if any of the other 5 tags to replace.

2019年05月24日56分19秒

looks like COW, without delete handle

2019年05月24日56分19秒

At first I thought »gosh, another newbie question. Read the docs… hm, clearly undefined behavior«. Then I read the complete question, and I must say: I like it. You've put quite some effort in it and writing all the testcases. ps. are testcase 4 and 5 the same?

2019年05月24日56分19秒

Just a thought about why it does make sense that the array pointer gets touched: PHP needs to reset and move the internal array pointer of the original array along with the copy, because the user may ask for a reference to the current value (foreach ($array as &$value)) - PHP needs to know the current position in the original array even though it's actually iterating over a copy.

2019年05月24日56分19秒

Sean: IMHO, the PHP documentation is really quite bad at describing the nuances of core language features.But that is, perhaps, because so many ad-hoc special cases are baked into the language...

2019年05月24日56分19秒

Baba It does. Passing it to a function is the same as doing $foo = $array before the loop ;)

2019年05月24日56分19秒

For those of you who don't know what a zval is, please refer to Sara Goleman's blog.golemon.com/2007/01/youre-being-lied-to.html

2019年05月24日56分19秒

Minor correction: what you call Bucket isn't what is normally called Bucket in a hashtable. Normally Bucket is a set of entries with the same hash%size. You seem to use it for what is normally called an entry. Linked list isn't on buckets, but on entries.

2019年05月24日56分19秒

unbeli I'm using the terminology used internally by PHP. The Buckets are part of a doubly linked list for hash collisions and also part of a doubly linked list for order ;)

2019年05月24日56分19秒

Great anwser. I think you meant iterate($outerArr); and not iterate($arr); somewhere.

2019年05月24日56分19秒

seems your right, I made some example which demonstrate that:codepad.org/OCjtvu8r one difference from your example - it does not copy if you change value, only if change keys.

2019年05月24日56分19秒

This does indeed explain all the behavior show above, and it can be nicely illustrated by calling each() at the end of the first test case, where we see that the array pointer of the original array points to the second element, since the array was modified during the first iteration. This also seems to demonstrate that foreach moves the array pointer on before executing the code block of the loop, which I was not expecting - I would have thought it would do this at the end. Many thanks, this clears it up for me nicely.

2019年05月24日56分19秒

Your answer is not quite correct. foreach operates on a potential copy of the array, but it does not make the actual copy unless it is needed.

2019年05月24日56分19秒

would you like to demonstrate how and when that potential copy is created through code ? My code demonstrates that foreach is copying the array100% of the time. I am eager to know. Thanks for you comments

2019年05月24日56分19秒

Copying an array costs a lot. Try counting the time it takes to iterate an array with 100000 elements using either for or foreach. You will not see any significant difference between the two of them, because an actual copy does not take place.

2019年05月25日56分19秒

Then I would assume that there is SHARED data storage reserved until or unless copy-on-write , but (from my code snippet) its evident that there will always be TWO set of SENTINEL variables one for the original array and other for foreach. Thanks that makes sense

2019年05月24日56分19秒

yes that is "prospected" copy i.e "potential" copy.Its not protected as you suggested

2019年05月24日56分19秒

Well did you read the rest of the answer? It makes perfect sense that foreach decides if it will loop another time before it even runs the code in it.

2019年05月24日56分19秒

No, the array is modified, but "too late" since foreach already "thinks" that it is at the last element (which it is at the start of iteration) and will not loop anymore. Where in the second example, it is not at the last element at the start of iteration and evaluates again on the start of next iteration. I am trying to prepare a test case.

2019年05月24日56分19秒

AlmaDo Look at lxr.php.net/xref/PHP_TRUNK/Zend/zend_vm_def.h#4509 It's always set to the next pointer when it iterates. So, when it reaches the last iteration, it'll be marked as finished (via NULL pointer). When you then add a key in last iteration, foreach won't notice it.

2019年05月24日56分19秒

DKasipovic no. There is no complete & clear explanation there (at least for now - may be I'm wrong)

2019年05月24日56分19秒

Actually it seems that AlmaDo has a flaw in understanding his own logic… Your answer is fine.

2019年05月24日56分19秒

Object public variables is wrong or at best misleading. You cannot use an object in an array without the correct interface (eg, Traversible) and when you do foreach((array)$obj ... you are in fact working with a simple array, not an object anymore.

2019年05月24日56分19秒