The original code is interesting - I think it's a very good piece of code for a first venture into F# functional coding.
The question you ask is also very interesting. Avoiding stack overflows when you consume the tree looks like it might require the use of continuations. Some of this is covered in Expert F# chapter 8.
However, it won't be so common for trees to get enomously deep if you keep them balanced.
One fun and helpful reference point in this domain is Wadler's classic on adding laziness to a strict language: http://citeseer.ist.psu.edu/102172.html
Kind regards
don