def getAdjacentTextElement(current, topMostNode, forward=None, traversingUp=False): if forward is None: forward = topMostNode topMostNode = None res = None #print "getAdjacentTextElement", current, topMostNode, forward, traversingUp # If traversingUp, then the children have already been processed if not traversingUp: if DOM.getChildCount(current) > 0: if forward: node = DOM.getFirstChild(current) else: node = DOM.getLastChild(current) if DOM.getNodeType(node) == DOM.TEXT_NODE: res = node else: # Depth first traversal, the recursive call deals with # siblings res = getAdjacentTextElement(node, topMostNode, forward, False) if res is None: if forward: node = current.nextSibling else: node = current.previousSibling # Traverse siblings if node is not None: if DOM.getNodeType(node) == DOM.TEXT_NODE: res = node else: #print node, DOM.getNodeType(node), node.innerHTML # Depth first traversal, the recursive call deals with # siblings res = getAdjacentTextElement(node, topMostNode, forward, False) # Go up and over if still not found if (res is None) and (not DOM.compare(current, topMostNode)): node = current.parentNode # Stop at document (technically could stop at "html" tag) if (node is not None) and \ (DOM.getNodeType(node) != DOM.DOCUMENT_NODE): res = getAdjacentTextElement(node, topMostNode, forward, True) return res