Python - Get html table element with lxml.html regex

By : Hamid Kazem Nadi
Date : October 17 2020, 03:08 PM
this one helps. I am trying to get the following element of the following website: https://www.investing.com/economic-calendar/ , I'm from Upwork. I guess this is what you want
code :

When using LXML why is the body element not the parent of the table element in this snippet of html

By : Catherine Griswold
Date : March 29 2020, 07:55 AM
this will help I am trying to process some files that are named xls and can be opened in Excel but they are web archive files There are some nested tables, I want to work first with only the non-nested tables. I thought I could catch the non-nested tables by looking only for those tables whose parent element had a body tag but for none of my tables is table.get_parent().tag=='body' true. Even for the table snip below the tag of the parent element of that particular table is a div tag , xpath to the rescue
code :
tree = html.fromstring(someString)
table_tops = set(tree.xpath('//table'))-set(tree.xpath('//table//table'))
table_tops = set(myTree.cssselect('table'))-set(myTree.cssselect('table table'))
How to surround an html element with another tag using lxml in Python

By : Micheal Priestas
Date : March 29 2020, 07:55 AM
This might help you I believe you are thinking it the wrong way: you cannot wrap an element around another. What you need to do is to copy the contents of the

into a variable, delete the

element, create a

element into where the

element used to be and then add the contents of the

element into the

Python Print element from lxml html

By : Alexey Kleyms
Date : March 29 2020, 07:55 AM
around this issue Trying to print out the entire element retrieved from lxml. , There is function tostring() in lxml.html
code :
import lxml, lxml.html

print lxml.html.tostring(element)
print 'Quote', html.tostring(quote[0])
for x in quote:
    print 'Quote', html.tostring(x)
Extract HTML comments in Python, using regex or lxml?

By : Gia Sơn
Date : March 29 2020, 07:55 AM
I hope this helps you . How do I extract all HTML-style comments from a document, using Python? , This seems to print the comment for me:
code :
from lxml import etree
txt = """<?xml version="1.0" encoding="UTF-8"?>
<clinical_study rank="220398">
    <!-- CAUTION:  The following MeSH terms are assigned with an imperfect algorithm  -->
    <mesh_term>Freund's Adjuvant</mesh_term>
    <mesh_term>Keyhole-limpet hemocyanin</mesh_term>
  <!-- Results have not yet been posted for this study                                -->
root = etree.XML(txt)
print root[0][0]
comments = [itm for itm in root if itm.tag is etree.Comment]:
if comments:
    print comments[-1]
Python lxml html xpath regex parsing

By : DisplayName=
Date : March 29 2020, 07:55 AM
this one helps. str.strip returns a stripped text, but does not change text.
code :
>>> text = '    a    '
>>> text.strip()   # returns a new string
>>> text  # `text` is not changed
'    a    '
text = text.strip()
