validate() — your HTML
For a couple months now I have been telling people that their Page Objects should have at minimum the three methods; open, wait_until_loaded and validate. The first two are easy, open navigates to the page it represents, and wait_until_loaded synchronizes. But validate was there because of a conversation on irc … that I can’t remember anything about other than I think it happened.
Until yesterday where I didn’t remember the conversation, but came up with a decent idea for what to put in there. And that is, non-functional static checks. Like whether the page is ‘valid’ HTML and whether the elements that should have accessibility attributes do. And a couple hours later I had a bit of an implementation.
At first I tried a couple of existing modules that will interact with the W3C validator but they either didn’t work, did too much for my needs (like parsing xml when json is a output format option) and/or used urllib2 as the means to talking to the server. (I now consider not using requests a code smell.)
Again, this is only a couple hours of thinking and is not perfect. It will need some tweaks, for instance…
- The W3C provides the validator free of charge, but if you are hitting it 10 000 times a day, that’s kinda a jerk move. Its just an Apache/CGI/Perl application that is Open Source so run a copy on your local network. It might even speed things up since you are not in queue with everyone else.
- If you chain your Page Object creation methods like this, you will validate the HTML every single time. Which could indeed find problems if there is a lot of things being injected or removed — especially in a CMS context where users are also adding markup to their content. But if you don’t want to do that, maybe some sort of global ‘did I validate this page’ counter? Which itself might get tricky if you are going parallel.
- The doctype is ‘supposed’ to be discoverable by the validator, but I’d rather be specific about which one I am trying to validate against. You can see the entire like of types available to you by inspecting the type select element on the web interface to the validator but I think the important ones are
- HTML5
- XHTML 1.0 Strict
- XHTML 1.0 Transitional
- HTML 4.01 Strict
<pre lang="python">from selenium.webdriver import Firefox
from po import Element34
class TestValidation(object):
def setup_method(self, method):
self.f = Firefox()
def teardown_method(self, method):
self.f.quit()
def test_validation(self):
e = Element34(self.f).open().wait_until_loaded().validate()
<pre lang="python">import requests
import json
class ValidationException(Exception):
pass
class Page(object):
def _validate_html(self):
post_data = {
"fragment": self.driver.page_source,
"output": "json",
"doctype": "XHTML 1.0 Transitional"
}
r = requests.post('http://validator.w3.org/check', data=post_data)
j = json.loads(r.text)
validation_errors = []
if r.headers['x-w3c-validator-errors'] != 0:
for m in j['messages']:
if m['type'] == 'error':
validation_errors.append(m)
validation_warnings = []
if r.headers['x-w3c-validator-warnings'] != 0:
for m in j['messages']:
if m['type'] == 'info':
validation_warnings.append(m)
if len(validation_errors) != 0 or len(validation_warnings) != 0:
raise ValidationException('There were %d validation errors and %d validation warnings' % (len(validation_errors), len(validation_warnings)))
class Element34(Page):
def __init__(self, driver):
self.driver = driver
def open(self):
self.driver.get('http://element34.ca')
return self
def wait_until_loaded(self):
return self
def validate(self):
self._validate_html()
return self
<pre lang="shell">puppet:unicorn adam$ py.test validation_test.py -s
======================================== test session starts ========================================
platform darwin -- Python 2.7.2 -- pytest-2.3.2
plugins: marks, xdist
collected 1 items
validation_test.py F
============================================= FAILURES ==============================================
__________________________________ TestValidation.test_validation ___________________________________
self = <validation_test.testvalidation at="" object="">
def test_validation(self):
> e = Element34(self.f).open().wait_until_loaded().validate()
validation_test.py:12:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <po.element34 at="" object="">
def validate(self):
> self._validate_html()
po.py:45:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <po.element34 at="" object="">
def _validate_html(self):
post_data = {
"fragment": self.driver.page_source,
"output": "json",
"doctype": "XHTML 1.0 Transitional"
}
r = requests.post('http://validator.w3.org/check', data=post_data)
j = json.loads(r.text)
validation_errors = []
if r.headers['x-w3c-validator-errors'] != 0:
for m in j['messages']:
if m['type'] == 'error':
validation_errors.append(m)
validation_warnings = []
if r.headers['x-w3c-validator-warnings'] != 0:
for m in j['messages']:
if m['type'] == 'info':
validation_warnings.append(m)
if len(validation_errors) != 0 or len(validation_warnings) != 0:
> raise ValidationException('There were %d validation errors and %d validation warnings' % (len(validation_errors), len(validation_warnings)))
E ValidationException: There were 4 validation errors and 1 validation warnings
po.py:31: ValidationException
===================================== 1 failed in 20.47 seconds =====================================</po.element34></po.element34></validation_test.testvalidation>