Overview of the BITV-Test

BITV-Test is a comprehensive and reliable accessibility evaluation procedure. 60 checkpoints lead evaluators to a detailed assessment of the accessibility of information-oriented websites and web applications.

The legal basis: BITV 2.0

The Barrierefreie-Informationstechnik-Verordnung (BITV) 2.0 aims to improve the accessibility of web content by mandating design principles that afford access for blind, visually impaired and deaf users as well as for users with motor deficiencies. All web content should be accessible via different browsers and assistive technologies.

In May 2019, BITV was updated and now matches the requirements of the Web Content Accessibility Guidelines (WCAG 2.1) on conformance level AA (for some reason, it is still called BITV 2.0 and not 2.1). Since the update follows the EU Directive 2016/2102 for the accessibility of web sites and apps, the BITV now also requires public mobile apps to be accessible.

On top of the European legal minimum requirements, the BITV goes a step further also mandates a version in simple language and one in sign language for introductory content (information on the site's aim and content as well as its Accessibility Statement). The BITV-Test focuses on the requirements contained in the 50 WCAG Success Criteria on level AA. It currently does not cover a test of the formal and substantive suitability of simpe text versions and sign language versions.

While Germany's federal institutions are directly affected by the BITV directive, its impact goes far beyond. Web designers and CMS developers that serve federal institutions have to prove their competence in implementing websites according to the directive. Other web service providers also increasingly use the directive as guidance. It sets a new quality standard.

Context and history of BITV-Test

BITV-Test is an web-based application to evaluate the accessibility and conformance to BITV 2.0, of information-oriented websites and web applications. It captures the requirements and success criteria of the BITV 2.0 / WCAG 2.1 in a set of 60 concise and practical checkpoints.

Out of a series of federally-funded projects BITV-Test has been developed by DIAS GmbH in close co-operation with associations of and for disabled people, web designers, and accessibility experts. The test was first published in 2004 and has been continually updated since.

Types of BITV-Test

There are three different types of BITV-Test:

  • BITV self-assessment: Free for everyone after registration, the self-assessment is a web-based online accessibility checking procedure for one evaluator allowing checkpoint ratings and comments. The self-assessment has no page sample: it applies the checkpoints to the entire site. Alternatively, users may limit self-assessment tests to a single page and aggregate results on their own.
  • BITV design support test: This test is used for evaluating websites during development. The results help address accessibility problems detected. The design support test often conducted prior to a final BITV-Test. Evaluators can target the page sample to pages in response to clients' needs. The result is for internal use by the client, i.e., it cannot be made public and cannot be used as statement of conformance. 
  • BITV final test: The final BITV-Test checks conformance to BITV 2.0 / WCAG 2.1. It involves two evaluators conducting the test independently based on the same page sample. These tandem tests are followed by an arbitration phase. Here, both evaluators run through all the checkpoints they have rated differently and agree on the final consensual rating. The arbitration phase not only helps detect oversights and corrects both too lenient and too strict ratings. When all checkpoints are rated as 'pass' or 'near pass', it is considered accessible. The site can now carry the 'BITV konform' seal that links to the detailed time-stamped test report.

The results of BITV design support tests andBITV final tests are subject to independent quality assurance by an accessibility expert in the BIK project team.

Summary of the test procedure (BITV final test)


Before a BITV-Test is carried out, evaluators check the suitability of the site. Websites largely depending on inaccessible web technologies are not tested. 

Selecting the page sample

Evaluators then explores the website and define an appropriate page sample that reflects the complexity of the site. THis procedure reflets the process laid out in WCAG EM. Pages also include dynamic page states as well as views at other viewport widths. The size of the page sample depends on the complexity of the website under test. Dynamic behaviour and processes are captured by defining additional states to be tested as part of the page or on additional pages in the sample.

Checkpoint instructions and rating

For each of the 60 checkpoints, in-depth explanations and instructions inform testers what is to be checked, why the check is important, and how the check is carried out.

When testing against a particular checkpoint that is applicable, evaluators rate the overall degree of conformance on a graded Likert-type scale with five rating levels: from pass to near-pass to partly-pass to near-fail and finally, fail. In terms of BITV/WCAG conformance, the two highest rankings ('pass' and 'near-pass') count as conformant. If several checkpoints contribute to a Success Criterion, all of its checkpoints need to be at least 'near-passes' for the content to be considered conformant to the respective Success Criterion.

The level of rating reflects both the quantity and the criticality of flaws identified. When rating alternative texts of imagres in checkpoint 1.1.1a, for instance, a page with a crucial image-based navigation element with missing alt text would be rated as 'fail', whereas a page where just one of several teaser images has a less-than-ideal alternative text (e.g., overly verbose) might be still rated as 'near-pass' .

The arbitration phase

In tandem tests, the conclusion of both independent tests is followed by an arbitration phase. Here, both evaluators run through all the checkpoints they have rated or commented differently and agree on the final consensual rating. The arbitration phase helps detect oversights and corrects both too lenient and too strict ratings.

Conformance statement

Web sites where all checkpoints / Success Criteria are rated as 'pass' or 'near-pass'  are considered conformant¬í. Sites successfully tested can the place the  'BITV konform' seal on their website to demonstrate conformance.

Public documentation of results

The entire test procedure is publicly documented. For those tests that have been published, the individual checkpoint assessments are also publicly available. Site owners wanting to use the 'BITV konform' seal on their website must include a link from the seal directly the test report or to their Accessibility Statement whcih must then link to the test report..

The reliability of BITV-Test

The result of a BITV-Test is based both on objective measurement and human assessment. This is not a deficiency of method. The actual accessibility of websites depends not only on the adherence to formal rules. Any reliable test has to be grounded in human judgement. However, the reliability of testing must be safeguarded. This means that test results should be replicable within a margin of error. Different testers should arrive at comparable results.

  • Test conditions and procedures. To ensure the reliability of test results, testers must meet certain criteria regarding qualifications, knowledge and experience. The test environment is clearly defined. It prescribes the choice of operating system, browsers, or special measurement tools. In addition, the scope of testing is clearly defined at the outset.
  • Coordination of decisions. Testers are part of a team. Here, all cases that cannot be readily decided based on the documented test procedure are discussed. These discussions also serve as the basis for the continued refinement and adaptation of the test procedure.
  • Comparability of results. The test suite enables access the comparison of testers' prior assessments made in other tests. These can serve as guidance. In addition, a statistics function for tandem tests shows the offset of evaluators' individual test results per checkpoint from the final arbitrated and quality-assured result. The offset value offers a measure of the degree of qualification of individual testers and indicates individual training needs.

The team-based approach enables testers to learn to apply the latitude in their assessment in a similar way. The websites tested should neither be assessed too leniently nor too strictly.