Overview of the BITV-Test

BITV-Test is a comprehensive and reliable accessibility evaluation procedure. 60 checkpoints lead evaluators to a detailed assessment of the accessibility of information-oriented websites and web applications.

Important note (March 15, 2019): The test procedure has been updated. This content is currently being revised.

The legal basis: BITV 2.0

On September 22, 2011, the German directive Barrierefreie-Informationstechnik-Verordnung (BITV) 2.0 came into effect for German federal websites.  BITV 2.0 is a complete overhaul of the old BITV that had been modelled on WCAG 1.0. The new directive is based on the Web Content Accessibility Guidelines (WCAG) 2.0 published in December 2008 by the Web Accessibility Initiative (WAI).

The aim of the directive is to improve the accessibility of web content by mandating design principles that afford access for blind, visually impaired and deaf users as well as for users with motor deficiencies. All web content should be accessible via different browsers and assistive technologies.

While Germany's federal institutions are directly affected by the BITV directive, its impact goes far beyond. Web designers and CMS developers that serve federal institutions have to prove their competence in implementing websites according to the directive. Other web service providers also increasingly use the directive as guidance. It sets a new quality standard.

Context and history of BITV-Test

BITV-Test is an web-based application to evaluate the accessibility and conformance to BITV 2.0, of information-oriented websites and web applications. It captures the requirements and success criteria of the BITV 2.0 in a set of 49 concise and practical checkpoints. The test covers Priority 1 of the BITV 2.0 directive. Priority 1 includes all WCAG 2.0 Level AA Success Criteria plus SC 2.4.8 Location, which is on Level AAA of WCAG 2.0.

BITV-Test has been developed by the BIK project in close co-operation with associations of and for disabled people, web designers, and accessibility experts. The test was first published in 2004 and has been continually updated since. A new version of BITV-Test was released when the new BITV 2.0 came into force in September 2011. The last update was published in 2017.

Types of BITV-Test

There are three different types of BITV-Test:

  • BITV self-assessment: Free for everyone after registration, the self-assessment is a web-based online accessibility checking procedure for one evaluator allowing checkpoint ratings and comments. The self-assessment has no page sample: it applies the checkpoints to the entire site. Alternatively, users may limit self-assessment tests to a single page and aggregate results on their own.
  • BITV design support test: This test is used for evaluating websites during development. The results help address accessibility problems detected. The design support test often conducted prior to a final BITV-Test. Evaluators can target the page sample to pages in response to clients' needs. The result is for internal use by the client, i.e., it cannot be made public and cannot be used as statement of conformance. 
  • BITV final test: The final BITV-Test checks conformance to BITV 2.0. It involves two evaluators conducting the test independently based on the same page sample. These tandem tests are followed by an arbitration phase. Here, both evaluators run through all the checkpoints they have rated differently and agree on the final consensual rating. The arbitration phase not only helps detect oversights and corrects both too lenient and too strict ratings. When a site achieves a score of 90 or more points (out of 100), it is considered accessible. The site can now carry a 90plus-seal that links to the detailed time-stamped test report.

The results of BITV design support tests andBITV final tests are subject to independent quality assurance by an accessibility expert in the BIK project team.

Summary of the test procedure (BITV final test)


Before a BITV-Test is carried out, evaluators check the suitability of the site. Websites largely depending on inaccessible web technologies are not tested. 

Selecting the page sample

Evaluators then explores the website and define an appropriate page sample that reflects the complexity of the site. The page sample must also include additional dynamic states of pages. The size of the page sample depends on the complexity of the website under test. Dynamic behaviour and processes are captured by defining additional states to be tested on individual pages in the sample.

Checkpoint instructions and rating

For each of the 49 checkpoints, in-depth explanations and instructions inform testers what is to be checked, why the check is important, and how the check is carried out.

When testing against a particular checkpoint, evaluators rate the overall degree of conformance on a graded Likert-type scale with five rating levels: from 100% for full conformance, to 0% for a clear failure. 

The level of rating reflects both the quantity and the criticality of flaws identified. When rating alt texts, for instance, a page with a crucial image-based navigation element with missing alt text would be rated as completely unacceptable (0%), whereas a page where just one of several teaser images has inadequate alt text would be rated as marginally acceptable (75%). In the latter case, the checkpoint would still contribute ¾ of its individual value to the overall score.

Weighting of checkpoints

The checkpoints are weighted according to their impact on accessibility, each contributing between 1 and 3 points to the overall maximum score of 100.

The mark-down option

If serious accessibility problems are discovered during a test (for example, a keyboard trap, or a missing alt text on a crucial graphical navigation element), the overall result for a website can be marked down to "not accessible". The website will then fail conformance to BITV, even if its overall score would be above 90 points.

The arbitration phase

In tandem tests, the conclusion of both independent tests is followed by an arbitration phase. Here, both evaluators run through all the checkpoints they have rated differently and agree on the final consensual rating. The arbitration phase helps detect oversights and corrects both too lenient and too strict ratings.

Conformance statement

Web sites that reach more than 90 points are classed as ‘accessible’, sites with a score of more than 95 points are classed as ‘very accessible’. Sites successfully tested can place the 90plus or 95plus seal on ther website to demonstrate conformance to BITV.

Public documentation of results

The entire test procedure is publicly documented. For those tests that have been published, the individual checkpoint assessments are also publicly available. Site owners using the 90plus accessibility seal on their website must include a link from the seal back the test report to back up their conformance claim.

The reliability of BITV-Test

The result of a BITV-Test is based both on objective measurement and human assessment. This is not a deficiency of method. The actual accessibility of websites depends not only on the adherence to formal rules. Any reliable test has to be grounded in human judgement. However, the reliability of testing must be safeguarded. This means that test results should be replicable within a margin of error. Different testers should arrive at comparable results.

  • Test conditions and procedures. To ensure the reliability of test results, testers must meet certain criteria regarding qualifications, knowledge and experience. The test environment is clearly defined. It prescribes the choice of operating system, browsers, or special measurement tools. In addition, the scope of testing is clearly defined at the outset.
  • Coordination of decisions. Testers are part of a team. Here, all cases that cannot be readily decided based on the documented test procedure are discussed. These discussions also serve as the basis for the continued refinement and adaptation of the test procedure.
  • Comparability of results. The test suite enables access to all prior assessments made for each checkpoint. These can serve as guidance for testers. In addition, a statistics function for tandem tests shows the offset of evaluators' individual test results per checkpoint from the final arbitrated and quality-assured result. The offset value offers a measure of the degree of qualification of individual testers and indicates individual training needs.

The team-based approach enables testers to learn to apply the latitude in their assessment in a similar way. The websites tested should neither be assessed too leniently nor too strictly.