
# Web Strategy for Everyone

## \- How to create and manage a website, usable by anyone on any device, with great information architecture and high performance.

© Marcus Österberg 2016  
**ISBN:** 978-91-983422-0-8  
**Title:** Web Strategy for Everyone  
**Author:** Marcus Österberg  
**Edition:** Ebook-edition, 1.0.2 (2018-04-10)  
**Publisher:** Intranätverk

## Table of Contents

  * Web Strategy for Everyone
  * Before we begin
    * Why you should read this book
    * About me
  * The Web's history and future
    * Web 1.0 - a network of documents
      * Characteristics of Web 1.0
      * Web design 1.0
    * Web 2.0 - the engaging web
      * Characteristics of Web 2.0
      * Web design 2.0
    * Web 3.0 - a network of data (also known as the semantic web)
      * Characteristics of Web 3.0
      * Web design 3.0
  * Information architecture
    * Content choreography
      * Examples of poor content choreography
      * Master Data Management prevents unnecessary duplication
      * The importance of marking up information with metadata
      * Metadata specification makes your data more standardized and interchangeable
      * Controlled vocabulary
      * Folksonomy
    * Architecture using APIs and open data
      * Public APIs, open data and the PSI Act
      * Background to the European Union's PSI Act
      * Some take issue with the PSI Act - cumbersome access to data
      * What then is open data?
      * The benefits of an API for a startup business or when building anew
      * Design a public API with the developers' experience in mind
      * Friendly terms and a free license
      * No surprising the developers with unforeseen breaking changes
      * Provide data in the expected format and in suitable bundles
      * Error handling and dimensioning of the service
      * Provide code samples and showcase success stories
      * Promote via data markets and API directories
      * What is the quality of data needed?
    * Microdata - semantically defined content
      * So, what is the problem?
      * The potential of semantic information
      * Microdata standards such as Schema.org and Microformats
    * Digital Asset Management (and Adaptive Content)
      * Adaptive Content
      * Image and media banks in your publishing system
      * Personalization of information
    * URL strategy for dummies
      * Common excuses for breaking established URLs
      * Ok, how to then?
  * Web design
    * Gov.uk design principles
      * 1. Start with needs
      * 2. Do less
      * 3. Design with data
      * 4. Do the hard work to make it simple
      * 5. Iterate. Then iterate again.
      * 6. Build for inclusion
      * 7. Understand context
      * 8. Build digital services, not websites
      * 9. Be consistent, not uniform
      * 10. Make things open: it makes things better
    * Keep it simple, stupid - KISS
      * Do not break the web
    * Persuasive web designs (PWD) - design that convinces
      * 1. Be clear in everything
      * 2. Be very careful of what is the default setting
      * 3. Visual hierarchy is important
      * 4. Focus on the common goal you and your visitor have
      * 5. Try not to overexert your users' attention
    * Responsive web design
      * The mobile moment
      * The elements of responsive web design
      * Arguments for responsive web design
      * Notes on responsive construction
      * Responsive typography
      * RESS - Responsive Server Side
    * Adaptive web design
    * Design with data - a data first-approach
      * Get started with design with data
      * What you know about your visitors
      * Continuous A / B testing
      * Examples of A / B tests for monitoring the website, and other communications
    * Mobile first
      * Mobile first vs. responsive web
      * The mobile opportunity
      * Mobile restrictions
      * The mobile moment - when mobile users are in the majority
    * SPA - Single Page Application
      * Design of SPA websites
      * Challenges of SPA
    * Web standards, and usability
      * Progressive enhancement and graceful degradation
      * Usability vs. accessibility
      * Gamified design
      * Design and plan for errors that will occur
      * Your website is a magazine, not a book!
  * Web performance
    * Planning for the unplanned
    * Performance optimization of databases, web servers and content management systems
      * General troubleshooting
      * Planning for high load - use cache!
      * Content Networks (CDN - Content Delivery Network)
      * Databases
      * Web servers, content management, own source code and external dependencies
    * Measuring and improving interface performance from the user's perspective
      * Helpful tools
      * Editorial performance impact
      * Technical settings for performance
    * Recoup an investment in web performance - is it possible?
  * Test your own website
    * How to document your test
    * 1. SEO
      * 1.1 Indexable for search engines
      * 1.2 Duplicate content
      * 1.3 Page title's length is under 60 characters
      * 1.4 Page title is readable and understandable in the search engine results page
      * 1.5 Page title contains relevant keywords that describe the page
      * 1.6 Correct headings are used
      * 1.7 Search engine friendly URLs
      * 1.8 Descriptive text on all important pages
      * 1.9 Reasonable number of links
      * 1.10 Pictures have alternative texts
      * 1.11 Structured description of the information
    * 2. Web analytics
      * 2.1 Current visitor tracking scripts
      * 2.2 Tracks the use of website search
    * 3. Performance
      * 3.1 Reasonable time for loading the page
      * 3.2 Compression of text files
      * 3.3 Usage of the browser cache
      * 3.4 Scripts and style sheets are sent in a compact format
      * 3.5 Images are optimized for fast transfer
      * 3.6 Reasonable number of background images, scripts and stylesheets
      * 3.7 Requesting files and pages that do not exist
      * 3.8 Minimal amount of scripts and CSS in page code
      * 3.9 Images are not scaled down using CSS or HTML
      * 3.10 Identical files are not referenced
      * 3.11 Reasonable amount of scripts in the page head
      * 3.12 Content networks are used when necessary
    * 4. Accessibility and Usability
      * 4.1 Website validates the chosen code standard
      * 4.2 Using correct header structure
      * 4.3 Anchor-texts are descriptive
      * 4.4 Link titles not used for non-essential information
      * 4.5 Favorite icon is present
      * 4.6 Possible to navigate with keyboard
      * 4.7 Texts are written to be read by a human - not with exaggerated SEO
      * 4.8 Language set in the source code
      * 4.9 Not depending on browser features
      * 4.10 Specifies image sizes in HTML
      * 4.11 Works with and without the www prefix
      * 4.12 Only one domain is used for the website
      * 4.13 RSS subscriptions can be detected
      * 14.4 Useful error pages
      * 4.15 No surprises when scrolling
      * 4.16 Enough distance between links, buttons, etc.
      * 4:17 Acceptable text size
      * 4.18 Zoomable, also on mobile
      * 4:19 Icons for the website
      * 4:20 Useable printouts
    * 5. Others
      * 5.1 Forms and other sensitive information is sent through a secure channel
  * Tips on in-depth reading
  * Sources & references
  * Thanks goes out to...
  * Notes

# Before we begin

## Why you should read this book

This book introduces what you need to know when working with the Web, no matter your role. My ambition is to make web strategy less mysterious, without relying on difficult words or complex reasoning.

I tackle how to arrange topics and lay out content to make your website more useful, usable, and user-focused, on any device.

With hands-on guidance, you'll be able to assess and address performance issues and prioritize the work needed to improve your website.

While this is a practical handbook, it does not focus on code or technical matters, but rather concentrates on helping you come to a deep understanding of user-needs and how your website should satisfy visitors. Working through the chapters, you'll devise your own web strategy to guide your tech, design, and content efforts.

## About me

Hi! I'm Marcus Österberg, and since 1998 I've worked in all kinds of roles with _web_ as the prefix; web designer, web application developer, web editor-in-chief, web analyst, and web strategist, to name the most memorable. I've worked as a consultant, entrepreneur, teacher, and at times, been the client in both the private and public sectors. The Web has powered every phase of my career and I think it's integral to many roles.

My colleagues and managers have often joked(?) that I should write a _gospel_ about the Web to record my ideas and guidance. This book is that gospel.
* * *

To my family

* * *

# The Web's history and future

A brief introduction to what happened during the Web's early-years and where it's heading in the immediate future. Web professionals still have much to learn from history since most new creations are variations on very old solutions with a modernized surface.

We begin with a summary of the Web phenomenon. We will talk about what is special about the Web, unlike what the Internet offered before, and the Web generations we've moved through (Web 1.0, 2.0 etc.). The Internet and the Web should not be lumped together, as they so often are. The Internet is the infrastructure, network, and all connected equipment that can talk with each other. The Web is strictly a service among many others that use the Internet as a communication network. To explore the Web, we use web browsers which send and retrieve information over the Internet.

Because of the Web's inclusive and unguided nature, most of us stumble upon sites that are at different stages of development, or have yet to embrace newfound design conventions or trends. Some first generation websites still exist and perform well enough if they have a simple purpose. Now we will go through the evolution of the Web since its creation atop the Internet.

Here goes...

## Web 1.0 - a network of documents

While the Internet has been around in some form since the 60's or 80's (depending on the definition), the concept of the Web came into being in 1990 when Tim Berners-Lee and Robert Cailliau wrote a proposal about the _WorldWideWeb_ for their employer, the research organization, CERN.

They wanted to use hypertext:

> "...to link and access information of various kinds as a web of nodes in which the user can browse at will..."
> 
> www.w3.org/Proposal.html

_Hypertext_ is more than plain text. Hypertext is a building block of HTML (HyperText Markup Language), which Berners-Lee designed to layout text content on a _web page_ – and which now forms the basis of every website. Over time, HTML has developed into a robust way to present rich media on the Web.

In 1990, Berners-Lee released the _WorldWideWeb_ browser - the only way to view the young Web.

In 1993, the NCSA released a more capable browser, Mosaic, which popularized the Web in many fields. In 1994, Netscape Navigator (based on Mosaic) was released and became the dominant browser even outside academic circles.

There were plenty of services available via the Internet before the web was born, but it was harder to find and use such services owing to the specialized software required, steep learning curve, and the many different standards, formats, and protocols.

Some example Internet services:

  * **Gopher** – found and retrieved documents and information. Could take minutes or even hours.
  * **Name/Finger** – reported the contact details about colleagues and showed if they were online in the workplace.
  * **FTP** (File Transfer Protocol) – enabled you to upload and download documents to a server – allowing you to publish information on the Internet, and collaborate with colleagues. Imagine _Dropbox_ but with no syncing, no back-up, and a clunky interface.
  * **Email** – colleagues were emailing each other in the 60s, but you could only email 'anyone in the world' in the following decades, depending on your email provider.

Web 1.0 builds on all these earlier services. The big difference is that the Web is further developed, offers standardized approaches to interacting with information, and is more accessible to the public.

### Characteristics of Web 1.0

Through the linking of documents (pages), the Web appeared as an electronic library to anyone with access to a computer and a phone line. Instead of librarians, link directories (Yahoo!) and early search engines (WebCrawler, Lycos, AltaVista etc.) helped you get around. In the late 90s, Web-based services provided an easier interface for email (Yahoo!, Hotmail, etc.) and so email became more popular outside work and academic contexts.

### Web design 1.0

In the introduction, I mentioned the word generations rather than versions. This is because the websites of different generations are living side by side, as not all embrace newfound design conventions, or even update old pages published long ago.

Things rarely seen on later generations of the Web include:

  * **Poor typography** \- Centered body text is hard to read, and the reader can get lost at each line break. Many sites also had serif typefaces (fonts where the letters have 'heels'), which looked quite smudged on the low-resolution screens of the time.
  * **Visitor counters (publicly visible)** \- Some argued that they wanted to show the visitors that someone, preferably many, had been there before.
  * **Outgoing links** – Webmasters loved to link to other websites, but links were often irrelevant to the subject matter, and more about making friendly suggestions. For example, it was common for websites to link to AltaVista, which was the giant of search. Various link exchanges were made between sites to drive traffic to each other; these 'web rings' helped websites get traffic before Google existed.
  * **Background imagery** \- Often it was a tasteless choice of wallpapers, a photograph, or illustration, which only made the website more cluttered and hard to read.

It was also common that the sites were made for specific screen sizes, designed with tables, and functioned differently in different browsers. Not forgetting all the background sounds, which began anew at each page view, and the completely pointless animated icons

    Figure 1: Screenshot from Tim Berners-Lee's desktop when viewing CERNs website using his WorldWideWeb browser.

In the late 90s, sites built entirely with animation technology appeared. Some had an extraordinary focus on animation and innovative design, but the usability and content was not that much of a priority.

Web 1.0 was amateurish since too few were qualified to design user-friendly websites; it was an immature industry. The sites were usually not versatile enough to adapt to the visitor's needs or technical preconditions.

Personally, I am grateful that the website I built in 1998 about Egyptian mythology is gone from the web service _Tripod_. It contained a lot of design I would be equally embarrassed and amused if anyone saw. Centered gray text on the messy black background, and the devil with a fire iron as an animated GIF in the footer - for no reason.

The next generation of the Web raised standards and expectations. Designers and publishers began to focus on the user. The business community made an effort to understand the Web, but began with an economic bubble.

## Web 2.0 - the engaging web

The concept of Web 2.0 was coined at the turn of the century and describes the participatory Web, where it's easier for people to publish, share, comment, and socialize online, not only through their own blog, but using other people's websites. Web standards were developed to shape the Web, including RSS (Really Simple Syndication) that allowed people to subscribe to content without disclosing their email addresses.

So-called _mashups_ emerged, the reuse of one or more external services to create something new. Embedding information from elsewhere became common, such as including content from Google Maps or Youtube on your own website. The Web was growing so much that it became an economic incentive to focus on _user experience_. More often, we would find websites were thoughtfully designed.

Fact:  
At the turn of the century, every third person in the developed world had access to the Internet. Most were able to take advantage of the Web's content. Globally, 7 % had access to the Internet.  
Source: ITU (International Telecommunication Union - an agency of the UN)

Because of the newfound opportunities to make money on the Web, publishers of well-designed websites began to measure and analyze how visitors used their sites. These measurements were used to optimize the navigation structure, simplify payment systems, monitor the popularity of news / content, and much more.

Along with the social possibilities, Web 2.0 also became more functional, which is noted by the ever-decreasing need to install software on a computer. Many office applications have been replaced or augmented with online services accessed through your browser.

During this period, the Web reached beyond the desktop. Mobile 'broadband' launched, and an increased use of laptops, mobiles, and tablets changed the perception of when, where, and how to use the Web. From being tethered to a connection at home, work, school, or at an Internet cafe - to being connected in any environment. Even when running for the bus with a cheese sandwich in your hand.

Fact:  
Between 2007 and 2013, half the Web's traffic became consolidated around just 35 sites. Some people now consider Facebook to be the Web, never going anywhere else. This is especially the case when considering mobile access.  
Source: The Connectivist

### Characteristics of Web 2.0

With the advent of what is sometimes called the social web, people needed fewer websites to connect to one another in greater numbers. Examples include Myspace, Wikipedia, Youtube, Spotify, Twitter, Facebook, and Wordpress. With this centralization of users, it was possible to allow people to log on to different sites using the same credentials, such as when you log on to a music streaming service using one of your social network identities.

Large, successful sites - often fueled by user generated content, could offer huge amounts of diverse content, allowing people to spend more time without leaving. Think how easy it is to fall into the _wiki-hole_ when you visit Wikipedia for one specific item, and end up reading unrelated and possibly bizarre articles. Similarly, Youtube offers tempting related and 'up next' videos to keep people watching.

Before sites offered so many suggestions and related links, people navigated in a more linear fashion to achieve a goal.

Something rarely noted about the second generation of the Web is that users actually make fewer mistakes and rarely need to start over (perhaps by returning to the home page as in Web 1.0). This improvement in user experience was made possible by the refinement of web technology. In the past, search engines would not offer keyword suggestions, leaving you to start afresh or refine your own terms. Nor were ever-updating news feeds or notifications available, as you now have on Facebook and Twitter. Back then, you had to reload the page to see if anything had changed.

### Web design 2.0

Websites, and web designers, brought many elements that we now associate with Web 2.0, such as:

  * **Floating advertisements.** Ad servers were often overworked and slow to deliver ads, making page load times painfully slow. Some people installed ad blocking browser extensions.
  * **Flash.** Often used for intros before a visitor was allowed entry to the home page, and for adverts. Some people removed (or never installed) the Flash plug-in to avoid these bandwidth-heavy animations.
  * **Search suggestions.** Search engines offered related terms to your query, often derived from what other people were searching for.
  * **Search fields placed at the top of the page.** Not newfound knowledge, but usability testing confirmed that many people prefer to search for what they want rather than browse. It became the convention to place the search field at the top of the page so that it was more visible, and easy to find when people gave up on the navigation menu. 
  * **Navigation based on the visitor's history.** By tracking the individual visitor's activity, some sites provided shortcuts and suggested links. For example, Amazon's homepage showcased books that the individual had looked at during previous visits.
  * **Streaming video.** While quality and bandwidth had to be considered, Youtube had proved that rich media was popular. Nobody expected to download a video anymore, everything was instantly streamed.
  * **Maps.** It became common for contact information to be complemented with a functional map, most often from Google Maps.
  * **Social feeds.** Embedding content from other websites (video, maps, etc.) included showing the website's own Facebook or Twitter feed. Buttons to follow, retweet, like, and add to your browser's favorites or to a read-later service abounded. The buttons just would not stop!

    Figure 2: Amazon.com and their search suggestions.

## Web 3.0 - a network of data (also known as the semantic web)

The _semantic web_ refers to how the **meaning** of information is recorded and presented in ways that computers can understand. Computers, including your laptop and smartphone, can understand that a series of numbers is actually a date. The semantic web adds a layer of metadata to tag information with contextual meaning. The name of a city can be tagged with its geographical location, and the travel routes to it from your personal location.

Fact:  
In 2008, 61 % of the population in the developed world were connected to the Internet and 23 % globally.  
Source: ITU

It is still highly debatable when the third generation of the Web began. If we include the use of Microformats, which is a markup tagging framework to describe information within web pages and other web services, it will be around 2008.

The definition of Web 3.0 is not agreed upon. Some believe that we are already there, that what we experience online every day is Web 3, while others think it is far down the road. Some people say that no one will notice it, that nothing has changed - the same old web just got a little better. Whatever the state of Web 3, it is about knowledge, meaning, and relevance based on the user's personal perspective and individual needs. It includes personalization, whereby the service adapts to the user based on explicit preferences and past activity, so as to be more relevant.

Web 3 techniques are already well-used by many sites and online services. Standards from Microformats.org and Schema.org help describe what specific information is about in a way that browsers, search engines, and other web services an understand. The semantic web offers digital publishers a way to mark information's geographical origin, contact information, reviews and much more. Even more interesting is what is called _linked data_. It boils down to the Web being a collection of open databases that follows standards to such an extent that different databases (owned by different organizations) can be accessed and the data combined to create new knowledge, or at least insight. When information is well structured according to known standards, it can be reused by other services - creating new value and benefits for end-users.

**Glossary - augmented reality**  
Enhanced or altered reality; the technology adds information from the Internet as a layer on top of the physical reality through visual displays, tactile technology, bracelets etc. For example, a smartphone app uses the phone's location to show the direction to local amenities as a person uses their camera to look around.

Future services should be able to present information in ways beyond the creator's control and expectations. The mash-up of the original information with the presentation method for the individual's contextual needs creates something unique. It can be a _HUD_ (head-up display) - screens embedded in a pair of glasses overlaying what you see with information in a context-aware manner.

Classic search technology becomes a problem on a small screen - worse when there's _no_ screen. The third generation web makes technology and information more adaptable and usable in reality – away from a keyboard. In a short while, you will no longer automatically reach for your mobile or sit down at your desktop computer for specific everyday information as you will already be well informed via your wearable tech and smart-objects around you. You will search for information less often, because services will notify you when there's relevant information, amenities, and friends / colleagues around you.

### Characteristics of Web 3.0

Many are no doubt familiar with the _geo-social_ perspective of data, that information is geographically marked so a user can fetch information created nearby. This is what the geo-service Foursquare is all about; everything is based around location. For example, online shops know so much about you and your shopping habits, that you will receive targeted offers based on who you are, where you are, and everything else that can make the content more appealing to you. Some shops use personalization features that suggest the purchase of sandals, instead of winter jackets, in December if you happen to find yourself temporarily in Australia.

Perhaps the most important feature of Web 3.0 is that services now can be more precise in guiding their users to sought after or relevant knowledge. Wolfram Alpha is an example of a 'search' engine, often called an _answer engine_ , which tries to generate correct answers to questions rather than a list of information resources. You don't search the web with Wolfram Alpha, you challenge it to work out problems or provide statistics. It does this by sourcing results from structured knowledge that is machine-readable – semantic. It also goes further, by combining facts and computing new results. Many do not know that Wikipedia also organizes information in a similarly structured way.

### Web design 3.0

It may still be a bit early to provide examples of truly third generation websites. Sometimes, you will see search results that have a creepy relevance to you as an individual. Google Search's Knowledge Graph (sometimes shown on the right of search results), shows supplemental information scraped from various sources.

    Figure 3: Google complements their search result with verified facts in the right-hand column.

When Wolfram Alpha answers a question about the place once called Danzig, it refers the user to the new name Gdansk. It assumes the user meant a place but states that it knows other facts, that Danzig is also the name of an artist, and a music album. There is no conventional search but rather it explores a well-structured body of knowledge. Compare this scenario with the millions of hits you get on Google search with the same question (and note that Google only hints at the name change).

Third generation web design will probably be characterized by websites with relevant (to the visitor) content placed right in the spotlight. Over the short and medium-term, more sites and online services will access the vast amounts of structured data, enabling cross-linking like search engines and Wikipedia, and contextual, personalized notifications. Web 3.0 is where information can finally live its own life, freed from its containers.

    Figure 4: Wolfram Alpha's answer on the search for 'danzig'.

The Web is an information rich platform, not just a social place to comment on videos of cats. To develop a modern website, you need a deliberate information architecture, which is our next topic.

# Information architecture

How do you rate the information presented on your website? Is it truly for the visitor? Do your pages provide up-to-date information that's relevant to the visitor, regardless of where the information originated? It should!

All too often, the Web is seen as a static publishing platform, a place for whatever marketing messages managers want to communicate. Content and design are often based on the publisher's perspective, rather than focused on end-users' needs. With good information architecture, your website can present content in a structured manner and provide a decent, even delightful, user experience. By marking up information semantically, you'll be able to integrate different information sources to create a website that is always up-to-date and relevant.

Information architecture is about how information is organized and accessible so as to be useful. Information is only useful when someone needs it; it's valuable at the _point of need_. So how to design your own sources of information and use the data that others share to achieve a goal? As modern and wondrous as the Web is, it's a bit behind other information management systems. In my opinion, this is because the Web (and especially intranets) until recently, has been regarded as a showcase rather than a natural part of business. Websites have often behaved like brochures, and have not even been as functional as a mail-order catalog or a self-checkout in a supermarket. Now though, C-level executives realize the Web can add significant value. For a website to make a difference to business, to customer behavior, and to the bottom-line, it requires good information architecture.

**Glossary – metadata, or, descriptive metadata**  
Metadata is _data on data_ or _information describing other information_. It may be, for example, categorization or keywords that describe the content. The word _meta_ comes from Greek and means _after_ or _above_ and is often used in conversation to describe something that is self-referential. Metadata is usually a well-defined label on a higher abstraction level than the data it describes.

## Content choreography

Content choreography is about how to reuse content and control the information-flow based on data and metadata. If information is not free, in all interpretations of the word, it is not that easy to reuse. Technical aspects may play into how free and reusable information is. However, I would bet that the biggest problem is governance - the lack of knowledge about your information and the failure to set proper requirements for metadata, and the lack of guidance / instructions on how content should be used by others later - in other contexts. We often fail to think about the long-term life cycle of content. There's great pressure to create and publish content, but long-term content strategy is vital.

    Figure 5: Video subtitles are not supposed to be formatted with HTML.

The mobile context is a common challenge these days. How do we provide the right content for the user's device without unnecessary duplication? There are technical barriers when using HTML to markup and describe web content. Mobile websites and mobile apps require different approaches. Therefore, it is not quite as simple as letting the web content management system act, in an identical way, as the source of all the content to a mobile app. The content often needs to be a little different on a smaller screen to be of real use, and design conventions are not identical. For example, blue underlined text is not an obvious link in an app as it is on the Web.

**Glossary - tags, keywords, labels**  
One or more individual words that describe or highlight content.

As you will have noticed, many websites supplement the usual menu-navigation with other means to find similar content. It is particularly common for websites built using the Wordpress CMS (Content Management System) to have categorizations and tags, which not only partly describe content but also provides a link to a list of similar content within the same site. Tags provide a complementary navigation system and surface thematically similar content. But just as importantly, such labels also create categorization and structure.

    Figure 6: BBC America using tags to enhance navigation (and also embedding content from Instagram).

* * *

**Glossary - taxonomy**  
Classification for systematically grouping things according to similarity or origin. It is the exact same thing that Carl Linnaeus did in the 18th century to describe a plant's place in nature based on its properties.

When shopping online, you often see that the menu seems to be a mix of a regular static menu and something that is driven by tags. You can often find a product in multiple listings based on manufacturer, type of product, color, size, and more. Tags can also be hidden from visitors to a website and have more of an operational internal use. I have been using hidden tags to suggest a priority level for the annual update of texts, such as _'priority2'_ to indicate that something is of secondary priority. It has worked as a sort of internal memorandum to those entrusted to review and maintain internal information. The need for a _taxonomy_ may not be obvious, but at least tagging is self-explanatory, since it is common on the public Web. The main point of tags is to use content dynamically and to make it easier to find and reuse later on. Precise and shared understanding of what each tag means makes tags easier to select and reuse. If the exact meaning of your tags is not explicit or commonly understood, you should publish a taxonomy. It is never too late to start tagging existing content (assuming you have the rights and ability) as even archived material can be tagged without altering the original content. In some cases, you can automate the tagging based on other information that already exists - the presence of keywords in the body text, for example.

Information that cannot be reused will likely be copied to, or recreated in, a new system to be made available. Then you have at least two versions that might end up being referenced, and that ideally need to be updated when changes occur. Content choreography tries to address this problem, to make sure that valuable information is agile, versatile, and useful in all the necessary contexts and information systems. Sometimes it is easier to pinpoint the challenges if you look at examples where something has gone wrong. That is exactly what we will do now.

### Examples of poor content choreography

The classic example, I think, is an information system designed by an unrepresentative minority of users, or worse, system engineers who will never use the system. Let us say that the HR department needs a new HR system. The requirements are listed, and several systems are reviewed. The winner is a system that has a feature the supplier named 'self-service'. According to the supplier, it is a convenient entrance into the HR system through which all employees can report their worked hours, apply for leave, choose benefits, and more. The problem here is that the system is primarily designed for people who are experienced in HR matters and terminology. It has been designed to appeal to stakeholders and budget holders from within HR. It has _not_ been designed for the workforce of field workers, store workers, factory floor workers, mobile-only sales people, and (presumably) digital savvy knowledge workers.

This focus on budget holders rather than end-users is often unavoidable, especially with enterprise software, and results in frustrated employees who waste time on an arcane system they would have avoided if they had the choice. Instead of all this, the HR department should have developed the system based on requirements from user research, and also defined how the system had to interact with _other_ information systems. Also, considering the (usual) clunky interface, it would have been worth developing specialized interfaces in the system, allowing people to perform specific tasks via the intranet or an app without having to worry about 'the HR system' itself.

I faced a poor and irritating interface when I tried to edit the dates for my leave of absence. There was no ability to edit existing entries, so I was forced to delete the existing entry and start the absence request process all over again. The morning after, an angry HR advisor called me to ask why I had deleted the work schedule she had created for me. I could not see the work schedule while I was attempting to edit my original request - the system did not do anything to help. My choice would have been to initiate the request via an online form on the intranet - and amend it similarly. What happens once the form is completed should not be the individual's concern; the system and the workflows should take care of the date, approvals, and impact.

The intranet offers HR related material that I don't have permission to view and links to the HR system that I can't access without having previously logged-on.

Instead of an activity-driven intranet with an underlying supporting information model, a specialized system is offered for each administrative task. You have a massive HR system, a clunky old room booking system, a claim expenses system, and a third-party benefits system. Every system requires a separate log-on, often using a different username.

Of course, this type of problem is not limited to the places I've worked in; it's fairly common across organizations of every kind. The evolving digital workplace offers a multitude of systems for each task. This in itself is a bit of a problem.

For bureaucrats such as myself, it is not unusual to have several document management systems for different projects. Whenever anyone needs a document, they can never be certain where to start looking. Enterprise-wide search can help, but results can be overwhelming. Further, it's highly likely that one or more of the older document systems will be phased out one day, or that multiple document systems will be consolidated into a new one. This all creates work and taxonomy conflicts.

When systems are rolled-out without concern for integration of workflows, people have to switch between several different systems to complete relatively simple tasks. Even the media report on administrative burdens, exclaiming that doctors don't have time to see their patients as they are contending with poorly designed IT.

Without integration, each separate system remains ignorant of previous steps the user has taken, encumbering the employee with the need to re-enter information time and time again. Different systems often follow different input standards. For example, one system might cope well with spaces or dashes within a social security number, another will accept only spaces, another provides room for only numerals. This is just one of many things the user has to remember, when alternating between systems. The cognitive load, the expertise needed just to type into a form, is excessive. Nor can we expect to log in once, as the systems do not share permissions or user credentials.

The icing on the cake is that often the systems have different rules for password complexity too, so people not only have to have different passwords for different systems, but passwords that are constructed in idiosyncratic ways. Undoubtedly this poor user experience causes stress and even poor security behavior - as in when people write down their passwords in a list on their desktop. The solution is often simple and obvious - we should focus on the user's experience first. Again, we need to think long-term when investigating user experience. We need to consider all the contexts that the system will be needed within, now and in the future. More than this, we have to consider the life cycle of the information that the system processes. Frankly, the information will probably out-live the system, and so portability is crucial. Structured content, and import / export capabilities are a 'must have'. **All this common sense is not yet common practice...**

Good content choreography can be seen when:

  1. All content is described using well defined metadata.
  2. The system adapts to the user's process and needs - not the other way around.
  3. It _feels_ like you only have a single system.
  4. It is never necessary to enter data more than once.
  5. The information is relevant based on the recipient's past activity, preferences, location, and other personal factors. The right information is available at the right time to satisfy a particular need.
  6. Information follows a given format; dates, for example, would preferably follow the international standard.
  7. Related information is suggested, or easy to find based on context.

Now we will explore how to control our most precious information.

### Master Data Management prevents unnecessary duplication

**Glossary - Master Data Management (MDM)**  
The systematic work to keep track of an organization's reference information. Consider a large directory of ever-changing supplier details, customers' order histories, or the company's financial records.

Public websites often display information that does not originate from its own content management system. The information could be customer data, product information, calendared events, etc. This information may be collected from internal enterprise systems, like a customer relationship management system or an accounting system. For products, real-time data (including supply levels) can be fetched from a supplier's system. For an intranet, it is common to have information about stocks, the consumer price index (adjusting for inflation in society), payroll dates, and the like. Such data, called _reference data_ , and _Master Data Management_ (MDM or MDM-system), is responsible for handling everything in accordance with applicable standards, regulations, and internal policies.

When you create or access a new source of reference data, ask yourself if it is acceptable to have to copy content manually into your website, or if the data source can be integrated with your existing systems.

All information has a life cycle. A fact that seems hard to remember when you want to make a quick fix - you are unlikely to foresee the systemic problems and extra work created by tactical fixes and undocumented changes.

There are advantages of manually copying information from the source into your website's CMS. You get to control just how and when the content gets published, and you can lay out the text, images, and other media exactly how you fancy using common publishing software and well-known code like HTML and CSS. The downside is of course that you made a _copy_ , and that the original information will no doubt change or require revising, making the copy on your website out-of-date until you edit it in line with the fresh information. I think we have all stumbled across hysterically outdated information on the Web. As a new subscriber to Macworld magazine, I wondered when the first issue would be delivered to me; I googled it, chose the first search result and got the publication plan for the editions two years prior. I did not find a link to more up-to-date information.

Information relatively quickly becomes outdated and misleading. Moreover, often some sort of responsibility-vacuum occurs between the original owner of the information and the person who published it on the website.

The advantage of full integration between the website and data sources is that you can design the information to always be up-to-date - without ongoing efforts. Something akin to 'create once, publish everywhere' (COPE), defined by the organization National Public Radio (NPR). The downside is that it takes more effort in the short-term. In some cases it is prohibitively expensive, but over the long-term it can sometimes be the only sensible choice.

The online movie service, IMDb, is an example of thoughtful integration. For me, as a Swede, in the middle of Ed O'Neill's description, I find the Swedish title for the series, 'Våra värsta år' (Lit: 'Our Worst Years'), instead of the English title 'Married with Children'. In other words, IMDb has created a link between their textual content (often only in English) and their master data. Embedded in their content, they bridge a small cultural barrier by assuming that different nationalities prefer localized titles.

    Figure 7: IMDb's reference data adapts to the user's preferences and displays the Swedish title in the English text.

The intranet I most frequently used was the one at Region Västra Götaland. It serves 50,000 employees with information about the multifaceted organization and tries to support everyday work. Before its deployment, when the Accounting and Human Resources departments were still decentralized, there were lots of uncoordinated local intranet pages. These local intranet sites rarely provided unique, local level information, but rather, duplicated corporate material. Payroll dates, for example; there were many and various pages stating what day of the month your salary would be paid on.

How many of these pages were updated with new dates each year, do you think? Not many, unfortunately! Intranet editors are unlikely to feel enthusiastic about manually updating multiple pages with information that's already up-to-date in another system. When intranets fill up with duplicated and conflicting information that is clearly out of date, trust in the intranets diminishes. Without being able to trust published information, employees work harder to get and validate information from accountable people. This can be great for those who like to rely on their network of colleagues, but is inefficient and downright wasteful when considered across the whole organization.

Have you ever come across an intranet that was so well managed it always had the up-to-date information you needed? Most intranets are not getting the care they deserve. The equivalent for a public website might be to advertise a product that is no longer available anywhere in the supply-chain. If you order something and get the message that it is no longer available, your trust in the company probably diminishes.

    Figure 8: B&Q (diy.com) sometimes cannot sell you items that appear on its website. You can check stocks at local stores though!

If your website exists to make money, it is crucial to keep customers focused on buying, and make it easy for them to pay, considering how easy it is to seek out a competitor. Providing information on stock and expected delivery times encourages the customer to feel confident in their choice. You need to manage your customers' expectations.

To return to the intranet example about payroll dates; we would preferably have gathered this reference data from a single data source, and enabled any publisher to display it without manual duplication. We would probably do this with a so-called _widget_ , a small box that pulls data directly from elsewhere around the intranet, the Internet, or from integrated systems and displays it on the page. The news on the intranet's home page might be displayed by a widget, as will the social stream, the 'latest discussions' list, and the up-coming events calendar.

    Figure 9: Metadata is all around us, for instance, currency on prices.

Web and intranet editors probably have better things to do than to keep track of the timelines of all copies of information they've published over the years. If you are talking to an IT consultant, or your own IT department, they will certainly offer many ideas for robust master data management. They will surely mention terms like _Enterprise Service Bus_ (ESB) to shuffle the information around between all involved systems. If the organization does not have other reasons to use an ESB, then it is probably smarter to use APIs and open data. First, we need to address the concept of metadata. Without metadata, information is less useful, useable and valuable.

### The importance of marking up information with metadata

Metadata is used in almost every conceivable context. It can categorize a document's content to let people know if something is worth reading, and it can also be the basis for website navigation or using keywords to make a web page easy to find when searching.

Some people seem to associate the term metadata with keywords, as in the words you use in queries on search engines. It is not necessarily wrong, but metadata is really all the information that summarizes, describes, or categorizes the main information. Metadata can classify the substance of a text, but can just as easily be the geographical coordinates of where a photo was taken. Metadata is some kind of descriptive labeling attached to the pertaining information. Look at, for example, ordinary price tags. They usually show the currency, a figure for the price, and the product name or description.

Metadata also tends to act as the table of contents for information. Without metadata, and the effective use of it, we cannot make the most of the information system's potential. With good metadata, it is easy to find our way even within enormous amounts of information. If you do not curate the information with well thought-out metadata, you face the risk that the information will not be used, or reused, or contribute to anything of value. Making use of information beyond its original purpose wrings more value from it. Which makes practical sense when you consider the costs involved in its creation. Failing to reuse existing information is primarily down to how difficult it can be to find.

Metadata can store synonyms in readable content, or more abstract concepts than what is mentioned in the user-facing content, to help computers understand the meaning of the content and provide users with navigation routes. Metadata can assure you that you have found the item you were looking for. The labels, the author's credibility, the date and the origin all contribute to your confidence in the main content.

If you were to create a new entry in an MDM (Master Data Management) system, how would you label the payroll dates? Assume integration with the payroll system cannot be achieved. An MDM system should be a role model for information management, its use of descriptive metadata and its flexibility in integration with other systems.

Metadata suggestions for you to reflect on:

  * Title: Payroll dates
  * Information type: Master data
  * Information series: Common reference info
  * Update: Ongoing
  * Metadata manager: John Doe
  * Target audience: All employees
  * Validity: 2015–01–01 onward
  * Keywords: salary, wage, payroll, pay day, payday, 2016, dates

Is anything missing? Is it possible to misinterpret the described content? It is a good idea to involve some colleagues and talk about the terminology. People tend to see things differently and associate things with different words. This is where keywords are useful. Other, authorized systems should now be able to subscribe to and fetch the actual payroll dates along with the metadata.

At the moment of content creation, it is not easy to foresee the many ways people will try to find it. Considering the overwhelming amount of information choices, we must make use of structured metadata so that the metadata itself can function as exclusion filters. Also, unstructured metadata, such as a couple of descriptive keywords, is great for those using a search function.

Are you going to let users add metadata to information they themselves did not create? To add keywords, related items, or suggest other titles, for instance? It may be a good idea as long as the contributions are separated from the creator's content.

Advantages of allowing anyone to contribute metadata:

  * The content creator may not use the same terminology as those searching or using it. Interview any expert on what they do and you will hear unfamiliar words. A greater variety of (relevant) synonyms makes the information easier to find with a search or via keyword-based navigation.
  * Those who contribute to content (even in small ways) are more likely to use and value it.
  * People's' skills may surface. The organization may find hidden talents in the organization and that some employees are more versatile than previously thought.

When content is easier to find, there's less chance that competing near-copies will be produced. When content is well used, there's more reason to keep it up-to-date.

One of the goals when adding collaborative elements to information systems is decentralizing the care or curation of information. When more people are able to collaborate, there's the potential that those who truly need and use the information will organize it to suit them.

If you are lucky, there is already an established metadata specification for you to embrace. One that tells you exactly which fields must exist, which ones are required and which are optional. Think about it! A standard that makes your metadata compatible with other data sources. Did I hear an amen?

### Metadata specification makes your data more standardized and interchangeable

You encounter metadata standards more frequently than you are aware of; a standard isn't always a formal document explaining everything. Think of all the structured data you see every day that give you details or short facts. You often come across labels, perhaps with a colon and then the content itself, like the contents page of a book, or ingredient lists for meals or recipes.

**Glossary - metadata specification or structural metadata**  
For metadata to be compatible with other data sources, you need to use a metadata specification, a standard. One such standard is Dublin Core. It specifies the metadata structure and which data to enter.

Long before the invention of printing, written work needed organizing. The library of Alexandria was one of the first places to centralize knowledge. Egyptian royalty wanted to own copies of all written works that crossed their borders. All knowledge was considered important, and the librarians had a mandate to secure and copy scrolls and books from around the world on almost every topic.

They needed a system to organize their large and growing collection; they needed a way to describe any scroll, and logically arrange these on shelves. New works, perhaps academic texts, needed to refer to individual scrolls. This is the subject of standardized metadata; simple, clear descriptors to aid locating and understanding specific matters. Look at any book - the back cover and inside cover describe the contents of the book. The contents page lists the chapters, and the index (if available) at the back records the location of important terms or topics. Often you will find:

  * The title of the book.
  * Whoever wrote it _and_ who contributed.
  * Which edition you hold in your hand.
  * When it was created.
  * ISBN identification so that you can identify the book and order another copy.
  * What entity published the book.
  * Who holds the copyright.

This is the standardized metadata about any book, which makes it easy to identify a specific book with certainty. The advantage of a metadata standard is that everyone can understand how to use the metadata descriptors and exactly what is being described.

The simplified version of the Dublin Core standard contains fifteen values to describe a work, namely:

  1. Title
  2. Creator
  3. Subject
  4. Description
  5. Publisher
  6. Contributor
  7. Date
  8. Type
  9. Format
  10. Identifier
  11. Source
  12. Language
  13. Relational
  14. Coverage
  15. Rights

Have you ever read the underlying HTML code of a web page? Sometimes you will see metadata, as below, where Dublin Core (DC) is used to both classify and describe the page:

    <meta name="DC.Publisher" content="Marcus Österberg" />

The point of embracing a metadata standard is that the information becomes compatible, and comparable, with other information that follows the same standard. Beyond this subject is the challenge of selecting the 'right' standard. You will want to adopt the same standard that other systems use, if you mean to compare your information with theirs in the future. It may seem simple in theory, but in practice there are local variations on how to describe things even when using the same standard. If the world chooses to follow a specific standard, it is good for your own sake to use the same one.

People entering information into your system must be aware of the appropriate standard, and be offered some support so that their contributions can be consistent and compliant.

When talking standards with IT vendors, they all tend to claim that they comply with all of them. It is important to try to see through the sales pitch and figure out if they mean that the system follows a _de facto-standard_ , its own system standard, which ties you to their product, or an open standard.

Chose standards with care by trying to figure out which one is the most established for your situation. For the Web, there are thankfully great open standards for metadata in the form of common metadata tags in HTML that work with the Dublin Core standard, and even microdata, which we will talk about later. Now a look at two different choices with regard to how much freedom to give users when entering keywords. First, the orderly way with a controlled vocabulary, then, _folksonomy_ filled with whatever users choose to write.

### Controlled vocabulary

**Glossary - controlled vocabulary**  
List of carefully selected words regarding a given topic. Often used as metadata to categorize other information. A word's synonyms are frequently included, and sometimes the words are arranged in a tree structure with internal relations. People often refer to a controlled vocabulary when talking about code systems, classification, and terminology.

**Glossary - ontology**  
Set of knowledge in a particular field. Lays out relationships between multiple vocabularies and taxonomies.

A controlled vocabulary is a pre-defined list of carefully selected words approved for use within an industry or organization. Words might be predetermined if it is imperative to be compatible with a certain metadata standard. Controlled vocabularies are used to classify information in a common and consistent way that stands the test of time and bridges the boundaries of organizations. These vocabularies are developed and maintained to encourage the use of a common language where precision in the word's meaning is of paramount importance to avoid misunderstandings and ambiguities. In other words, we need cooperation and broad support for a vocabulary to be useful. Few words have perfectly set meanings; context is everything. Take the word cancer as an example. If you find it as a categorical keyword on a website, is it obvious what is referred to? What can seem clear, owing to assumptions and our personal knowledge, can actually be quite ambiguous.

Off the cuff, I imagine that the word _cancer_ can be:

  * The name of a group of diseases.
  * A star constellation.
  * A zodiac sign, which suggests approximate time of year of birth according to the pseudoscience of astrology.
  * A Japanese Transformer character.

Perhaps you can offer even more suggestions. It is probably more common to mean the disease rather than the character in the Transformers – but, would your search engine figure that out?

If you get a list of all the information marked with the keyword 'cancer' - how do you know that everything relates to the same topic? You would not! You would be more confident if you knew that the contributors and information managers actively work with a vocabulary and strive to reduce ambiguity.

Youtube nicely separates the controlled vocabulary and the contributors' own additional keywords when tagging video content. When entering a tag (keyword) that happens to be in Youtube's vocabulary, you are notified in the user interface. Parentheses surround the type of data Youtube believes a keyword belongs to. Such as ' _(City / Town / Village)_ ' for cities. A flexible solution to make sure that the user is aware that they just entered information that can be structured and made unambiguous. This is great if the user meant the exact thing, but they have the choice to disregard the suggestion and add their plain keyword.

    Figure 10: Tags that match terms in a controlled vocabulary have their type shown within parenthesis.

An example of a vocabulary is _ICD–10_ 10 _,_ which is an international method of classifying diseases and health problems. The collaboration forum for this vocabulary is the _World Health Organization_ (WHO), which reports to the UN. ICD–10 contains over ten thousand ways to describe medical diagnoses at several levels of detail. Since the terms and their meanings have been translated into many languages, it overcomes local, idiosyncratic names and lexicons that could place barriers between healthcare professionals across the world. To build on the healthcare example there is a plethora of other, specialist vocabularies, for instance, _MeSH_ 11, which categorizes healthcare information, and _HL7_ 12 _,_ which describes related health information, such as laboratory results.

As you can imagine, the field of healthcare needs several controlled vocabularies since no vocabulary is all-inclusive, and each has different strengths and limitations.

### Folksonomy

**Glossary - folksonomy**  
A vocabulary without centralized control or standardization. Sometimes called democratic, social, or collaborative tagging.

A folksonomy is a list of words and/or concepts that are not standardized or centrally controlled by any stakeholder or organization. A folksonomy is characterized by the freedom contributors have to add new words as each individual chooses. A folksonomy can stand in marked contrast to a controlled vocabulary. A folksonomy is a highly social and democratic approach to words which may be used as keywords. New words, slang words, ordinary expressions are gathered in a creative chaos that is updated in real-time, as contributors add more tags to content. A controlled vocabulary, on the other hand, might be slow to accept new words, and might only be updated each year and published as a new version.

New terms are created more quickly than ever in this Internet age. Few people outside of South Korea would have been familiar with the word 'Gangnam' in 2011, but the following year, 'Gangnam Style' went viral around the world as people re-interpreted Psy's K-pop dance music video. All the many content creators and curators at the time could not wait for the term 'Gangnam Style' to be added to a controlled vocabulary after some ruling body's committee meeting; the benefit of, and the need for, a folksonomy is clear.

Is your information management work enhanced by the limitations imposed by a controlled vocabulary? Or are there frustrating occasions when your metadata simply cannot cope with new concepts? Would a middle ground with both a folksonomy and a controlled vocabulary work better? When designing a system, you must carefully consider the future ramifications of your decisions. What seems ideal in theory may be unworkable in practice. Think about the benefits and drawbacks of creative chaos and rigid structure; what balance will be right for your work, and the work of others?

Seth Earley, one of the authorities in the field of information architecture, suggested in a master class I attended that the need for a controlled vocabulary is clear if the content:

  * Will be widely used and re-used in other contexts.
  * Is already included in a controlled process.
  * Answers the questions anyone may have, perhaps documented approved methods or guidelines, or comparison information.
  * Has a significant cost to produce.

Examples of typical systems that probably require a controlled vocabulary would include records management, document management, and digital asset management.

A system that would fare well with a folksonomy would have the following characteristics:

  * Included in a creative or ad-hoc process.
  * Tends to solve problems in a practical way.
  * Has a high degree of cooperation between individuals.
  * The information arises whether it is wanted or not, as in a discussion forum or micro-blogs such as Twitter.

Blog platforms work well with a folksonomy, as do instant messaging systems, chat rooms, wikis, and other collaborative and communication platforms.

It is common that people have an innate sense of which of these variants they should devote themselves to, a vocabulary or folksonomy. One of my metadata-interested colleagues who works with document management in healthcare asked me a fascinating question, below:

> "By the way, who decided what hashtags are allowed for use on Twitter?"
> 
> \- A colleague

My colleague clearly found more value in controlled vocabularies!

On microblogs, such as Twitter, each of us can make up our own keywords to use as hashtags. The benefit of an unguided media, such as Twitter, is that we can have a hashtag for the evening to talk about just about anything without prior approval of any third parties.

    Figure 11: Greenpeace using a hashtag of their own choice.

You might wonder where on the control scale a web content management system is. Are you supposed to go for a folksonomy or a vocabulary? I think most people need a combination of both. But the most important thing is to find support in what wordings to use, regardless if the word is in a vocabulary or already in your folksonomy. It promotes the reuse of words within the folksonomy, helps with spelling and reduces the amount of similar keywords.

Hashtags on Twitter tend to be used for two purposes:

  * **Indexing** , allowing topics to be contextually entitled and collated, and for people outside of the conversation to find them, e.g. #job #jobhunt #intranet #FF
  * **Commentary** , as when people end a tweet with something like #justsayin #stayinyourlane #notmybestmoment or #FridayFeeling for example.

Like a folksonomy, many commentary hashtags are unique (e.g. #VeganKyloRen) but while they provide context, they're no use for indexing.

## Architecture using APIs and open data

**Glossary - API (Application Programming Interface)**  
An access point to transfer information from one system to another. Sometimes an API offers some of the system's functions to other systems. APIs provide the rich data and information needed for mobile apps, websites, and such like that use dynamic content or require interactivity.

**Glossary - public API**  
An API that encourages third parties to use it. A public API should be documented, list support information, and have some form of commitment from the issuer that they intend to keep it alive for external users.

**Glossary - PSI Act (Public Sector Information)**  
Law in most countries in the EU, based on an EU directive that regulates the way in which governments must share their collected information.

**Glossary - open data**  
Philosophy that strives to share information publicly. At the very least to offer data free for reuse, preferably in a structured way for others to design services around it.

Valuable information may remain blocked within a system that cannot communicate with other systems. To realize that value, and perhaps create something new, integration and information exchange between systems is needed.

APIs, like the lubricant in a gearbox, make the online world work so well, enabling Internet services to interoperate and provide rich experiences across devices.

When planning to use APIs, you will want to consider how to integrate with very old systems, or at least systems that have had their architecture and vocabularies established, and also how your objectives might influence your choice of new systems – built to order or purchased off the shelf.

### Public APIs, open data and the PSI Act

Failing to consider open data, or at least making APIs publicly available, when building an information system should be deemed a dereliction of duty. As a digitally transformed company, openness with data should be a natural part of everyday business. Most tax-funded organizations have many data sources that would make a greater difference to society just by being open, and some would most certainly also generate new revenue. Once public sector data is open, citizens themselves could improve the content, so-called _crowdsourcing_ , to add new information or time-sensitive local exceptions when actually using the data, or observed in the physical world. It is on the verge of unethical if a tax-funded organization does not share its data sources if open data could be of value to society. The challenge is considering what is interesting enough to open up, and how. Perhaps an 'open by default' policy is needed.

Developers can avoid costly and unnecessary data collection by using someone else's data sources. Commonly used data would include exchange rates, weather forecasts, public transport timetables, and product reviews. It is natural to seek open data and existing data sources when building new digital services. Governments should offer data through APIs and enable others to solve public information problems instead of building a half-assed service themselves.

A great part of any organization's operations is to manage information. To allow multiple actors to collaborate on the same information makes the data source more credible and well-known, to the credit of someone who could have chosen to keep everything internal to the organization. Companies can make money on this, and tax-financed organizations can save money - no more excessive double-recording of information, collect it once and for all. For instance, imagine how many businesses on the Web would make use of governmental accessibility information. Suddenly, all those who would like to guide a person to a physical location would have information on the location's accessibility for every specific variation of needs, for the blind, the language skills of the staff, whether the staff are LGBT certified, etc. Just as an example, governments have much useful data in all possible areas – but not always in a great system, unfortunately.

Other examples that are often held up to sell the sharing of information are where you try to get help from people by staging some form of competition. The example I myself heard retold atmospherically on many occasions is about the mining company Goldcorp, who announced a competition for who could find gold they themselves had difficulty finding. Goldcorp shared their geological data, and the competition was aimed at raising the profitability of an existing mine. In the pot, there was 575,000 Canadian dollars in prizes for those who participated in this digital gold rush. The initiative was made possible by Goldcorp CEO Rob McEwen, who, reportedly, had been inspired by the culture of cooperation in the early days of the Internet in the 1990s.

    Figure 12: The beer app Untappd uses Foursquare's API to help users enter their location.

Those who won this competition were not geologists, they performed the job without the need to travel, and they solved the problem from, wait for it, Australia. The result of awarding 575,000 Canadian dollars was that production started about 2 years earlier at the mine for Goldcorp and the value of mined gold from these findings could be reckoned in terms of multi-billions of Canadian dollars.

The example of Goldcorp is of course unusually spectacular in its success but it is an example that others can help you with your problems if they have access to your data. In some cases, they do it for free if they and your business share a common goal your data can help realize.

The fashionable word of 2006, 'mashup', means to mix multiple data sources to solve a problem or create a service. To take two or more data sources as often the whole is greater than the sum of its parts. Nowadays though, people are not talking that much about mashups since it is a natural part of developing the concept of new services. But the need for access to data sources is increasing. Most common is probably when we combine our own data source with someone else's.

If you take pictures with the Instagram app on your phone, you can choose a named geographic location to associate the image with. Instagram uses the API for the geographical data service called Foursquare and therefore avoids every complexity associated with that type of data. These two organizations have signed a cooperation agreement even though Foursquare offers their API to anyone.

The next part deals with transparency, which is important when building services, which is not only of interest to the public sector. Everyone can exercise their rights to receive data from the government.

### Background to the European Union's PSI Act

The act aims to improve the ability of citizens to play a part and participate in governmental affairs. A supposed positive side effect is that innovation and economic growth will occur when releasing information that the government has collected.

The PSI Act represents a significant improvement for many European countries that did not already have substantial governmental transparency. In some countries though, such as Sweden, many failed to see the potential in government operations, many leaned back resting assured that they already met the legal requirements decades ago. The main difference consists in the fact that Swedes have long been able to request paper copies of most of the government's administration information. While the PSI Act initially did not expressly require digital copies of information, very little happened, while several other countries took the digital leap - from not needing to disclose any documents at all to offering them all digitally.

At least the data sources that there is a public interest in should be made open, which may seem obvious, since the curation is funded by the taxpayers. Despite this, there are exemptions that can be applied for a couple of years at a time if you, as a governmental organization, provide a service that is of public interest. Many developers would probably say that it is all in the state's self-interest and several national organizations have stated such missions – to offer national services of public interest. For instance, governmental land and geological surveyors, company registries and statistical organizations. These organizations also happen to own some of the most interesting data sources that undoubtedly have a general interest.

A friend I am not going to name worked at Sweden's national weather forecasting institute during the transition from safeguarding their data to sharing it freely. This friend happens to be very versatile in most aspects of technology and got upset one day. Some of his colleagues had put a lot of effort into what nevertheless turned out to be a poor visualization of the data which the institute, back then, did not share with third parties. He mumbled something about high school students being able to do a better job in less time if the institute had just released its data sources.

To open your data sources most certainly initiates a discussion on what your core operations really should be. Of course, the government should not stop providing society with information services just because they have opened their data sources. But spending time on vanity-work with information is certainly not a great idea any more, as my friend noted, since many other actors are better at it.

### Some take issue with the PSI Act - cumbersome access to data

When speaking to entrepreneurs who tried to use this law for their business ideas, they get a laconic expression on their face. They often talk in terms of "the government is a paper-API", and that "we need more lawyers compared to developers in our company".

The PSI Act had needed something that at least regulated the means of disclosure of information. Paper should only be accepted if information is not found in any other form than on paper within the organization. Developers usually make do with structured text files, database exports, Excel, and almost anything digitally exported from existing systems.

When writing this book, there is ongoing work in the EU, made visible by the EU Commissioner Neelie Kroes, which aims to clarify the requirement to disclose information in a structured digital format. How it will work out remains to be seen, but my guess and hope, is that there will be more demand for structured data in each amendment of existing legislation.

### What then is open data?

Open data is digital information with no restrictions regarding reuse, unlike the PSI data which permits limitations on reuse. Open data should therefore be free from copyright, patents and other obstacles. When the government is the publisher of such information, it is usually called open public data. Open Government Working Group has attempted to standardize what is to be considered open data.

These requirements on open data, derived from Wikipedia, are true at least in my opinion:

  1. **Complete.** Information that does not contain personal data or depends on confidentiality is made available as widely as possible. This is particularly aimed at databases with materials that could be processed and improved.
  2. **Primary.** Information shall be provided, as far as possible, as an original. Image and video materials will be provided in the highest possible resolution to allow for further processing.
  3. **Timely.** Information should be made available as quickly as possible so that the value of it is not lost. There should be mechanisms to receive information about updates automatically.
  4. **Accessible.** Information made available to as many users as possible for as many purposes as possible.
  5. **Machine processable.** The information is structured in a way that allows for machine processing and interconnection with other registers.
  6. **Non-discriminatory.** The information is available to all without requiring payment, or restrictions in the form of licensing and registration procedures.
  7. **In an open format.** The format the information is provided in adopts an open standard, or that the documentation for the format is freely available and free from copyright licensing terms.
  8. **License-free.** The data itself should be free of any limitations or costs of use. For instance, if the data is released under Creative Commons CC0 or in the public domain, it is considered license-free.

Some points are more open to interpretation than others. Perhaps principally point seven, which can be anything from a simple text-file to an advanced API for distribution and synchronization of information. If you mix open data with an API, which many believe is an obvious combination, the possibility arises for others to build services that depend on the information gathered and upon the API. It's not really a requirement of open data that it is offered through a public API, but if you want to encourage its use, it is worth checking what developers want. The exchange of data needs to be reliable, flexible and give developers the confidence to use it. Developers may not always give priority to making money on the things they build but they quickly learn to avoid the frustration that occurs with seemingly unnecessary obstacles, terms and other hassles.

### The benefits of an API for a startup business or when building anew

It is perhaps not obvious at first glance, but the business model for APIs is that not everyone should have to reinvent the wheel themselves. Today's information systems are so complex to develop that any help people can get from others is gratefully accepted. For a startup, it is essential to avoid doing the things you are not good at. That is a prerequisite not to fail, and probably in many cases, you depend on public APIs for services someone else can provide better than you can. The list can be very long with the information and services they depend on to be effective. Things like currency exchange rates for international companies, geographical data as in the Instagram / Foursquare example, the current weather for a location or support functions to reduce the amount of spam comments on your blog, among others, are examples of things someone else probably does better than you, and perhaps they already offer an API.

    Figure 13: The spam fighting Wordpress-plugin Akismet telling how many spam comments they caught on a website.

I thought that a list of arguments would be nice, a list of why a startup, or a new web service, can benefit from letting their own public API be a natural part of their business.

#### 1. It is normal business

In today's connected society, you do not know beforehand where your information will finally end up, so it is difficult to do without APIs. For your own business needs, it should not come as a surprise that APIs help when your web, intranet, mobile app and all other systems need access to the same data sources. When discussing a partnership with another company, you are at least partly ready for integration across organizational boundaries.

Is there a problem many face that you can solve with the help of a computer? With an API, you can offer your services to the world directly from your parent's basement, or your own garage :)

APIs are today's information desks, automated secretaries and staff all rolled into one. Note that an API is not automatically private just because it is not public. When you have an API for your mobile app, others may also have access to it and use it, though a public API is a better start to a relationship with other developers than them using your hidden API is.

#### 2. Builds relationships around your services

Think of it as an ecosystem where your services are the centerpiece from day one. There may, at first, be just a few who are interested in your services, but for every new user, you have someone whose success partly relates to your success. That makes for natural communication and offer exchanges in the future. External developers contributing with new perspectives, innovation and expertise towards something that all participants will benefit from.

What if someone who uses your API makes it big time? It will partly spill over to your services and it could lead to unexpected business opportunities.

#### 3. Release the data and contribute to transparency

Probably many wish that their employer's data sources were more accessible to employees. I am definitely one of those with a need for more transparency in what data my employer has. Think about the term _data discovery_ for a moment. How easy is it to explore the organization's digital information resources? Often, internal inventiveness around existing data sources is unfortunately thwarted since creative people do not have access to the data they actually need to do their jobs better or in a more efficient way.

Just with an API, it becomes a silo of information. The difference is that you know which silos there are, which you can make use of and you can use the content as necessary. Someone has put effort into gathering information so it is a good idea to take advantage of it and encourage reuse.

#### 4. Investors take this almost for granted

How big do you think Twitter would have been if they had never offered users' data to third-parties? Because of this, there was a broad range of applications for all possible platforms, which contributed to Twitter's popularity. Other developers drew a lot of experience which Twitter could use themselves and capitalize on later on.

A big part of succeeding online is to enlist the help of others and make the most of things as quickly as is humanly possible. An API is often a part of that strategy.

#### 5. Makes for good mashups

Google Maps is a gigantic example of a widely used mashup. The free use of Google Maps on other company websites or apps helped Google to establish themselves in the map business, but if the users' services become popular, they will have to start paying to use them.

The possibilities of mashups are bigger than you might think with all of the niche interests that thrive on the Web, along with all the services that help with features like video and more that are difficult to create yourself.

#### 6. Self-going content marketing

Spirits brand Absolut did an interesting marketing stunt in 2013 when they hosted an innovation competition centered on their new API. The API contains many ingredients for cocktails where Absolut's own vodka happens to be an ingredient.

If the API becomes popular or the drink recipes are used on someone else's website, Absolut gets extremely valuable credibility from someone who is seemingly independent. In addition to all the text in the recipes, professionally taken photographs and video clips are offered free of charge to use via their API. Users and developers did not have to think long about the quality of the content.

### Design a public API with the developers' experience in mind

It is a good idea to start all IT projects by planning what data to handle. Then you have a plan for the construction of an API for your own needs. It is advisable to use this API yourself - eat your own dog food - in the same way as if you were an external party. If you do not dare to use your own API, one may ask why someone else should be more reckless than you are.

Just as we have for a long time distinguished between content and design in web development by using CSS and HTML, in the same way we should distinguish between data and presentation by using an API to feed the data a webpage needs. If you are starting a new web project, begin by looking at what data you already have, what data you need, how you will collect the needed data and which parts are meaningful to offer to third parties.

When releasing a public API, you really should commit to some basic things. Even though you may only indirectly make money on the API, the relationship between you as the publisher of an API and your users is similar to a business relationship, and you should regard it as such.

### Friendly terms and a free license

If you want someone to use your API, it requires good communication and mutual interest between all involved. Avoid being too bureaucratic or legal in the conditions of use. Instead, try to be encouraging and inspiring concerning what they can build using the API.

I think it is worth emphasizing that you should avoid burdensome terms, and, offer as free a license as possible. This is especially true for the public sector, but the same reasoning however is useful in the corporate context. Be mindful of anything in the terms that is unnecessarily harsh, or perhaps indicates that transparency is given reluctantly.

I have met with an example of reluctance, but not in open data, in the form of a former monopolist in the event tickets market. My company had Northern Europe's largest festival website, and I asked them if I could get permission to use their APIs to channel my visitors to the correct part of their website. For my visitors to more easily buy tickets, for me to avoid manual linking, and for the other company to make money. It was a bit confusing when they wanted money from me to hand over prospective customers to them. There I was, with hundreds of thousands of visitors reading about the concerts and festivals but could not easily guide them to where they could buy the tickets. It would have been more reasonable that they asked how much I wanted to get paid for handing over customers, or replied that they could not help me because of some technical reason. Need I mention that nowadays, there are many companies in this market space? There are, of course, a number of APIs to use.

In cases where it is possible, you should use standardized terms of use and licenses, which are plentiful. Among others, _Open Database License_ (ODbl) and _Creative Commons_ (CC). However, try to find a standardized license that businesses similar to yours use; it makes it easier for developers who can then combine your data with similar data, without the need to engage a lawyer to look for legal conflicts between the licenses. It is worth remembering that most people turning to an API are better versed in technology than law.

Things to answer in the terms of use are among others:

  * Is there any pay model? How many requests to the service are free of charge and when exactly does the service start to cost money?
  * The basic license for information purposes. If there are limitations in the general license, which license applies when, and what are these limitations?
  * If the information is not completely free - how can it be temporarily stored? It is an advantage if the terms, in plain language, can tell how long a re-user, for performance reasons or otherwise, can keep data downloaded in their own system.
  * Are there one or more usage quotas? Usually there are a limited number of requests allowed to be made to an API, and this is very important to find out early on for developers so that they can conserve resources.

### No surprising the developers with unforeseen breaking changes

It is easy to think that you thought of everything, but almost every successful technical project requires future adjustments. When it comes to APIs, it is important to plan for this right from the start by designing for future additions. This means that you will version-manage APIs regardless of whether you envisage future versions.

Many developers use the version number in the addresses, such as _/v1/_ or _/version-1/_ , which is used for sending requests to the API. This makes it easy to see which API version the code is using. In addition, you do not have to worry about clashes between different versions, as they always have a unique version number. To have the version number in the address is standard practice among major APIs, perhaps because the other choice, to add the version number in the HTTP header, is more obscure and many developers do not even know about this.

The documentation must also be versioned so documentation is still there for those who are not using the latest version of the API. It is advisable to provide information on what differentiates the various versions from a technical perspective, but also the direction in which the API is developing conceptually.

In practice, it must be possible to use an older version of an API for a transitional period if others are using your API. Good practice is to contact them with information about what the changes are, and how long they will be able to stay on the older version. Actually, you may not always turn off access to older versions, but it is good to be open about your plans to provide continued support for versions other than the most recent. An absolute minimum is to give users of an API at least three months in which to manage their migration to a newer version. Expect to upset some users if you set too tight a deadline.

Because of future needs for change, and to get to know the API users, it may be a good idea to require, or offer, registration. At least for those who are continuous users or use the API for business purposes. The benefit to them is to be able to receive information on improvements, advance warning of new versions, and also a way to get in contact for support questions.

Since API usage for a user may vary heavily between days, it is a nice gesture to offer a _soft quota_ with prior notice before the _hard quota_ strikes causing lockout. An example would be sending out an e-mail when 75 % of the quota is used up. Offering your API users the choice of when this warning occurs would be a great plus point. Should you buy an off-the-shelf system to offer APIs on a larger scale, definitively ask about customized quotas.

Try as far as you can to put yourself in the user's shoes and it will probably end up fine. Eat your own dog food that is.

### Provide data in the expected format and in suitable bundles

There are two opposing approaches to data exchange that you need to reflect upon, namely, providing data refined enough for direct use in other applications, or to provide data in its original format. To exemplify the refined version, an API responds with true or false if a certain bus is on schedule or not. The more original format would be to show all data on all buses, like a database copy.

Processed data is of course handy but simultaneously limits what we can do with the information. Getting a database copy is great for those who need to do just about anything other than the most obvious, but at the same time, the data set timeliness quickly begins to disappear. Should you yourself use the API, it is a bit easier to know what the first version might look like. But if others are to gain access, you should talk to them, what information to deliver for them to have a great experience. The risk is that you, with the best of intentions, compile data in a way that makes it impossible for others to take advantage of your API. Examples I have seen are services converting information into readable formats such as Word documents, while this really only leads to more work for developers to convert everything back to plain text.

It is a good idea to offer the original format in bite-sized packages and that there are ways to keep track of data that is updated regularly. To continue with the example of buses, it would be that the packaging allows retrieval of all the info about a certain bus-line's planned schedule. Then add to this with a simple API service that specifies how the current situation of the bus-line relates to the schedule.

Which format is best? Well, that depends on whom you ask. If you ask a non-developer, they probably think about how to access the information on a familiar type of device. The answer is certainly something that is familiar to most of us, possibly something from the Microsoft Office suite, Adobe Photoshop, PDF and the like. If you ask a developer on the other hand, they often think of the versatility of the format and you will get answers that are abbreviations, like JSONP or XML, or that they want everything you have in whatever the native format is.

The point is you have to know who your users are and whether you already know what they need. Otherwise, it is time to get hold of some representative users.

As a rule of thumb, you can think about which format is useful to quickly and easily explore what the data source contains. Complement it with the formats used in other applications. It is common practice to offer several formats, so it will be up to the user to choose for themselves. For example, it is easier for most people to get an Excel file when, on a single occasion, they need to look through the content while you'd rather have a CSV file, i.e. a comma delimited text file, or similar if the content is to be processed in an application.

Also keep in mind in what situation your users are in – do they need to download the content as a file or is it to be used with a mobile app? It can be both, but by your choice of path, you control which one users get. To offer the very popular format JSONP is a good start for attracting web developers, who will recognize it and be able to use your API with their own data source without necessarily having to incur the inconvenience of making a local copy first.

The type of data can sometimes control the format. Is it financial information it could be a spreadsheet, supplemented with a CSV file, is most logical. Is it news or chronological information probably the format _RSS_ or _ATOM_ is most suitable. Is it about geographic information or map data, perhaps _GeoRSS_ , _Shape files_ or _KML files_ is what you are looking for.

Many times, it is stated in the address of the API request which format the response is. The reasoning is the same as with expressing the version number in addresses – it is easier to understand your code if it is clearly indicated what the response contains. It might look something like the address below, the API request to find out who won the _Nobel Peace Prize_ in 2015 - in this case JSON is the format:

_http://api.nobelprize.org/v1/prize.json?year=2015 &category=peace_

### Error handling and dimensioning of the service

A good public API needs to be predictable. You need to know in advance how an error will appear - technically. In some APIs I have seen, you can even instruct the API to provide an erroneous response to test-drive your application for this eventuality. An API is, as you probably have figured out by now, infrastructure others use and it should not be treated lightly regarding reliability compared to other major web services you offer.

It can have enormous consequences for a company's reputation if the API does not work. An example is Facebook's bug, in the beginning of 2013, which affected almost all the sites that used Facebook's Like-button. Large parts of the Web were not accessible at all. Well-known sites such as CNN, Huffington Post, ESPN, The Washington Post and many more went offline, for not having fault-tolerance regarding Facebook.

For those who use an API, to have a standing chance to carry out good error handling themselves, it requires the API publishers to offer error handling. This includes everything from the obvious things like using the correct status codes in HTTP. You have probably seen error 404-pages when surfing the web; those kind of error messages are helpful and developer friendly to find out what happened.

The biggest failure is probably when the entire API goes down because of an overload. It can be caused by many things, most of which are familiar to all developers. An interesting anecdote is about the company that manages the mass-transit system in Stockholm, Sweden. When they re-launched their website, they thought they were the target of a massive denial of service attack. In fact, it turned out that the earlier website had an internal, they thought, API to access traffic information. The API was not documented or advertised as a service for others to use. But that did not stop many popular services and mobile apps from using a direct integration to the API, which had now disappeared.

The reason that the new website went down was probably due to all the erroneous requests to the website where the API once resided, which can be more resource-consuming than when everything worked as intended. More on performance planning later in this book. The solution was to roll back to the old website, work together with all those who needed a public API and then publish the new website again.

An ingenious solution, many probably overlooked, is to design the API so you can prioritize traffic in difficult situations. If in the sticky situation of having to choke usage, perhaps the API should only serve the API issuer and paying customers.

Other problems that can arise are that you have not optimized the API for low resource use of related resources, such as databases, or it may be that capacity simply runs out because of an overwhelming popularity beyond your imagination.

Nowadays, it is so cheap to rent space at a major hosting provider that you should abandon the tiny hosts, at least for APIs, and have a healthy margin on the resources that the API depends on. The same may well apply to the services the in-house IT department hosts, regardless of whether it is a larger organization - maybe a hosting service can offer a similar service.

Best practice is to have a subdomain like _api.mywebsite.com_ or _data.mywebsite.com_ , which allows you to put the API somewhere that prioritizes performance, and scalability, without automatically affecting your website's costs or settings.

### Provide code samples and showcase success stories

Those who want to use your API are not necessarily experienced system developers that have plenty of time on their hands. Therefore, what others can reuse should be included in the API documentation. For example, the API documentation should give tips on how to get started, code samples, or even more or less complete sample applications to download and use as a template. Remember, the goal of having a public API is for others to use it; otherwise, it is smarter to keep it entirely private.

Be open to link to resources that may be useful, tips on tools to crunch the data and to encourage those who use the API. Even if they do not pay for their use, it is something that justifies why the API is made public at least.

An easy approach some embraced is to have a designated contact person who supports API users, mainly by having a developer blog for news and a wiki for documentation. There you can get information, respond to comments and enhance documentation. This is where you build relationships with API users. A task that requires technical skills, social tact, as well as a touch of market-based thinking.

### Promote via data markets and API directories

To reach out with your API, make use of various API directories and get listed. Internationally, programmableweb.com is by far the largest, with tens of thousands of listed APIs, but do not miss national or local directories.

    Figure 14: At ProgrammableWeb, you can search for APIs on just about anything.

Some directories require open license terms for inclusion while others are more of a data market where you can make money by selling your data through the service. On these services, you can see what competition there is in the field your API covers, which can be really good inspiration for what to offer in the next version or if collaboration with another organization is meaningful.

### What is the quality of data needed?

**Glossary - URI (Uniform Resource Identifier)**  
An address to an Internet resource and looks like an ordinary URL. In the context of linked data, URIs are intended to give an address, via the Internet, to a thing, which can be to name something in the physical world. You can view a URI as a name for something - a name that happens to be an address to the description of the thing.

**Glossary - linked data**  
Data that is compatible with other data and usually contains relational links, URIs that is, to these resources. Readable, processable and understandable to machines through an information model that is self-descriptive.

_Linked data_ is refined data so well standardized that it can be combined with other data sources across the Web. The challenge linked data is trying to solve is to bring order, structure and context to the increasing amount of information we are trying to make use of - popularly known as _big data_. Linked data is a way to know if the information we stumble upon is related to something else. Whether you can do further research. Just as the Web is a network of documents, linked data enhances the Web with an ever-growing network of data. Data combined into facts and knowledge is the natural continuation of the Web.

The Web's creator, Sir Tim Berners-Lee, has set up four principles for how to link data, namely that:

  1. **Using URIs to name and identify things.** The 'thing' may be a document on the Web, a dataset or a physical location such as a tram stop. This provides unique naming and a common way to refer to things, whether they are online or in the physical world.
  2. **Using HTTP web-protocol so that URIs can be looked up, read by people and processed by machines.** In other words, there should be a page on the Web where you can read information about a URI, whether you are a human or machine, enter via a desktop computer or other type of device.
  3. **Provide useful information when the URI is looked up, by use of standardized formats such as RDF, XML and SPARQL.** Information such as the status of the thing, who is responsible for it, metadata such as keywords, etc. If a machine makes the request, the answer comes in a language machines can process, probably RDF. If it is a human using a web browser, you can expect readable information meant to address humans.
  4. **Refer to other resources using their URIs and thereby reference related URIs on the Web.** Things' relatedness to other information sources or datasets is of high importance. Among others, a URI can declare that the old version of itself is found at the referencing URIs endpoint. All known and useful references are given. The list can be quite long since both the organization's own internal and external URIs can be plentiful.

    Figure 15: BBCs Iplayer website uses linked data to display information about what is playing.

The BBC built their Iplayer Radio service based on linked data as a data source. It enabled a lovely Wikipedia touch with cross-linking within the service. Information related to what is currently playing is automatically pulled from external linked data sources. This integration of external sources is perhaps less tightly controlled compared to traditional use of external APIs. Linked data uses many techniques that allow a loosely coupled combination of several data sources' content. The vision is that the Web will become like one big database accessible to all.

For it to be worthwhile to contribute with your own linked data, it should at least be able to relate to some other information silo. Or perhaps you are the natural issuer of URIs for something unique – for instance, a municipality, naming your properties, such as schools.

We can benefit from the principles of linked data without intending to release the information outside an organization. For example, to introduce enterprise-wide naming of important things would be something an enterprise architect's only dream about. As social security numbers are understandable points of reference for interoperability between systems, URIs too can offer the same standardization benefits - instead of having things named differently in every database.

## Microdata - semantically defined content

**Glossary - semantic web**  
Refers to websites where the content and type of information is understandable and processable by machines. This enables the Web to become more of an enormous database. A network of data and not just a network of documents.

Mention the word _semantically_ and your colleagues instantly get a glassy look in their eyes, believe me I have tried many times. It is not as complicated or boring as the word seems to suggest. The semantic web, or Web 3.0 as some call it, is the generation of the Web we see today. A more intelligent and relevant web. It weighs in your location in the relevance model for which gas station might suit you – the ones that are close and have many positive reviews rank higher than the completely unknown ones without public contact information located pretty damn far away from your current location.

For this to work, information needs to be clearly defined since no one has access to all information in a structured format. The complete picture is spread across many services that need to interoperate. Among other things, search engines are at the mercy of how well information is described on websites (and other sources of data) they themselves have no control over. Here microdata introduces itself as a savior for how your website can be a part of the semantic web, how your data can be self-explanatory to machines.

When using a search engine in recent years, it is likely that you have already taken advantage of semantic features. Search engines and other technical systems try to understand the structure of the information contained in the content. This is easier for humans than for machines and we can understand what a text is about just by reading it. The possibilities for search engines to improve their understanding of unstructured information have its limits, not to mention how to understand the content of a video or other media types.

At the most basic level, probably everyone has realized at this point the need to distinguish between headers and other content in text. It is nowadays, fortunately, not that common to see sub-headers which are just bolded text or images for headers instead of text. Just as headers make themselves noticeable for us visually when we skim through a text, they are also there to give structure to a document, a structure that improves its readability to machines. The very same thing that makes a machine understand that a particular text is a header is what lets the blind skim a text, by listening to the headers before choosing to load any portion of it to be read aloud.

Another context where headers are used is to measure the relevance of a text. If a search term is contained in a header on the page, it is probably more relevant than another page in which the word is only found in the unstructured body. This is used in most of today's search engines.

To identify the types of information your website contains, and mark them properly, cannot be considered optional work anymore. List things like contacts, calendar events, geographical locations, etc. and find a suitable standard that describes the information. Some of the standards we will go through shortly.

### So, what is the problem?

Reflect upon how many varieties of dates you have encountered. Most likely, you can find a version in your calendar, another in your e-mail and a plethora on the products at the supermarket. I get grouchy to say the least whenever the date is given in the style of _06/11/10_ since I do not know which standard it follows.

To figure out what _29_ stands for on the page of the month of August in your calendar is easy for us humans. When on a website, if we all click on the calendar and are presented with the same information, most of us will probably understand this in that context too.

However, it is not obvious to a machine to understand the context. Add to this that there are many different national standards and industry standards that specify how dates should be formatted. A week number in a date context is another problem; only a couple of European countries and a few in Asia have a grasp of the concept that weeks may have numbers.

This problem is not limited of course to data describing dates or time. Among many examples worth mentioning are distance, geographic location and units for measuring weight. In fact, the problem is in all information.

### The potential of semantic information

One of several dreams that remain to be realized on the Internet is that the Web should act as a giant structured database so that we can get precise answers to almost any question. Right now, the Web is a half-structured database where it is difficult to know what is what, at least for machines.

    Figure 16: Displaying related dates in Google's results-page.

    Figure 17: A burger-joint's rating directly with Google Search.

Those who are interested in search engine optimization have probably encountered SEO best practices with structured data by now, or read tips about working with something called RDFa. Structured data is enriched snippets of information that are self-descriptive to a machine. The goal is to be more precise in the nature of your content and that Google will reuse the data which gives a competitive edge to your website.

Besides the possibility of gaining more space for your website on the search engine's results-page, it also demonstrates to the user that there is more related or ancillary information on the website. So it is not just for search engines' sakes that you mark up your content. Examples of other uses are what you, as a visitor on a website, can do with the information. Such as being able to click on a phone number on a website and make a phone call, import contacts to your address book and add an event to your calendar directly from a website – since your browser understood the content.

That users themselves can take advantage of semantically marked up information has not exactly taken most users by storm. Now there is a chance that this will change as the Web is increasingly used by other types of devices where the lack of a keyboard and mouse can be alleviated with these opportunities, and information can be used in a more intuitive way adapted for each type of device.

To regard the Web as a distributed database, or a global document management system, is perhaps not so strange. What semantic technology adds is to put the document's type on each document, or subset within a page. If a web page contains calendar information and a geographic location, it helps other systems to give the user a choice of filtering within a larger amount of information. The content can declare what it is.

### Microdata standards such as Schema.org and Microformats

Microformats and Schema.org are the two most common ways to extend the semantics of web pages today. They both consist of a number of specifications for how HTML code should look to stand out from the body text and other HTML elements. Microformats were released early as a standard under continuous development, with the idea of offering simple solutions to develop the HTML standard to extend semantic meaning; for instance, including contact information, among other things.

It involves adjusting the HTML code to follow certain patterns (see examples later on) where the code goes to show information in a certain format and, at the same time, have a more reader friendly presentation appear on the page. This technology is called _RDFa_ (Resource Description Framework in Attributes), which you can see in the example code is a method of enriching code with content type.

Examples of code in Microformats to describe a person's contact information.

_< ul class="vcard">  
<li class="fn">Jane Doe</li><li class="org">Acme Inc</li>  
<li class="tel">555-12 34 56</li>  
<li><a class="url" href="http://example.com/">http://example.com/</a></li>  
</ul>_

Geographic location in Northern Sweden marked up with Schema.org

_< div itemscope itemtype="http://schema.org/Place">  
<span itemprop="name">Bräntberget Ski lifts, Umeå</span>  
<div itemprop="geo" itemscope itemtype="http://schema.org/GeoCoordinates">  
<meta itemprop="latitude" content="63.841066" />  
<meta itemprop="longitude" content="20.311139" />  
</div>  
</div>_

Schema.org is an industry standard by Google, Yahoo and Bing, which began in the summer of 2011. It is a joint effort to remedy the fact that Microformats development has slowed down significantly since 2005. Schema.org is your primary choice if you have not yet begun with microdata for your information.

There are many circumstances where microdata can be used to enrich information. Here is a short list to exemplify the scope:

  * Contact information and authorship.
  * Geographical locations.
  * Events.
  * People and organizations.
  * Health data and medical procedures.
  * Products, offers, reviews.
  * Books, movies, recipes, paintings.

Most of these can be combined to give a geographic position in a company's contact information, for instance. The full list of entities is quite long.

What this structured data is then used for varies from service to service. How it looks on Google is something we all notice fast and is quickly included in the best practices of search engine optimization, not only to lure visitors from the search engine but also to increase the page value in the search engine's algorithm. Better structure is a qualitative measure many of us can improve. There is nothing to prevent other actors from taking advantage of the same microdata in their own services - all this microdata is of course just as public for Google and anyone else, such as an organization's own enterprise search engine.

## Digital Asset Management (and Adaptive Content)

**Glossary - Digital Asset Management**  
The structured work to collect, describe, keep and use digital resources in a usable archive. Sometimes known as an image bank, but usually contains more than just pictures. Often called DAM (Digital Asset Management) or MAM (Media Asset Management).

**Glossary - Adaptive Content Management**  
In essence the same thing as DAM, but with a focus on multi-channel challenges, such as serving different versions of material, depending on the type of unit or device it is to be consumed on, including mobile phones, desktop computers, wearables, televisions in shop windows and so on.

The benefits of _Digital Asset Management_ (hereafter called _DAM_ and including _Adaptive Content_ ) are primarily two-fold, namely, first, to have a central location for storing and managing media files for repeated use. In addition to the need for orderliness, to have a place to look into, or integrate other systems with, the DAM system is often responsible for the distribution of material to other information systems. It is easier, in an enterprise scenario, to adopt a single system that is skilled in optimizing images for the Web than have all web systems do this, or to stream video suitable for the available bandwidth of the receiver. In large organizations, it is common to have many systems that are accessible through the Web. Many of them, though, have great problems in living up to the Web's fast changing needs.

DAM described on Wikipedia:

> "Digital Asset Management (DAM) consists of management tasks and decisions surrounding the ingestion, annotation, cataloging, storage, retrieval and distribution of digital assets. Digital photographs, animations, videos and music exemplify the target areas of media asset management (a subcategory of DAM)."

A DAM system may not be a perfect fit for you if you run a personal Wordpress website. Nevertheless, to regard all media files, such as images, audio, and others, as digital resources may be worthwhile anyway. The day may come when we ask why we did not have more foresight regarding file structure.

The similarities with document management for large organizations are striking. It is about preserving the original files, standardizing metadata that is attached to each file, controlling who has access to what files and offering accessible copies via websites. Exemplified by a photograph, the original is a digital raw file taken directly from a camera; the metadata describes the photo's properties such as the aperture and its shutter-speed, access rights, license for the image and so on. The viewable copy via the Web is a JPG image optimized for viewing on the Web.

Examples of files that can be part of a DAM system:

  * Photographs.
  * Illustrations, information graphics and images.
  * Brochures, graphic productions and originals of other printed matter.
  * Videos and movies.
  * Sound clips, podcasts and audio effects.
  * 3D-printer files and other digital drawings.

Usually you leave out the most obvious office documents, like word processing and spreadsheets. Exactly where to draw the line can be quite difficult and I have seen examples of DAM systems that also stored Excel files. Perhaps because the DAM systems did a better job than the document management systems.

The most basic editing capabilities are often included in a DAM system, for example, to crop an image. This edited version of the material is stored as a copy of the original so it is possible to monitor the use of an image.

Examples of factors that suggest the need for a DAM system - instead of using the upload directory on your website - can be:

  * Access control for serving logged-in users a certain image while others see a very low-resolution version, thus encouraging registration on the website.
  * Special access to give freelance photographers and print shops access to project files, so you do not have to send physical stuff such as USB sticks or DVDs.
  * Management of multiple channels where a DAM system makes it easier to have an overview of communication across the ever-growing number of devices and channels.
  * Personalization such as where visitors get content which is popular in their geographic vicinity, or videos are automatically subtitled in a language they understand.
  * Device customization by sending materials in high DPI to devices that support this, or to send the best suited format for the device.
  * Connectivity customization to send, for example, streamed video tailored not only to the recipient's device but also in the best possible resolution for the bandwidth available.
  * Context factors where, if a user's device is low on battery, you limit the amount of network traffic, or to adapt the content's contrast to the lighting conditions around the user.
  * Target audience customization so that information sent is comprehensible to the recipient based on their level of knowledge; this is solved by relating different editions of content classified by target audience.
  * Legal requirements such as when using images from an external stock photo provider and the need to keep track of which possibly time-limited publication licenses these images have.
  * Findability since it is easier finding content if it is stored in a single location, probably with more consistent metadata than if managed in lots of systems.
  * Support marketing to carry out A / B testing and ensuring that the system, after a completed A / B test, sends the best performing material to end-users.
  * Analysis of content usage is something more people should pay attention to. Do you know how many times the files in your upload directory were downloaded? Perhaps it is your most popular content. Is there deep linking to it from other websites, to content you cannot find in your analytics tool? If it is all there in one place, it is easier to track.

Holy shit, you might think. All of that, yet nothing that concerns you with your little website. The snag is just that the need for this type of orderly handling is sneaking up on you over time. It's never an attractive idea to stop producing content and reflect on inconvenient shortcomings needing massive work to overcome.

Imagine that you have a super profile photo of your CEO, such a photo may end up in/on these and other places:

  * Your own website for contact info.
  * On your intranet.
  * In the human resources' system and the CEO's corporate ID card.
  * On the mobile website, in the mobile app and tablet app.
  * In printed matter, such as the company's annual report.
  * In e-mail newsletters.
  * On internal and external blogs.
  * On Micro-blogs, like a Twitter profile picture.
  * In social media on official accounts.

Losing the photo's original makes life tricky, just with a few of the above scenarios, when all you really needed was to crop the photo for some new hyped online service. All official material that is used needs to be in a common location searchable by anyone who needs to communicate. Having a DAM system available to everyone in the organization, and not under the total control of an old-fashioned print-oriented communication department is therefore a goal to aim for.

The challenge with a DAM system is that it does not get better than its content, or its structure. Who should be allowed to contribute content? Anyone or only a qualified media editor? Personally, I would probably push you to go for a mix of professional editors with just about anybody. Maybe two different zones. The 'demilitarized zone' where the organization keeps enough control to dare to invite media and partners – probably press-photos and other resources for public relations. Then there is the place where various levels of chaos thrive. Like Dante's concept of circles in hell, you might plan for distinct levels of structure in user-generated content. This is managed through thresholds set by the organization and those who contribute content can choose the level of structure/findability for their content. If the user intends or allows others to reuse the content, then the DAM system guides the user to enter the right amount of metadata and to set permissions correctly. With an encouraging interface, perhaps using game mechanisms, guiding the user through the process to give enough information so that there is a chance that someone else can find it later on. Another choice is to require a certain minimum number of keywords and categories for the user to be able to add content to the DAM.

    Figure 18: The stock photo website Freeimages requires at least five keywords to save an image. Also, you can have up to three categories.

Keywords may need to be given a graphical representation rather than assuming that users understand the need to type a comma between each keyword. A keyword suggestion service is a good idea to create some kind of uniformity in keywords.

    Figure 19: Users need support adding and removing keywords, using the same wordings, etc.

The information source that provides keyword suggestions may be a centralized metadata service that keeps the vocabulary used and curates your own folksonomy. Such a metadata service would of course also be used by all other information systems. It is important that the metadata service does not only suggest words, but also attributes unique identifiers to every word. The words may appear in several places in a vocabulary that has a tree structure, or if we have a number of similar glossaries, you need to keep the words apart.

A media label with a keyword needs to keep track of both the word as plain text but also the unique ID of the keyword as a reference to the metadata service. Then it will be easier in the future to move a referenced word from the folksonomy to a more orderly customized vocabulary as you do not have to change the metadata on all content previously tagged with the word from the folksonomy. It would also help if you decide to go through the folksonomy in search of synonyms you want to associate with each other.

    Figure 20: On Youtube, you can link to a specific position in a video.

Also keep in mind that there is other metadata that needs to be entered for the information to be useful. In all honesty, it is probably unlikely that the magical day comes when we have time to fix it all afterwards. Video and audio clips may need time codes for when various sections occur inside the clip. It is their equivalent of texts' so obvious headers; the chapters help us skim through the content. On Youtube, you can, for example, right click on the video and get a specific link to the exact position of the clip, but you can also comment on video clips and write a time-reference in the format of _minutes:seconds_ , and a link is created that refers to a given position in the video. This is the equivalent of a chapter in an audio book or movie. If this information is entered in the central DAM system, it can offer an automatic table of contents for podcasts, for instance.

Those who frequently use podcast apps may have noticed that the cover image in a podcast sometimes changes by chapter, which is suitable for, among others, DJs who want the original artist covers on the image to the chapter.

    Figure 21: Podcast app Overcast showing chapter-info for the podcast of Avicii.

There are metadata standards for many specific types of files. A photograph contains EXIF data with information about the shutter-speed and other camera settings, which can be useful. ID3 is a tagging system for audio and music that can be worth using. Mainly because it is embedded in the media-file and therefore will follow the file wherever it might end up.

### Adaptive Content

Adaptive content, as in versatile content, appeared as a concept around 2012 and is a subset of what many probably would expect from a fully featured DAM strategy. It addresses the challenges of keeping control of content regardless of which channel it is communicated through, though most seem to be concerned with mobiles and wearables. The goal is, like the idea of responsive web design, that the content is adapted to the actual context.

> "Think of your core content as a fluid that gets poured into a huge number of containers. Get your content ready to go anywhere because it's going to go everywhere."
> 
> \- Brad Frost

If you make a video, it is good if it is sent in an optimized format suited to the respective channel's ability to deliver results. Sometimes you stumble across examples of when things are not as intended. Among other things, I noted that a local tech-shop franchise ran a campaign in their storefront where the commercial was made for landscape mode even though the screen was mounted in portrait. At first, I did not understand what made it look weird, but as soon as I saw the odd shape of a wheel, I got it. The proportions in the film suffer heavily due to the missing black bar above and below the picture.

In your own storefront, you will probably not have problems with bandwidth and can show the best possible resolution and without undue compression. The same videos, when presented online, need lots of compression - especially if they are to be streamed since you risk getting lag if the bandwidth is not good enough. It might be easier to think of this in terms of still images when in an ad context, as in Google Adwords and suchlike, where you have a number of standardized image sizes for everyone to use. We should apply the same concept when storing files in a DAM system. It is not just about fulfilling content needs and opportunities but also about the receiving devices. The difficult balance is to avoid creating lots of optimized content for each recipient and instead trying to cover as many scenarios as possible.

    Figure 22: Proportions not cutting the mustard when screens are rotated 90° degrees.

> "Fragmenting our content across different 'device-optimized' experiences is a losing proposition, or at least an unsustainable one."
> 
> \- Ethan Marcotte, author of the book Responsive Web Design

It is not only the proportions, or the size of pictures, that are challenging. The list of challenges can be quite long, depending on what you consider included in the DAM systems responsibilities for multi-channel communication. Personally, I would add that this system, regardless of what any other of your content systems are capable of, should be able to send context-aware content so that the consuming devices' needs are met.

Also, list all the necessary features you need, for instance if:

  * Pictures are sent with normal or, maybe high resolution too (what Apple users call _retina_ ).
  * Video content is to be streamed and / or downloaded.
  * Resolution on the material is to be varied depending on system-external circumstances such as what the consuming devices can handle in each case.
  * Compression ratio should automatically be adjusted and also selected manually. Do you need compression to be optional for files created with professional tools where the uploader has already sorted out the compression issue?
  * Format may depend on the receiver. Text may come as Word, HTML, PDF, Markdown, and others.

What will you send an **Iphone 6S** with its **retina display** locked in **landscape mode** when **on a medium fast cellular connection** and **located outdoors at noon**?

It is similar to matchmaking. Would you prefer to reuse an old advertisement in HTML5 format? Can the device requesting the content even display HTML5, or should a list of other options be given, such as sending a picture instead, or nothing at all? Sometimes it is pointless to try to send something and you should ask the recipient for an e-mail address to send the content. The point of adaptive content is measuring how well your choice, and content, performs for actual users. This is done with _A / B tests_ , which is a form of competition between two versions (version A and version B) which are randomly distributed to visitors during a limited time. The one that continues to be used after the test is the one that performed best for users. This way, you know what works in the respective situation the users find themselves in. What works might differ between desktop computers and mobile phones, or other segmentations.

### Image and media banks in your publishing system

For smaller organization, or those with less needs, it is certainly good enough with one of the image and file managers we integrate in our web content management system. These systems have a slightly different focus. The ones I have used are Imagevault, which has a bit of DAM functionality and an okay API, but its main strength is primarily that it is a common combination with Episerver CMS, and Fotoweb, a more complete suite for those who need advanced search capabilities and integration with software which professionals in the graphic design industry use, such as Adobe Indesign.

    Figure 23: Screenshot from the Media Asset Management product Fotoweb.

Before choosing an off-the-shelf system, we should be clear about what feature set we expect. Perhaps the following may be important to you too:

  * Using a suitable resolution, regardless of the size of the image in the system.
  * Images should be sent optimized for the Web. Will it be possible to have exceptions from system optimization in the case that this gives a poor result? Manually override when necessary?
  * Does your DAM need to be able to manage video? Streaming and / or download?
  * Access control, or are all files for everyone's use?
  * Should high-resolution content be sent to retina screens?
  * Is it easy to add external images and keep track of the licenses we accepted?

For those who want manual control over the optimization of a picture, I can recommend, besides Adobe Photoshop, the app _Imageoptim_ 16 for Macs. Just drag and drop images or folders in the window and it will fix it all.

_Smush.it_ 17 is the service to use if you want automated optimization of images. It is also virtually lossless to the human eye.

Personally, I think manual image editing is good enough for most of us if we have good structure in the image or file management in the web content management system. If you want to take your image management for the Web to the next level, go for a DAM system, even though it is a big project to manage.

### Personalization of information

With the help of meaningful metadata, services have more freedom in what to show to website users. Personalization is about matching content with an interested receiver, to be proactive in our communication. What is shown is controlled by the metadata - metadata about the content and metadata about intended users. Personalization is not about making the content personal; it is supposed to be individual and contextual, which works to some extent even when we do not know who the individual is. I bet you have used a service that managed to profile you and give you personalized content. A personalization I often run into is what Google presents in their Knowledge Graph, where things I search for turn up with geographic vicinity as a crucial personalized factor. The Medical History Museum in Gothenburg was chosen in the right-hand column instead of the corresponding museum in Uppsala. Probably because I was in, or, am associated with, Gothenburg.

    Figure 24: Knowledge Graph showing up on right-hand side of desktop.

There are two types of data about a user, namely:

  * **Explicit data** which the user relinquished by logging on as a customer, with customer data, completed forms and other active means of sharing information.
  * **Implicit data** that is about a user's behavior that reveal information, like special interests, gender and other demographic data.

    Figure 25: The bookstore Adlibris wants users to disclose if they are representing themselves, or a library, business or government. Business or pleasure. Also which language I speak.

Examples of what you may know about each user are:

  * **Where they are located?** There are many techniques to more or less accurately decide where a person is probably located.
  * **What equipment is used?** If the user is on a computer, tablet, mobile phone or other device might affect their behavior and what they are looking for during a visit. For example, you probably will not buy a house via a mobile browser, but to browse pictures or look for facts is common.
  * **If it is their first visit, or a return visit?** If there were earlier visits, cookies may be used, or if the visitor is logged in, you can store usage data without the user's active consent. Did the user look at the same content during an earlier visit? Did they abandon a shopping cart with contents that may be worth reminding them about? To some extent, we can make use of so-called _remarketing_ through ad systems to get returning visitors to landing pages optimized for them and highlight products to remind them about.
  * **Where was the user before they ended up on your page?** Was there a search for something specific in a search engine, a link from your own newsletter, or did the user come from a page with a context that should influence the content of your page - for example, a campaign on social media.
  * **Available information about a user's preferences?** If there is a logged-in customer, you can collect your own data. If your service is, for example, an online video service, there are certainly conclusions to draw on which type of video a user tends to look at, etc. Consider whether there is any data that you can take advantage of. The chosen language on a website certainly reveals more than just the mother tongue of a user. Bestselling books on Amazon most certainly vary with what language a user is fluent in.
  * **Does their navigational behavior reveal something?** If a user stays within a category of content, or constantly jumps between different categories, can that explain any interest or possibly a wish to be surprised? Do search analytics since users of the search function enter words that explain what they are after.

In practice, you build categories of prospective users, one category probably needs to be ' _others_ ´ which is where the default non-personalized users end up – the ones you don't have enough data on just yet, the equivalent of a non-personalized website's standard mode. If enough information about a user points to a specific category, the threshold is reached, and the user starts to receive targeted content. These categories are to be seen as indicative of what is shown on each page, some things may still be shown regardless of personalization. For example, travel agencies might prefer to suggest your _nearest_ major airport of departure despite the personalization category you are in.

#### Examples of personalized content

If there are enough good reasons to affect a page's content, it is sent in an altered form to the user, following a pre-defined template of alternation. It may be, for example, what products you suggest or which office's contact information is shown. This is done using thresholds. These thresholds can be placed at different levels of the website depending on what type of website it is. For an international company, you can regard it as a competition of different contact details, where the user's geographical vicinity and language are the determining factors. Thresholds function to determine if a user has met the criteria of a personalization category and secondarily, to decide which category, if there are several, wins.

Take a scenario of a global sporting goods store, for example. If the user lives in the northern or southern hemisphere can be quite crucial to the type of goods that are appropriate at a given time (because the seasons are opposite of course). Compare that with a global music store where there are also regional differences but probably not as dramatic - here it is more about providing what they can deliver on the user's market.

To return briefly to the example of the online bookstore - where do you think they should preferably highlight textbooks for college students? In the ' _Private_ ' or the ' _Companies and Government library_ ' category? Most divisions are there to position the more relevant content for the different categories of users; however, it is not always designed that obviously so that you yourself as a user must choose between a blue and a red pill.

A lingerie boutique would probably be glad to know their user's gender, on a normal day to display men's underwear to men, but just before some holiday or special day to suggest gifts to a statistically probable girlfriend. Is there any pattern in the user's browsing behavior that reveals gender? Is it possible to find out the user's gender through an external advertising platform?

The car manufacturer Volvo differentiates those who intend to buy a Volvo, and those who have already bought one. Their needs are slightly different. A potential customer might need suggestions on financing through Volvo's financial services, while those who already own a Volvo might need to be reminded of the benefits of original spare parts, and authorized workshops. Not only that, they try to get to know how close to buying a person is. My translation into English:

> "Volvo maps potential customers based on how close they are to buying. Those who seem to be far from making a purchase are going to look at the products, pictures and movies. Those who are closer to buying want to book a test drive, make detailed choices and get price quotes. Therefore, we present different content based on the user's past behavior."
> 
> \- Mikael Karlsson, mobile manager at Volvo Car Corporation

Not everyone is as obvious as some tech outlets. When visiting some of them, you are prompted if you would like to enter as a company or as an individual consumer. If you choose to enter as a company, campaigns on buying servers and network equipment are shown on the homepage. If you instead choose to enter as an individual, prices include VAT and the enterprise offers are gone. Instead, they make room for wearables, televisions and gaming video cards.

Perhaps the simplest variant is all those websites trying to decide in which state or region a user lives and forwarding them to a regionalized landing page. Most often also visible in the address bar, regionalization affects what news and other things a visitor sees. In cases where it is not possible to decide the visitor's location, a neutral variant, a default mode if you wish, appears. In essence, there are two categories of experiences within the very same website; those ' _positioned within a regional setting'_ , and ' _everyone else'_.

In a content management system supporting personalization, it is important that there is a feedback loop for web editors so no content is created which is unlikely to reach a user of the personalized website. In what ways can you categorize your users or customers? Without annoying them?

Something that is guaranteed to annoy your users is when they try to follow a link that turns out to be dead. Time to talk a little about the delicate subject of URL strategy.

## URL strategy for dummies

**Glossary - URL (Uniform Resource Locator)**  
Is the full address of a web page or resource on the Internet. For example, _http://mywebsite.com/contact/_ according to the scheme _protocol://domain.topleveldomain/subfolder-or-file.extension  
_ Often called web address, or address for short.

URLs are addresses on the Web and are used to have a common reference to some kind of online resource. This something can be the homepage of a website, a sub-page, an image or any other type of resource. URLs are important for humans as well as for machines to be able to address web pages, uploaded files or data. Just as a physical street address is expected to persist over time, it is preferably that an established address lives long and has roughly the same content at your next visit.

It can be easy to forget what happens when you create a new page or put up new material. If the content attracted any attention at all, the risk of a broken URL is that the following occurs:

  1. **Links from other sites to yours will break.** It may be that the URL is used on someone else's intranet, which you cannot figure out until you may eventually see a certain pattern in the statistics for your 404 error page.
  2. **People who bookmarked the page end up on an error page.** Nowadays few people bookmark in the browser to the same extent as before, but it is not yet an irrelevant point.
  3. **The website loses value in search engines.** It can be risky, especially for those who depend on search engines for their traffic, to scrap many addresses search engines are already familiar with. In part because an older URL is always worth more than a new one, but also because there is a built-in suspicion in search engine algorithms - they are after all battling against search engine spam on a daily basis.

Search engine optimizers think it is worth working hard to get natural links to a website. Seen through their eyes, it is obviously an incredible waste not to take an interest in established addresses that are already known to search engines.

Quality indicators of a URL is that it should:

  1. Be designed to persist over a very long time.
  2. Specify who the sender, or owner, is.
  3. Describe what is to be found at the address.
  4. Be as brief as possible, not contain non-essentials, and be easy to memorize or read out over the telephone.
  5. Follow the naming standard, i.e. not contain special characters, no capital letters, no underscores, etc.
  6. Have been around for a while, which is a sign of seriousness for a search engine.
  7. Refer to something unique, in other words, there should only be one way, a single URL, to reach the unique content.
  8. Be functional. If the address is hierarchical, it should be possible to hack it, erase parts of the address to reach a higher level in the structure.
  9. Send the correct status code according to HTTP. A missing page is 404, if the URL is moved to a referring URL, you should send status 301, and so on.
  10. Have some sort of spell-checking feature so it can cope with mishaps such as the (unnecessary) www prefix, with or without trailing slashes.

This requires, of course, inclusion in the design of a web system or inclusion as a requirement in procurement. All those who could influence the choice of addresses need to be informed about the supposed URL standard. Web editors should really spend more effort on the address than the header since the address is not something they can change without penalties later on.

Please document your view of a good URL strategy and try to follow it. If you have editors, it is a good idea to inform them, especially for time-bound information such as calendar events and news. Not infrequently, uploaded files in particular are uploaded without much reflection on the file name, which usually affects its own URL.

### Common excuses for breaking established URLs

As the header suggests, I believe that on most occasions when broken addresses occur, it is because they prioritized something else other than their users' best interests in mind. But it is hard to blame anyone as most people do not seem to have spent much time thinking about this subject. I have never encountered a URL strategy during my 18 years in the web industry - I may well have to write one myself someday.

Now some examples of what I have been told is the cause of broken URLs, and suggestions on how to work.

#### "But we have closed that website, now the same info is located over there..."

That a website is shut down is nothing strange or uncommon. Sometimes however, they are replaced by another website, on a different domain. Then you have the chance to retain some of the advantages of the established addresses. Here are some common variations on how not to handle old addresses:

  1. No matter what the requested address is, whether it has a new counterpart or not, the user is sent to the new website's home page.
  2. Only when requesting the old home page, visitors are redirected to the new website. All addresses except the old home page are broken.
  3. The redirection of the old URLs is temporary, and after a couple of years stop working, since no one believes them to be used anymore.

If the first point affects you as a visitor, you will be surprised or annoyed, especially if you are in a hurry. It was not quite what you had hoped for. If the visitor is forwarded without warning, the new page has to meet the needs of the page originally requested; otherwise, you are supposed to say what has happened to the old website. Based on what pages are popular or important for other reasons, the user should be referred to other matching or corresponding content. More on that later.

#### "We have replaced our content management system and the new one could not handle..."

The requirement for any new web-related system should be both that it takes care of established URLs, which is not as complicated as you might think, and that new URL addresses are not limited by any form of system standard. Many governmental actors in my vicinity chose Episerver CMS in the early 2000s and therefore got a lousy system standard in the form of an unnecessary subfolder with design templates. For icing on the cake, the template's name and page's ID number also appear in the address.

Addresses such as _www.municipality.se/municipality-templates/OrdinaryPage____67241.aspx_ were common and are still seen sometimes. Imagine the usefulness of those addresses if you are going to read them out aloud to anyone. How many underscores are there? Will you remember that address tomorrow?

When upgrading to a newer version of Episerver CMS, which had more sensible URLs, or if you replaced the CMS, there were already established, ugly addresses to be sweating over for a long period.

#### "The old addresses were so incomprehensible - the new ones are user-friendly"

Excellent. If you are interested in address quality, you should be interested in dealing with all the old address even though they were ugly. Right? The old ugly addresses often contained identification on how information is retrieved from a database, probably a series of figures. You can catch the user's intent and serve the right content, even after a system change.

    Figure 26: Reference URLs found in a book on toxicology. Try guessing the content of the last two URLs...

Taking care of old ugly URLs, or at least providing web editors with a manual tool, is something serious web agencies have offered for years. As usual, almost anything is possible, and such solutions are luckily on the cheap side. It makes sense in the majority of cases to do something and to make sure to add it to the requirements.

#### "We archive old pages ongoing"

Continuously archiving published pages is honorable in a way, but what is the reason? Sometimes you hear that news should be removed because it is out-of-date and that calendar events should be removed shortly after the event is over and done.

What is not thought of then is that the web is an excellent archive and information can have value at the end of the day even though its timeline stretches further back. Is there perhaps a problem with the information structure that makes you annoyed by older content? Does it pollute the content found in your search engine?

A counter-argument would be that, using metadata, you could instruct the search engine that the content is not very important, or perhaps ask the search engine not to index the page at all. Another choice that you may not have thought of is, for example, the design of the news list or the calendar function. News lists can be further developed to ignore news with an expiration date in the past, which would give the editor another option other than putting news in the trash to get it out of sight.

Calendar event page templates perhaps should be redesigned with a before-, during- and after-perspective. **Before** the event, an event page focuses on providing information, factual content and encouraging registration. **During** the event, the page can automatically switch to primarily guide those who cannot find their way, but also to add supplements giving event changes and report on hashtags used on Twitter, etc. **After** the event, perhaps a compilation of documents, captured images and the best of what the participants posted publicly on the Web would be suitable post-event content.

See the above as suggestions for what you can do instead of throwing a page in the trash - and therefore killing an established URL. There is certainly a solution that will make your website even better.

It is also okay to declare that a page is outdated and archived. Prefer a warning text on the page and making it a little harder to find rather than deleting it so that it cannot be found at all if needed.

### Ok, how to then?

Just as it is now obvious that an address should work regardless if the user connects with a mobile, desktop computer or something else, it should be equally obvious that an address is to be valid everywhere. What is possible to show should be displayed.

In an increasingly digital world, in which we cooperate across borders, it is reasonable that all URL addresses should work whether I am navigating on an employer's private network or a public one. If I find myself on the wrong network, or the URL points to a protected network, I should be served content of a level of access that is appropriate for me. The level can certainly be set to zero most times, but imagine a URL to a news item on the intranet of a government organization. If everyone is entitled to access it, then why not design the technology accordingly.

Some things should have extraordinarily good reasons to exist in a URL, for instance:

  1. **The author's name.** It is not certain that the author will be the one who administrates this address throughout its entire lifetime.
  2. **Subjects and other forms of categorization.** This is probably the trickiest variant as it often feels like future-safe to classify content. It is good to keep in mind that the actual word for a classification tends to have a shorter life than we first expect. Here too are the extremely common hierarchical addresses. What will happen to the pages' URLs if a parent page later on gets a new URL?
  3. **Status of the content.** Information status is supposed to change; therefore, you should omit it from the URL.
  4. **File extensions like .html, .php and the like for webpages.** For uploaded files, however, file extensions are okay. The problem with .php and the like is that they show system information, if you replace the system you might be forced to break all established addresses.
  5. **Forced folder name or traces of system standards.** In the past, we saw folder names such as _/cgi/_ but nowadays we more often see _/cms-templates/_ or anything that does not directly contribute anything more than length to URLs.
  6. **Access levels.** It is usually not that smart to have access group names in URLs when the name of these groups can be changed within the time a URL can be expected to live. It can cause problems if you come across an address created for an access level other than the one to which you belong.
  7. **The date the content was created.** When it comes to meeting notes it is okay, then the date actually belongs there, but otherwise dates in URLs eventually gives the impression that the information is old and not updated. That is not good. A URL must be maintained, with the aim of keeping it up-to-date and that this should be visible in the URL.

#### URLs in print and digital distribution

If URL content is to be printed, will it be out of your control, like the contents of a newsletter or similar? A reason to use a URL service is if you do not have control over the website you link to, such as making adjustments afterwards. Such services are available online or you can set up your own service. Several of the most popular services let you use your own, preferably short, domain name or to pick whatever domain the service offers. The difference is in how much control you want over time.

    Figure 27: Yourls is a simple URL-shortening service you can host yourself.

You create an address for a specific intended use, connect to where the address refers too and then the address is ready for use. The target can be changed in the administration interface of the service you are using and the address you used remains the same. Not only that, you also get statistics on the use of the address, which can be anything from a simple visitor count to more advanced analytics.

#### If you are willing to scrap addresses

Now if you have to break many addresses on your website, make sure to do a proper web analysis beforehand so you know what you are messing with. It might not be worth it.

A well thought-out archiving solution is your choice in this regrettable situation. My suggestion is that all purged addresses lead to a so-called _error 404-page_ that informs users that the page no longer exists. You will find later on in this book how such a page can be designed.

The error 404 page itself needs to survive a long time to take care of dead addresses, which are sometimes entire domains that have been shut down. In addition to taking care of stray visitors, it should collect data to give insight into which of the previously used addresses are used the most. If there are enough people who visit a purged address and a new equivalent is present, it is a good idea to send the users on their way to the new content - it should not be the user's problem that you broke the links.

Except to link popular defunct addresses, manually, to new functional ones, you can use search engine technology to give educated guesses about corresponding pages. For instance, suppose the government scraps all established addresses on one of its websites, and a user is using links to learn more about a specific federal agency.

The following URL is used, but no longer works in this hypothetical example:  
 _http://www.usa.gov/directory/federal/administration-for-native-americans.shtml_

The most obvious solution is to look at the address and realize that it contains things that describe the page, namely: _federal administration native americans._ Words that can be used as keywords for an automated search, supporting the error 404 page that may give suggestions on where to go.

It is also quite common to find numbers in URLs. A number might identify old content in a structured database and can be used to connect to a new address. If that were the case, only those who manage the website would know, but for you, it is worth checking out where the number came from and if it can be used for mass-redirection.

In case you want to save a lot of old addresses and know exactly how to do it, create a redirect rule from the error 404 page that applies to all addresses in _/directory/federal/_ and sends visitors on to the corresponding new addresses. Technically, this redirection should occur before an error 404 message is sent from the web server, but bring that discussion to your developer if necessary.

If no given redirection rule exists, it is worth picking up some of the best search results and telling the user that the information they were looking for could be among those results. If you are on a governmental website and want to enter a search query with all the words in the address of the federal agency's page, the search engine can actually help. These types of searches can be done by your error 404 page and presented instead of leaving the visitor in the lurch.

I hope that not everything about information architecture was overwhelming for you. Now to some more light-hearted things, as the topic of web design is coming up.

# Web design

In recent years, it has become clear that we have no control over how our visitors choose to connect to our websites. They can enter through anything from a mobile phone while sitting on a bumpy bus to a giant TV screen while surfing reclined in the living room sofa.

It is not just the size of the screen that matters but also how your visitors control their device, the connection speed and the maximum definition the screen can display.

Besides satisfying the visitor whatever their surfing situation, we also need to design with business goals in mind so that not all design proposals are assessed only using subjective measures. Do not look at your website as a static brochure, neither the content nor the technology you use. Technology and content needs to, with increasingly shorter intervals, keep up to date for your website to be relevant.

The Web was created between 1991 and 1993. In 1993, Mosaic showed up - the first web browser. It was a very basic experience, with text that could run over pages. First in 1996, a competing browser appeared on the market in the form of Microsoft's _Internet Explorer_ (IE), a browser which supported the then-obscure technology _CSS_ (Cascading Style Sheets) to separate content from its appearance.

Mosaic was soon renamed _Netscape_ and competition with IE began in who had the most feature-rich browser. The norms to respect, or develop, the initial standards for the Web were quickly forgotten, with techniques such as Javascript, dynamic HTML, CSS and Adobe Flash. This gave the publishers of websites an enormous amount of extra work who frequently had to build several separate sites, depending on whether the visitor was using Netscape or IE. Sometimes the main website was in Flash with minor versions for Netscape and IE.

    Figure 28: A computer from back when the Web was born.

Today however, it is a little more orderly regarding standards. Partly because the industry has matured but perhaps mostly because there are so many variants of browsers, browser versions that support various technologies as well as those running on different types of devices. Many web designers' solution for this was to revert to using standardized page designs, which meant that what was built worked on most visitors' computers. Not to develop according to established web standards, or not requiring it when ordering a website, is a cardinal error in the current situation. Especially when it is the only way to have any hope for everything to work as intended in future devices.

There are many design strategies to adopt when building websites. I will discuss the most basic ones worthy to be inspired by, or even to follow by the letter. Some are universal for project success; others go into detail on how the design of the buttons should look in order to be understood by visitors.

## Gov.uk design principles

Around the world, the public sector has started to standardize how to do things in order not to waste time and energy on methods already tried and failed. The British Government is quite prominent in the area of open data and service design. The British have developed something that can be likened to commandments for designing websites and digital services. Even though this is a proposal for the public sector, it absolutely has bearing on any project that will result in a digital product. The principles fit perfectly into this chapter and as a reminder of things that sometimes does not seem to be so obvious.

### 1. Start with needs

The real user's needs and not the organization's should be the focus. What needs should the service as a whole and in parts support? You need deep insight into needs before any other form of work on the design of a service is meaningful. Though it is worth remembering that what users ask for is not always what they actually need.

Use the documented needs as categories to organize your efforts around, users after all come to the service with a specific purpose. There is no reason to hide away the solution to a need in some small body text when you should make this information stand out.

### 2. Do less

Do less and sometimes nothing at all. If someone else has already fulfilled a need, it is better to link to them instead of putting effort into becoming a late competitor. If you can offer APIs, it may be enough for someone else to design a service and then your organization can focus on its core business while saving money at the same time.

The principle also applies to how much to present in one go. A website is better if there are a few obvious choices for the user to decide what to do. Content needs to be placed thematically.

### 3. Design with data

Most often, a service is not designed without prior knowledge. It probably already has a counterpart in the physical world that can act as inspiration, to guide you when designing the first version of a digital service.

The first mail-order company that built a web service certainly took some inspiration from what worked in the past, offline. This knowledge, together with A / B testing allows us to learn methodically from what works on actual users. A / B testing is about playing out version A against version B, and the version that performs best is the one used up until the time you want to test new contenders.

### 4. Do the hard work to make it simple

When designing a service, it is a good idea to take a fresh look at what needs to be done and not choosing the most obvious solution to get the decision out of the way. How difficult must this be? A quick job risks causing complexity for your users - if you cannot make it as easy as possible, your users will suffer, paying with their time and energy.

There are many examples, not only within the public sector, when one needs to have worked for many years within the organization in order to navigate to the right information. Would you bet a month's salary on whether your employer has placed the request forms for leave under the category human resources or finance on the intranet? Probably not. To not be aware of our own sender-perspective is normal and it is remedied by calling in an external third-party who will not fall so easily into this trap.

### 5. Iterate. Then iterate again.

The best way to build a service is to start small and continue with constant updates and improvements of the service, based on responses from real users.

A service or website is never finished. If someone thinks otherwise, ask that person to explain themselves since it is probably an idea in need of rejecting. Shortly after celebrating the launch, it is already time to release the first update. These updates should also apply to web design, graphic design and identity in general and not just error correction. Major redesign projects tend to confuse visitors as does replacement of user names and passwords. Doing a comprehensive project suddenly, after a couple of years, should not be necessary as it could have been done little by little and one step at a time.

### 6. Build for inclusion

Including everyone from your service is a good idea. It is all about prioritizing readability, going beyond subjective design thinking such as allowing too low contrast text making it difficult to read. Make sure clickable areas are large enough so that visitors can use standard equipment, and, yes, you are supposed to follow design conventions. Technically, there is a lot to do but thanks to the accessibility guidelines _WCAG_ (Web Content Accessibility Guidelines), everything is easy to understand and suitable levels can subsequently be chosen.

Those who have disabilities, and unfocused users in mass-transit, should be able to use the service.

### 7. Understand context

It is not about designing for a screen or building a service, but about designing for people to be able to use the service in their normal everyday context. That context can be mobile without a mouse, at an information kiosk or at a library and it may be that they do not have the assumed Facebook account to access the services put behind social networking walls.

The service should be usable in all imaginable, and even some less probable contexts.

### 8. Build digital services, not websites

Not everything revolves around your website. The solution to a need may well start on the website but finish at the nearest post office to the user - this needs to be considered in the design even though some factors are beyond our control.

It may be that a website is not the best option. Perhaps there are other digital opportunities that would fare better?

### 9. Be consistent, not uniform

As far as possible, use the same language and design layouts since this facilitates recognition for a user. When it is not possible or suitable, then you should make sure that you are at least consistent in your design.

For natural reasons, we cannot have a mobile service look identical as for a large computer screen, but you should effortlessly be able to see that it is the same organization and it should work in a similar way without the user having to learn new things.

### 10. Make things open: it makes things better

If possible, have an open policy, and share both the experience and the results of the work. Share code, data, design, ideas, failures and goals. In this way, you can get more people to examine what is created and you can get insights from others who also have an interest in achieving the best possible outcome.

Even if the project is not released as open source, we have probably all learned from others' experiences and we should share our work. It's an integral part of the Web to share with others and even if there is not tax-payers' money for the project you are working on, you too can benefit by sharing.

One common problem on the Web is that things get overly complicated. Overdesign or other features that do not put the user's needs first. The next design principle, _KISS_ , covers how to relate to design for simplicity.

## Keep it simple, stupid - KISS

A design principle created by the US Navy in the 1960s after they found that most systems work best if they stay simple rather than are allowed to become complex. Therefore, simplicity must be the primary design goal with avoidance of any unnecessary complexity at all costs.

According to this principle, a user can hardly be at fault and if they are not successful, those who designed the service screwed up. Often KISS involves not making a change or adding a feature, not allowing it to become complicated. It should be applied to everything. What good are decorative pictures doing? That cool effect, is it necessary? Why have your own design for buttons and forms when users are at risk of not understanding what it is?

If you make it complicated for users, technically, chances are that also search engines are prevented from indexing your website. If it becomes difficult for the search engines, your website will get fewer visitors and then it becomes almost pointless to have a website - if you are not actively contributing to what is called _the deep web_ , i.e., the part users do not reach via search engines.

### Do not break the web

There should be no surprises for users, or violating the conventions they have learned. A common example is whether users can rely on the back-button in the browser when they are banking online, or if they are halfway through the process of checking out a shopping cart. Uncertainty arises if it is necessary to use any custom back-button on the website or if the one in the browser will work as supposed to.

Of course, the browser's back and forward buttons should work as intended in all web applications. Unfortunately, it seems to be a matter of honor only among a very few developers to handle this and for some reason there often does not seem to be a written specification from the client's side.

Take, for instance, the software giant Microsoft's product Outlook, an e-mail system used by millions of users around the world - do you think that you can go back from reading an e-mail and end up in your inbox? Nope, not if you use the browser's back-button and happen to be browsing with an Iphone. A button, without explanatory text, with a left-pointing arrow in the field above the browser's own navigation buttons does work great, however, but for the love of God, do not use the back-button in the browser just because it works on almost any other website! :)

    Figure 29: In Outlook Web Access, the browser's back-button does not work. However, the custom, white circled left-arrow above does.

It is pathetic to say the least, that instead of ending up in your inbox, you are sent right back to the login window. Microsoft is unfortunately not alone in this nonchalant behavior. If you are prepared to put up with much hassle, you can from now on try to reload pages after you send a form, or are in the middle of a web service process. You will surely notice that the bank complains loudly, you will get double orders of shopping carts and in some cases, you will get hassled if you really want to post a form you've never even seen, once again.

Everything essential on a website should work in all browsers, and whatever is used to interact with the page. Many years have passed since it was reasonable to assume that all users had a mouse pointer hovering over something. Despite this, many websites' menus are still optimized for mouse pointers and sometimes they do not work at all with a touch interface.

Nor should we require that users have any plugins in their browser, certainly not to view the most basic content. _Java_ in the browser (note, I did not write Javascript which is something else entirely) has for many years been a huge security risk, the Adobe Flash plugin has brought many computers to their knees just to show complex ads. Not to mention Microsoft which came out with their idiotic Silverlight plugin far too late to compete with the badly designed Flash after web producers had agreed for years that Flash was dead.

That a web page should work without what may be considered unnecessary features is a discussion with a lot of nuances. On an intranet, it is easier to get control over which browsers employees use and which extensions are automatically installed. But of course, the KISS principle also applies on intranets, so employees who are blind, or have other disabilities, are not discriminated against because of someone's incompetence or indifference. However, we cannot be complacent; needs will change over time. Those who have worked somewhere where the organization standardized their web systems around Internet Explorer 6 know what I am talking about. It became extremely painful to modernize all systems simultaneously later on.

Not only does a website need to be easy to understand, it must also persuade and build confidence. The next design principle is about how design guides the visitor through the interactions needed to fulfill the website's purpose.

## Persuasive web designs (PWD) - design that convinces

Design to convince users to do what the publisher of the website wants them to do. An example would be to showcase the products that are about to run out of stock - which makes the visitor believe that they are getting a better deal than in reality - but also to show a page before they check out their shopping cart where further offers are displayed.

If we leave the often simplistic e-commerce examples, PWD may be used to display a layman's version of a legal agreement rather than a link to the version filled with legalese. Right here the ethical dilemma emerges, as simplification, i.e. removal of the obstacle in front of the user, must be in the user's best interest and not risk causing any unpleasant surprises. It may sound trivial, but believe me, not everyone will think the same way as you do. What you think is a fantastic offer can, with too much compelling design, make the viewer feel completely fooled.

It is all about lowering the threshold for decisions and guiding a series of micro-decisions towards the goal you have. Here the concept of _dark pattern_ introduces itself. That, by design, you control what happens in a way that is not in the user's best interest, or intention. It could be moving around buttons so the user happens to give an app a five-star rating in an app store, without warning, adding additional products to the shopping cart, or services sending e-mails to your contacts claiming to be you.

A somewhat milder example of dark pattern is that which is preselected in a form actually affecting the outcome. For example, how likely it is that a person donates their organs. There have been studies on several countries' populations and there is enormous difference in the results. The answer is controlled by how the question is asked.

    Figure 30: How opting-in, or out, changed the outcome of European's interest in donating their organs. Study made by Dan Ariely[20](index_split_031.html#note_20 "20").

What differentiates Danes from Swedes, for instance, is that the question was presented to Danes in the form of ' _Check this box if you want to donate your organs_ ' while Swedes had to tick the box if they did not want to donate their organs. This is explained by the fact that you do not put much consideration or time to change a default setting and in this case, what happened were two diametrically different things if you neglected to check the box. Trust me, Swedes are not that different from our fellow Vikings on the other side of the Strait of Öresund :)

In order not to confuse visitors, or be too eager to convert visitors into customers, this checklist might be worth going through at each design decision, as well as when writing texts.

### 1. Be clear in everything

Clear about who the sender is, the purpose of the service, what to do and so on. Add what is most important first in the text and graphically prioritized. Writing headers rather than titles is important, as headers contain information about the content, rather than titles to classify content. Your headers need to be true, active and descriptive like ' _We work according to the methodology Scrum_ ' instead of ' _Applied methodology at our company_ '. What in the text needs to be lifted to the header to describe the content of the text?

### 2. Be very careful of what is the default setting

By what is the default, you control people's behavior and if it is not in the users' best interest, it can create unnecessary irritation that may not even pay off in the short term. When you design a button, you can lower the threshold by linguistic tricks such as using the less daunting text ' _Add to cart_ ' instead of ' _Buy this item_ '. If you measure these two options, it is likely that more people dare to put something in the cart compared to a click that directly means a purchase. Does it lead to more conversions? Go ahead and measure and you will know if it works in your case.

### 3. Visual hierarchy is important

Do what is expected for operations to be carried out, easily found and executed. It may be that a form's submit-button sticks out more graphically than the button to clear the form. With color, size and placement, you reduce friction to post the form.

Does the user have to scroll first to find what the purpose of the page is, or is it obvious at a quick glance? Even on small screens? If a user does not instantly see the so-called call-to-action, a button for example, it reduces the chance of what you want to happen actually happening.

    Figure 31: Dropbox clearly thinks signing up is more important than signing in, probably because existing users might be more committed to fulfilling their task, compared to newcomers.

### 4. Focus on the common goal you and your visitor have

You have probably seen websites where all the navigational features disappear when you are in the middle of the process of checking out a shopping cart? It is for you not to be distracted and cancel the purchase. The purpose of the checkout is that the user, with as little effort as possible, should be able to make a purchase; this is a clear mission statement for any e-commerce website.

### 5. Try not to overexert your users' attention

Design your website so it is as self-explanatory as possible. An example of something to avoid is to add useful links after a very long text as the chance that the links are found decreases the longer the text is. Such links may be of greater benefit before the text.

You have probably seen the all too common cliché pictures from stock photo sites where people from all over the world smile at the beholder. Those pictures are not meaningful and should not compete for space with something that is important!

Well designed PWD will be experienced by the user as a clear and intuitive service, which should be the goal of every website. At the same time, the sender's purpose with the website needs to be achieved. Something else that should be an obvious goal for each website is that it presents itself as well as possible on as many conceivable types of devices. That is what responsive web design is all about.

## Responsive web design

A website should be able to display itself and be useful on any device the user selects and make the best of it. The idea is to invest effort into achieving a good compromise for all types of devices your users might use instead of building a mobile website, a desktop computer website, a games console website, a projector website, and what evermore may come in the future in the form of equipment with a web browser.

    Figure 32: How a non-responsive website looks on an Iphone 5S.

We have already seen the situation with special solutions online in the era when we had separate websites. One in HTML, one in Flash, and sometimes you had to choose Netscape or Internet Explorer after selecting the HTML version of a homepage, just to get the correct style sheet. Some clearly stated what kind of equipment that was recommended, as if a user would switch browsers, or change the screen resolution for every website they visited. That was not exactly smooth, and today we have more versions of browsers and types of screens than ever before. Today's websites have to adapt to users and _responsive web design_ (hereafter RWD) is the concept where a single website aims to satisfy all types of users without a plethora of special solutions.

Before RWD broke through, it was for many years assumed that the user could see at least 960 pixels horizontally on their screen. These 960 pixels were then divided into the number of columns needed for the design. When the _Iphone_ was released in 2007 however, the average man in the street started surfing the Web with a browser, held in portrait mode, that was only 320 pixels wide and 480 pixels high on a 3.5-inch display. Websites back then assumed that the visitors had at least a 13-inch screen with at least 1024 pixels in width and 768 pixels in height. A large screen in landscape mode as opposed to a smaller screen with a lower resolution typically used in portrait mode. On a smartphone, websites were viewed in a zoomed-out mode without the necessary sharpness for even a person with extremely good eyesight to be able to read the text other than, at best, big headers. It was not a great experience surfing with a mobile phone before RWD made its entrance.

Exactly as if the web would be printed, many designers approached their design like an artist chooses a canvas of a suitable size to paint on - the Web's cloth was 960 pixels wide and with infinite length for scrolling. The notion that it is not possible to choose a suitable canvas to design a website is really the most important insight of responsive web design.

### The mobile moment

Responsive web is not really a technology aimed at mobile customization of a website, even though it competes indirectly with the idea of having a separate mobile website, but RWD does keep the mobile visitors on the main website. If you build a mobile website, it still needs to be responsive for the simple reason that there is too much variation even among mobile devices to be able to make assumptions on screen size, among other things. The same applies, as always, on common desktop computers. Some of us use only a laptop with a screen usually 13–15 inches wide, others use a desktop computer monitor up to 30 inches, or use a TV.

The reason RWD hit it big around 2013 probably correlates with the fact that many saw the percentage of mobile users had increased steadily in their web statistics, in some instances mobile users accounted for half the user base. A responsive website is supposed to be device-agnostic, making it a useful solution in a mobile or small tablet scenario. At the same point in time, an sufficient number of web users had browser support for modern web standards, required for responsiveness.

What the web genius Luke Wroblewski named _The Mobile Moment_ , i.e. when the proportion of mobile visitors is greater than those using non-mobile devices, occurs at different speeds, if ever for some sites. Examples in my professional life are the national healthcare website in Sweden, 1177.se, which had its mobile moment in early 2013, depending on whether you count tablets as a mobile device. While our biggest hospital, sahlgrenska.se, at the same time had around 90 percent of its visitors on desktop computers.

### The elements of responsive web design

Responsive web is actually three different techniques in combination. The techniques themselves were not new, but their combining became a bit of a revelation when Ethan Marcotte wrote his article in 2010, on the website A List Apart.

#### 1. Fluid grid - let the design fill the screen

Grid-systems are hardly a new thing; the biggest difference is that with responsive web design, we specify width in relative terms rather than in exact pixels - relative to the space available on each screen, that is. On mobiles and other small devices, it is often difficult to accommodate more than a single column of content, which means that the different page components need to be stacked on each other in a long column. On a small screen, it is particularly important to use the entire width so that no valuable space is lost in unnecessary margins. On a bigger screen we can afford more margins without being wasteful.

    Figure 33: Fewer columns when the browser-window becomes smaller.

#### 2. Media queries - to adapt the design to the available space

With media queries, you can have _breakpoints_ identified which makes for a customized design based on different screen width needs. At some magical points, the breakpoints, the page can display itself in a different way. For example, changing the number of columns depending on the content's needs. It is common for things of lower priority not to show up at all if the space is too limited - decorating images for example.

Often, at least the start page is one big compromise where everyone with influence gets something important to them on the display. Via a mobile, most of what you are used to from a desktop computer is not visible at first sight. Therefore, the ranking or prioritization of the content is one of the more difficult tasks a web project faces.

Media queries is like grouping the screen-sizes and specifying how the design should behave within a range. It is kind of like planning for several editions of the website.

Possible examples of a set of media queries in a CSS file:

  1. _@media (max-width: 350px)  
_ Instructions for small mobile screens.
  2. _@media (min-width: 351px) and (max-width: 767px)  
_ Larger mobile screens and many small tablets.
  3. _@media (min-width: 768px) and (max-width: 979px)  
_ Tablets and some computers where the browser is not set to full screen.
  4. _@media (min-width: 980px)_ Tablets and computers where the browser width is at least 980 pixels.

In the above example, there are four alternative designs of the website. You can end up with four different size-customized editions of the website for it to support optimal content presentation.

    Figure 34: Layout for a large screen. Note the ordinary top-menu.

    Figure 35: Layout for a medium-sized screen. Now with a hamburger-icon for the menu, and no arrows on the featured image itself.

    Figure 36: Same design for a small screen.

We should not heedlessly choose three versions of the breakpoints as in one for desktop computers, one for Ipads and another for the most recent Iphone. It is not necessarily those devices your visitors are using. But, perhaps mainly, because there really is no way to show exactly how a responsive website will look. If you take anything away from this part of the book, take away the idea that it is no longer meaningful to select a canvas. The point of responsive design is to surrender to the fact that screen variation is too large given all the kinds of devices the content will end up on. Instead, we should set the breakpoints from when the content needs to be broken up to present itself well, in order to achieve the desired effect with users.

#### 3. Flexible images

Not only is the grid fluid, but also image width is set relatively so that their size is scaled up or down in line with other design elements. This is an alternative to programming exact pixel dimensions. Often image width is set to one hundred percent of the available space and the height to automatically adjust to get the correct proportions. This is fine on your own website, in a context you control, but just like any other content, images are reused in other contexts. Images are in fact often reused on other websites along with a link to the page the image is displayed on. A context that you have as little control over as the size of screen your own users happen to have.

    Figure 37: Photo dominating on large screen.

    Figure 38: The same photo shrunk to fit a mobile device.

Many have designed their newly built responsive sites with decorative, inspiring and awesome pictures. Many images are huge, not only in space but also in file-size, because the same image is to be displayed over most of the screen for some visitors and scaled down to appear on a small screen. A large picture becomes very heavy for it to have any image quality. It is especially difficult if there are people in the picture since we are sensitive to image optimization side effects on areas with human skin.

I can only guess why designers went for huge images but I think that with responsive technology, it was easy to get the design to work so well with leading images and image carousels. That and the fact that too many designers completely forgot that not everyone has a high-speed connection to the Internet as we do when at the office. Even community actors, such as municipalities, which are expected to serve the public through their websites, were blinded and used such massive images that homepages could sometimes weigh 10 MB. That can amount up to two percent of a mobile's monthly data plan for some mobile users, consumed by mostly meaningless image carousels.

Depending on who you are talking to you will hear about lots of other things that are included or not included, according to them, in responsive web design. The points above highlight a number of challenges that we need to take care of - among other things that the images cannot be scaled up without looking bad. Since the pixel-density is different on mobile screens and many desktop screens, and mobile connectivity cannot be compared to an office high-end connection is perhaps more what is included in the concept of adaptive web design (AWD). A modern website will of course take care of these challenges and many believe that the concept of AWD is not even necessary (more on AWD later in the book).

### Arguments for responsive web design

If you still need convincing about why RWD is something to consider, here are some common arguments of varying weight depending on the type of business you have on your website.

#### 1. Google believes that RWD is the industry standard

> "Responsive Design: serves the same HTML for one URL and uses CSS media queries to determine how the content is rendered on the client side. This removes the possible glitches of user-agent detection and frees users from redirects. This is Google's recommended configuration."
> 
> \- Google, on their developer pages discussing various techniques for smartphones (2014)

**Glossary - user-agent**  
Information the web browser relinquishes to each website during a visit. This tells the website which browser is being used, what version, on what type of device, the operating system and some other information. This is used on websites to know what type of equipment the user has, for example, to decide if the user has a touch screen, etc.

Besides recommending RWD, Google brought up two other examples; to serve different pages depending on the _user-agent_ of a browser and the variant to send the visitor on to a separate mobile website. In their reasoning as to why you should choose RWD, they mention that it is the easiest solution among the alternatives. Since Google accounts for the majority of visitors you can get through search engines, their statement should not be regarded just as a suggestion on how to design your website (we can discus over a beer why they make this recommendation when you're in the right conspiratorial mindset).

#### 2. A single website to set up (for all types of devices)

Put all your energy into building one single good experience for all kinds of devices and screen sizes. There is no longer an option to build a mobile website since mobiles have very different sized screens. We would have the same type of challenge - except that you also have at least two different websites to administer.

Bear in mind that one person can use several kinds of devices to visit your website in a single day. Say that the person starts at work on a desktop computer to continue on a smartphone at the bus stop. If the website is not responsive, you will either have to start afresh on a mobile website and maybe find the corresponding page, but it may also happen that the usual website shows its worst side, which causes many to give up.

One of the CMS vendors, Epi, did a study in 2013 that found that 70 % never returned to a mobile website that was difficult to use, and it is probably at least the same for non-responsive websites visited on a mobile. Every other user asked was irritated by the fact that many websites are not designed for smartphones.

#### 3. Logical URL strategy

A responsive website has just one address. Also, a sub-page has only one address no matter what and can be used by a mobile user or someone who is using a different type of device. You probably have got e-mail from a friend, containing a link, clicked the link when on a desktop computer and landed on a mobile website - or vice versa got a desktop website on your mobile. It is of course not ideal and rarely optimal for what the publisher wants you to do during your visit.

With a responsive website, there is no need to be shuffling users back and forth between different versions of websites depending on what type of device they have.

#### 4. The fewer sites you have, the easier to manage and maintain

Content management is easier and to follow up on how content is performing, you do not need to jump between different web stats accounts since the breakdown by type of device is only a simple segmentation in your analytics tool. All campaigns will be easier when you direct visitors to the same address. You get a better overview of how your landing pages are performing and can easily see on what type of device you can do a better job.

#### 5. Responsive is more future proof

A well-executed responsive website works better on devices that are not common yet. At worst, you'll have to make some minor changes later on. Say, for instance, that your visitors suddenly start using a browser on a gaming console on a TV. That scenario is well covered in RWD but in a non-responsive setting would mean yet another special website (re-using the arguments to have a separate mobile website).

### Notes on responsive construction

The content itself does not automatically adapt to a small screen just because you made the website's design more responsive on more kinds of screens. Among other things, it is a common comment that the main headers fill half the screen height, or that decorating pictures get in the way of actual content. In other words, you have a great opportunity to revise your content from a mobile user's point of view. I would recommend the content genius Karen McGrane's book _Content Strategy for Mobile_ 23 and try googling her lectures on the topic for a lot of good advice.

Some may certainly ask themselves whether to upgrade their existing website design or start from scratch. It depends on whether you believe that the content's presentation and structure will function in a mobile scenario. If not, I think you should consider starting from scratch.

A large part of a responsive project is how to present pictures in the design and whether you even can afford to show pictures, which often take more space on a small screen when they compete with other content. In my first responsive project, we looked through the most visited web pages to see if we could do without images. Most often, the images were just decoration, which made the choice easy to instead prioritize the header, preamble and body text.

There are many creative solutions on how to try to resolve not sending unnecessarily heavy pictures or embellishing imagery to the small devices that do not really have space on their tiny screens. The variant we chose at Region Västra Götaland was that if the web editor does not expressly choose that the picture is also to be displayed for mobile visitors, it is replaced with a transparent pixel-sized image - which draws almost no bandwidth at all. The caveat with such decisions is that we cannot necessarily assume that a user on a small screen device is using a mobile connection; they may be using a hyper-fast Wi-Fi network. At the same time, I am one of those who use a mobile broadband at home on my desktop computer. Screen size simply does not automatically tell you what kind of connection is being used.

    Figure 39: Photo 1024 pixels wide, 263 Kb file-size.

    Figure 40: Same photo, resized to 320 pixel width. 33Kb file-size. Not as easy to identify the tall man, even for those of us who know who he is, Sweden's former foreign minister, Carl Bildt.

    Figure 41: Cropping the photo makes a big difference to recognize the content on a small device. 320 pixels wide and 34 Kb file-size.

The challenge of selecting images to use in a responsive scenario is to get just enough detail into any images portraying something or someone a visitor should recognize. A high-resolution screen the size of a tablet or bigger are in many cases able to display images a bit more zoomed-out, for example, 1024 pixel wide. At the same time, a small low-resolution screen will not always manage with exactly the same image scaled down in resolution, such as 320 pixels. Important details might disappear.

The thing to consider is that the image resolution, i.e. the granularity that provides clarity and detail in the image, in a mobile-optimized image is only about a third of a larger tablet. 320 pixels compared to 1024 pixels in standard-definition (SD - sometimes called the native resolution).

To use the same picture on a small, low-resolution screen that on a bigger screen is difficult when there are fewer pixels to give clarity to the smaller picture. You then need to crop the picture and remove some of the outline to favor a view of what the image is really portraying. If you show the original image on a small but high-resolution screen, you will send an image as heavy as for larger screens except that it appears smaller and super sharp - something that can be overreaching but definitely will affect download negatively.

Therefore, we need to choose images with great care, or replace them, so they fit the smaller-sized or less than optimal screens. At least it is worth checking how many users will suffer from a lazy photo editing job; there may not be a problem on your particular website. In other words, it is crucial to understanding and meaningfulness for pictures, that priority be given to the clarity of what they depict. Even more challenging is information graphics, full of text, charts and symbols. These challenges are what you look for when you go through the website to see if it has the prerequisites to be upgraded to RWD or whether it is best to start from scratch.

As you can see, responsive imaging is about level-headedness. The only advice I can give is to have healthy safety margins and see for yourself what your images will look like on different types of devices.

And, when writing, at least, the future standard is to specify multiple images of different sizes, where only the one matching the breakpoint in the style sheet is to be displayed. Say you have a media query for a breakpoint where all screens smaller than 512 pixels wide get a smaller, cropped, more obvious picture while all screens of 512 pixels or more, the larger, more sharp image is displayed. Then those with small screens with low-resolution images can get their image almost ten times as fast, not to mention that you may customize the version of an image suitable for a certain screen size, making it easier for the visitor to understand the content. Then you can make customized cropped versions for screen sizes specifically needing it.

> "We're not designing pages, we're designing systems of components."
> 
> \- Stephen Hay, @stephenhay on Twitter

Web designer Brad Frost has described a system to think of websites as a collection of small pieces - _Atomic Design_ 24 \- which is one of several ideas to move away from the heritage of classic desktop publishing, and its static page templates. I can personally recommend this, along with exercises to cut up prints of an existing website for insertion into one long column, to get non-designers to think more responsively about content they are already familiar with.

A website's smallest design element is, according to Frost, the atom, which can be combined into molecules, which can become an organism for display in a page template, which with the navigation organism and other standard components form a web page. In this way, the search field is an atom, the search button another atom, and together they form a molecule of the website search interface. Together with the header's other molecules, the website search forms an organism. Same with the page content (template), and the footer, all the parts constituting the web page. By dividing each web page's ingredients down to atoms and piecing them together, you get a method that allows you not to get stuck with what should be removed to work on a mobile – instead you design your interfaces atom-up.

Speaking about mobiles. The most common argument I have heard against RWD is that it is not needed. What is common among these opponents of RWD is that they have more expensive editions of Android phones and a larger screen than the average person. These phones do not suffer quite as much from old-fashioned design. Then it becomes desirable for users be able to jump between the different breakpoints, if they want to find something that they know exactly where it is located on the desktop version of the website. Some mobile browsers have a function for this, where the user can choose to instead fetch the desktop version of a website. Another choice that will possibly emerge is to offer the choice of website version, maybe among other accessibility settings as text size, contrast and more. If you have an advanced or tech-savvy user base, they might expect to alter these settings.

### Responsive typography

Since the column width on small screens is quite narrow, it gets risky with long compound words that might stand alone on a line, or in the worst case, cause unnecessary horizontal scrolling. In editorial and static text, you can prevent this by using a soft hyphen. Then a word is hyphenated if necessary. This soft hyphen is created using the code below:

    <h2>Subdermato **& shy;**glyphic</h2>

It is _& shy;_ which hyphenates if necessary. It is worth bearing in mind that this snippet of code needs to be sent unchanged to the visitor's browser to work. That might force you to switch to code view in your WYSIWYG-editor, and it might not work in all text fields. Something you may have to talk to the developers about is to resolve problems with the length of the names of menus, maybe by using Javascript to replace each item with shorter versions of the names. For instance, if the page width is below a certain level, you replace text, 'and' becomes '&', or whole words are replaced with slightly shorter ones so the text does not break the design.

Subjective taste alone does not determine design, such as column width. When the content type is text, there are typographic rules to adhere to. Ideally, you have somewhere between 45–75 characters per line for good readability, which affects how wide a column can be if the text should align more or less with the column's right margin. The content demands a breakpoint if the lines are too long, or if you set a maximum width for text paragraphs solely for the sake of text readability. You can still allow for other elements, such as images on the page, to be wider than the text's line length.

Web editors needs to use &shy; to hyphenate longer words, so narrow columns do not display only one or a few words per line since readability suffers. If you are not the only one to feed the website with content, it is surely a good idea to have a soft hyphen on the cheat sheet for the editorial tricks all web editors should know about.

### RESS - Responsive Server Side

A logical sequel to responsive web design is _RESS_ (Responsive Server Side). It differs from an orthodox RWD by advocating that it is ok for the web server to find out which equipment the visitor is using and send a customized version of the website. It is a bit like the old days when we sent different style sheets depending on whether the visitor was using Internet Explorer or Netscape as a browser. The difference is that today, it is much more complicated than just two different web browsers, nowadays some browsers even have APIs for cameras and light sensors.

A simple example of where RESS can differ from a regular responsive website is that if the web server discovers that the user's connection is slow, the server will not send several megabytes of decorative pictures, instead just the bare minimum. If it seems to be a desktop version of a browser, you might choose to send heavier media content and prioritize a visual experience. Most first generation responsive websites send far too much data to mobile devices. That is the most common criticism. Broadly, with RESS, the same content is sent to all users; however, you customize the experience, unlike orthodox RWD, where it is justified.

Now you may wonder how the web server knows so much about its visitors' equipment. There are several tricks, but the most basic one is to check the user-agent, i.e. information about the software you use to connect to the Web. This is what my user-agent looks like when using my computer:  
 _Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11) AppleWebKit/601.1.56 (KHTML, like Gecko) Version/9.0 Safari/601.1.56_

Any web server I connect to can decipher my user-agent and see that I use a Macintosh operating system, version OS X 10.11, the Safari browser and the rendering engine WebKit controls the appearance of web pages. Check your own user-agent online.

On an Iphone, a user-agent can look like this:  
 _Mozilla/5.0 (iPhone; CPU iPhone OS 8_4 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12H143 Safari/600.1.4_

For a Google Nexus 5-phone, it may look as follows:  
 _Mozilla / 5.0 (Linux; Android 4.4.2; Nexus 5 Build / KOT49H) AppleWebKit / 537.36 (KHTML, like Gecko) Chrome / 32.0.1700.99 Mobile Safari / 537.36_

As you probably noticed in the earlier examples, a user-agent is great for deciding what type of device the user uses. As on a computer or touch-based device, whether it is Windows, Linux or Mac, Android, Windows Phone, Apple iOS, etc. There are many conclusions that can be drawn only from the user-agent in order to send customized HTML code and content optimized for each platform. Among many things, you can use the user-agent to work out what type of optimized content you can send, such as the until now not so common image format WebP, which is about 39 % smaller in file-size compared to conventional JPG files for photographs.

Besides looking at the user-agent, a web server may need to know other factors that affect how you want to present your website, such as how fast a connection a user has. Exactly what a browser support constantly changes. The next time you have reason to do some development, check out what can be discovered by a web server.

Sorry, this section was a bit lengthy but there is so much to say about responsive web design as it is an emerging default for all web projects. Now we are going to talk about the successor, according to some, to responsive web design.

## Adaptive web design

The idea behind _adaptive web design_ (AWD) is to tailor each visitor's experience, in detail. It is about using all opportunities available, by so-called _progressive enhancement_ , which provides a good basic function to all visitors and lots of improvements depending on what features are supported by the visitor's browser. One of the first adaptive design techniques often used was to allow modern browsers to take advantage of CSS3 early on, such as by applying rounded corners on boxes, while those with an old version of Internet Explorer got an uglier design.

I might as well mention directly that the discussion of this term's necessity makes many of my friends at web agencies somewhat glassy-eyed, while pointing out that we do not need this concept since ' _all that is included, of course *sigh*, in a good responsive website_ '. I'm inclined to agree with them, but the reason I still like to bring up AWD in this book is to explain the variety of things available to customize. Depending on who you talk to, you risk missing that part of whatever a good responsive web might include at any time.

    Figure 42: A website is asking if it can know my location.

Think of AWD as a responsive website with the difference that with AWD, it is ok to give unique experiences depending on what context a visitor is in and to follow design convention applicable to each device-type, or even a particular device. An AWD website may send different HTML code for different kinds of devices, which is the biggest difference compared to RWD. When the user's conditions are known, and worth adapting to, an optimized version is served. Many times you use Javascript frameworks, such as _Modernizr_ , to get in black and white what each user's device can handle. Modernizr checks how much support the user's browser has for CSS3 and HTML5 capabilities.

Examples of conditions that can vary and be useful in the design of a website that utilize AWD:

  * **Does the device have a camera?** Can be useful if the user should be able to take a profile picture, or maybe shoot a picture of the object to sell online.
  * **How is the device controlled?** It may be important if there is a mouse pointer, if it is a touch interface etc. Otherwise, it can be difficult to interact with certain design elements. It may be neither a mouse nor touch screen but perhaps, the user's entire body that is some sort of remote control instead.
  * **Is it possible to pinpoint the user's geographic location?** This is crucial for location services but may also act as input for search criteria, such as geographical closeness to the user which can be used in a relevance model when the user does a search on the Internet.
  * **What are the possibilities of sound, film and vectorized images?** Among other things, it is good to know what formats the user's device accepts so you can send the most suitable format that can be quickly transferred.
  * **Does the browser support local databases?** In some web applications, much work is carried out locally in the browser, and primarily what gives sluggish experience on the Web is sending the material back and forth to the server. A prerequisite for using the browser as a client software is in many cases that it can manage a local database.
  * **WebSocket to allow real-time updates in the browser.** Used to communicate between different users, and between the user and the web server in real-time games for instance. Messages such as those when you are informed that the other user has started typing a response, among others.
  * **Web worker available to build applications?** Web worker is a solution that uses Javascript the programming language of the Web. The user's browser can now host applications, which are used in Single Page Applications (SPA, more on this design principle later in the book).
  * **Can the device make a phone call?** Used to convert phone numbers into links initiating a phone call. If not present, you might link to a function where the user can ask to be called instead.
  * **Resolution of the screen?** There are so-called high-DPI screens (sometimes called retina because of Apple users), on which normal resolution images are perceived as blurry. Especially true for text in images and even logos.
  * **What is the speed of the Internet connection?** As an example, if video is to be streamed, it requires a certain bandwidth to be worthwhile. Otherwise, you may offer a discreet suggestion to download. When on an acceptable to good bandwidth, the server can automatically select a suitable bandwidth use for it not to be buffering interruptions in the playback.

AWD does not necessarily mean that the content, such as texts, changes depending on the visitor's context, such as their geographic location, for example. It is more of an overall website design level. Granular control over what content shows up belongs more to the concept of personalization, however, with the exception that an AWD web needs to prioritize content suitable to be displayed on smaller screens or odd devices.

However, we can, and perhaps should make sure, at least as a community actor, that an incredibly condensed version is offered to those with clearly substandard Internet connections. If transfer is particularly slow, the visitor gets a lite website with just the essentials. There, the user themselves, like on a mobile website, can choose to go to the full website if they have the patience or switch to a better connection. It does not send any images or materials heavier than text and basic design to bring just the most essential content. This solution might serve as some kind of emergency version if the server or the network becomes overloaded.

**Example 1: Fast Internet access**  
A fast 3G connection at nearly 16 Mbit / second downstream and 0.048 seconds response time.

**Example 2: Slow speed**  
Worse 3G connection at 0,035 Mbit / second and 1.2 seconds in response time.

You might be thinking that it is not so bad, but then you have probably never been on the outskirts of the 3G network and experienced a visit on a responsive website. For those who find themselves in the countryside, this is the normal situation. To exemplify the two situations, I measured my connection speed. When downtown in Gothenburg, I had 16 Mbit / second downstream and 3 Mbit / second upstream with a delay of 0.048 seconds. In a cabin, out in the woods in Fengersfors, there was however a bit more modest 0.035 Mbit / second downstream and 0.4 Mbit / second upstream, with a delay of a full 1.2 seconds!

If I try to put these numbers in perspective, in my cabin it was slower than surfing with the dialup modems of the 90's, those 56 Kbit / second modems, with over one second of lag for each file before it started sending. To make matters worse, today's websites are immensely more burdened with large images, Javascript and style sheets, each of these files adding a second to the wait time even before starting the transfer.

Let us calculate a little how this might affect a visitor on a little too _obese_ responsive website at 5 MB divided into 30 different files - a weight many Swedish municipalities deemed suitable with their fine responsive homepages. With the fast connection example from earlier, such a website takes about **1.4 seconds of delay plus 3.2 seconds to download.** For an abnormally heavy webpage, it is quite ok that it is loaded in under 5 seconds. The same material sent through the slow connection example though instead takes **35.8 seconds of delay and 143 seconds to download.** That is almost three minutes!

Where AWD shines is that it is more obvious that you should not send files to the visitor they have no benefit in receiving. Many responsive websites often send photos in a larger format than the visitor's screen can display, not to mention that images are sent and not even presented to the visitor. There is of course no point in being so orthodox to responsive web design that you do not apply this kind of optimization, and personally, I think optimization is what caused the discussion on AWD vs. RWD in the first place. AWD supporters claim that their version is better optimized and loads much faster - while RWD supporters believe that AWD people build multiple sites in one.

The question is how many devices the website should be optimized for. The five most popular? Which devices are popular can actually change quite quickly. The number of unique kinds of devices your visitors might use is amazing. At our mobile moment, in 2013, there were 355 different kinds of mobile devices identified on the relatively well-visited website 1177.se, which points to the need to think responsive even when building AWD since screen sizes vary greatly and you cannot provide unique customizations for all of them. Begin with your web statistics, if you are selling something on the website, you can look at what devices are less profitable than average - if there are enough of them, there may be reason to find a solution. For non-commercial websites, you have, I hope, other metrics indicating whether a user's visit was worthwhile for both of you.

    Figure 43: Of the mobile user's at 1177.se during 2015, 66 % use an Iphone or Ipad, 33 % Android, and the remaining percent Blackberry, Windows Phone, Nokia, etc.

The design of most AWD websites is like any modern website that follows RWD, but unlike RWD, the experience can differ vastly between different types of devices.

Personally, I think AWD is most interesting as a counterforce to the urge so many seem to have to build simple information-apps for mobile phones, instead of taking care of their existing website. With a good dose of Javascript and a little redesign, any website can look and behave like a mobile app; then it is just a question of putting a shortcut on the home-screen for the mobile. It is also on mobiles that the interesting elements of AWD pop up since you have other design conventions. The design should feel more like a mobile app in the navigation and esthetics of design elements.

HTML5 has an API for battery status on mobile devices; an AWD website should obviously use this when it makes sense to, among other things, to reduce the amount of network requests to conserve the battery's remaining capacity.

    Figure 44: Gray bottom and black text is perfect during the day.

    Figure 45: A dark theme is great at night or in dark surroundings.

Ever felt that the lighting on your mobile is too bright at night? Most probably! The Web standard organization W3C has developed a recommendation called _Ambient Light Events_ , which allows us to take advantage of the light-sensor that is present in some computers and almost all mobiles. What to make of this information when designing a website is up to us, but personally, I would probably make a design for daytime and one for nighttime. Many of today's applications have this feature but you have to toggle manually.

How do you know that your website performs as well as possible? By continually evaluating data on users' behavior of course. This will be our next topic.

## Design with data - a data first-approach

When you choose to let data guide the design of a service, it is a way of working with what evidently actually works, or does not work, and what could perform better. It is a way to gain experience in the usage of a service, for the next update to be even more in line with what users expect. A rather unpretentious approach to design.

If you are going to use data to guide design, you will first need to specify what the website's purpose is, what it should be able to help achieve. What role it should play in the bigger picture. In some cases, you might not have a clue why the website exists, then a look at the organization's business plan is a good idea - there overall goals should be found, goals to break down into measurable goals for how the website can bring value for the organization. If it is more of an information-heavy website, the definition of exactly what well-performing metrics are for the website might be found in a communication plan or project directive. The goals for the website to strive for should be documented and reconciled with all concerned.

Being able to break down an entire website into multiple parts that are graded according to whether they are valuable is probably not an approach everyone is used to. If you do not feel that you can see or explain what benefit the website contributes to the bigger picture, then this approach is a great way to finding out and discovering what can be improved.

To develop _metrics_ , often-called _KPI_ 's (Key Performance Indicators), is a tough job and slightly off topic for this book but it is worth mentioning that you will be working with both quantitative and qualitative means of measurement.

The difference between them is that:

  * **Quantitative metrics** depend on automatically collected data, such as signals indicating a behavior in actual use of a website. These figures answer questions such as _who_ , _what_ , _when_ and _where_ about the use/user. This information, among other things, is to be found in your web analytics tool.
  * **Qualitative metrics** are data based on interviews, surveys, and other things that reflect people's own subjective beliefs or actions. Data refers to these people's answers to questions as to _why_ or _how_ about their reasoning or behavior - anecdotes in other words - but can also be based on observation of users when given a task to solve.

An example of quantitative measurement is a figure for the number of visitors from Europe during the month of January. A qualitative measurement may be that most visitors, in the mobile category in a web survey, indicated that the service works much better generally on mobiles since a redesign. Quantitative values suggest the extent and qualitative values gives perspective.

### Get started with design with data

If you are starting completely from scratch with a service, you perhaps do not have any own data to work with. But if you are creating a digital equivalent of an offline business, there are certainly some conclusions to draw from offline to make qualified assumptions for the initial online service design. Whether you have an existing business or not, digital or offline, it is a good idea to do some research. Check the digital counterparts if there are any, and especially potential competitors. Being guided by the data means that you are prepared for continuous change, which means that you need to be careful with your data collection. Quantitative data is often useless if we compile it to a too high abstraction level, but in an overly narrow segmentation, statistical base becomes so small that it is hazardous to draw any meaningful conclusions.

As you have probably realized by now, it is a difficult balancing act to choose the questions your data can answer. Look critically at your data. I hope that you will not have any unpleasant surprises later on (when someone else reviews your data).

Your metrics should be specific and isolated enough to answer a question about use. The number of visitors to a website is interesting and all, but to make a website better, it is more important to know how many people abandon their shopping cart before payment. Instead of being satisfied that the website has 10 % more visitors than last month, the person working guided by data will look at more meaningful quantitative data (When is the shopping cart abandoned? At checkout, perhaps?) Moreover, use qualitative data, such as asking visitors about their experience of the buying process, to optimize even further.

By now, you have probably figured out that the website needs to have clearly stated goals, and preferably documented, for many of the individual sub-pages on the website - frequently there are several goals per page. These goals form the long list that almost every website modification should align with and strive to improve.

Some pages will have goals that conflict with the site's primary goals, or what many consider a criterion of communication with the visitor. For example, it is common to want a low _bounce rate_ (when a user visits only one page and then leaves), therefore, that a sub-page does not result in visitors leaving the website. But there are pages whose content refers to other sites. In such an instance, a high bounce rate is a measure that visitors understand the purpose of the page, accept it and use the link to the other website. This is an example where the overall goal to keep visitors on the website is opposed by some sub-pages. Reports of how the website is performing must be adapted, so an increased proportion of visitors to certain sub-pages will not make it look as if the website is not performing well.

### What you know about your visitors

To know what is worth adjusting on a website requires a certain knowledge of who your visitors are. The different types of visitors are your segments in this type of design. A segment is a recurring visitor; another is a logged-in customers who has placed a certain kind of product in their shopping cart. What you need to think about, and collect data on, is what is needed to make each segment to convert - then change what you need to on the website. A conversion can be to recruit a new customer to place an order, but it can also be a less grandiose goal such as getting a visitor to click on a link to the organization's profile on Instagram. Which converting activities a visitor can do of course depends on what part of the website the visitor is on. That each page should be contributing to converting visitors will certainly be shocking to some web editors, especially in the public sector (who historically have been a bit lax on what needs their websites should aim to fulfill). Nevertheless, where an online shop is trying to make money, the public sector is more about getting visitors to know their rights, use digital services or click through to other relevant sites.

It is crucial to know why a page exists, what it is supposed to bring to the bigger picture and how it meets the overall objectives of the website. If you cannot answer the question ' _why_ ', the page should probably be deleted.

Optimization needs to be done within a smaller segment of users and the greatest benefit is that if you can find out which ones have the most difficulty converting - i.e. what type of visitors you cannot manage to convince to do a desirable action on the website. It may concern segments such as ' _those with the greatest problem to get through the checkout process with their shopping cart_ ' but also may involve converting ' _visitors entering on the landing page for a trip to Greece_ '. To find these segments is something that requires business knowledge and it is quite difficult to hire consultants to remedy this. The answer to these challenges is not always to redesign the website. It could very well be to reach out with a newly created landing page for destinations in Greece, or to make it easier to see what users are supposed to do on a certain type of page.

In some cases, we know quite a lot about visitors, sometimes because they are logged-in and have provided useful information about themselves, such as the sector in which they work. Sometimes you do not know so much and you have to lump them together according to how they ended up on the website. Did the visitor come through a campaign link, a newsletter or spontaneously via googling? Yes, it can make all the difference in whether your website is capable of converting the visitor into a customer, a fan or whatever you are looking for with the website. It is in these details you are looking for new segments to improve how your website delivers based on the quantitative data on how users behave. Then if you find a large segment of visitors who do not do what you want them to on the website, you have a great untapped potential to work with. If you have information about logged-in users belonging to the public or private sector, or other grouping that are underperforming, just start by thinking about what could be better. All the data about your visitors is good data when you like a detective is looking for the underachieving group you should try improving first.

Something you have probably already seen, but possibly not reflected on, are the checkout deals available in many web shops just before you finally pay. This is the digital counterpart to when you find candy, batteries, magazines, and other well-chosen items close to the checkout where you queue up to pay in a physical store. Of course, these checkout deals should not to be static and the same for all customers on an online shop. What is offered first should be something that complements the purchase and offers a good profit margin. The risk is that you lose customers through the introduction of this intermediate step at the checkout.

How do you know that the step at the checkout is worth the risk of your customers abandoning their shopping cart? Alternatively, what extra products should you offer before the purchase is completed? By testing various other designs! You can use the data to see if it is worth having an extra step before checkout for your customers.

### Continuous A / B testing

By continuously working with so-called _A / B tests_ , you test your hypotheses for which design is most successful to convert a visitor to a customer. An A / B test is a competition where you prepare two different design versions of something you want to measure and see which design gives the best effect on real users. It may be small details, such as which text works best for the button to add an item to the shopping cart. How visitors react to a landing page, to measure how visitors react to major design decisions - such as the number of columns with content. Or why not find out which extra accessories customers choose for each product.

Example of A / B test to perform on a design:  
Version A – Current page template with two columns for the page's unique content.  
Version B \- Alternative page template with only one column to give more area to display photos, maps, videos etc.

    Figure 46: Version A with two columns of content.

    Figure 47: Version B with a single column, an attempt to make the map better.

These versions are randomly distributed to visitors during a test period, and it is important that you split up the visitors instead of switching versions every other day. Half will receive version A and the other half version B. It may be that the test is designed to run on a particular segment, a subset of visitors, to be able to isolate the hypothesis. For example, we may want to test different design ideas only on new visitors and see if you can get them to convert to a greater extent. It is not a good idea to test major design changes on returning visitors as they already have an idea of how the website should work, if you do this, try to avoid surprising the visitors.

To test matters of design you may want to evaluate how new design elements are received by users, or how many columns of information work best for those with small screens. Or whatever you want. Once the testing phase is over a winner is selected, probably automatically based on the success criteria you set up in advance, and will be the version that continues to be used in the future until a new contender comes forth. Sometimes, the test will not be conclusive as there is no significant difference between the versions, but regard that as a valuable lesson for future tests.

    Figure 48: From Foxbusiness.com, notice the "More on this". The placement of these suggestions in articles is probably not coincidental on news websites.

Be careful and avoid designing your goals in a way that conflicts with your visitors' interests. For example, it is not a great idea to reduce the emphasis of links to your partners' websites to achieve your website's overall goal to keep visitors on your own website. Statistics are not an end goal in themselves.

### Examples of A / B tests for monitoring the website, and other communications

How your A / B tests should be designed you know best yourself, but I want to introduce some examples of tests that may be worth looking at.

  * **Design of buttons and links, and their text, color, size and placement.** In addition to obvious usability problems such as making sure that something is easy enough to discover, it could also be that the color and distance to other design elements, etc. play a major role. How to style text can make all the difference in the world, for example; it depends on your visitors' interest in technology if a link should say ' _Subscribe_ ' or ' _RSS Feed_ ' as link text - this needs to be tested and measured.
  * **Design two alternative forms for sign-up.** A person who backs away or encounters usability problems with the form is a lost customer.
  * **Evaluate photo selection on landing pages.** Which type of imagery works best to inspire the visitor to add something to the shopping cart?
  * **Length, and wordings, of texts.** It is worth testing how product texts and headers are composed. Do texts need to be shortened for mobile visitors? Try.
  * **See how customers react to being reminded by e-mail about products in abandoned shopping carts.** Can be a series of tests to figure out which customers react positively to a reminder but it may need to be tested if it works to varying degrees depending on the type of products.
  * **Send two different newsletters, and measure which one has the lower percentage of unsubscriptions.** Just as data about which parts of a website are popular, you can measure what content is appreciated when sent to people. It is not necessarily the same content they are browsing on the website that is suitable when sent to their mailbox.
  * **Test different offers when mailing.** To send offers very few are interested in is a waste of everyone's valuable time and attention. If you have not yet tested enough how to make the most of each mailing, test two different versions of mailings and then compare their efficacy against the business goals.
  * **Placement of a secondary call-to-action.** Since many important pages can have multiple goals, it is a good idea to test the optimal placement of call-to-actions other than the primary ones. It is probably not a coincidence that many of the established newspaper websites have tips on related news embedded into its news text. I guess that probably the exact position in the text is not haphazard, either. Rather it has been evaluated to see how far into the text such a link has the most effect to keep the user on the website. It all comes down to generating those page views you can sell to advertisers.

Someone who is well versed in testing and optimization when selling online is the Internet giant Amazon.com. Late 2013, they were granted a patent on sending goods even before the customer ordered anything. To summarize Amazon solution in the patent's abstract:

> "A method and system for anticipatory shipping are disclosed. According to one embodiment, a method may include packaging one or more items as a package for eventual shipment to a delivery address, selection of a destination geographical area to which to ship the package, shipping the package to the destination geographical area without completely specifying the delivery address at the time of shipment, and while the package is in transit, completely specifying the delivery address for the package."
> 
> \- US Patent and Trademark Office's registry

How would Amazon be able to do this? Perhaps by keeping a close eye on visitors' behavior and relating it to the sales. They track your behavior through massive data collection, for example, the likelihood that you will be ordering granola muesli increases if people in your geographical vicinity suddenly start buying it, or those you follow on Twitter write a lot about it. However, the trend does not need to follow a geographical or social graph; it can consort with other factors. A rather peculiar example is the American store Target when they, through a teenager's buying patterns, realized that she probably was pregnant and offered products based on it. The teen's father was dissatisfied with the store suggesting his daughter would be having a child so young, while his daughter was in fact pregnant.

## Mobile first

To build a website for mobile users is not just about making it usable for those with small touch screens. Mobile first is the title of the book on the subject by Luke Wroblewski which puts the mobile user's needs first, both in terms of advantages and disadvantages.

When going mobile first, you do not expect the user to have a stable Internet connection at a high speed. Nor that the environment around the user is ergonomic. Rather, you are inclined to think the user might be on a bumpy bus ride in the wilderness with a blinding sun, or soon to enter a tunnel and lose their connection to the Internet.

Studies show that those who surf on mobiles are more impatient than others are, probably because they learned that it is not worth the wait. This makes for extra prioritization of the speed and usability. Mobile visitors also expect a different design convention compared to the desktop web, and how the interaction is managed. For example, it is natural to swipe images sideways if you are primarily a mobile user and the many services we are used to in apps adapt content to the person's geographical location. There are simply different expectations from users, expectations we need to consider.

A not entirely implausible consequence of making your website mobile first is that returning desktop users not encounter unavoidable confusion on where content has gone, but also that on a desktop computer, it is to such an extent simplified that the user can be inclined to believe that an error occurred. At the same time, we cannot wait until most visitors use a mobile phone. Therefore, we need to design something simple, which without compromising mobility can scale up to meet the expectations of those on a desktop computer.

### Mobile first vs. responsive web

The difference between a responsive website and a mobile first-website is what is considered normal, what equipment we are optimizing for. Most responsive projects I have heard of assumed that the one website would be able to adapt to different sized screens to display the content. Responsive websites tend to be about two or three editions for a small, a medium and a larger screen. Since the big screen is easy to default to, the ordinary way of thinking is that the mobile screen size is an exception to a desktop screen which then is scaled down to a mobile experience. In this classical way of thinking of web design, one does not necessarily consider the mobile as a constant to relate to.

    Figure 49: Adlibris was the first 'mobile first' website I encountered from my desktop, it looked a bit, sparse, compared to its predecessor. Only two links in my account, one for previous orders and one for my digital library of ordered books.

In many cases, a website following mobile first even becomes better on desktop computers compared to earlier websites since we have to prioritize what to put on the limited space available. Often the appearance is more consistent between different types of devices compared to responsive websites, which makes it easier for all of us who use several types of devices regularly.

    Figure 50: Website clearly making the most of the available space.

    Figure 51: Enhancing some details of the design for bigger screens.

With respect to size, mobile first is designed for the mobile screen, and then if you have the opportunity and know what to add on a larger screen, it is added. Similar to RWD but starting with the small screen and working your way upwards. Strictly speaking, for mobile first, responsive web design is not really a competitor; a good mobile first-website needs to be responsive to function as intended for the plethora of mobile devices.

### The mobile opportunity

Many things favor a mobile device when using the web. The most important point is of course that a mobile device is in fact mobile; that you can continue with something that you were doing, once you got on the bus, paid the fare and sat down. All mobile phones have a lot of sensors and equipment we would not normally find in desktop computers. One of the most used features is to locate the mobile phone geographically, which is fantastically useful for geolocalization services on a website.

Almost all mobile phones have sensors for the device's orientation and an accelerometer that allows the mobile to be aware of its surroundings. It remains to be seen whether a website makes any sense of this, but off hand, I can imagine that if the accelerometer indicates much motion, there is a need for simplified forms, increased text size and other usability adjustments.

It is easy to forget nowadays that one of the major innovations of mobile phones is that they have touch screens. This allows for a much more intuitive interaction with your fingers instead of the user accustomizing themselves with various external pointing devices. What touch screens can be used at the end of the day remains to be seen. So far it has mainly been about making the interaction between users and services more natural, including examples such as the user defining an area on a map to look for a diner at lunchtime.

Imagination is what limits in our use of cameras, gyroscopes, microphones, light-sensors, vicinity sensors and all other kinds of data.

### Mobile restrictions

Fact:  
The fast 4G network (LTE), compared to 3G, covered about half of Sweden's sparsely populated areas in 2014 while the faster version of 30 Mbit / second only covered 2 %. Vodafone Germany plans to cover 90 % of German households by the second half of 2016.  
 _Source: PTS (The telecommunication authority in Sweden) and ITU_ 28

The limitations of mobiles can be summarized as follows:

  * The screen is often small. Compared to the earlier idea of a normal screen, the first handsets only have one-fifth as many pixels. Now there are devices with high-resolution screens, but they are still very rare compared to a desktop computer on many markets.
  * Mobile - mobile in the sense that the user has it in their hand, watching out for traffic, and continues to use it even when going into a bomb shelter underground, where it loses reception with the cellular network.
  * Context - think about the ergonomics during the use of a mobile, as light-conditions affect the contrast needed to read text and more. We cannot expect the user's undivided attention, even for a short while at home in their living room since mobiles are usually used with other things demanding attention.
  * Often with a sub-optimal connection - if a fast Wi-Fi network or 4G is not present, a mobile device has a hard time dealing with many or large files for download.
  * Monthly data plan – it is very common that users have a monthly limit for how much traffic they may use. This means that users needs to limit their traffic and are not very willing to receive unnecessarily large content.

The concept _offline first_ is something that we probably should include in the concept of mobile first. The idea is that the user should be able to disconnect from the cellular network and, as far as possible, continue using the website. Mobile users are accustomed to apps functioning without connectivity and probably using local data. In these cases, users are not interested whether, from a technical point of view, they started a _native mobile app_ or clicked a bookmark on the phone's home-screen - it is simply supposed to work. In a mobile or offline scenario, we should not consider a dropped connection as an error and tell the user that they are offline, instead, they are, as far as it is economically feasible, to be assured that the service works, even though there are periods without reception.

There are many occasions when you have a shaky connection to the Internet. Imagine traveling by train or subway, or a car ride in a hilly countryside and all other situations that will not magically work out with respect to connection across the globe in the near future. When you have a connection, it is often of a dubious quality, which means that it may take several minutes to download a less than optimized website. This we will go through in depth in the part about web performance later on.

### The mobile moment - when mobile users are in the majority

Fact:  
In 2013, 80 % of Japanese people use the mobile web. While 25 % of mobile web browsing Americans almost only used a mobile device. Only in exceptional cases, they used something else, such as a desktop computer.  
 _Source: Mobiforge_ 29

Exactly when the mobile moment occurs, when most of your visitors use a mobile, differs depending on which users you have. In some cases, this has already happened and for a few, perhaps, it will never occur.

In 2014, Facebook released statistics showing that about three quarters of their users visit the service via their mobile phone. More amazing is that almost one quarter use Facebook only from a mobile. For those of us who work in offices, it can be easy to think that people still have a "real" computer they use for several hours each day, including surfing the Web for both job-related things and leisure. Even though many probably are in that situation, you should not assume that it applies to most users you want to communicate with.

When is your mobile moment and are you ready for it? If it tends to take a few years to improve your website's design, maybe it is time to start with mobile first today.

## SPA - Single Page Application

The idea with _SPA_ is to give feedback at once from a website instead of the usual behavior to request a new page and then wait for it to be loaded. There is ongoing communication, out of sight of the user, between the browser and the web server to load the content constituting the web application. SPA is common on websites with content that is continuously updated, for example, where there is a flow of new posts to be loaded, without having to reload the page to see them. Frequently, these websites are designed to resemble applications on a desktop computer rather than a regular website.

Another common SPA feature is that the entire website is downloaded locally on the device, which makes for extreme performance while becoming independent of a connection to the Internet. The challenge is that these applications often occasionally need to be updated or synced with a central server. There is good support for these technologies thanks to HTML5 and specifically Web Storage.

The concept of SPA is probably best known in tech circles, but many recent great mobile sites, many of which display an app-like behavior, exemplify what this design strategy is all about.

SPA is heavily dependent of Javascript frameworks and modern features in your browser. A technique you probably heard about is _AJAX_ , which manages the exchange of data between the server and browser. This is based on the use of APIs, which makes it more of an architectural concern because the website is tantamount to an app - a window to the API. There are two variations on how the information is sent to the browser. Either it is sent as raw data, usually in the format JSON from the API, or it is pre-formatted HTML codes to update the content on a particular part of the page. For example, presenting new e-mails in the inbox would only contain structured raw data, and then some Javascript code in the browser adds the design, bold, wrap, etc. Otherwise, new e-mail is sent from the server to the browser as HTML code, making it ready to be presented instantly in the browser, pushed into the top of the list of e-mails.

If you thought of also having a mobile app, or external users of the API, there is reason to choose to have the API send raw data since HTML controls in detail how the appearance should be.

Many websites have components similar to what constitutes a SPA website. The most common example is probably the ones that offer messaging between visitors who find themselves on the same website, like the message window you have on Facebook to talk in real-time with other users. Another common SPA-like feature is the listings of content, which is automatically updated with new content, like a stream of information. To give content updates and direct communication between users on a website, the web standard _WebRTC_ (Web Real Time Communication), and _WebSocket_ among others is used to deliver features that were previously impossible on the Web. Having a long-lived connection for real-time communication directly between clients is a bit of a novelty on the Web.

Tech-savvy people and some web developers around you will take great interest if you mention SPA, but there is something essentially non-technical to remember, namely that SPA may be excellent for small web projects that emphasize speed of use and perceived smoothness. SPA is also good if you are not quite sure if you will build a mobile app later, as a lot of the investment already made in the API will help.

    Figure 52: Many early SPA-websites mimicked the design conventions of Iphone apps, when it still was skeuomorphic.

### Design of SPA websites

Although SPA is primarily a question of technical architecture, there are some common characteristics when it comes to appearance. The early app-like websites often built a mobile website since the web code could also be compiled into a native mobile app. Therefore, some looked like the first generation Iphone interface.

As the Iphone's market dominance diminished, the design language has become more neutral, and a common denominator for SPA sites is that they are very similar to apps in appearance and function. They often ask you to add them to your home-screen on a mobile. If you do that, the difference between a native app and a website is not necessarily noticeable.

In addition to the app-similarity, many SPA sites also focus on giving an overview of many things, such as with so-called dashboards. This application behavior means that users really only have one view of the website, rather than the plethora of unique templates for different types of content they are used to, which can be both a good and a bad thing.

Recently, many variations on the SPA theme has evolved where you have one long web page and links to different sections using internal links, often with eternal scrollbars where new content is loaded when approaching the bottom of the page. Plenty of websites have a single page that is SPA, among a group of ordinary pages, the application itself is SPA and the rest is an ordinary website.

### Challenges of SPA

The fact that there is so much Javascript and dynamics is not sensible when it comes to search engine optimization; you cannot take Google ranking for granted when going SPA. A common behavior on an SPA website is that there is only one single address regardless of where you are as a user within the application. So there are no different addresses to index into a search engine, but really just a start page - this makes the website's content largely invisible to search engines and their users. There are ways to solve this, which is often necessary to get the web app to work without Javascript enabled in the user's browser.

    Figure 53: Hultafors' website behaves as an app if used on a mobile.

The same problem with addressing also affects users who think that it is possible to copy everything in the address bar and send to anyone, for that person to see the same thing as them.

Something to keep in mind is to abide to the navigational conventions visitors are accustomed to. For example, it is, unfortunately, common that the expected behavior of the browser's back-button on SPA websites does not occur. Sometimes you are logged out but it may well be that nothing happens at all.

If you strive to make your SPA-site functional offline, it requires some planning to have everything downloaded in time. There is probably a priority order of what content you want offline if you cannot choose to have it all. Probably much will be downloaded and never be used; therefore, you could consider having only text locally on the device and download heavier material, such as pictures, later on if the conditions are right (check out the design pattern called lazy loading). When on a slow mobile connection, one needs to adjust what is downloaded, and on a public website keep in mind that some visitors may have a monthly data plan, a quota to be considerate of.

SPA is all about offering the functionality that the user's device is capable of, and it has some certain challenges regarding fault-tolerance, a subject we soon will talk about.

## Web standards, and usability

**Glossary - usability**  
A measure of how easy it is to do specific tasks with a product, a website or other features. In the web context, how easy it is to understand how to interact with a website, a measure of whether it is intuitive or not.

**Glossary - accessibility**  
How well it works for people with some form of disability, such as visual impairment, motor problems, learning disabilities and similar.

**Glossary - web standard**  
To follow established, open and inclusive standards and practices when designing a website.

To round off the topic about web design, I think we need to go through what a good website is and if the website works as it is supposed to. What is published has to be useful and follow established standards as far as possible.

_Web standard_ is based on the underlying principle that all users, regardless of their physical or technical capabilities, are able to use a website. The original idea was after all an open and inclusive Web, and the use of common standards supports this idea. Opposites of web standards include commercial or proprietary formats such as Adobe Flash or Microsoft Silverlight. Instead, we should use open formats such as HTML5 and others developed as recommendations by W3C and other credible groups without any obvious self-interest. We should choose standards that set requirements as low as possible.

_Usability_ is not only about ease of use but also about bringing something meaningful, having an objective in common with the user. A first level to aim for regarding usability is to make sure that what is built works on the devices your visitors use, designing everything fault-tolerant since there is a vast variation among visitors' equipment, but still sticking to web standards.

### Progressive enhancement and graceful degradation

**Glossary - progressive enhancement (PE)**  
To assume that the visitor does not have the latest technology, or most recent browser, or all possible plugins, but instead offer a basic version that is good enough for anyone.

**Glossary - graceful degradation (GD)**  
Design of system to fail gracefully. When errors occur, or needs are not met, the system will not crash if avoidable. Stylish error handling where it is planned for, in advance, that errors may occur.

_Progressive enhancement_ (PE) is a design approach that above all prioritizes _accessibility_ , the code validates established and widely used standards rather than aiming for an awesome or experimental experience. It is about offering a stable and reliable website without extravagances we just as well could have done without.

However, the concept around PE also includes the idea that once you have delivered the basic needs, it is of course acceptable to add features that will only benefit the few who meet the technical requirements. Under controlled circumstances. In other words, you do not create a page where those who do not have a certain obscure plugin are told to download it, rather it should be invisible what you are missing out on. A typical example is the transition period when not all browsers had full support of CSS version 3, which ensured that the design worked well without CSS3 but those who had such support in their browser, could see small improvements such as round corners, for example. What is important here is that no design choices cause other users to have a less pleasurable experience.

    Figure 54: Javascript and CSS makes a form more usable for those with modern browsers.

If you read the section about adaptive web design you have already seen the idea behind PE, namely to add something to the user's experience if the device or context supports it. PE is usually about web design in terms of graphical appearance, but there is no reason to neglect visitors' connection speed - if a fast connection is used, we can just as easily stream video in Ultra HD format for optimal quality. In other words, we can argue that mobile first is PE because it starts out with a small screen and adds content if there is more space available.

_Graceful degradation_ (GD) is in some regards PE's opposite since GD means designing for the most modern and optimal technical equipment a user may have in the form of browser support for new technologies. To be usable in older equipment, you need to build in lots of fault-tolerance.

A way to tackle the problem that some browsers (yes, Internet Explorer has historically been the biggest scapegoat) do not support the latest web standards is to use a so-called polyfill. _Polyfills_ are solutions for features you have good reason to believe that most browsers will support in future releases, but for it to work in older versions, the polyfill has supporting code for fallback. An example of this is that support for _SVG_ images (Scalable Vector Graphics) is very diverse, so instead of having yourself to develop special code for each browser and version, you use a polyfill. In this case, the Javascript framework _Modernizr_ can help. Say that the user, in the case of an SVG image, is using Internet Explorer 8, then the polyfill will see if the user at least has Adobe Flash installed, and if not it gives up since IE8 by itself is not capable of displaying SVG and Flash did not work as a fallback. Instead of giving up instantly when needs are not met, it tries in a structured way to resolve the issues with older equipment.

Whether we choose GD or PE is revealing as to how we look at our websites and the requirements placed upon the users. Need I mention that PE is a further development of GD? You have probably read between the lines by now that I prefer PE :)

### Usability vs. accessibility

Two concepts which in many ways can be seen as the same thing, namely usability and accessibility, is something that often seems to create headaches for people.

    Figure 55: I'm guessing that the person who wrote that caption is able to take the stairs.

The most popular way to distinguish between them is that usability is about the website's ergonomics for a common user. Ergonomic so they are able, for example, to get through a payment process in an online shop. Accessibility puts more focus on allowing as many people as possible to use a website, including those with temporary or permanent disabilities.

When speaking of disabilities, the blind and their needs are most often used as an example. It is deceivingly simplistic since accessibility is something most of the population can benefit from. We all benefit when we are tired, in bright sunshine with a mobile, are forced to use a gaming mouse with too-high sensitivity or receive the text version of video clips when we have forgotten our earphones and are in a quiet environment.

There are of course helpful guides on how to develop a website in an accessible way. Two common standards to follow are; WAI - Web Accessibility Initiative, and WCAG - Web Content Accessibility Guidelines. WCAG is actually a subset of WAI and goes specifically into the exploration of web content on a website. WCAG is based upon four principles, namely that a website should be:

  1. Perceivable
  2. Operable
  3. Understandable
  4. Robust

Whether or not you are aware that your audience consists of people with special needs, it is important to have a good basic level of accessibility and usability. It is encouraged by search engines, and I guess you want traffic on your website? Although some but not at all care about the needs of the blind when designing a website, it is worth bearing in mind that Google also cannot see, and for the moment, Google cannot hear. If you do not care about getting visitors through search engines or always having all the employees on top when accessing the intranet, just carry on ignoring it.

_Accessibility_ is about taking into account people's varying needs. People with the widest possible variety of characteristics and abilities, including those with various disabilities, should be able to use a fully accessible website. Among common accessibility challenges are, for example, that the text actually is real text for a person with a screen reader and not an image, as well as that headers in the code are marked up as headers instead of bold and enlarged text, and that the page has a page title. Simple stuff, most of it.

 _Usability_ has more to do with efficiency and the perceived experience. A website may be perfectly accessible, but still not very usable, if, for example, it is very time consuming and perceived as annoying. Your website might actually be usable to person A but not to person B at the same time, if they have different goals with their visit, or different preferences.

One way to get your users to agree that the website is useful is to design it with game mechanisms, so-called gamification, as it often gives a sense of purpose to what you are doing.

### Gamified design

Gamification is an approach to designing useful services, not only what the user is able to do, but also understanding what needs to be done and why. Gamification is a popular term that discusses this in depth - how to use game mechanisms to give meaning and structure to a service.

It is not just about chasing trophies, medals or other digital awards, but usually a way to guide a user through a process or motivate them to do certain tasks. It is excellent for introducing new and existing users, but it does not have to look like a game - perhaps the competitive instinct or wish to please is more natural than we at first think.

When you launch a new website, how are you able to help or introduce it to your users? In the system, of course! The basics that need to be mastered are presented in a digital guide that helps the user to get started. Provide tips, motivate continued exploration and make the user feel safe that any setbacks will not be disastrous. This is called onboarding when talking intranet.

Gamification can be anything from a full-fledged game to a few cherry-picked game-components such as clearly stating to users what extra personal information they need to fill in to make the most of the service.

Things that distinguish a service that takes advantage of gamification include:

  * **Relationships** – a social context. How _you_ relate to _us_? What others are doing?
  * **Movement** \- incentives for the individual that feel neither like a carrot nor a stick. Why should I give away my e-mail and what is in it for me? What happens if I do not comply?
  * **Feedback** \- continually know if you are doing the right thing and are heading in the right direction. Information in other words broken down into such small activities that they are hard to fail, micro-interactions.
  * **Learn** \- how do you do better next time, is there something to improve?
  * **Higher purpose** \- what is the service all about and why should a user get involved.
  * **Reward** \- can come in the form of digital goods, such as medals or badges, but could just as well be something in the physical world. Maybe not the result but rather the effort put in is what should be rewarded. You should put some thought into this so that the game does not attract saboteurs. A good trick is to set an achievable goal everyone can relate to; when the goal is reached, you are finished. Sometimes there are multiple levels of reward like lotteries to see what you get, which gives extra excitement.

    Figure 56: Linkedin.com is clearly stating what tasks left for "12x more career opportunities" (2012).

    Figure 57: Linkedin.com's next iteration looked a whole lot different, with numbers instead of notes (2013).

    Figure 58: The circle illustrating the profile strength contributed to 20 % more complete profiles (2014).

Some examples of successful gamification ideas:

  * The social job-related network _Linkedin.com_ got 20 % more of their users to complete their profile with important data. The change they made was to visualize how complete a user's profile was with a circle.
  * The mobile operator _giffgaff_ allows customers to offer support to each other. Customers get call minutes as compensation for their efforts. The company has around 16 employees who take on the more difficult questions and miscellaneous administration. This way, they can offer lower prices to their customers who are more active than competitors' customers, which certainly gives some additional benefits.
  * _Runkeeper, Nike, Jawbone, Fitbit,_ et al., who give regular exercise a social dimension where even daily (in)activity and targets are compared with friends. Social competition with the objective to exercise more, sleep better, eat more healthily and encourage each other.
  * _EBay_ where the seller and buyer rates each other, for future buyers' or sellers' interests.
  * _Slashdot_ has a karma system for comments on their articles. A user has to earn karma points. Only when you have points can you vote up or down on others comments. Many upvotes on your comments means you deserve better karma. Only those comments voted up are shown by default to visitors making it a self-moderating system.

    Figure 59: Runkeeper encourages almost every single activity, with lots of sub-goals to celebrate.

One problem you may face when trying to take advantage of gamification is that you focus on what you yourself want the user to do. Instead, you should focus on what the user wants to do. The thing you, as a game system creator does, is to create game mechanisms so that the player wants to move along; you cannot suddenly change to a challenging tone telling the player what they must do. We are no longer an obedience society; rather we need to admit that our society is driven by motivation. To be told what to do can have the opposite effect. It is better of course to focus on why everyone wants to do something for his or her own sake.

### Design and plan for errors that will occur

**Glossary – error 404-page**  
The page that is shown to the visitor when requesting an invalid URL address. It happens when a page has been removed, the user mistyped an address or an error in an address sneaks into print etc. The number 404 is because of the status code of the Web protocol, HTTP, which signals that the page does not exist. Other common codes are; 301 meaning that a page has moved to a different address, and 200 meaning that everything went according to plan.

Nowadays, you are not supposed to end up on a _404-page_. That we so often do is partly due to lack of respect for the users' best interests among us administrators who run websites. If a page on a website is no longer published, or if an address link is broken, I would claim that in nine cases out of ten, it is because one of the following:

  1. **The page is about information in time, such as news or a calendar event.** An often heard argument is to throw these pages in the trash since "no one should be interested anymore", possibly "but it occurred last week" or that "it takes up unnecessary space".
  2. **They switched content management system and did not have the foresight to keep their established addresses.** It seems to be easy to forget that the addresses on your website have value, for instance: 
    * In the visitor's bookmarks.
    * Search engines will not give you a golden star for your fine new page, as it is, precisely, new and unheard of.
    * Via the links to the page from intranets, websites, documents, or in people's mailboxes, links that suddenly will stop working.
  3. **Content management does not handle addresses in a good way.** Most of the publishing systems I have seen have an address format reflecting the tree structure, a structure which mainly makes sense to web editors rather than to visitors. It rarely has much to do with the structure of the website. Frequently, you see that a main page gets a new name, then what happens is that all sub-pages' addresses are changed.

For example: _www.examplepage.com/customer-service/our-services/list-a-z/c/_

If you decide to rename ' _Customer Service_ ' to ' _Customer Center_ ', all pages beneath it probably get new addresses. A more sensible handling of URLs is one of many things that have to be on the shortlist of requirements when choosing a content management system.

Clearly, there are occasions when you cannot cater for some visitors and they have to end up on a 404-page. In such an event, there are some easy tips to remember on how to design the 404-page, namely:

  1. Do not let it escape the visitor that the requested page cannot be found and that an error occurred.
  2. Make sure that the visitor gets a design similar to the rest of the website. It should include a logo, navigation, color and design.
  3. Give suggestions on ways to move forward. Maybe to list the ten most visited landing pages, or if the website's structure is in the address that gave the 404-page, you can link to the parent page. Sometimes a search box is useful.
  4. Send the correct HTTP status code, a 404, to the visitor. If not, you can be punished by search engines and other robots.

If you do design a good and perhaps even entertaining 404-page, it can mitigate the initial disappointment that a visitor felt. When possible, refer to a new page that replaces the old one! The referring should be done with a permanent reference, a _HTTP 301_ technically, because all the alternatives are worse.

    Figure 60: A faulty-configured web server might present unfriendly 404-pages.

    Figure 61: Spotify's cute mascot apologizes and suggests singing for money.

    Figure 62: Flickr acknowledges the problem and lets the user know that they are fixing it.

Web, the mobile web, apps, or a little bit of everything?

For the near future, we will constantly be faced with the question regarding the packaging our information or services should have. Today, the question is most often related to whether there should be an app for specific devices or whether we should set higher standards for the existing website.

Just as many moved their content to social platforms, because it is where their target audience spends their time, our own mobile apps can become less relevant if the phone becomes a platform for starting up just a few services. That is the direction we are moving in, as it is becoming increasingly more difficult and expensive to get someone to download an app regardless of the fact that it might be both free and awesome. In contrast, maybe it is an app inside a larger application we should look at. For example, Spotify invites third parties to put "apps" in their app.

Usability guru Jakob Nielsen says, in his book _Mobile Usability_ , that you should have a separate website for mobiles, perhaps even an app for the more frequent users and the unique type of usage patterns they have. That might be the most reasonable conclusion if you only prioritize usability and utility. At the same time, he speaks out against building websites for simple mobile phones, so-called _feature phones,_ since the usability achieved is still so terribly bad.

    Figure 63: Nice updates for those talking Polish or Italian, I do not, but I am still downloading this update somehow.

Where apps are brilliant is where the web, in many cases, is not even trying to compete, such as with gaming performance on a mobile device, for instance. To some extent, it has historically been easier for those with disabilities to use apps instead of accessing the Web on a mobile browser. The advantage apps previously had was to be downloaded already and to cope without contact with a cellular network, something that websites are becoming better at.

Apps, by today's standards, also have some drawbacks compared to a website. On a website, the publisher has full control and does not need to worry about customers using different versions where some even cease to work. If there is an error on a website, when you fix it, it is solved for everyone, and it is not necessary to notify all users that now you have improved support for the vast minority who have Icelandic as their mother tongue. Apps need to be, unfortunately, installed before you can make use of the content which is a bit like putting barriers around the entrance to your store.

In my opinion, it is very difficult to argue to lower the demands on a website, or by omitting to have them, and instead focus solely on an app. Apps have, thus far, had poor transparency through search engines on the Web, which means that you are hiding your content in a container that is hard to find compared to a regular, highly public, website. Apps are something that can complement a website and carry out some specific tasks more useful than what is profitable to do on a website. In most cases, you still have to build a website; it is foolish to create one that repels the visitors because of having done a half-hearted job.

#### Do not get blinded by the mobile market

Your mobile moment may never come. Instead, it may be that the percentage of visitors with PlayStation 4 will need your attention and what makes you lose visitors because of that, what you offer is not usable on a large television set. To rely entirely on web statistics to leave out certain groups of visitors is like planning a bridge for cars based on how many people swim the exact same route over a river.

Have you tried to navigate your website with a PlayStation controller?

Exemplified by the healthcare website 1177.se, the number of mobile visitors increased each month during 2011 and 2012. The share varied a little since traffic from desktop computers depends pretty much on how successful the various campaigns are. At the turn of the year 2012–2013, almost half the visitors used a mobile device or tablet to visit the website.

    Figure 64: The increase of mobile/tablet users (gray) at 1177.se during 2011-2014. At the end of 2014, two-thirds were using mobiles while there was a very modest increase of desktop users.

When do you think 1177.se became responsive or useful in mobiles? The responsive version was released in the spring of 2013 after months of preparation, when a majority of visitors were already using a device with a touch screen, about 42 % mobile and 10 % tablet. Given how quickly user behavior can change, it is extremely important to keep up with the change and make small adjustments on an ongoing basis to meet with visitors' expectations. The time of large web projects is over; now the time has come to release improvements in a steady stream.

### Your website is a magazine, not a book!

This book talks about the design of a website and the underlying information systems; it certainly scratches below the surface but it was only when writing these pages that I realized how much there is to think about. However, I will still pick a single subject about designing websites that is more important than anything else; namely that you from the very beginning are prepared to adjust the website, frequently. It is not about refreshing the design every five years and spending the time in between focusing on producing great content.

A website is not static print matter, since printed matter are things that disappear into oblivion, which is much better than a hopelessly outdated website that is all too easy to stumble upon. Every website should be more of a magazine that brings out a new edition each month.

This includes, of course, continuous web design. You should not be forced to implement any redesign project just because the design feels outdated, or because after two years of waiting with bated breath, you finally have a majority of mobile visitors, and then hurriedly have to produce a responsive design. All this is something you can do regularly and if you work according to a flexible project methodology, such as Scrum, there should not even be a particularly big change since each sprint actually releases new code on the website.

Between these releases, you design and conduct tests to decide if future improvements really are improvements according to actual users, following up on earlier activities and making sure that the plan for the next iteration is solid.

Continuous web design has the advantage that returning visitors are familiar with the design as each change is quite small. You need a proven master design with the most common graphical elements, what then is missing when building new is invented when necessary and tested on real users with A / B tests. That is what you include in your own pattern library, or what you would call your living style guide perhaps.

**Glossary - deep links**  
Links from other sites to pages located far away from the home page of your website. Not your homepage, or the pages linked from your homepage or the main menu. These links are valuable signals for search engines as they suggest quality content that other people appreciate on your website.

The parable of a magazine also calls for ongoing work with content, curate what you have and produce new. This should hopefully lead to insight regarding the bad idea of giving up their content and making a fresh start - no matter if you change the content management system or not. Editors' tools to produce content seem to have a stranglehold on public websites as seen by their visitors. We should of course not prioritize editors' needs of tooling before that of users' needs. Therefore, it is important to be prepared to change content management so as not to be caught up in a particular system's proprietary standard, which in the future will make users' experience suffer.

A typical example, which today finally seems to be resolved, is that content management systems tended to want to restrict how the addresses would look on the public website - and then, when you switch systems, you break all the priceless _deep links_ procured over many years.

When a website is in constant change, you are better prepared for incidents, and everyone involved has more of the intricate details in their short-term memory. Furthermore, it is not a big obstacle to put technical updates onto production servers continuously. Sometimes it pays off at once when a security vulnerability affects the platforms and frameworks you are using.

Your website is always under construction, or at least it should be. Have a developer, web designer and frontend programmer readily available. Just as we do not wave goodbye to the marketing department between each marketing campaign, organizations should keep those skilled in web-tech nearby.

# Web performance

User experience is what is important and it is not easy to put objective figures on how fast a page is perceived to be. However, by observing several technical details you can identify things that affect user experience – such as how long it takes to load a webpage. With these metrics, you can eliminate, or at least mitigate potential problems and prevent bottlenecks that can occur, for example when a very successful marketing campaign goes viral.

You should make sure that relatively small proactive activities are done, or put in a performance budget for the next major release.

A successful marketing campaign is a planned event in which you have time to prepare yourself and the website. Still, we don't always get the chance to prepare. The website can get an unmanageable flood of traffic landing on content pages taxing most of the server resources; we can also meet congestion of various kinds. This chapter suggests some preventive emergency planning to make life easier for the web-related part of organization.

Working with performance optimization has many more benefits than mitigating complete failures, especially now that more and more users are connected to the Web via cellular connections. Since you should not let visitors wait longer than necessary before they can interact with a website, it is important not to be wasteful in the transmission of files. Send as small amounts of data as possible to visitors since they sometimes have dubious connections to the Internet, for instance when in a sparsely populated area or the countryside.

Since 2009, Google has been working hard to help improve the performance of websites, both by providing tools to analyze performance, and also by providing motivation. The following statement in a blog post from Google more than suggests that they will penalize you if your website has poor performance:

> "Like us, our users place a lot of value on speed - that's why we've decided to take site speed into account in our search rankings."
> 
> \- Google Webmaster Central Blog

Google itself has one second stated in their performance budget. One second as the time limit for a page to load completely when it comes to any of their services, according to statements from their optimization professional, Ilya Grigorik. Studies show that a few tenths of a second of lag are enough for a user to lose the feeling of instant response when trying to use a website, or any type of digital product, according to Nielsen-Norman Group.

### Planning for the unplanned

There are plenty of examples when it is a good thing to be prepared or proactive when it comes to web performance even though it becomes more obvious when something goes awry. Some examples when websites ran into problems with web performance follow.

#### The major Swedish newspaper, Aftonbladet.se, during the September 11 attack

On September 11, 2001, the most observable terrorist attack in modern times occurred. The event was filmed and broadcast on television, in real-time, and those Swedes who at the time had access to the Web (probably mostly at work and school) knew of Aftonbladet as the most established Swedish media outlet on the Web. It did not take long before Aftonbladet's normal website went down, which I suspect happened to other big media organizations all around the world. Instead of the usual website with its flexible content management (Vignette Story Server if I recall correctly), they had to switch to a manually updated sparse version of the website to give visitors the current news they were looking for.

    Figure 65: A stripped down version of the newspaper Aftonbladet, stating thousands of dead at the World Trade Center in New York.

    Figure 66: Meanwhile, at latimes.com, a simple design, probably under heavy load too.

Until this extraordinary event occurred, at least I was not aware of the importance of the online role of media, or that websites could go down in a way that made it extremely difficult to get them online again. That websites might be slow was an everyday experience, but that a website was unreachable for a long time was unexpected.

A corresponding disaster scenario on a small, local scale might be a train accident and spillage of hazardous chemicals. It can easily have an impact on the surrounding municipalities' websites and the shipping company concerned. The longer you can keep your website up and running, the fewer people who need to think about whether they have a traditional FM-radio receiver lying around.

#### Search Engine Optimization strikes against slow-loading pages

The governmental agency of Western Sweden, Region Västra Götaland, had a website with regional healthcare information. A website that I inherited from a developer who left the company where I worked. In 2009, the website consisted of a mix of national content, fed by an API, and self-produced content. Like most sites, we had been doing search engine optimization work in order to facilitate finding the website's content through search engines. Suddenly one day in the fall, there was, according to the news channels, an outbreak of swine-flu; there was immediately a high number of requests for such content since the media had started writing hysterical headlines about death, pandemics and the end of humanity.

    Figure 67: A sudden increase of visitors because of swine-flu. Stats from Region Västra Götaland's healthcare-portal.

What was the problem? Well, those pages that now were popular were fetched using a national resource of texts, retrieved from an external web service for every single page view. Previously, most of the visits to the website were on pages that did not have external dependencies. The external API had already shown minor performance issues with response times, which, before the swine-flu epidemic, was not considered a priority issue. The API was not designed to support an increased load, no one had ever demanded such a need, and also, back then, what is today popularly called 'the cloud' was not established or offered as a bail-out as is the case today.

After some initial confusion as to why the website went down, the editors used editorial work combined with search engine optimization to remedy the problem. In 2009, individual keywords clearly weighed heavier in Google rankings compared to today, therefore creating new sub-pages that matched even better the keywords people searched for and that could direct traffic past the pages that were difficult to load. Web analytics and editorial work saved the day.

#### The energy company E.ON's website goes down during a minor storm

In October 2013, a category 3 storm hit Southern Sweden. Tens of thousands of people were without power. If those without power or others, visited E.ON's website via their mobiles, I do not know, but on this same day, I entered their website and it took an unprecedented 37 seconds for the server to respond with the first byte, and an additional 100 seconds to load the page completely.

    Figure 68: Power company E.ON's page took 37 seconds to respond and 100 seconds to load, as stated in the browser's status bar.

I took a quick look at their performance, such as how Google measures it according to its Pagespeed Insights service. E.ON had an 81 out of 100 rating, which in no way is remarkably bad. As an external spectator of their stripped-down disaster version of a website, I noticed however a number of things which could have made their emergency website perform better:

  * Do not send unnecessary stuff, such as custom fonts for a disaster version of a website. Is typography really your major concern in these situations? The custom font file-size is the same as the HTML code, and the HTML is what carries the actual content to visitors.
  * Instruct visitors' browsers how long files are supposed to live - for example, the logo and favicon, will, most likely, remain unchanged for some days to come. When these unchanged files are not loaded from the servers, E.ON's servers are able to serve more visitors.
  * Optimized lossless images could reduce image traffic by 18 %. Images are significantly heavier than text and make for unnecessarily large files which unfortunately, can make a difference in the wrong direction when it comes to crisis-communication.
  * Is it necessary to send 88Kb of style sheets? To design text like ' _Excuse us - our website is not working right now_ '? It actually takes ten times as much data to send, compared to the readable content, in order to specify that headers are red and the body text is black.

Some organizations with foresight prepare and do test runs with emergency and contingency plans for dealing with too much pressure on their website. To buy themselves more time, and perhaps in some fortunate cases, avoid entering emergency mode, it may be worthwhile to think about web performance.

If you switch to crisis-mode on your website, you will have to make sure that all established URLs pointing to your website work when the crisis occurs. Do not expect users to start erasing characters in the address bar to reach a possible home page - you have already failed them at the first contact attempt. In addition, they may use a mobile device that does not indicate that they have reached a sub-page, something that at least is the case with Apple's mobile Safari browser since iOS 7. The address bar only shows the domain, not whether they are on a sub-page. Therefore, a user might not be aware that they are on a sub-page.

Of course, we need to have other communication channels when communication during a crisis. For example, you can add text messaging, or reach out via the media, but you still do not want to be known as the one who contributed to traffic congestion on the Internet just because you did not think of obvious optimization.

## Performance optimization of databases, web servers and content management systems

Depending on whom you ask, you will get different perspectives on performance. There are several perspectives when asking different types of professionals what affects how quickly a website displays itself to a user. If you are talking to a database administrator, they will talk about databases structure and probably express skepticism about how the handover of data to the content management system is handled, or whatever the developers have built. A developer is often focused on efficient code, something that is certainly important, but only in extreme cases has a significant impact on user experience. User interface developers often try to minimize the amount of code, reducing the number of interface elements on the page, cut the file-size and reduce the number of files sent to users.

All of this is, for sure, confusing for those who have not worked much with optimization. I thought this part is what everyone needs to know if they are involved in website management. Certain parts may certainly be familiar to some. This chapter is likely to be somewhat technical, but hang on in there and google what you are not familiar with.

### General troubleshooting

If you think your website is slow, and it is not obvious what is wrong, it is always a good idea to talk to a technically minded person looking through the log files for error messages. Frequently, the database server, web server, or content management system and similar reports errors to a log file when experiencing problems. It varies where you find this log file and it sometimes requires prior knowledge to understand if the error itself is a serious one. If you do this yourself, it is a good idea to google the error discovered (if you did not know, your developers do google the same errors before talking to you), since someone else probably encountered the same problem before you did.

    Figure 69: In Windows' Event Viewer, you can view all kinds of trouble the server encountered.

If you are somewhat technically minded, there is usually a tool for each part of the system or one where you can check the server's health.

With today's complex websites with many interactions directly handled in the browser (without reloading the page that is), sometimes delays occur locally on the user's computer. This indicates that you need to do something about your interface code, your HTML, Javascript and CSS. Maybe the page contains too many design elements with CSS applications that are not efficient enough. Such as when trying to toggle between hiding and displaying tens of thousands of design elements. If transitions and animations in the browser feel sluggish, this indicates that the interface code is too complicated. Then googling for performance testing of CSS, user interface or the frontend will probably be of help unless the error source is already obvious to you.

    Figure 70: Chrome's network view when accessing amazon.com reveals the timeline of and info on the files received.

A test you should do before you give up and call in the pros is to visit the sluggish page on your website with the browser networking feature enabled. Most browsers offer some tools for developers, and what the network (sometimes called timeline) does is to show what happens during loading of the files. The discoveries you can make are that files are sent in the wrong order, that there are more files than you think you need, or that a part of the download time is wasted waiting for the server to start sending anything at all. For example, when I visited Aftonbladet.se with Chrome as my browser, Chrome suggests that there is an initial latency of 0.074 seconds before anything is sent. We can often identify or rule out the problem just by looking at patterns in what is slow. Remember that networking concerns and other perceived slowness issues may depend on your own equipment and connectivity to the network. It is a good idea to test various devices and connections when you are troubleshooting.

### Planning for high load - use cache!

A _cache_ is a compiled version of something, a webpage, or any part of an element of an information system, and reflects information that has not changed recently. The point of a cache is to offer frequently used content directly from memory, with no need for it to be computed, or compiled, again and again for every user. It is much faster to send content to a user if the content is already in a cache in the server's memory. Then the server does not need to check in databases, or talk with external APIs for each page view.

There are many variations of how caching can be used, but the two most meaningful from a developer's perspective are the so-called accelerators, that create a cache only at the web server end when some content has changed. The other one is the cache you have in your own web application. An accelerator works a bit like a filter between the website and the users. If the content has not changed, what is sent is a cached version of the page from the accelerator, without even involving the web server. This way, we have two specialized softwares. The accelerator to send files as quickly as possible, if the content is unchanged, and a web server to compile web pages and operate the content management system (where editors are updating content).

    Figure 71: Even accelerators can fail when put under extreme load, in this case Varnish cache server.

Using an accelerator does not solve all of your performance related problems. However, they are incredibly useful for huge loads on semi-dynamic websites, where static information is supplemented by users' comments for example.

### Content Networks (CDN - Content Delivery Network)

**Glossary - static files**  
A file that has the exact same content throughout its lifetime. Frequently used is Javascript library Jquery that version-controls its editions, that is, the contents of the file for Jquery version 1.9 is the same regardless of when it is accessed, and from any location.

A _CDN_ is a network of servers located around the globe, and they are used to serve the Web with content. Dependent on your website needs, these services can be free of charge or cost a lot of money based on usage. A common example that is free, which many people use, is Google's CDN for running Javascript framework Jquery on their websites. When retrieved, Jquery is not sent from the visited website but from a nearby CDN. Another common usage is to stream large amounts of video, something that draws a lot of traffic.

That such material is sent from a local data-center in the user's vicinity is important because, for the user, it takes unnecessary time waiting to be connected with the server if it happens to be far away. Internet traffic on a continent works better than if the traffic has to go through cables under an ocean. This means that if you run a service that streams video, it will make a lot of difference if you have servers strategically placed, or rent servers on major networks and spread your content across the globe.

The above example of retrieving Jquery, which is common on most websites, makes very good use of a CDN. If your visitors have recently been on a website that fetches Jquery 1.9 from Google's CDN, they will not be required to download the file yet again if your website uses the same file version on the same CDN. The browser then notes that it is the same file as previously downloaded, and fetches it from its lightning fast local cache, instead.

Jquery 1.9 is a static file: when an update is detected; a new version number is used, which means that you get a new filename. All addresses to a new version will differ from older versions, and instruct browsers that there is new content, not downloaded yet. Many pictures should be handled the same way, placed in a more or less large CDN repository for fast transmission.

A variant that often pays off, even for small hosting companies, is to create a subdomain like media.mysite.com where you put pictures, sounds and so on. The point is to distinguish the complicated work of compiling web pages by connecting to databases, APIs, etc. and the simpler task of sending static files. Web server software such as _Apache_ and _Internet Information Server_ are not optimized for sending large amounts of static files and can, with a CDN or separate function, be relieved of this chore and instead focus on heavier jobs.

### Databases

When trying to identify problems with databases, there are tools available for all major database environments. Most commonly used is logging and profiling, to gain insight into databases task. Logging is exactly as the name suggests, a log file that, in this context, is a means of collecting the queries that take too long to execute, or if there are errors that can affect performance.

A database query may not take, in normal cases, more than a tenth of a second to run. If this is not the case, either the system is not correctly designed, or you need your database environment reviewed. If you have enabled logging of your slow queries, you might see a pattern after having amassed a few thousand logged queries and you can now, thanks to this, start to look for a solution.

The difference between logging and profiling is that profiling is about you manually monitoring performance for a short while and then turning it off when you think you have found something worth improving. The risk of profiling in itself is for it to become a performance issue, which can also be the case with logging, if you are unfortunate or forget to turn it off.

**Glossary - database index**  
Just like an index, or table of contents, in a book. Complete bundles of data frequently sought after, from a database table. For example, filtering needs to be included in the index to give fast response times.

Can you quickly list products by category? Otherwise, the category field in the database table maybe is in need of being indexed for quick viewing. If it is a relational database (such as SQL Server or Mysql), it is common to have to work with database indexes so they reflect how the database is actually used. An index is a bit like a table of contents and makes it possible to choose which parts of a database that should be quick to look up.

I previously had a database of millions of records in the largest table. Just by adding an index, I lowered a typical database query's execution time from over one hundred seconds to less than a tenth of a second. Without this optimization, it had not been viable for the website to ask questions directly to the database since the webpage's response times would been several minutes!

Database servers are fairly complicated to wrap your head around. Frequently, it is an erroneous configuration, or non-favorable one, that prevents the database server from working properly. Sometimes it is as simply that you have out-grown the cheap hosting company you have, or perhaps that you need a dedicated database server. I have seen database servers that have been generous in hardware, but had configurations that only accepted a few simultaneous connections. This means that when the website has many visitors, a queue of waiting users is lined up for an available connection to the database server, even though the server is not necessarily overloaded. At these times, people often like to have the kind of cache as mentioned earlier, to remove the need to connect to the database server.

Among the more drastic measures are redesigning the database based on its use. This may involve creating copies of the information. Copies that are optimized only for reading while the original content is used to keep the database's integrity intact. Others have chosen to partly, or fully, change the database architecture, for example, some data is located in so-called, NoSQL databases, or mem-caches, since they do not have to put any effort into creating relationships within the dataset as a relational database requires.

In larger organizations, it is common to have an enterprise search solution that indexes the entire content of databases and other information systems. If your search engine already holds all data from your databases, you might as well let the search engine serve data to your website, something a search engine does very quickly. It is important to have database changes indexed quickly, in order for the search engine not to present old content.

### Web servers, content management, own source code and external dependencies

There are a thousand and one variants of how to optimize the performance of your technical environment. It is mainly about how much time and money you have. Even under normal circumstances, a website encounters growing pains on several occasions during its lifetime not only because of occasional traffic peaks, but also as content grows in scale. Or when your own code grows in complexity.

> "A typical CMS is like a digestive system with no capacity to poop."
> 
> \- Gerry McGovern, @gerrymcgovern on Twitter

Frequently used web solutions in order of magnitude, smallest first:

  1. **Web hosting account at 10 dollars a month** \- includes web server shared with many others and database in a separate database environment shared with many other customers. Inexpensive and an easy start for a small website.
  2. **Dedicated server / colocation** \- renting space or a physical machine in someone else's server room. These solutions usually include specific operational support, such as rebooting and replacement of faulty hard drives, but all the software on the server is your own problem (and opportunity).
  3. **Virtual Private Server (VPS)** – a rented virtual server where you yourself control and set up the behavior of the server. Often you can crank up performance when necessary; this is what many refer to as _the cloud_.
  4. **Load-balanced server farm** \- sometimes hired in someone else's server room or in your own hall if you are a large organization. The load is distributed between multiple cooperating servers to give visitors the information they need as fast as possible.

There is no guarantee that a website works better on a dedicated server compared to a budget web hosting server. To have your own server requires knowledge and plenty of time to make sure that all settings are optimal for the website's needs. The difference between low budget and large-scale solutions is that you can make adjustments manually, but then it is also your problem alone to manage the unique environment you created.

Often, there is a point in having a larger website scattered, architecturally, across multiple servers to achieve the best performance. For instance, one server that only serves static content, like images (a bit like a private CDN), another server that only takes care of the databases and another server that manages the presentation of the website and its content management. These various server tasks have different needs based on hardware, configuration and software. If you divide the chores and distribute specialized roles, they perform better compared to generalist servers.

Dependent on what heavy content needs to be served to the individual website, we can be forced to expand the environment. In an Episerver CMS-environment I am familiar with, we needed an additional server for serving web editors, and all-in-all three servers for visitors. If I remember correctly, this was because there were many editors in the organization and that the load they put on the server was tremendous. They would move around thousands of pages in the pre-production environment, and these changes were reflected every hour in the production environment. This allowed the production environment to be online, but at the expense of editorial content lagging behind somewhat.

Talking about being responsible for configuring your own physical or virtual servers, it is a good idea, in addition to building a fault-tolerant system, to look at the external dependencies you have and how prepared they are for a significantly higher load. Most likely, there are settings in the application configuration files, or in the web server settings, a choice to adjust how long you should wait for a response from APIs, for instance. A web server has a limited number of clients it is able to serve at the same time; therefore, you do not want it to waste time waiting to serve a page to a user because of a sluggish external API.

If you talk to a developer, it is common that their experience in performance optimization is limited to what they can do with the source code that constitutes an application. In my experience, it is rarely worth trying to fix the source code, no matter what the programmer thinks. It is not cost-effective, since there is often more performance to win in databases, configurations, and making sure that you have the right hardware. Hardware is often inexpensive compared to having consultants look at the source code.

It will not hurt, of course, to take a quick look at the source code and see if there are glaring errors. Things I would fix in the source code are unnecessary concatenations of text (the aggregation of multiple text strings into one), unnecessary use of database connections, for example, that it takes more than one database query to retrieve, or write, data. However, unfortunately, it is difficult to assess precisely what impact a change can have before it is performed and tested.

For those who take coding seriously, there are tools for most environments, ReSharper for Visual Studio for instance, which keeps track of code while the developer is writing it. Some versions of Visual Studio also have built-in features to load-test a website. It may be interesting to see how the website behaves under stress. Many of these tools can also check and propose adjustments to the source code. In large development projects, there is often something called a build-server, which is another control instance to see if the code quality is high enough before the code reaches production. In build-servers, such as Jenkins, there are various reports to take note of, not just for performance.

## Measuring and improving interface performance from the user's perspective

A common way to get an idea about your website's performance is to measure how long a page takes to load. This hides detail, such as latency, the time before the website begins transferring information to the user. In some relatively rare cases, the wait can be longer than the time the transfer takes. Since solutions differ, it is important to identify the problems in order of their priority.

### Helpful tools

There are several ways to get figures on wait time versus transfer time. I prefer to have this information directly in my browser since you otherwise might miss the moment a website is slow if you, only after a sluggish page view, want to know the speed. An example of a tool that measures website response time is the Firefox extension _app.telemetry Page Speed Monitor_ 36. It has a value for how long it takes the server to send the first binary one or zero to the user. This figure is valuable since a high figure suggests an overloaded server, or misconfiguration. It is important to highlight that these values can vary greatly between different sub-pages. Are you aware of any specific page templates having additionally complex structure, with dependencies to external information; they are definitely worth testing to see how they behave.

    Figure 72: App.telemetry Page Speed Monitor showing page load info in the upper right of the browser window.

You should not be hung up on exact numbers, but instead try to see if there are patterns in how a website behaves. Worth keeping in mind, your own experience is just anecdotal and that you may need to look for what the average user is experiencing, which is probably already recorded in your website statistics. In Google Analytics, you will find this under _Behavior - > Site Speed -> Page Timings_ and on the websites I looked at, it is often easy to identify why pages are slow. Usually, it is because of an incredible number of lookups of information on databases or other sources. When you start, it feels good to be focused on quantitative data, to see if a performance boost resolved the problem or not.

If you are technically minded, and maybe found something very suspicious, Google Chrome has a great feature in the form of their toolbox for developers. With it, you can get detailed information on all files transferred between the website and your browser. The rule of thumb is that if the initial wait time is high, it is due to the technology or design of the website. If the transmission time is high, it might be due to editorial choices in the form of heavier material such as images, but probably the reason is in the design itself. For example, loading lots of Javascript not actually needed on every single page of the website.

    Figure 73: Google Pagespeed Insights is great for indicating possible improvements in both performance and usability.

Last but not least, Google Pagespeed Insights is an easy way to get a comprehensive view of how a website performs on both desktop computers and mobile phones. Pagespeed Insights can be used via the Web; it is also available as an API. In simple terms, two main divisions can be made about what affects performance. What the editor contributed and what is included in the website's design.

### Editorial performance impact

What an editor contributes to a website is mostly confined to different kinds of content, and their work in a content management system for publishing this content. It is extremely rare to have a problem with text since text does not take any significant effort to transfer. However, issues may arise where text is stored. In some content management systems, such as Episerver CMS, there are functions that retrieve content from other pages on the website. If you use such a feature, you introduce additional complexity to present a page to a user. It need not be a problem, but can lead to serious concerns if the page to retrieve data from has been deleted for example, as it will, at best, make the server work hard not to crash the page view.

The dramatic occasions on which I was able to point out that web editors affect the performance of a website were when someone had copied an entire node with pages and pasted it elsewhere. The node contained thousands of pages and the web server had to figure out the pages' internal relationship. On these occasions, the database server said that it had deadlocked. On other occasions, very popular pages had been thrown in the trash, or given a expiration date. All this is to be regarded as faults in the web application as all such errors will take a lot of extra computing, and if there is an often sought after address behind these errors, you're in for a lot of headaches. For the sake of visitors, you should take care of scrapped addresses by referring to new relevant material, and most preferably make sure that others using the old link update their links.

When it comes to media files, it is common for an editor to work with images. Video and audio are also present but usually they do not affect the perceived performance of a page, the material either is streamed or is a separate download. Pictures, on the contrary, are frequently included on pages and contribute significantly to slower load times during typical web surfing.

The first thing to do to make pictures lean is to choose the correct format. On the Web, mainly JPG, PNG and GIF are used. If we neglect all philosophical ideas about image format, we use JPG for photographs and images with many colors across the entire image area, PNG for illustrations and logos and GIF mainly for animated images. In image-editing programs like Photoshop, there are functions to save images in a web format, making it easy to compare which format is most effective in bringing down the file-size and keeping high enough quality. By choosing the right format, images can be optimized for quick viewing on the web. This means that the image quality is reduced and the image file-size too. You need to find the optimal balance where the picture still looks good but is as small as possible.

To top it off, there are applications that can cut image file size in a non-destructive way, i.e. in a way that is not detectable to the human eye. A favorite of mine, for Macs, is _Imageoptim_ where you just drag and drop the image files into the application window and it overwrites the files with optimized versions.

    Figure 74: Imageoptim makes images smaller, without loss of quality.

The tool _Smush.it_ 37 can optimize images on a single web page. The same function is incorporated in the YSlow plug-in for Firefox or Chrome. There is a plethora of tools to fix your images - experiment to see what works for you. Perhaps you'll dare to add an optimization plug-in in your content management system.

The principle for images also applies to video and audio. You choose the format with care and choose parameters such as resolution and quality when saving the file for publication.

### Technical settings for performance

In 2010 I built a service to test websites' optimization on my website webbfunktion.com, and based on tens of thousands sites tested so far, a clear pattern has emerged. There are some optimizations that are often lacking but are relatively easy to fix.

#### 1. Forgot to set life expectancy of files

When the browser downloads a website, the website consists of several files. In addition to the HTML document, there are usually a number of images, at least one style sheet and a Javascript file. With each of these files come instructions to the browser about how long the file is expected to be up-to-date, the expected lifespan of the actual content, that is.

An all too common scenario is that the favicon, the logo and other files that are almost never updated do not have any expected lifespan. What happens then is that returning visitors download files yet again that have not changed since their previous visit. It's silly as the files already exist (in the user's browser cache) but when downloaded again, this can delay the page from displaying. It is logical not to download files that a returning visitor already has in their browser cache, would you not agree? Failing to attribute life spans to files forces the browser to download not only new files but old ones at every visit.

#### 2. Static text files are sent uncompressed

Many files on a website consist of text, such as HTML documents, CSS and Javascript. These can be transferred as-is to users or you can compress them so they are quick to send. Compression means that an algorithm does the best it can to make a text file as small as possible. The algorithm searches for patterns and repetitions, for instance it is able to compress twenty consecutive whitespaces.

It is very common for those who develop websites to retain spaces and tabs for the code to be readable and easier to edit if necessary. Compression is a good idea so that this does not cause slower load-times for visitors.

Compression is much more than just a question of saving disk space. Frequently, we can strip away 75 % or more of a file's size in transmission, which contributes to a faster user experience.

#### 3. Images are not optimized

Almost all websites use decorative images, logos and display concepts. Images are the most abundant heavy material you post on a website and often have a high optimization potential. The images to focus on first are those that are part of the design, since they are loaded regardless of which page the visitor looks at, in other words, logos, icons and any wallpapers.

Even if you have saved optimized images for the web with your image-editing software, there is usually potential left to go a step further. This is called lossless optimization and the rule of thumb is that it should not cause any image deterioration visible to the naked eye.

It is common to find websites that send pictures where they could have optimized away a few hundred kilobytes if they had made an effort. That is not much in the grand scheme of things, at least not until someone on a shaky mobile connection is ready to give up and will not accept to wait a few more seconds.

Many organizations use image systems on their websites. Such systems are usually marketed with catchphrases such as ' _taking care of image reduction and optimization_ ', but in the cases I have seen, the system only performs a few quick tricks of the trade and leaves much to be desired. Be especially observant on pages with many thumbnails since they, in my experience, can be optimized much more than you think.

#### 4. Too many files

The problem is mainly that sometimes there are far too many files. Each file to be downloaded adds to the wait time even before the file is sent. If there are many files, the situation gets worse.

The same applies to icon libraries, which sometimes used to load hundreds of very small images. Each small file contributes with its own unnecessary wait time, and could have been combined with a technique called CSS Sprites. _CSS Sprites_ means that you have a single image file, which in turn contains several images. Then you use CSS to create a small peephole through which you look at the image and you can therefore see a single image at a time.

Instead of each file being loaded individually, all files are loaded at once. A bit slower the first time but the advantage is that there is only one file to download.

Many savvy web developers also combine all the Javascript into a single Javascript file and all the CSS into a single CSS file to reduce the number of files to send to a visitor. If haphazard behavior is observed related to the style sheet or Javascript, check whether the files were combined in the correct order when they were merged. The order of the contents is important.

#### 5. Javascript blocks page load

Many modern websites use a lot of Javascript to give a rich user interface, to simplify interaction with forms, and more.

However, it is very common to load all Javascript code before other files of the page are loaded. Moreover, many execute much Javascript code before the page is presented to the user. So first, you wait to download Javascript, then you wait for it to execute. This is mostly a problem on slow devices, especially if they have a slow connection, and contributes to too long a wait time.

Frequently, there are a few hundred kilobytes of Javascript to download and execute in the browser before continuing with anything a visitor can see on the screen. It is important to prioritize the quick display of visible items on the screen. Sometimes it might be smart to load heavier materials when necessary instead of loading everything directly, so-called lazy-loading.

It is not simple getting an existing website to load Javascript after presenting an interface to the user, but it may be worthwhile examining whether it is possible to move some parts to be loaded last. For example, many people give priority to the page showing up quickly, at least over Javascript used for the web statistics tool to track the visit. This is why such scripts are placed at the end of the code and loaded last.

Check your own website with Google's Pagespeed Insights service. It usually finds something you can improve.

## Recoup an investment in web performance - is it possible?

Before deciding to invest heavily in the best possible web performance, it is important to know your objective.

There are four main objectives with optimization of web performance; cutting the cost of operations of the website and/or earning more money, and potentially, improving user experience and better search engine optimization.

If you are primarily looking to cut operational costs, there are a limited number of things to focus on. Other things you should completely avoid since they instead solicit more operational capacity. In this case, it is possible to make a forecast of future changing expenses.

Work for example, preferably with:

  * **More efficient source code.** Manual review is not always worth the effort, but many frameworks have tools that can find quality problems, and also check what your development environment and, possibly, build-servers can find.
  * **Common database-queries in the cache** , or create specific tables optimized for reading, taking into account usage patterns. Among other things, it is often more efficient to retrieve everything matching a given criterion from the database even though you only plan to showcase the first ten hits, at least when users occasionally scroll to the next ten hits. Then all hits are already in the database server cache.
  * **Database indices are very important** and may need to be continually updated as the content and usage of the website change.
  * **Life span of files in the browser cache** so files that have not been updated are not downloaded unnecessarily. If using a particular version of a Javascript library in the HTML code, you can safely assume that the contents will live forever since the next version will get a new address.
  * **Web accelerators** such as Varnish Cache Server are a solution to test for those experiencing extreme load from quite static web pages.
  * **Content Delivery Networks** (CDN) can be a solution if traffic from the server is the bottleneck. With a CDN, common files are sent from another server so it leaves more capacity for the tasks you keep on your own web server.

If you have a very high number of visitors to relatively few pages, you might consider selective caching of pages a preferable move.

Please talk about this subject with each type of specialist before you begin. If your website does not have at least a few hundred thousand page views per month and you are still experiencing troubles, the problem is probably something else. Maybe your website is on a slow web host, or perhaps the system is not designed for the website's basic needs?

As an example, I might mention that when I was working as a web consultant, I concluded that each of our customers would have got massive benefits from 20 hours of specialist support for performance optimization, after which the value of further hours and effort fell, rapidly. Some would have got their investment back in a week while a few needed to start afresh with their website.

# Test your own website

There are lots of offers from specialized consulting firms where you can get your website evaluated from every possible perspective. Some tests of course are incredibly complicated, require a lot of knowledge or special tools, but others you can do yourself. What you are able to test yourself you should, to level-up your website.

This part consists of a long list of simple tests for an initial health check of a website!

You can probably run the majority of the following tests on your own. We will go through what needs to be tested. Why you do it and what to look for. I suspect that in some parts, you may need help from colleagues or to read a bit about the topics you do not yet know much about. Google is your friend and I will try to use terminology that makes it easy to google for more information if you need it.

## How to document your test

The evaluation begins with the website's homepage. Please sample other pages when the respective assessment requires it or you know of any page that you think is suspicious. A spreadsheet to document your tests on is downloadable at tba.nu/wetools

Since the tools to check websites sometimes disappear, and better ones show up, you can always look at the above URL to see what tools and tricks I'm using myself at any given time.

Whether you use the template I suggest for documentation or your own, it makes sense to:

  * Enter dates for the test so you can go back to old tests for future reference.
  * Give each item a rating, such as _Point 1.3 failed_ or _Point 1.3 passed_ , to rate point 1.3 of the length of the page title.
  * Add a comment to the check point if needed showing which page you checked, or other details worth mentioning.

Now to the evaluation points.

## 1. SEO

Good search engine optimization is all about a website being accessible to search engine robots. That information on the website is accurately described and nothing is missing.

### 1.1 Indexable for search engines

    Figure 75: SEO Doctor gives an SEO score.

If a webpage is not accessible to search engines, it becomes difficult for users of search engines to find. This means that you do not get good returns on the time you spend working on website content since it is not found in searches.

Checks to run:

  * Browser extensions, such as SEO Doctor, will protest if the page has settings that do not allow a search engine to index.
  * The website's HTML code has metadata tags that say _noindex_ , or _canonical_ tags refers to another URL which disqualify the current page from being indexed.
  * Check if there is a file called _robots.txt_ in the root of the website. In such a file, there can sometimes be instructions to search engines that stop the website in whole or in part from being indexed. Exclusions are written with the prefix _disallow_ and then an address follows that search engines should not crawl.

### 1.2 Duplicate content

Information and texts should be as unique as possible. For example, it is not good to have multiple pages with identical titles, headers or content. As far as possible, each page should be unique.

Checks to run:

  * Google Search Console has a view for pages with similar or duplicate content.

### 1.3 Page title's length is under 60 characters

The title of a page should be short for it to be easy to read. There is a limited length in a search engine's results page; it is not to be taken for granted that the full title is shown.

Checks to run:

  * Browser extensions, such as SEO Doctor, will protest if the page title is considered too long.
  * Count the characters in the title tag in the HTML code.

### 1.4 Page title is readable and understandable in the search engine results page

Early on in the page title, it is important to specify the uniqueness and what is important on the page, because search engines do not have infinite space for your page titles.

Checks to run:

  * Try searching for your pages with the major search engines, via both a desktop computer and smaller devices. Check if the titles manage to communicate the general content of each page.

### 1.5 Page title contains relevant keywords that describe the page

We cannot assume that an Internet user reads everything you write. Therefore, your title needs to be short, pithy and confirm the users' search word as early as possible in the page title.

Checks to run:

  * Try searching for your pages with the major search engines, via both desktop and a mobile device. Check that titles contain potential keywords that your target audience might use.

### 1.6 Correct headings are used

Only one main header element and at least one sub-header are preferable. Sub-headers should form a proper document structure without any skipped levels. If the website uses HTML5, you should try to have as few h1-headers as possible to make it clear what is most important.

Checks to run:

  * Browser extensions, such as SEO Doctor, will warn you if there is only one main header and no sub-headers.
  * View the HTML of your pages and see if both an _< h1>_ and _< h2>_ tag are present.
  * Check that header levels are not skipped, going from an _< h1>_ to an _< h3>_, for instance.

### 1.7 Search engine friendly URLs

A website address is important as it can be used outside the website owner's control. Among other things, it should not be too long to memorize or read. Also an address sometimes becomes a link when it is posted online, and then the wording of the address is supposedly an understandable hypertext link.

    Figure 76: Temporary things, such as 'phpsession', are not suitable to be in a URL.

Checks to run:

  * Surf around on your website in search of addresses that look hard to memorize even for a short space of time, addresses that do not describe the contents, or cannot easily be read aloud. A website's internal system-structure should not be visible.

### 1.8 Descriptive text on all important pages

A text that describes the page, a so-called meta-description, is a short summary of a web page's content and purpose. Although the text is not always visible to visitors, it is an opportunity to associate the page with qualitative keywords and synonyms. Sometimes the text appears as body text below the link in the search engine's results pages.

Checks to run:

  * Browser extensions, such as SEO Doctor, usually look at length or the presence of a description text. It is also possible to check the HTML code manually, for something similar to below:  
_< meta name="description" content="Description of the page." />_

### 1.9 Reasonable number of links

Too many links on a page indicates several things, and to a search engine indicates not only suspicious behavior, but also a messy structure and the fact that a visitor may find it difficult to get an overview of the page - especially on a small screen. Fewer than 100 links is preferable on all pages regardless of whether the links point to pages within the website or to other sites.

Checks to run:

  * Look around your website for pages that have an excessive number of links. The browser extension Accessibility Evaluation Checker has a navigation report tool that counts links on a page.

### 1.10 Pictures have alternative texts

Images that have content that complements the page text need to be described for visitors who cannot see them. Among those who cannot see images are search engines, the blind and those who for some reason do not load images when surfing - perhaps because of a shaky internet connection. An image that conveys important content should have an alternative descriptive text, or an empty alternative text if nothing is suitable to describe. This is particularly important for linked images since the image's alternative text will be the hypertext link in some cases.

Checks to run:

  * Browser extensions, such as SEO Doctor, will point out if an image has no alt text. Otherwise, you can manually browse around and check the HTML code to see if the <img> tags have the alt attribute set to something meaningful. Decorating pictures should not have a descriptive alternative text, but should appear as follows:  
_< img src = "image.jpg" alt = "" />_

### 1.11 Structured description of the information

Is there content that in part or in whole describes itself? This is used by search engines to present a company's geographical location, list its events and contact information, among other things.

Checks to run:

  * Browser extensions, such as Operator, will show in your browser if a page has been found to offer structured data.
  * Check pages manually one by one with Google's Structured Data Testing Tool.
  * Verify in the Google Search Console which pages on your website they have discovered to contain structured data.

## 2. Web analytics

To draw conclusions from an analysis of a website, you need to save information in advance so that there is something to analyze. It is therefore important that you keep up with what is possible in web analytics.

### 2.1 Current visitor tracking scripts

To do a good analysis afterwards of how a website is used, it is important to first collect relevant data. There is variation in how tracking code can be written and sometimes things that need to be tracked will require additions to the code you use. Therefore, it is important to keep your tracking tool up-to-date.

Checks to run:

  * Manually compare the proposed tracking code from the service provider with the code you are using and check if there is any interesting information to take advantage of.
  * Browser extensions, such as SEO Doctor, will tell you if tracking code does not seem to exist on a web page.

### 2.2 Tracks the use of website search

If you collect statistics on the use of the website's own search engine, you can draw conclusions from this and often improve user experience. If nothing else, it is useful to collect data on what visitors are looking for because you might want to do search analytics in the future.

Checks to run:

  * Check that your analytics tool, such as Google Analytics, is collecting data from the website's internal search engine. Usually a relatively simple setting in the web analytics tool which does not require you to change the website architecture itself.

## 3. Performance

To optimize for performance is to give visitors as good an experience as possible. Visitors have different technical set ups and equipment in terms of connection speed, the type of connection and features in their browsers.

### 3.1 Reasonable time for loading the page

How long it takes before the first byte is sent, as well as a balance between the necessary amount of data and the total page load-time. In other words, an assessment whether the experience can be made better or faster. What you think is a reasonable time is up to you, but a regular page should be loaded in less than 2 seconds on a desktop computer in any case. On a mobile, you can have 5 seconds as your upper limit.

Checks to run:

  * Browser extensions, such as app.telemetry Page Speed Monitor, display this information regarding your own desktop computer performance.
  * Look for the appropriate view in your web analytics tool to see quantitative data from real user monitoring (RUM) at your website.

### 3.2 Compression of text files

Compression is about making sure that what is sent to the visitor is as small as possible. A small amount of information is faster to transfer.

Checks to run:

  * Browser extensions and web services, like Google Pagespeed and YSlow, will show if the website you are evaluating sends files in a compressed format.

### 3.3 Usage of the browser cache

You do not want to send unchanged files every time to visitors who have already downloaded them; it takes unnecessary time and degrades the experience. This is particularly important for files that are rarely or never updated. For example, logos and icons.

Checks to run:

  * Browser extensions and web services, like Google Pagespeed and YSlow, will tell you if the website you are checking instructs the browser cache on file lifespan.

### 3.4 Scripts and style sheets are sent in a compact format

For a web page not to be unnecessarily slow to display for the visitor, you want to remove as many spaces, line breaks, etc. as possible from the code.

Checks to run:

  * Browser extensions and web services, like Google Pagespeed and YSlow, will show in your browser if the website you are evaluating sends files in a minified format.

### 3.5 Images are optimized for fast transfer

A picture posted on the Web needs to have the right balance between image quality and reasonable file size. Often, image file sizes can be reduced even further without a visible deterioration of image quality.

Checks to run:

  * Browser extensions and web services, like Google Pagespeed and YSlow, will tell you if the images on the page can be further optimized.

### 3.6 Reasonable number of background images, scripts and stylesheets

If the website is dependent on many files to display properly, there is a prolonged waiting time for the visitor. Each file takes a while before it starts to download and that is why you might want a balance in the number of files required. In many cases, you can combine many smaller files into a single larger file, meaning fewer files in the queue for download.

Checks to run:

  * Browser extensions and web services, like Google Pagespeed and YSlow, tell you if they feel that there are too many files to download and which ones ought to be combined into fewer files, or placed on a CDN.

### 3.7 Requesting files and pages that do not exist

Neither visitors nor search engines like error pages or that files are missing that they depend on for correct presentation. Although the page might look correct on the screen, behind the scenes requests to non-existing files might be made, contributing to a slower experience.

Checks to run:

  * Browser extensions and web services, like Google Pagespeed and YSlow, usually report if a file that is requested is not available. It can sometimes be difficult to detect missing images or custom fonts.
  * Google Search Console has a view for addresses that give errors 404's , or system failures in the 500 series.
  * Some content management systems, such as Episerver CMS, have reports and tools to check links.
  * Check that all important links from other websites to yours point to functional pages.

### 3.8 Minimal amount of scripts and CSS in page code

All frequently used functions and instructions on appearance need to be stored in a way that makes them reusable between pages. In other words, the individual page is supposed to have as little information as possible when this information is common with other pages. You put the common code / instructions in an external file for joint reuse. This does not contradict the performance practice to include critical, or structural, CSS on every page.

Checks to run:

  * Browser extensions and web services, like Google Pagespeed and YSlow, may complain when finding too much inline code.
  * You can manually check if there are a lot of _scripts_ or _style_ tags embedded in HTML code. In addition, frequent use of _style_ attributes on HTML elements is something to watch out for.

### 3.9 Images are not scaled down using CSS or HTML

A large image is generally slower than a smaller one to download for a visitor. Therefore, it is good practice that the images used on the website are shown in their actual size, in pixels. Think about whether you regard this as responsive image enhancement, or if you think it is a performance issue to send images with a higher resolution than needed for the available space.

Checks to run:

  * Browser extensions and web services, like Google Pagespeed and YSlow, will tell you if there is potential to reduce load-time.

### 3.10 Identical files are not referenced

Files that do, or contain, the same thing should not be downloaded multiple times. An example of this is using multiple versions of script libraries at the same time for one page, or retrieving them from different addresses. Sometimes it happens that you have accidentally, for example, put the logo file in several locations so that multiple files with identical content are sent unnecessarily to users.

Checks to run:

  * Web services, like Pagespeed, can point out, somewhat cryptically, that consistent addressing is needed. This suggests that identical files are downloaded from different addresses.

### 3.11 Reasonable amount of scripts in the page head

In many cases, it is recommended to have tracking scripts and other external dependencies at the end of the page's source code. Because these may be interdependent, it is something you should plan for already in the design phase or in a major revision. It is especially important for the sake of mobile users to plan when scripts run, or if they are loaded only when needed.

Checks to run:

  * Browser extensions and web services, like Google Pagespeed and YSlow, will indicate if there is an unusually large amount of Javascript blocking the page's earliest opportunity to display, its rendering in other words.

### 3.12 Content networks are used when necessary

Many sites use common features, including collecting statistics and providing a richer experience, by way of, for example, image carousels. Instead of having their own copies of files that are required, you can use a Content Delivery Network (CDN) to improve performance on a website. This is particularly relevant for those who have not yet worked on minimizing the number of files. Do you have many self-created static files, such as images and other media? Then a personal subdomain such as media.mywebsite.com may be of help.

Checks to run:

  * Browser extensions and web services, like Google Pagespeed and YSlow, tell you if you need a CDN. A CDN can be a good idea for files that many websites use, such as Jquery, Modernizr and others.

## 4. Accessibility and Usability

Accessibility and usability are about caring for visitors and meeting their needs. Since we have no control over the user's context, the website needs to be accessible to as many people as possible to achieve its purpose. It should be easy to understand, users are supposed to do the things they want to do, such as buying something, instead of trying to understand what is on offer and how to proceed.

### 4.1 Website validates the chosen code standard

Website validation is important for many reasons. One of them is that it reduces the risk of technical barriers arising for your visitors, regardless of whether the user has a disability or not. Validation reports are commonly divided into errors and warnings.

Errors should not be tolerated and most warnings are important to look into.

    Figure 77: W3C offers a great markup validation service[38](index_split_049.html#note_38 "38").

Checks to run:

  * Browser extensions for validation or the W3C validation tests.

### 4.2 Using correct header structure

Pages with real headers (h1 - h6 in the HTML code) is a support for visitors who do not actually see the page. Headers can be used as shortcuts and table of contents for those with screen readers, and even search engines.

All too common mistakes are using bold text or otherwise making plain text look like a header, or having a large body of text that lacks sub-headers.

    Figure 78: Browser add-ons, such as Accessibility Evaluation Toolbar, find shortcomings.

Checks to run:

  * Browser extensions, such as Accessibility Evaluation Toolbar, offer a range of reports on a page. You can manually check by going through the HTML code and hunting for header tags like <h2>, <h3> and so on.

### 4.3 Anchor-texts are descriptive

When a user skims a page and finds a link, the link text should explain where the link points to. You cannot assume that a user reads the text around the link.

It is, unfortunately, all too common that link texts are phrased as ' _read more here_ ' or ' _go here_ ' which does not suggest either the reason for the link, or what is behind it.

Checks to run:

  * Browser extensions, such as Accessibility Evaluation Toolbar, offer a navigation report summarizing a page's links. Otherwise, check them manually link by link.

### 4.4 Link titles not used for non-essential information

The title of a link is the small notification that may pop up if you hold the mouse pointer over a link for a while. The function is to give additional information about the link itself, for instance the size of a PDF file to download. If the link's text is repeated in the link title or if it does not add anything of value, it will just be an unnecessary annoyance to visitors. Keep in mind that link titles are rarely seen by those with a touchscreen.

Checks to run:

  * Browser extensions, like Accessibility Evaluation Toolbar, provide navigational reports that summarize a page's links. Otherwise, you can manually check every link, by hovering the mouse pointer over them, or looking at the HTML code in the attribute alt or title used by search engine crawlers.

### 4.5 Favorite icon is present

A favorite icon (favicon) is a small image used in the address bar, browser tabs and bookmarks among other things. It helps users to navigate between tabs and recognize bookmarks among other things.

Checks to run:

  * Look in the address bar or tab in a browser to see if there is a unique icon for the website.

### 4.6 Possible to navigate with keyboard

Using the keyboard to navigate a website is necessary for some, while for others it is about good ergonomics. Keyboard navigation is when, for example, you use the tab key to move between links and form fields, or use special hotkeys to jump to the next header or list.

Checks to run:

  * Browser extensions, like Accessibility Evaluation Toolbar, provide navigational reports that summarize a page's access keys. You can manually check how well you can use the Tab key to move between form fields, links etc.

### 4.7 Texts are written to be read by a human - not with exaggerated SEO

When text is written for the web, the first and foremost concern should be users as readers, all other aspects are secondary. Except for adapting the style to suit the Web, such as putting the most important thing first in a sentence or paragraph, we should be wary of focusing on keyword optimization too much if it causes a human reader difficulty.

It is all too common with peculiarly formulated page titles, headers and links, often due to eagerness to be the highest ranked on search engines. Visitors can see through this easier than many seem to think, and they will then have a less positive impression of the website.

Checks to run:

  * Go ahead and critically examine texts. Are they written primarily for a human reader or is it mainly a lure for visitors? Headers and other text crammed with potential keywords are a warning sign.

### 4.8 Language set in the source code

For both search engines and screen readers, it is easier to know the language if it is set. The language is a _lang_ attribute on the html tag, such as _lang="sv-sv"_ for Swedish.

Checks to run:

  * Display the HTML code, look for the HTML element's start tag and check if it indicates the correct language.

### 4.9 Not depending on browser features

To make a website appealing to as many people as possible, it is important to guarantee users a good experience even when they do not have certain technologies installed on their browsers, or have difficulty using certain technologies. Flash, Javascript, and to some extent CSS risk causing bad user experiences.

We must not refrain from using the technology simply because not everyone can use it, but by working based on the lowest common denominator and then refine the interface for those who support it. This practice is known as progressive enhancement.

Checks to run:

  * Use browser extensions, or change your browser's settings manually, to turn off Javascript and CSS (style sheets). Is the content presented in a logical order? Can you still use the website?

### 4.10 Specifies image sizes in HTML

If the visitor's browser receives instructions on image size, it does not need to redraw the page again when the image is fully downloaded, so reducing the perception of a flickering screen. We get a more robust impression of the page when content does not jump around while images are loaded.

Checks to run:

  * Browser extensions and web services, like Pagespeed and YSlow, report if images in the HTML code are not provided with their dimensions.
  * Browse around manually and carefully pay attention to what happens around images. Remember that it may vary depending on your device's screen size, for example, responsive websites display fewer columns on mobile screens where this can have a greater impact.

### 4.11 Works with and without the www prefix

The website should work regardless of whether the visitor has entered the www prefix or not. Make sure to use only one of these variants by forwarding the visitor to the page at the address you have chosen. This is to ensure that things work no matter how one enters, but that it is clear what is considered the correct address.

Checks to run:

  * Try to add or remove the _www._ prefix in the browser's address bar, not only for the home page. Do you end up on the correct page? Is only one variant used? Great.

### 4.12 Only one domain is used for the website

In order not to cause uncertainty among visitors about what your address is, it is preferable to use only a single domain for all of your web presence. A rather common example is that blogs are placed on subdomains such as blog.example.com, or that sub-pages with login requirements on subdomains like customers.example.com

Several domains are quite in order as long as the user will not see them or suffer from this. For example, it is alright for images to be retrieved from a custom domain to optimize performance.

When it comes to intranets, this can cause obvious problems when users are allowed to login even though they are not in the organization's network. Then we can discover that certain links on the intranet point to internal domains that are not (yet?) available even though they are perceived to be a part of the intranet.

Checks to run:

  * Browse around manually on the website and check if subdomains show up or whether completely different domains appear.

### 4.13 RSS subscriptions can be detected

If a website offers news subscription through RSS-technology, it should be marked up in code for so-called auto-discovery. This means that any subscriptions are listed among the page's other metadata and are easy to detect technically.

Checks to run:

  * View HTML code similar to below, or install a browser extension that detects RSS feeds:  
 _< link href="/rss/" type="application/rss+xml" rel="alternate" title="Subscribe" />_

### 14.4 Useful error pages

A so-called 404 page should tell the user what happened and help the user to find what they requested. For your own sake, you should also log how people ended up on the error page, where they came from, and more.

Checks to run:

  * Type an incorrect address to the website and see what happens. Will it present a page that tells you that an error occurred? Does it show whose website you are on? Is there help to find what you were looking for? Are there obvious ways to reach the home page?

### 4.15 No surprises when scrolling

It is all too common that a page stops scrolling when the mouse pointer, or finger on a touch screen, ends up in a map or iframe. It is of course very frustrating and for most visitors very illogical.

Checks to run:

  * If you do have maps, place your mouse pointer over the map element and begin to scroll down. Did you zoom the map when the mouse pointer hovered over the map?
  * Try to scroll on a device with a touch screen and assess whether the behavior is logical. It is possibly a bigger problem for maps or iframes, since it is not within the visible area of the screen and the user triggers it during fast down-scrolling.
  * Also, check if you have iframes (a peephole on the page where you download content from another location) as these, among many other things, degrade usability.

### 4.16 Enough distance between links, buttons, etc.

A common usability problem that we all can suffer from is that interactive elements on a website are placed too close to one another. Most often, this becomes a problem with lists of links where they are stacked vertically. Clicking on the wrong link causes irritation, and together with slow loading, this could be enough reason for a user to give up.

Checks to run:

  * Appears in Pagespeed Insight (called _tap targets_ ) but can also be checked manually on any touch screen where you use your unpracticed thumb to see if you can hit the right link - every time. Is it good enough?

### 4:17 Acceptable text size

Text size has become more important than ever since many people use the Web outdoors in daylight. It provides challenges with size, contrast between text and background, among other things.

Checks to run:

  * Appears in Pagespeed Insight but can also be controlled manually by taking your mobile device outdoors. If you have a tablet, it is worth testing too because they tend to have more problems with reflections on the screen because of the larger surface. Is it easy to use?

### 4.18 Zoomable, also on mobile

There are many websites where the creators seem to have thought that now that they have made it so suitable for mobile users, no in- and out-zooming is needed. Unfortunately, disabling the ability to zoom is a stupid thing to do. For example, those with motor disabilities may need to zoom in a little extra on links to be certain of hitting the one they want.

Checks to run:

  * Enter the website with your mobile or tablet. Is it possible to zoom in the same way you normally zoom with the device?
  * Use your browser's feature to zoom and see if it works.

### 4:19 Icons for the website

A user can add a shortcut to a website on their home screen on many kinds of devices. You have probably seen invitations to do this while browsing the Web with a mobile. Just like an app icon, this icon needs to be highly recognizable. This is a common requirement with logos or other familiar visual marks.

    Figure 79: When designing icon's for home screens, recognition is important.

Make sure that you have enough image resolution so that it does not appear blurred or ugly on some high-end screens.

Checks to run:

  * In addition to adding the website to the home screen on your device, you can check the HTML code for something similar to the examples below.  
 **Example for Iphone / Ipad, Android and others**  
 _< link href="/apple-touch-icon.png" rel="apple-touch-icon" /> <link href="/apple-touch-icon-precomposed.png" rel="apple-touch-icon-precomposed" />  
_ **Example for Windows 8**  
 _< meta content="#123456" name="msapplication-TileColor" />  
<meta content="/win8-tile-bild.png" name="msapplication-TileImage" />_

### 4:20 Useable printouts

    Figure 80: A printout of my website discloses the URLs behind the links.

Not that you should ask people to do printouts on paper, but there is still reason to keep an eye on how printing looks and functions. Partly because some people read PDF print versions on their e-readers (which sometimes lack Internet connectivity, and this messes up the links), and also for archiving of your website by you or others.

Checks to run:

  * Print a few different types of pages, either on paper but rather as a PDF draft. Is all the information properly presented? Can you see where the links point to?

## 5. Others

### 5.1 Forms and other sensitive information is sent through a secure channel

When a form on the Web is sent back to the web server, sometimes it is transmitted in a way that makes all the content readable, in plain text, if someone is monitoring what is sent over the network. If a login form, your password can be read by every network administrator who oversees the network's traffic. However, your visitors can also be sitting on an unsecured wireless network in a hotel or be vulnerable to other threats, such as _man-in-the-middle attacks_ , which means that unwelcome people can monitor everything that is sent. This also applies to the content of web pages that users receive, not only forms that are sent; sensitive information in the page's content can theoretically be read in transit between the server and the user's computer.

What every website should do is to protect sensitive traffic, secure it by passing it through the HTTPS protocol. This means that the information is transmitted in an encrypted form via SSL/TLS.

It is important that visitors can verify that traffic is protected so that they feel safe to use the website. Therefore, it is a good practice to use HTTPS on the entire website even if the need exists only on parts of the website. To verify that information is sent securely through your organization's own APIs to user devices, you probably need to contact a developer.

Checks to run:

  * There are different ways to see if a website communicates via _HTTPS_ , often displayed as _https://_ or a key in the address bar. You can also look at the HTML code and verify that _https://_ is used, or files called with just _//_ are safe on a page requested with _https://_

# Tips on in-depth reading

  * Content Strategy for the Web, Kristina Halvorson, ISBN 0321808304
  * Don't Make Me Think, Steve Krug, ISBN 0321965515
  * Mobile First, Luke Wroblewski, ISBN 1937557022 
  * Responsive Web Design, Ethan Marcotte, ISBN 098444257X 
  * Mobile Usability, Jakob Nielsen & Raluca Budiu, ISBN 0321884485
  * Content Strategy for Mobile, Karen McGrane, ISBN 1937557081

... And of course my blog at webstrategyforeveryone.com :)

# Sources & references

  * ITU - Statistics on Internet access from the International Telecommunication Union:  
www.itu.int/en/ITU-D/Statistics/Pages/stat/default.aspx
  * ITU – 4G population coverage in Germany:  
www.itu.int/ITU-D/ict/newslog/Vodafone+To+Boost+LTE+Coverage+To+90+By+Next+Summer+Germany.aspx
  * PTS - the telecommunication authority in Sweden on cellular networks:   
<http://www.pts.se/upload/Rapporter/Radio/2014/rapport-uppdrag-samla-statistik-tillgang-mobila-komnat-pts-er-2014_11.pdf>
  * The Connectivist - The web grows and shrinks:  
www.theconnectivist.com/2013/06/the-expanding-consolidation-of-the-consumer-internet/
  * Deepfield – 50 % of traffic comes from 35 websites (2013):  
conferences.infotoday.com/documents/172/2013CDNSummit-B102A.pdf
  * Study on how defaults affects the response:   
danariely.com/2008/05/05/3-main-lessons-of-psychology/
  * Dark patterns – dirty tricks designers use to make people do stuff:   
www.90percentofeverything.com/2010/07/08/dark-patterns-dirty-tricks-designers-use-to-make-people-do-stuff/
  * Statistics on Facebook 2013–2014: 
    * thenextweb.com/facebook/2014/01/29/facebook-passes-1-23-billion-monthly-active-users-945-million-mobile-users-757-million-daily-users/
    * www.theguardian.com/technology/2014/feb/03/facebook-mobile-desktop-pc-platforms
  * Target figuring out that a teenage girl was pregnant:  
www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/
  * Offline First: alistapart.com/article/offline-first
  * Mobiforge - about mobile usage statistics in 2013:  
<https://mobiforge.com/research-analysis/global-mobile-statistics-2014-part-b-mobile-web-mobile-broadband-penetration-3g4g-subscribers-and-ne#topmobileweb>
  * Google's statement about performance: googlewebmastercentral.blogspot.se/2010/04/using-site-speed-in-web-search-ranking.html

# Thanks goes out to...

**Agneta Grangård** for assigning me challenging digital topics to write about at work. **Linus Josefsson** for opposition about the benefits of responsive web design. **Patrik Malmquist** who had many valuable observations on the content. **Pär Lannerö** for input on usability and accessibility. **Filip Andersson** for clarifying some English grammar, and particularly pronouns, for me when in doubt (and for always being there when needed!).

**Robert Lundin** for the inspiration to stop being a smartass about RSS, JSONP and other tech knick-knacks and instead write a gospel – The gospel of Mark(us), which was the working title for a long time. **Kristian Norling** for all writing-assignments over the years, all influences on the use of technology and for being the publisher of this book. **Kalle Skogh** for answering my probably, to him, stupid questions on how Millennials regard the Web.

All of you who have encouraged me along the long road until completion.

However, of course, mainly thanks to, **Anna Johansson.** Whom among other things designed the whole thing and had to put up with all woefully confused matters during these years of sometimes writing, and sometimes write angst.

A small selection of what Anna had to put up with:

"How universally understood is the term faux pas?! C'mon! ARE U AWAKE, or asleep?"  
"I want the book to be moderately thick, columns to feel uncluttered, neat, and colorful and a premium feel of the paper. Can you do it? "

As said. Thank you!

# Web Strategy for Everyone

This book focuses on the necessary skills of developing and taking care of websites. What you need to know to work strategically with your website. The introduction reflects the Web's history and how it connects to the Web we see today. Followed by what ought to be common knowledge on information architecture, such as tagging, metadata, digital asset management, URL strategy and the like.

Obviously, web design strategies are discussed at length, including responsive web design, and how to design to be persuasive. Next to last topic on how to optimize the performance of a website and last but certainly not least, the do-it-yourself section where you can test a variety of quality factors of your site based on usability, search engine optimization, and more.

> "Marcus Österberg's book _Web Strategy for Everyone_ is just what the title advocates, a book for everyone. With simple and well-written text it guides the reader through the most important foundations of a successful website."
> 
> \- Fredric Ollerstam, web producer

> "Highly recommended for anyone working with strategic web matters for medium to large companies and businesses."
> 
> \- Anders Lövkvist, Brand Manager at RLVNT Distribution

# Notes

[←1]

National Center for Supercomputing Applications
[←2]

tba.nu/we2
[←3]

tba.nu/we3
[←4]

tba.nu/we2
[←5]

Hypertext Markup Language is generally used to display content, and Cascading Style Sheets inform the browser how to format and layout the content.
[←6]

tba.nu/we6
[←7]

Application Programming Interface, allows relatively easy systems integration.
[←8]

Dublin Core Metadata Initiative: tba.nu/we8
[←9]

International Standard Book Number
[←10]

International Classification of Diseases: tba.nu/we10
[←11]

Medical Subject Headings: tba.nu/we11
[←12]

Health Level 7: tba.nu/we12
[←13]

tba.nu/we13
[←14]

tba.nu/we14
[←15]

tba.nu/we15
[←16]

tba.nu/we16
[←17]

tba.nu/we17
[←18]

tba.nu/we18
[←19]

tba.nu/we19
[←20]

tba.nu/we20
[←21]

tba.nu/we21
[←22]

tba.nu/we22
[←23]

tba.nu/we23
[←24]

tba.nu/we24
[←25]

tba.nu/we25
[←26]

tba.nu/we26
[←27]

tba.nu/we27
[←28]

tba.nu/we28
[←29]

tba.nu/we29
[←30]

tba.nu/we30
[←31]

tba.nu/we31
[←32]

tba.nu/we32
[←33]

tba.nu/we33
[←34]

tba.nu/we34
[←35]

tba.nu/we35
[←36]

tba.nu/we36
[←37]

tba.nu/we37
[←38]

tba.nu/we38
