HTTP caching with Qt

April 29, 2011 by Peter Hartmann | Comments

This is an in-depth article about how HTTP caching works in general and how it works with Qt.

What is HTTP caching?

When a browser loads a Web page, the different resources (HTML pages, images, CSS scripts etc.) are stored locally, so that next time the resource is retrieved, it can possibly be served from the local store instead of loading it from the network again. This has several benefits:

speed up: Loading resources from the cache is a lot faster than loading them from the network.
offline usage: Pages can be displayed without being connected to the network.
reducing load: Loading resources from a cache or a proxy reduces the load on the originating server.

This article is mostly about finding out when a resource can be loaded from cache, and when it has to be loaded from the network.

How does caching work with the HTTP protocol?

The usual flow with HTTP caching goes like this: When the client (usually a browser) is requesting a resource via HTTP GET for the first time, it does usually not send any caching information with it. The server responds with a HTTP 200 OK message and the data, while it adds some headers to control caching on the client side, namely:

expiration information:

When the server responds to a client request, it sends information along whether the resource can be cached to disk and, if that is the case, how long the resource can be fetched from cache the next time the client loads it. In other words, it tells the client when the resource expires in the cache and has to be loaded from the network again. HTTP headers used by servers and proxies for sending expiration information are (list is not complete):

Expires: The server tells the client the date of when the resource expires. Example: "Expires: Fri, 29 Apr 2011 09:22:59 GMT"
Cache-Control: max-age: The server tells the client the maximum age of a resource, i.e. how old the resource can get while still being considered fresh. Example: The server tells the client that the resource can be cached for one hour ( = 3600 seconds): "Cache-Control: max-age=3600".
Example:
Cache-Control: s-maxage: Same as the max-age case, but used for shared caches (e.g. caching proxies) and ignored by private caches (e.g. browser caches), while the max-age case is for private caches. Example: The server tells intermediate proxies that the resource can be cached for one hour ( = 3600 seconds): "Cache-Control: s-maxage=3600"
Cache-Control: must-revalidate: The server tells the client to always reload this resource, in case other expiration information is not enough. For instance, a client is allowed to serve a stale (over-aged) resource from the cache (see QNetworkRequest::PreferCache below), so specifying "Cache-Control: max-age=0" would not be enough in that case. Specifying "must-revalidate" makes sure the client always reloads from the server itself (and not only from intermediate proxies). E.g. Facebook and Twitter are using that for their front page (but usually not for elements referenced from their front page).
Example:
Age: Denotes the age of a resource. This header specifies the time in seconds from when the resource has been generated by the originating server. Now at first glance this seems redundant, because a reply from the server should always implicitly have an age of zero. However, often the reply does not come from the originating server directly, but from intermediate proxies (check e.g. qt.nokia.com). In that case, the "Age" header denotes the number of seconds from when the resource has been fetched from the originating server. The "Age" needs to be considered when calculating the "max-age" directive.

modification information:

When the client has a resource in its cache locally, it can ask the server to send the resource only if it has changed. This involves always a roundtrip to the server, but might save data if the server tells the client that the resource has not changed since the client fetched it last time. In that case, the server sends an HTTP message with an empty body, instead of sending the data body as well. HTTP headers used for sending modification information are (list is not complete):

From the server:

Last-Modified: The server tells the client the date of when the resource was last modified.
ETag:The server sends a version identifier of the transmitted resource. This can be considered a hash function of the data body, which will change whenever the resource changes.

From the client:

If-Modified-Since: The client tells the server to only send the data if it has been modified since the given date; i.e. if the Last-Modified header has changed. If it has not been modified, the server sends an HTTP 304 Not Modified message, containing only HTTP headers, but no body. If it has been modified, the client sends an HTTP 200 OK message containing the body.
Example:
If-None-Match: The client tells the server to only send data if it has a new version identifier, i.e. if the ETag header has changed. If it has not been changed, the server sends an HTTP 304 Not Modified message, containing only HTTP headers, but no body. If it has been changed, the client sends an HTTP 200 OK message containing the body.
Example:

It is interesting to note that the headers involving absolute dates ("Expires", "Last-Modified", "If-Modified-Since") were already present in HTTP 1.0; the newer HTTP 1.1 standard resorts to means not involving dates, but time data relative to the client's clock ("max-age", "s-maxage") or versioning information ("ETag", "If-None-Match"). This is because in order of handling dates to work accurately, the server and client clocks need to be synchronized. ETags and relative time data provide more robust means that do not assume the clocks to be synchronized. That said, all of the headers presented above are still in widespread use.

How does caching work with Qt?

By default, no disk cache is used when retrieving resources over HTTP with the QNetworkAccessManager class. In order to enable a cache, you need to either instantiate the QNetworkDiskCache class or write your own class deriving from QAbstractNetworkCache and then set it on your QNetworkAccessManager instance by calling setCache().
In that case, Qt will load resources from the cache if the resource is still fresh, and load from the network if not; if possible it adds modification information, as described above.

In order to fine-tune the behavior of how Qt loads resources from the network, you can set specific attributes in your QNetworkRequest by calling setAttribute() with QNetworkRequest::CacheLoadControlAttribute being one of:

AlwaysNetwork: Always load from the server and force intermediate caches to reload by setting "Cache-Control: no-cache" and "Pragma: no-cache".
PreferNetwork (default): If the resource can be found in the cache and the age of the cached resource is less than the maximum age (used headers: "Age", "Cache-Control: max-age") or the resource has not expired (used header: "Expires"), then it is loaded from the cache. If the resource has expired or has exceeded its maximum age, it is loaded from the server, if possible with modification information (used headers: "If-Modified-Since" if "Last-Modified" was given, and "If-None-Match" if "ETag" was given).
PreferCache: If the resource can be found in the cache and has not expired, then load from cache. The contrast to PreferNetwork here is that even stale, i.e. resources exceeded its maximum age, will be loaded from cache. If the resource has expired (determined via "Expires" header) or cannot be found in the cache, this setting behaves as with the PreferNetwork case.
AlwaysCache: Serve the data from the cache if available, never use the network; this can be seen as an offline mode. If the resource is not in the cache, an error is reported.

If you want to fine-tune the caching behaviour even more, you could add headers (e.g. "Cache-Control: max-age" or "Cache-Control: max-stale") yourself via QNetworkRequest::setRawHeader().

Areas for improvement in Qt

Implement freshness heuristics: If a resource does not have an expiration date (no "Expires" header) and the age of the page cannot be determined (no "max-age" or "Age" header), the client can implement heuristics to determine whether a page is fresh. In particular, if a resource has a "Last-Modified" header, a fraction (the HTTP RFC mentions 10%) of that time until now can be used to assume a resource still being fresh. For example, if a resource has a "Last-Modified" header set 10 days in the past, the resource can be assumed to be fresh for 1 day.
The age calculation needs to be reworked.
Fetching resources from the cache can be made faster.
The HTTP Vary header must be taken into account.

As always, feel free to vote and comment on the tasks above.

Blog Topics:

Internet

Development Framework & Tools

Qt Framework

Qt Development Tools

Qt Design Studio

Qt Quality Assurance

Qt Digital Ads

Qt Insight

Quality Assurance Tools

Squish

Coco

Test Center

Axivion Static Code Analysis

Axivion Architecture Verification

More

Qt 6

Licensing

Qt Features

Qt for Python

Industry & Platform Solutions

Industry

Automotive

Micro-Mobility Interfaces

Consumer Electronics

Industrial Automation

Medical Devices

Platform

Desktop, Mobile & Web

Embedded Devices

MCU (Microcontrollers)

Cloud Solutions

More

Next-Gen UX

Limitless Scalability

Productivity

Our Ultimate Collection of Resources

Development Framework & Tools

Qt Resource Center

Qt Blog

Qt Success Stories

Qt Demos

Quality Assurance Tools

QA Resources

QA Blog

QA Success Stories

More

Live Events & Webinars

Documentation

Take Learning Qt to the Next Level

Learn with us

Qt Academy

Qt Educational License

Qt Documentation

Qt Forum

We're Here for You—Support and Services

Helpful Links

Contact Us

Qt Partners

Qt Support

Qt Customer Portal

Qt Customer Success

Qt Professional Services

HTTP caching with Qt

What is HTTP caching?

How does caching work with the HTTP protocol?

expiration information:

modification information:

How does caching work with Qt?

Areas for improvement in Qt

Blog Topics:

Comments

Subscribe to our newsletter

Subscribe Newsletter

Try Qt 6.7 Now!

We're Hiring

Read Next

How to create a REST API with QtHttpServer

Introducing the Qt Http Server

Connecting your Qt application with Google Services using OAuth 2.0