HTML Cookie Introduction

by Patrick Horgan

(Back to Web tutorials)

Is this what you're looking for?

This is a tutorial about cookies. You will learn everything about what a cookie is.

If instead, you are wanting to learn how to deal with them via javascript, see the tutorial HTML Cookies From Javascript.

If you need to learn how to deal with them via PHP, see the tutorial HTML Cookies From PHP.

What the heck is a cookie

Cookies are just bits of text that can be stored on a user's machine by a browser. When a user asks for a page, the server can send Cookies along with the page in Set-Cookie headers and ask the browser to store them. The Cookies have information in them (we'll explain later), that lets the browser know what page requests to send them back with later. The browser sends all appropriate Cookies back with a request for a page in one Cookie header.

A cookie as sent from a server in a Set-Cookie header consists of Name=Value, optionally followed by one or more attributes, each separated from preceeding parts of the cookie string by a semi-colon and space. As an example:

ID="x3729"; HttpOnly; Secure; Expires=Wed, 09 Jun 2021 10:18:14 GMT

The attributes of a cookie can include

They are stored on the user's machine, and can be created from a web page from javascript, or sent over from the server as a header. They are defined in RFC 6265. If you don't know what an RFC is, or how to get them (they're free), see my RFCs and a Script to get them. If you read that you'll know everything about what they are.

The attributes are stored on the browser and are for the browser's use. Only the name and the value are returned to a server.

A cookie is not a program

A cookie is not a program so can not be run on the user's machine. That means that they can not carry trojans/viruses/worms. They can carry information about you to track your browsing habits.

A cookie passes as a header

Headers are things that pass back and forth between your web browser and a web server that you never usually see. They tell the server, for example, what browser you're using or what language you prefer using. They allow the server to say, for example, the mime-type of the page being sent or how long the page has been cached.

When you request a page, the browser looks through the list of cookies on the machine, and if one matches the domain, (possibly restricted by path), of the page you're requesting, it will send a Cookie header along with the request. The Cookie header will contain all matching Cookies. On the server a script in php or perl or any other cgi conforming language has access to them and can use them for any purposes, or can ignore them.

Every time the server sends a page to your browser, it can include Set-Cookie headers to, well, set cookies on your machine for later. N.B. Cookies can also by set on a machine via the use of javascript running on a page locally in the user's browser.

More about the parameters

Here I'll talk more about the different parameters of a cookie

Name

The Name field can consist of a token as defined in RFC 2616. A token is one or more of any char with an US-ASCII code of 12710 or lower excluding any of the following characters

Within those constraints the name is anything chosen to be meaningul to the web developer who thought it up.

Value

The Value field can be any US-ASCII character not containing

The Value can be inside double quotes, and the quotes don't count as part of the value. The value is optional, i.e. can consist of nothing, or of just a pair of double quotes.

The value will often be encoded somehow. If you want to use characters outside of the US-ASCII range in the Cookie value they must be encoded somehow into the US-ASCII range.

Base64 is a popular choice. Even if you think that you aren't using anything outside of the US-ASCII range, you might be. If you accept a user name from someone, they might have characters in their name that are outside of the US-ASCII range.

Base64 (RFC 4648) is a commonly used encoding that allows binary data to be converted to a subset of US-ASCII text.

The meaning of the value is whatever is ascribed to it by the web developer who thought it up.

Expires attribute

This attribute gives the maximum lifetime of the Cookie. If neither the Expires attribute, nor the Max-Age attribute discussed next are given to a Cookie, then the Cookie expires at the end of the current session, whatever that means to the browser. Usually it is when the browser exits.

The browser is not required to keep the Cookie until then. It is free to delete the Cookie for any reason. It might be running out of space, or the user of the browser doesn't want the Cookie.

The Date is an RFC 1123 date as defined in RFC 2616. That means that it will have in order

Example:

Tue, 09 Aug 2011 23:42:17 GMT

That's what the spec says, but it has a section about how to deal with dates that don't adhere too closely to the spec and have for example two digit years.

Browsers are free to modify dates, for example if a date is ten years in advance, the browser might instead make it for two weeks. If the date is outside the range of dates the browser can represent, the browser will adjust it to be within the range of dates the browser can represent.

Max-Age attribute

This attribute gives the number of seconds that the Cookie should live. Unfortunately, it's not supported by all browsers, so it probably doesn't make sense to use it. If you do use it, and the browser does support it, then if it is given as an attribute of a Cookie which also has an attribute of Expires, it over rules the Expires. If neither Expires or Max-Age is given the Cookie is kept until the current session is over.

As with the Expires attribute, the browser is free to adjust the Max-Age attribute for sanity.

Domain attribute

The Domain attribute is the basic way that the browser knows who to send the cookie to. This discussion assumes a Path attribute of /. If, for example, the cookie is for the domain example.com, then the cookie will be sent along with requests to servers for example.com, www.example.com, and foo.bar.example.com. If a Domain attribute is not sent with a Cookie, then it is supposed to only apply to the host that sent the cookie, that is if it was sent by example.com, it should apply to example.com, but not to www.example.com or foo.bar.example.com. Some browsers get this wrong and will send it to www.example.com and foo.bar.example.com as well as to example.com. As time goes by these naughty browsers are expected to adhere to this part of the spec.

Browser will accept cookies with a domain attribute that specifies a scope for the cookie that would include the server that sent the cookie. That means that a host, www.example.com could send a cookie for www.example.com or example.com, but not for adifferentdomain.com. Technically they could also send one for com. Many browsers are configured to reject cookies for top level domains like com and edu, but they are not required to by the spec.

Path attribute

The Path attribute is used to restrict the portion of a domain that a Cookie applies to. If a Path is given for a Cookie, the browser, when checking to see if a Cookie should be sent in a Cookie header for a request, will, if the domain matches, then check to see if the directory portion of the request matches, or is a subdirectory or the path. Usually, the Path, if given is specified as /, and every request matches. You could give it a more specific value though, so that with a Path of foo/bar, a request to http://adomain.tla/foo would not match and would not have the Cookie sent along, but http://adomain.tla/foo/bar, http://adomain.tla/foo/bar/, and http://adomain.tla/foo/bar/apage.html, would all match and so, would have the Cookie with that Path attribute sent along with the page request.

Secure attribute

The secure attribute says to send the cookie along with https requests, but not with http requests. They can still be overwritten via unsecure paths, and the Secure attribute removed.

HttpOnly attribute

This attribute says to send the Cookie it applies to along with HTTP like requests, not along with other types of requests. That means for example, that a local script will not see a Cookie that has this attribute, and the value can't be stolen by a cross site scripting attack. Modern browsers now support this, and older ones will ignore it. There's no downside to using it all of the time. One thing that is amusing, is that you can set the same Cookie twice, once Set-Cookie foo=realvalue; HttpOnly, and then again, Set-Cookie foo=fakevalue and from javascript you will get the fake value. This is questionable and should not be relied upon. There's nothing in the spec that says this will work.

Extension attribute

The extension attribute can be any US-ASCII characters except for CTLs or a semi-colon. If a browser doesn't understand the attribute it ignores it. The intent is for browsers to extend a facility on trial basis, and if something catches on, it might end up in the spec. Browsers are free to understand any new attributes they want. The HttpOnly attribute came from Microsoft in just this way. It turned out to be a great idea, and is now supported by all current browsers. Even as a confessed Linux/Unix bigot, I have to say that Microsoft hit a homerun on this one.

The Cookie Header

The Cookie header is sent from the browser to the server along with a page request if any Cookies have a Domain attribute that matches the domain of the page being requested, if the Path attribute matches the path of the page being requested, and if the connection method matches the Secure attribute if present.

The syntax of the header is

"Cookie:" OWS cookie-name=cookie-value OWS [; OWS cookie-name-cookie-value ]*

OWS means Optional White Space, and means TAB, SPACE, CR, LF. There has to be at least one cookie sent along, or there's no need for the Cookie header at all. All of the Cookies are sent in the same header.

The attributes are not sent back with the Cookie, only the Name and the Value.

More than one Cookie with the same Name can be sent back if they differ in Domain and/or Path is such a way as to still apply for the requested page. You can make no assumptions about the order of these in the request.

The Set-Cookie Header

The Set-Cookie Header can be sent from a web server along with the other headers associated with a web page. It has the format

Set-Cookie cookie-pair [; SP cookie-av]*

where cookie pair and cookie-av are as follows

cookie-pair: Name=Value cookie-av: One of the attributes as discussed above

That is, a Name=Value followed by zero or more of the optional attribute/value pairs as discussed above.

A browser can freely ignore requests to set Cookies either at the request of the user or for its own reasons (out of memory, disk full, etc.) A server application can also not depend on a Cookie spending its full lifetime as expected from an Expires or Max-Age attribute. These values can be adjusted before setting and Cookies can also be evicted at any time.

The order in which the attributes are sent makes no difference.

If the server has more that one Cookie to send, it sends a separate Set-Cookie header for each Cookie.

(Back to Web tutorials)