A cluster in simple terms is group of similar things, in this case computers or servers. A more refined explanation would be that a cluster is group of computers, working together in such a way that for end user it is one single machine. This is close to what I discussed about implementation of virtualization, so yes clustering is a form of virtualization.
But when we are strictly talking about software architecture, we are actually talking about using cluster for load balancing or handling failover. For a simple web application, this would mean creating 2 or more similar server machines, which are maintained in a cluster. There is a single point of entry which dictates which server from cluster should fulfill incoming request. This is load balancing. The server at the entry point can use any algorithm like round robin or check the actual load on a server to assign a request to one of the servers in the cluster. At the same time if one of the machine goes down in cluster for some reason, other servers can share the load and end user will never know about the problem occurred at backend.