A cluster in simple terms is group of similar things, in this case computers or servers. A more refined explanation would be that a cluster is group of computers, working together in such a way that for end user it is one single machine. This is close to what I discussed about implementation of virtualization, so yes clustering is a form of virtualization.
But when we are strictly talking about software architecture, we are actually talking about using cluster for load balancing or handling failover. For a simple web application, this would mean creating 2 or more similar server machines, which are maintained in a cluster. There is a single point of entry which dictates which server from cluster should fulfill incoming request. This is load balancing. The server at the entry point can use any algorithm like round robin or check the actual load on a server to assign a request to one of the servers in the cluster. At the same time if one of the machine goes down in cluster for some reason, other servers can share the load and end user will never know about the problem occurred at backend.
Virtual, as the word hint, is something that is not real, but gives feeling of being real. In computer world, my first interaction with virtual machine was at the very beginning in school days, when we were made to work on dumb terminals, which had only monitor and keyboard. The terminal used to be attached to a central powerful machine which would provide processing power and memory requirements.
That is a very crude example of virtualization. With hardware cost going down in last few years, the need of dumb terminals has evaporated. But with cloud computing coming into picture, virtualization in IT industry has reached to new scales. With cloud computing it is convenient and important to manage virtual machines based on requirement or load on the application.
Virtualization goes beyond a virtual machine (hardware virtualization, software virtualization, storage virtualization, network virtualization are key), but for sake of this posts simplicity, I will stick to virtual machines. A virtual machine is simply a machine which does not exist in real, but can be used as a real machine. The advantage of this type of arrangement is maximum usage of hardware and hence cost efficiency.
Hypervisor is the key software component which helps running multiple operating systems (and hence machines) on one system. Read here on hypervisors.