I still see a lot of VMware vSphere environments where virtual machines are configured with memory limits. This is not always done on purpose, when for instance a template is configured with a limit and the VM deployed from this template is granted more memory, people often forget to put the memory limit at unlimited.
What about this ballooning ghost? Well, when a virtual machine is configured with a memory limit lower than the amount of configured virtual machine memory, the VM will experience ballooning, compression and swapping.
How does it work? If the virtual machine sees 3 gigabyte of configured memory and tries to access it, it will only get physical memory until the limit is reached. So if you configure a virtual machine limit at 2 Gigabyte and the virtual machine is trying to use 3 gigabyte, 1 gigabyte will be ballooned, compressed and eventually swapped.
You can easily track down virtual machine configured with a limit by using vMemory tab in RVTools. I’ve never heard a valid reason why people would use memory limits so get rid of them. This behaviour is also described in knowledge base article “Impact of virtual machine memory and CPU resource limits”.
When a memory limit is set lower than the virtual machine's provisioned memory, it is considered the upper boundary for the amount of physical memory that can be directly assigned to this particular virtual machine. The guest operating system is not aware of this limit, and it optimizes memory management options to the assigned memory size.
When the limit is reached or exceeded, the guest operating system can still request new pages, but due to the limit the VMkernel does not allow the guest to directly consume more physical memory and treats the virtual machine as if the resource is under contention. As such, memory reclamation techniques are used to enable the virtual machine to consume what it has requested.
Depending on the amount of pages requested by the virtual machine, the VMkernel might, in the worst case scenario, resort to VMkernel swap to fulfil the request.
The VMkernel first tries to reclaim memory by inflating the Balloon Driver to let the guest memory manager decide what to page out. In ESX 4.1, the VMkernel also tries to compress memory pages before swapping them out.
You can verify the impact of a memory limit by running esxtop and looking at MCTLSZd and MCTLTGT, SWCUR and SWTGT, and CACHEUSD.