What an incredible week it has been! This past Thursday, I had the pleasure of presenting at VMUG Connect Amsterdam, held at the iconic RAI Amsterdam. The energy in the room was fantastic, and it was truly invigorating to connect with so many engaged professionals from the VMware community.
My session — "VMware Cloud Foundation Troubleshooting: Real-World Scenarios and Solutions" — focused on practical approaches to maintaining and fixing VCF environments. We explored essential tools for performing preventive health checks on VCF 9, walked step-by-step through real-world troubleshooting examples, and discovered some of the latest troubleshooting features within VCF Operations.
The room was nearly full with around 60 attendees, and the feedback I received afterward was overwhelmingly positive. It is always rewarding to share lessons learned from the field — especially when the audience is so interactive and eager to discuss real-world applications. A big thank you to everyone who attended, asked questions, and shared their own experiences!
It was also wonderful to catch up with so many familiar faces, both from the Netherlands and abroad — including a good number of former students. To top off an already memorable week, I also passed my VCAP-Automation exam π And a comfortable stay at the Van der Valk Hotel at the Zuidas certainly didn't hurt either.
Next Stop: VMUG Connect Minneapolis πΊπΈ
If you missed the Amsterdam session, don't worry! In just three weeks, I will be delivering this presentation again at VMUG Connect Minneapolis. I am looking forward to bringing these real-world VCF troubleshooting scenarios and practical strategies to the US audience.
After the Minneapolis session, I will make the full presentation slide deck available for download right here on NTPRO.NL — so stay tuned if you want access to all the materials, tools, and references we covered.
Thank you again to the VMUG community for the continued support and engagement. See you in Minneapolis! π
VMware vSAN stands as a cornerstone of the modern Software-Defined Data Center (SDDC), offering robust, high-performance, and scalable storage solutions integrated directly into the hypervisor. As the technology evolves, keeping up with the latest advancements is crucial for architects, administrators, and IT professionals. This article distills the essential takeaways from the comprehensive VMware vSAN FAQ document, providing a clear overview of its core concepts, latest features, and deployment architectures. For a more interactive deep dive, be sure to check out my upcoming video walkthrough on this topic, created with NotebookLM.
Core Architectural Pillars: OSA vs. ESA
vSAN offers two distinct architectures: the Original Storage Architecture (OSA) and the newer, high-performance Express Storage Architecture (ESA). Understanding their differences is key to designing and deploying a modern vSAN cluster.
Feature
Original Storage Architecture (OSA)
Express Storage Architecture (ESA)
Storage Devices
Supports SAS, SATA, and NVMe devices.
Exclusively uses NVMe-based devices.
Tiering
Uses a two-tier model with a cache tier (for writes) and a capacity tier.
Employs a single-tier design where all devices contribute to both cache and capacity.
Data Structure
Organizes devices into "disk groups," each with one cache and up to seven capacity devices.
Uses a "storage pool" of devices per host, eliminating the disk group construct.
Performance
Performance is often gated by the cache tier and disk group limitations.
Delivers near device-level performance by removing bottlenecks and optimizing the data path.
Space Efficiency
Deduplication and compression are performed at the cluster level, which can impact performance.
Offers highly efficient, policy-based compression and global deduplication with minimal performance overhead.
RAID-5/6
Requires a minimum of 4 hosts for RAID-5.
Supports efficient RAID-5 on as few as 3 hosts, delivering RAID-1-like performance.
While the OSA remains a viable option for existing hardware, all new deployments should be designed for the ESA to leverage its superior performance, efficiency, and lower Total Cost of Ownership (TCO) .
Flexible Deployment Models
vSAN is not a one-size-fits-all solution. It provides several deployment models to cater to different infrastructure needs, from the data center core to the edge.
•vSAN HCI (Hyperconverged Infrastructure): This is the traditional, aggregated model where compute and storage resources reside within the same cluster. It simplifies management and is ideal for a wide range of workloads.
•vSAN Storage Clusters: Previously known as vSAN Max, this disaggregated model separates compute and storage into independent clusters. It allows you to scale storage and compute resources independently, providing a centralized, highly scalable storage platform for multiple vSphere clusters. This is particularly beneficial for optimizing licensing costs and managing large-scale environments .
•Stretched and 2-Node Clusters: These topologies provide high availability and site-level resilience. Stretched clusters span two geographic locations, while 2-Node clusters are designed for Remote Office/Branch Office (ROBO) and edge environments, using a third-site witness to maintain data quorum.
Key vSAN Capabilities
Beyond its architecture, vSAN is packed with features that ensure data availability, security, and operational simplicity.
Availability: vSAN protects against failures at multiple levels, including individual disks, entire hosts, and even network partitions. By setting Storage Policy-Based Management (SPBM) rules like "Failures to Tolerate" (FTT), administrators can define the desired level of data redundancy for each virtual machine.
Security: vSAN provides both Data-at-Rest and Data-in-Transit encryption using the AES-256 cipher. This can be managed through external Key Management Service (KMS) solutions or the vSphere Native Key Provider (NKP), without requiring specialized self-encrypting drives (SEDs) .
Performance and Operations: Tools like HCIBench allow for standardized performance testing, while features like adaptive resynchronization minimize the performance impact of maintenance operations. The Skyline Health service in vCenter provides comprehensive monitoring and proactive diagnostics to ensure your cluster runs optimally.
Explore Further
The world of vSAN is deep and constantly evolving. This article provides a high-level overview of the key concepts you need to know. To get the full details and answers to more specific questions, I highly recommend exploring the official documentation.
As you continue your learning journey, be sure to watch my upcoming video where I walk through these concepts in more detail using NotebookLM.
#vExpert Eric Sloof takes us through the a new era for elite private cloud professionals with the VMware Certified Distinguished Expert (VCDX) offerings: the very best certification that marks you as a "Distinguished Private Cloud" Expert.
With the release of VMware Cloud Foundation 9.0, Broadcom has taken a significant step forward in simplifying log management and troubleshooting for private cloud environments. VCF Operations for Logs 9.0 introduces a deeply integrated logging solution that brings log analytics directly into the VCF Operations interface — making life easier for NOC teams, SREs, IT administrators, and application teams alike.
In this blog post, I'll walk you through the key new features, architecture improvements, and what this means for your day-to-day operations.
What's New in VCF Operations for Logs 9.0?
Integrated Log Analysis in VCF Operations
The biggest change in version 9.0 is the introduction of Operations-Logs, a new logging solution built on VMware Aria Operations for Logs. This means you no longer need to switch between separate interfaces to analyze your logs. Everything is now available directly within the VCF Operations UI.
With this integration, you can:
Create log-based alerts without leaving VCF Operations
Design custom dashboards based on log data
Save and reuse log queries across your team
Package alerts, dashboards, and queries into management packs
Important note: VMware Aria Operations for Logs content packs are still supported in 9.0, but Broadcom recommends starting the migration to log-based tools in management packs, as content packs will be phased out in future updates.
Deployment Options
How you deploy VCF Operations for Logs depends on your license type:
VMware Cloud Foundation license: Activate Operations-Logs via VCF Management in VCF Operations Fleet Management — no separate appliance deployment required.
VMware vSphere Foundation (VVF) license: Deploy the VCF Operations for Logs virtual appliance using vSphere.
Key Features
High-Performance Log Ingestion
VCF Operations for Logs is built to handle large volumes of log data at high throughput with low latency. It accepts data through two main channels:
Syslog: ports 514/UDP, 514/TCP, and 1514/TCP (SSL)
Ingestion API: ports 9000/TCP and 9543/TCP (SSL)
Any environment component — operating systems, applications, VMs, hosts, vCenter, firewalls, switches, and storage — can push syslog feeds to VCF Operations for Logs.
Scalable Architecture
VCF Operations for Logs supports both single-node and multi-node cluster deployments:
Single node: Good for development and lab environments. Use the Integrated Load Balancer (ILB) even for single nodes to simplify future expansion.
Cluster: Required for production environments. Clusters provide primary and worker nodes, enabling linear scaling of ingestion throughput and high availability.
Cluster with Forwarders: Extend your deployment with forwarder clusters at remote sites, forwarding all logs to the main cluster. Ideal for multi-datacenter environments.
Cross-Forwarding for Redundancy: Mirror two main clusters across datacenters, each front-ended with dedicated forwarder clusters for full redundancy.
Near Real-Time Search
Log data ingested by VCF Operations for Logs is available for search within seconds. Historical data can be queried from the same interface with equally low latency — no need to wait or switch tools.
Runtime Field Extraction
Raw log data is often difficult to parse visually. VCF Operations for Logs provides runtime field extraction, allowing you to dynamically extract any field from log data using regular expressions. These extracted fields can then be used for:
Searching and filtering log events
Aggregating events in the Explore Logs chart
Building dashboard widgets
A handy one-click extract feature makes this even easier — no need to manually type complex regex patterns.
Explore Logs
The Explore Logs page is your primary workspace for log analysis. From here you can:
Search and filter log events by timestamp, text, source, or field values
Create and save custom queries
Visualize query results as charts
Pin charts to custom dashboards
Dashboards
Dashboards give you a real-time view of the metrics that matter most. You can:
Create custom dashboards with widgets based on your own queries
Use content pack dashboards for out-of-the-box visibility into VMware components
Clone content pack dashboards and customize them for your needs
Share dashboards with your team via shared dashboard URLs
Log Management
The Log Management section provides full control over how logs are handled:
Log filtering: Reduce noise by filtering out irrelevant log data
Log masking: Mask sensitive data before indexing
Log forwarding: Forward logs to external destinations or other VCF Operations for Logs instances
Index partitions: Control log retention and archiving policies
Integrations
VCF Operations for Logs integrates natively with key VMware and third-party products, including:
vSphere (vCenter log collection via centralized configuration)
Understanding how VCF Operations for Logs processes events helps you use the tool more effectively. Here's what happens from the moment a log event is generated:
Generated on a device outside VCF Operations for Logs
Collected via a VCF Operations for Logs agent, third-party agent, or direct API/syslog write
Received by VCF Operations for Logs — directed to the appropriate node via the ILB
Processed through the ingestion pipeline:
Keyword index created and stored on local disk
Machine learning applied to cluster events
Event stored in compressed format
Available for search within seconds
Archived or deleted based on retention policies (FIFO deletion when storage reaches 97% capacity)
What This Means for Your Teams
Team
Benefit
NOC Teams
Unified log view across the entire VCF fleet
SREs
Near real-time alerting and log-based incident detection
IT Administrators
Centralized log management with no extra tools
Application Teams
Application-level log visibility and custom dashboards
DBAs
Deep-dive log analysis for database components
Getting Started
To start using VCF Operations for Logs 9.0:
If you have a VCF license, activate Operations-Logs via Fleet Management → VCF Management
Configure your environment components to forward syslog to VCF Operations for Logs
Explore the Explore Logs page and start building your first queries
Create dashboards to monitor your most important metrics
Set up log-based alerts to get notified proactively
Conclusion
VCF Operations for Logs 9.0 represents a major step forward in how VMware Cloud Foundation environments handle log management and troubleshooting. By integrating log analytics directly into VCF Operations, Broadcom has made it significantly easier for teams of all types to gain visibility into their infrastructure and applications — without additional tools or context switching.
Whether you're troubleshooting a failed deployment, investigating a performance issue, or simply keeping an eye on your environment's health, VCF Operations for Logs 9.0 has the tools you need.