Change Nutanix container replication factor from RF2 to RF3

Nutanix Replication factor

Data availability, integrity, and security are one of the most important functions provided by modern HCI solutions. Nutanix offers several ways to protect and secure data. Data-at-Rest encryption and data in transit encryption just to name a few of them. Nutanix has implemented the concept of data Replication Factor (RF) to protect data availability and integrity in case of hardware failure. By default, Replication factor 2 (two) is implemented in every Nutanix storage container. This means, that every block of data has its exact copy elsewhere in the cluster. It can be placed:

  • on a different node (in multi-node cluster deployment)
  • different disk in a single node cluster deployment
  • on a node in a different block (chasse) called block-awareness, if you have a cluster span across multiple blocks
    NOTE: the system does enable block awareness automatically as soon as cluster configuration meets block-awareness requirements.
  • on a node in a different data center rack if you implemented rack-awareness into the cluster.

With AOS 6.5, Nutanix supports 3 replication factor configurations:

  • RF=3
  • RF=2
  • RF=1

Why would you change the replication factor to RF=3

There are multiple reasons why would you change the default replication factor. Below, you can find the most common use cases.

Increase data resiliency in the T0 cluster

Changing from default RF=2 to RF=3 increases the number of copies of the data from 2 to 3. This means the system can tolerate two simultaneous HW failures and still serve data to the application.

Customers are enabling RF=3 on clusters with Tier 0 applications where data availability is the most important factor.

NOTE#1: changing the Nutanix container replication factor from RF2 to RF3 does come with storage “costs” as the system has to store an additional copy of the data. To limit the impact of storage utilization on the cluster, customers can enable Erasure coding on the system, which helps reduce storage utilization.

NOTE#2: Changing from RF2 to RF3 increases the number of metadata copies from 3 to 5

Comply with Nutanix best practices

When building the Nutanix cluster, you can keep adding nodes to the system gradually, one after another. Nutanix recommends on clusters with 24 nodes or more to use Replication Factor 3. The reason behind this recommendation is to mitigate risk. The more nodes in the cluster, the higher the risk of HW components (in our case disk drives or nodes) failing.

How do I change Nutanix replication factor from RF2 to RF3?

NOTE#1 Changing RF2 to RF3 – changes the number of metadata replicas from 3 to 5. You can change RF 3 to RF2 on the storage container level but a number of metadata replicas will remain 5

NOTE#2 make sure you have enough storage free space on the cluster

Log in to CVM over SSH and run ncli storage-container edit rf=3 name=<Storage_Container_Name>

Depending on how many nodes are in the cluster, how much data, and how busy the cluster is, the operation may take any time from 30 minutes to a few hours.

<ncli> storage-container edit rf=3 name=SelfServiceContainer 

    Id                        : 0005e404-1657-774a-7cca-3cecef82f0e1::1448
    Uuid                      : d8d77b5e-7693-4d65-9421-cfa3bad00986
    Name                      : SelfServiceContainer
    Storage Pool Id           : 0005e404-1657-774a-7cca-3cecef82f0e1::18
    Storage Pool Uuid         : e138ea63-24a9-4880-8de9-17aac8743711
    Free Space (Logical)      : 48.36 TiB (53,173,256,954,970 bytes)
    Used Space (Logical)      : 2.62 TiB (2,875,405,090,816 bytes)
    Allowed Max Capacity      : 50.98 TiB (56,048,662,045,786 bytes)
    Used by other Containers  : 13.96 GiB (14,985,650,176 bytes)
    Explicit Reservation      : 0 bytes
    Thick Provisioned         : 0 bytes
    Replication Factor        : 3
    Oplog Replication Factor  : 3
    NFS Whitelist Inherited   : true
    Container NFS Whitelist   : 
    VStore Name(s)            : SelfServiceContainer
    Random I/O Pri Order      : SSD-PCIe, SSD-SATA, DAS-SATA
    Sequential I/O Pri Order  : SSD-PCIe, SSD-SATA, DAS-SATA
    Compression               : on
    Compression Delay         : 0 mins
    Fingerprint On Write      : off
    On-Disk Dedup             : off
    Erasure Code              : off
    Software Encryption       : off
<ncli>  

You can watch progress in Prism UI. When the process is completed, you can check changes in the command line or in Prism Element

    Replication Factor        : 3
    Oplog Replication Factor  : 3
<ncli> storage-container list

    Id                        : 0005e404-1657-774a-7cca-3cecef82f0e1::1448
    Uuid                      : d8d77b5e-7693-4d65-9421-cfa3bad00986
    Name                      : SelfServiceContainer
    Storage Pool Id           : 0005e404-1657-774a-7cca-3cecef82f0e1::18
    Storage Pool Uuid         : e138ea63-24a9-4880-8de9-17aac8743711
    Free Space (Logical)      : 48.36 TiB (53,170,239,718,490 bytes)
    Used Space (Logical)      : 2.62 TiB (2,878,421,329,237 bytes)
    Allowed Max Capacity      : 50.98 TiB (56,048,661,047,728 bytes)
    Used by other Containers  : 13.96 GiB (14,986,648,234 bytes)
    Explicit Reservation      : 0 bytes
    Thick Provisioned         : 0 bytes
    Replication Factor        : 3
    Oplog Replication Factor  : 3
    NFS Whitelist Inherited   : true
    Container NFS Whitelist   : 
    VStore Name(s)            : SelfServiceContainer
    Random I/O Pri Order      : SSD-PCIe, SSD-SATA, DAS-SATA
    Sequential I/O Pri Order  : SSD-PCIe, SSD-SATA, DAS-SATA
    Compression               : on
    Compression Delay         : 0 mins
    Fingerprint On Write      : off
    On-Disk Dedup             : off
    Erasure Code              : off
    Software Encryption       : off

Useful Links

0 0 votes
Article Rating

Artur Krzywdzinski

Artur is Consulting Architect at Nutanix. He has been using, designing and deploying VMware based solutions since 2005 and Microsoft since 2012. He specialize in designing and implementing private and hybrid cloud solution based on VMware and Microsoft software stacks, datacenter migrations and transformation, disaster avoidance. Artur holds VMware Certified Design Expert certification (VCDX #077).

You may also like...

Subscribe
Notify of
guest
2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Ray Davis

Nice write on this. Very well explained.

2
0
Would love your thoughts, please comment.x
()
x

FOR FREE. Download Nutanix port diagrams

Join our mailing list to receive an email with instructions on how to download 19 port diagrams in MS Visio format.

NOTE: if you do not get an email within 1h, check your SPAM filters

You have Successfully Subscribed!

Pin It on Pinterest