Show last authors
1 {{box cssClass="floatinginfobox" title="**Contents**"}}
2 {{toc /}}
3 {{/box}}
4
5 = (% class="mw-headline" id="High_availability_setup" %)High availability setup(%%) =
6
7 Flexisip can achieve high availability by the combination of 3 things:
8
9 * Several flexisip instances running on multiple host machines, with an appropriate configuration to serve the same domains
10 * SRV records to spread the traffic amongst several Flexisip instances, and make sure that client can try alternate routes if one SRV record is down
11 * Redis registrar set up in a master/slave configuration and monitored with sentinels
12
13 The ideal and simplified setup is such that each host has:
14
15 1. A //redis// instance configured with both //requirepass// and //masterauth// equal passwords, and a //slaveof// configured to the current Redis master;
16 1. A //flexisip// instance configured to connect to the current redis instance;
17 1. A //redis-sentinel// instance configured to monitor the current Redis master.
18
19 [[image:Flexisip-ha_conf.png||height="343" width="660"]]
20
21
22 With that setup, the redis-sentinel will connect to the master redis DB, and start monitoring its slave (including the local instance). The redis instance will start replicating the master and wait for client connections. The flexisip instance will connect to the redis master and start handling client interactions.
23
24 On network error, if the master redis database is impacted, the sentinels will elect a new redis master and configure all the redis network to reflect that.
25
26 For flexisip, the behavior is as such:
27
28 1. when the connection to the master is working, we periodically ask for the list of slaves of this master
29 1. if the connection to the master is lost, we will try successively all known slaves and wait for a new master to be elected.
30 1. when a new master is elected, flexisip will drop the slave connection and connect to the new master.
31 1. at this point, the network will be able to process registrations again.
32
33 Overall, the time it takes will depend on the sentinel configuration. We recommend a 10s delay.
34
35 = (% class="mw-headline" id="High_availability_setup_requirements" %)High availability setup requirements(%%) =
36
37 It is REQUIRED to install an NTP daemon on all machines running the REDIS and Flexisip instances. Indeed, flexisip requests REDIS to automatically remove expired registrations. This mechanism is relying on universal time. If any node of the cluster has a wrong time information, then this management of registration is broken. Clients will then experience 404 Not found responses from Flexisip for destinations that were correctly registered.
38
39 On a debian system, this is done by installing the NTP daemon:
40
41 {{{sudo apt-get install ntp
42 }}}
43
44 /etc/ntp.conf might be customized to set the hostname of your favourite NTP server (exemple: the one of your hosting provider).
45
46 == (% class="mw-headline" id="Sample_configurations" %)Sample configurations(%%) ==
47
48 === (% class="mw-headline" id="Redis_master" %)Redis master(%%) ===
49
50 In file ///etc/redis/redis.conf//:
51
52 (% style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;" %)
53 (((
54 bind *
55 requirepass ComplicatedPassWord123456789
56 masterauth ComplicatedPassWord123456789
57 )))
58
59 === (% class="mw-headline" id="All_other_Redis_instances" %)All other Redis instances(%%) ===
60
61 (% style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;" %)
62 (((
63 bind *
64 requirepass ComplicatedPassWord123456789
65 masterauth ComplicatedPassWord123456789
66 slaveof <master ip> <master port>
67 )))
68
69
70 === (% class="mw-headline" id="All_Redis_sentinels" %)All Redis sentinels(%%) ===
71
72 In file ///etc/redis/sentinel.conf//:
73
74 (% style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;" %)
75 (((
76 # sentinel monitor <name> <ip master> <port> <quorum size>
77 sentinel monitor flexi1 10.0.0.1 6379 2
78 sentinel down-after-milliseconds flexi1 10000
79 sentinel failover-timeout flexi1 20000
80 sentinel auth-pass flexi1 ComplicatedPassWord123456789
81
82 # For Redis 3.2 and later
83 protected-mode no
84 )))
85
86
87 The **quorum size** is the number of sentinels that must be agree on the fact that master is down before triggering the election of the new master. For a cluster of 3 nodes, a quorum of 2 is a good value. If the quorum is equal to the size of the cluster, the election process will never be initiated.
88
89 The **protected mode** must be disable in order sentinels be able to accept requests not coming from loopback interface even if those are listening on all interfaces. Please note that by disabling //proteceted mode//, you will expose your sentinels to the public network whereas these are not able to authenticate each other. To solve that security issue, the firewall should be set to authorized sentinel request coming from a whitelist of IP addresses.
90
91 Alternatively, if all your sentinels are on a safe subnetwork or VPN, you should let the protected mode enabled and make your sentinels listen on the interface with the private network.
92
93 === (% class="mw-headline" id="All_flexisip_configurations" %)Node's flexisip configurations(%%) ===
94
95 In file ///etc/flexisip/flexisip.conf//, in the //[global]// sections, //transports// must be defined for each host, for example for //host1//:
96
97 {{code language="ini"}}
98 [global]
99 transports=sips:host1.example.org
100 {{/code}}
101
102
103 in the //[module::Registrar]// section:
104
105 (% style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;" %)
106 (((
107 reg-domains=mydomain.com
108 db-implementation=redis
109 redis-server-domain=10.0.0.1
110 redis-server-port=6379
111 redis-auth-password=ComplicatedPassWord123456789
112 )))
113
114
115 in //[cluster]// section:
116
117 (% style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;" %)
118 (((
119 enabled=true
120
121 # List of IP addresses of all nodes present in the cluster
122 nodes=<IP host1> <IP host2> <IP host3>
123 )))
124
125
126 === TLS certificates ===
127
128 In case SIP/TLS (sips) is used, the **TLS server certificate MUST advertise both the hostname and the SRV domain name**. As a result, x509 extension SubjectAltName, with DNS fields should be used to advertise both names. The rational for this is:
129
130 * SIP clients are required to verify that the names match the host part of the SIP URI targetted originally, per [[RFC5922>>https://tools.ietf.org/html/rfc5922#section-7.3]] (Domain Certificates in the Session Initiation Protocol (SIP) ).
131 * Flexisip nodes will use their hostname in Record-Route headers, in order to ensure that requests part of a same dialog will take the same path as the request that created the dialog. For this reason, clients may need to connect to SIP URI pointing to the node hostnames.
132
133 For example, the TLS certificate used for node "host1.example.org" must have a SubjectAltName with two DNS fields with values "example.org" (the SIP domain resolved by SRV), and "host1.example.org" (the node's hostname resolved by A/AAAA).
134
135 === Typical DNS SRV records configuration ===
136
137 Here is an example of an active/active configuration for sips with 2 nodes.
138
139 (% style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;" %)
140 (((
141 _sips._tcp.example.org 3600 IN SRV 0 100 5061 host1.example.org.
142 _sips._tcp.example.org 3600 IN SRV 0 100 5061 host2.example.org.
143 )))
144
145 == (% class="mw-headline" id="Typical_scenario_in_case_of_failure_in_a_HA_configuration" %)Typical scenario in case of failure in a HA configuration(%%) ==
146
147 We have 3 hosts, and the current Redis master is the host 1.
148
149 * Host 1 suffers a failure, and becomes unreachable. The other Flexisip instances immediately detect the failure and start connecting to another Redis slave. The enter a wait mode, where no new registration can be made, until a new master is elected.
150
151 [[image:Flexisip-ha_failure_step_1.png||height="394" width="831"]]
152
153 * After the configured delay in the sentinels (10s is recommended), they start the election process to set a new Redis master. In this case, Host2 is deemed new master. Host3's redis is reconfigured by the sentinels to adopt the new master. The Host2 and Host3 flexisip notice the change and automatically migrate to the new Redis master database.
154
155 [[image:Flexisip-ha_failure_step_2.png||height="368" width="782"]]
156
157 * Once Host1 comes back online, the sentinels will detect its livelyhood and reconfigure it as a slave. The Host1 flexisip will automatically migrate to the newly elected master Redis (Host2).
158
159 [[image:Flexisip-ha_failure_step_3.png||height="378" width="820"]]