Skip to content

2022-08-17 openfoodfacts.net unreachable#

What happens#

  • day before stagging (preprod) environment was down due to a bug introduced in the code
  • A PR #openfoodfacts-server:7214 was submitted and merged to fix it
  • github action went bogus, but was reruned later and passed
  • though openfoodfacts.net was unreachable

Why this happens#

  • the day before, Charles saw that wiki and some other services were down and had to reboot proxy1
  • openfoodfacts.net was still using the old way to redirect:
    • redirect to ovh1.openfoodfacts
    • ip filters would then forward to internal ip 101 (aka proxy VM)
  • this kind of forwarding did not work any more but I don't know why
  • Also this way of doing as the drawback that nginx can't see the correct incoming address, but see instead 10.0.0.1

What was done#

Resolving by pointing to proxy1 instead of ovh1#

I tested that using proxy1 address would work, using:

curl  --connect-to world.openfoodfacts.net:443:193.70.55.124:443 https://off:***@world.openfoodfacts.net

Go on OVH interface, web cloud, domain names, openfoodfacts.net, dns zones

Change CNAME and A previously pointing to ovh1 to point to proxy1

  • A rule for: openfoodfacts.net and *.openfoodfacts.net
  • CNAME rule for: events.openfoodfacts.net, productopener.openfoodfacts.net, robotoff.openfoodfacts.net

We had to wait for propagation...

On the A rule I changed DNS TTL to 360s as it is a test environment... (suggested by Benoit382 on slack)

Trying to understand why ovh1 forwarding is down#

Forwarding rule and Masquerading rule seems there:

sudo iptables -t nat -L --verbose --line-numbers

Chain PREROUTING (policy ACCEPT 3135 packets, 201K bytes)
num   pkts bytes target     prot opt in     out     source               destination         
1    3016K  180M DNAT       tcp  --  any    any     anywhere             ovh1.openfoodfacts.org  tcp dpt:https to:10.1.0.101:443
2     318K   18M DNAT       tcp  --  any    any     anywhere             ovh1.openfoodfacts.org  tcp dpt:http to:10.1.0.101:80
(...)
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination         
(...)
3    2203K  150M MASQUERADE  all  --  any    vmbr1   10.1.0.0/16          anywhere            
4      10M  697M MASQUERADE  all  --  any    any     anywhere             anywhere            
5        0     0 MASQUERADE  all  --  any    vmbr1   10.1.0.0/16          anywhere            
6        0     0 MASQUERADE  all  --  any    vmbr1   10.1.0.0/16          anywhere    

But the proxy nginx does not seems accessible to ovh1:

nc -vz 10.1.0.101 443 and nc -vz 10.1.0.101 80 are not responding

While on the proxy itself, it is:

# nc -vz 10.1.0.101 443
proxy [10.1.0.101] 443 (https) open

This is linked to previous problem resolution, see Reverse proxy down on 18 fév 2022, Final resolution.

Solution: on proxy add a rule for routing 10.1.0.xxx packets:

sudo ip route add 10.0.0.0/8 dev eth0

Now nc -vz 10.1.0.101 443 works from ovh1 and service is up again.