[Dev] [FYI] changes to [repo] HTTP

Luke Shumaker lukeshu at sbcglobal.net
Fri Jul 29 21:24:34 GMT 2016

I've made some changes to how repo handles HTTP requests.  I'm taking
bets on how long before someone complains about it.

TL;DR: repomirror is now prime-time; anyone who tries to hit
repo.parabola.nu will now be redirected to repomirror.parabola.nu by
default.  repomirror will then redirect to a randomly-selected
suitable mirror.

What I did, in order:

 - Notice that everything is slow and terrible.
 - Performance-tune nginx a bit (use epoll, et c.)
 - Observe that things are only slightly less terrible.
 - Run iotop to see if it gives any insights to why disk wait is so
 - Notice that unionfs is dominating the system IO
 - Configure nginx to bypass unionfs, only letting PHP fall back to it
   for the indexes[^1].
 - Notice that nothing really changed.
 - Use `lsof` to try to figure out why.
 - See that some asshole is downloading multiple ISOs from repo
   directly (instead of a mirror)[^2], which means that nginx is
   keeping the files open on unionfs until they finish.
 - `sudo systemctl kill nginx; sudo systemctl restart nginx` to get
   nginx to let go of the unionfs handles.
 - Yay, things are way better!
 - But still slow.
 - htop says that now the CPU is being slammed by soft-IRQ, which is
   at least better than disk wait.
 - iotop says that nginx is now king of IO.
 - Use lsof to see why.
 - See that user(s) must have restarted their ISO downloads.
 - Become frustrated that users are using repo instead of a mirror.
 - Modify repomirror to append "?noredirect" to the URL when it sends
   you to repo directly.
 - Configure repo to redirect to repomirror unless it sees
   "noredirect" in the query string.
 - `sudo systemctl kill nginx; sudo systemctl restart nginx`
 - Yay, proton.parabola.nu seems happy for the first time in a long
   time.  I don't remember the last time I saw the load average <5
   except for immediately after boot.
 - Notice that (somewhat ironically), this puts unionfs back into a
   hot-path because repomirror uses it.  But repomirror will only ever
   use stat/lstat/readlink/readdir on it, so it's fine.

So, to recap:

 - nginx: tune performance settings
 - repo.parabola.nu: bypass unionfs if possible
 - repo.parabola.nu: redirect to repomirror unless "noredirect" is in
   the query string
 - repomirror.parabola.nu: append "?noredirect" to the URL if
   redirecting to repo.parabola.nu

[^1]: I'd tried to do this in the past, but couldn't figure out how.
      I was dumb back then.

[^2]: Because we scrub client IPs, it's possible that it was actually
      several users.  They all just show up as[^3].
[^3]: https://www.xkcd.com/742/

Happy hacking,
~ Luke Shumaker

More information about the Dev mailing list