[Dev] repo redirector modificatoins

bill-auger bill-auger at peers.community
Tue Jun 11 17:41:03 GMT 2019


there has been discussion on the IRC recently about improving the
redirector behavior in various ways for various reasons; and i
wanted to get them documented on the mailing list and open for
discussion

the first concern was regarding which mirrors to use - the
parabola mirror pool is healthier than ever now and it would
probably improve the user-experience to stop redirecting to
any arch mirrors and always redirect to a parabola mirror - the
arch mirrors are frequently out-of-sync and return 404 - also, in
my experience most of the arch mirrors that we use are slower
than the parabola mirrors

the second concern stems from that discussion - it was suggested
that we could ping each mirror with a HEAD request to
determine whether a specific packages exists before actually
redirecting the user to it - as i remember, lukeshu explained
that this problem only existed for the arch mirrors due to the
crude and blind way that they are selected; but that the -
redirector for parabola mirrors was introspective and was
unlikely to ever result in a 404

one new concern i would like to raise is regarding packages that
are "not built from source" by the parabola standard - that is,
specifically those the pull dependencies at build-time that are
not itemized in the sources() array - this results in source
incomplete packages

even if disregarding the binary build deps, anything that is not
itemized in the sources() array is something that may become
unobtainable in the future (or in the case of pulling the the
daily HEAD of a VCS repo, non-trivial to resolve which repo
state corresponds to the original source, even if the source
are obtainable) - that is aside from the concern that it
requires the network to be active in build chroots

the proper solution, of course, is to package each of the sources
separately, which adds a significant amount of work to the
overall maintenance of the distro - that amount is roughly
gauged by this value/cost metric:

where:

  value = n_users + importance
  cost  = auditing + packaging + maintenance

the costs are not simple to guess, but do become evident by doing
the work - auditing, although a relatively high cost, can
probably be factored out; because that is entailed by each
freedom bug report - value however is somewhat subjective,
especially because we make no attempt to determine 'n_users',
even proportionately - 'importance' is something that could be
determined through some research and discussion; but it could be
helpful for determining the fate of such packages like this, to
start ranking packages by relative popularity (normalizing
against something in 'base', such as 'pacman')

we dont want to blacklist anything hastily; but we also dont
want to expend precious resources rescuing something that no one
actually wants to use - most recently, certain packages such as
'dotnet' and 'lsd' have made this decision difficult

the 'awesome-terminal-fonts' package identified in the
artistic-freedom bug report #2331, on its own, could probably be
blacklisted as having a very low value/cost ratio; but that
should also factor in the value/cost ratio of it's dependent
'lsd' package

https://labs.parabola.nu/issues/2331

  community/lsd 0.15.1-1
    Modern ls with a lot of pretty colors and awesome icons

does any parabola user want such a thing, for example? - just
because the devs may not value it, would not imply that users
would not; so that is not a good gauge

this third concern is related to the previous one of
redirecting to out-of-sync mirrors because that results in
multiple requests for the same package; often enough to skew the
popularity ranking; so collecting fairly accurate numbers would
require the redirector to distinguish when a package was
actually downloaded and not merely that it was requested



More information about the Dev mailing list