- Wed 26 February 2020
- server admin
- Gaige B. Paulsen
- #nginx, #pelican
I've been doing a bunch of maintenance on my two blogs (company and personal) and one purpose has been to track down malformed and mis-mapped URLs on the site. Since both have been through changes in the underlying blog engine a couple of times, there are multiple sets of URLs that point to the same content. Generally, since there was a long period of time between each of the respins, the search engines picked up the changes in URLs, but occasionally I will see log entries for the two-engines-ago format, and I'd like to fix those as well, especially since the same content is still on the site in most cases.
The most recent versions (the Cartographica blog was on
SquareSpace and Gaige's Pages as in drupal) are already mapped and
had simple mappings due to good URI choices. However, both of these blogs were previously
(initially) in Geeklog, and it had a format that was based on
the article.php
file and a query string.
As mentioned in my nginx_alias_maps post last year (this week, it turns out), I had written a bit of code to produce nginx maps to handle redirections.
However, as written, the map
that I was generating was using the $uri
variable in
nginx, which cooks the URI by removing things like the query string. Obviously, this won't
work for the old Geeklog query string-based redirection, so I needed to move to using the
full $request_uri
. That was fine, but came with drawbacks as well. As I mentioned, the
$uri
variable is cooked by nginx, removing relative directory traversals, double-slashes,
query strings, etc. For most of my URIs, this is a much better fit. As such, I decided
to complicate the plugin a bit and add support specifically for URIs which contained a ?
as an indicator of query strings, and to process them in a second stage map. It's a little
more time consuming, although it's not noticeable on my blogs.
The solution was to run the $uri
map for any URIs not containing query strings and then
run the $request_uri
map for any URIs that did contain them. So, if you had an alias
entry such as the one for
Load up those album covers (header shown here):
Date: 2003-04-29 11:41
Alias: /node/4921,/article.php?story=2003042913413622
Tags:
Category: macintosh
Title: Load up those album covers
the code will generate entries in two maps:
map $uri $redirect_uri_1 {
~^/node/4921$ https://$server_name/load-up-those-album-covers.html;
}
map $request_uri $redirect_uri {
default $redirect_uri_1;
~^/article\.php\?story=2003042913413622$ https://$server_name/load-up-those-album-covers.html;
}
Note here that the first map maps to $redirect_uri_1
and the second one maps to
$redirect_uri
, with a default value of $redirect_uri_1
. Because of the way that
nginx evaluates maps, you can't use $redirect_uri
in both cases.
As with previous versions, you need to include the map in your http
stanza in your nginx
configuration, and you also need to check the value of $redirect_uri
and send it back
as a redirect if present:
include /opt/web/output/alias_map.txt;
server {
listen *:80 ssl;
server_name example.server;
# Redirection logic
if ( $redirect_uri ) {
return 301 $redirect_uri;
}
location / {
alias /opt/web/output;
}
}
Of course, if you only have one or the other type of redirection, the code will make sure to only create a single-stage map.
Updated code is now available as nginx_alias_map on github.