Shaarli on NixOS

#PHP #Webapplications in #NixOS are a bit special, as they commonly violate the split between configuration, data and application. Sometimes it's all in the same directory but more commonly it's a subdirectory that contains the data. Packaging the sources can be easy or complicated, depending on wether there is some build process. For Shaarli I just use their full.tar.gz and don't have to worry about that.

The package expression is very basic:

{ lib, stdenv, fetchurl, config ? null, dataDir ? "/var/lib/shaarli" }:

stdenv.mkDerivation rec {
  name = "shaarli-${version}";
  version = "0.11.1";
  preferLocalBuild = true;

  src = fetchurl {
    url = "${version}-full.tar.gz";
    sha256 = "1psijcmi24hk0gxh1zdsm299xj11i7find2045nnx3r96cgnwjpn";

  phases = [ "installPhase" ];
  installPhase = ''
    mkdir $out
    tar xzf $src
    cp -ra Shaarli/. $out/
    find $out -type d -exec chmod 0755 {} \;
    find $out -type f -exec chmod 0644 {} \;
    for a in cache data pagecache tmp; do
      mv $out/$a $out/$a.orig
      ln -s "${dataDir}/$a" $out/$a

  meta = with stdenv.lib; {
    description = "";
    # License is complicated...
    #license = licenses.agpl3;
    homepage = "";
    platforms = platforms.all;
    maintainers = with stdenv.lib.maintainers; [ tokudan ];

What's uncommon is that I have two optional arguments: config and dataDir. config is not used in my Shaarli derivation and is just part of the boilerplate I use for PHP apps. I use it to feed in a config.php if that makes sense for the PHP app, for example my roundcube config uses it. dataDir on the other hand is used in the installPhase. I move away some directories to $a.orig so the install service can setup the dataDirectory if it doesn't exist yet. It's not perfect, but works for now. Then the directories are replaced with symlinks to /var/lib/shaarli – or whatever was specified in dataDir. This derivation gives me a package that is specific to one instance of shaarli. If I run a second instance, I need to specify a different dataDir, leading to another build of the derivation.

The second part of the equation is the system configuration. How do I include the above derivation in my system? I use nginx and phppool with specific users for each php app. Here is the part of my system configuration that uses the package:

{ config, lib, pkgs, ... }:

  phppoolName = "shaarli_pool";
  dataDir = "/var/lib/shaarli";
  vhost = "";

  shaarli = pkgs.callPackage ./pkg-shaarli.nix {
    inherit dataDir;
  services.nginx.virtualHosts."${vhost}" = {
    forceSSL = true;
    enableACME = true;
    root = "${shaarli}";
    extraConfig = ''
      index index.php;
      etag off;
      add_header etag "\"${builtins.substring 11 32 shaarli}\"";
    locations."/robots.txt" = {
      extraConfig = ''
        add_header Content-Type text/plain;
        return 200 "User-agent: *\nDisallow: /\n";
    locations."/" = {
      extraConfig = ''
        try_files $uri $uri/ index.php;
    locations."~ (index)\.php$" = {
      extraConfig = ''
        fastcgi_split_path_info ^(.+\.php)(/.*)$;
        if (!-f $document_root$fastcgi_script_name) {
        return 404;

        fastcgi_pass unix:${"${vhost}".socket};
        fastcgi_index index.php;

        fastcgi_param   QUERY_STRING            $query_string;
        fastcgi_param   REQUEST_METHOD          $request_method;
        fastcgi_param   CONTENT_TYPE            $content_type;
        fastcgi_param   CONTENT_LENGTH          $content_length;

        fastcgi_param   SCRIPT_FILENAME         $document_root$fastcgi_script_name;
        fastcgi_param   SCRIPT_NAME             $fastcgi_script_name;
        fastcgi_param   PATH_INFO               $fastcgi_path_info;
        fastcgi_param   PATH_TRANSLATED         $document_root$fastcgi_path_info;
        fastcgi_param   REQUEST_URI             $request_uri;
        fastcgi_param   DOCUMENT_URI            $document_uri;
        fastcgi_param   DOCUMENT_ROOT           $document_root;
        fastcgi_param   SERVER_PROTOCOL         $server_protocol;

        fastcgi_param   GATEWAY_INTERFACE       CGI/1.1;
        fastcgi_param   SERVER_SOFTWARE         nginx/$nginx_version;

        fastcgi_param   REMOTE_ADDR             $remote_addr;
        fastcgi_param   REMOTE_PORT             $remote_port;
        fastcgi_param   SERVER_ADDR             $server_addr;
        fastcgi_param   SERVER_PORT             $server_port;
        fastcgi_param   SERVER_NAME             $server_name;

        fastcgi_param   HTTPS                   $https;
        fastcgi_param   HTTP_PROXY              "";
    locations."~ \.php$" = {
      extraConfig = ''
        deny all;
  services.phpfpm.pools."${vhost}" = {
    user = "shaarli";
    group = "shaarli";
    settings = {
      "listen.owner" = "nginx";
      "" = "nginx";
      "user" = "shaarli";
      "pm" = "dynamic";
      "pm.max_children" = "75";
      "pm.min_spare_servers" = "5";
      "pm.max_spare_servers" = "20";
      "pm.max_requests" = "10";
      "catch_workers_output" = "1";
  users.extraUsers.shaarli = { group = "shaarli"; };
  users.extraGroups.shaarli = { }; = {
    serviceConfig.Type = "oneshot";
    wantedBy = [ "" ];
    script = ''
      if [ ! -d "${dataDir}" ]; then
        mkdir -p ${dataDir}/{cache,data,pagecache,tmp}
        cp -R ${shaarli}/data.orig/.htaccess ${dataDir}/cache/
        cp -R ${shaarli}/data.orig/.htaccess ${dataDir}/data/
        cp -R ${shaarli}/data.orig/.htaccess ${dataDir}/pagecache/
        cp -R ${shaarli}/data.orig/.htaccess ${dataDir}/tmp/
      chown -Rc shaarli:shaarli ${dataDir}
      find ${dataDir} -type d ! -perm 0700 -exec chmod 0700 {} \; -exec chmod g-s {} \;
      find ${dataDir} -type f ! -perm 0600 -exec chmod 0600 {} \;

The let block just defines some variables to be used by the expression, but there are a couple of important options I use below that: The nginx extraConfig contains

      etag off;
      add_header etag "\"${builtins.substring 11 32 shaarli}\"";

This is both nice and bad at the same time: It leaks some information to the outside world by publishing part of the hash of my Shaarli derivation. On the other hand it ensures that Browsers will refresh their caches as needed if I switch to another derivation, as they use that part of the hash to verify if the file on the server has changed and do not rely on the file modification time, which would always be the unix epoch in the nix store.

At the bottom you can see, which is the service that sets up the data directory when the configuration is activated. Note that with its current implementation it cannot detect if the Shaarli version changed and run any update scripts, but that's hopefully not necessary for Shaarli.

This type of packaging seems to work for most php webapps. It's certainly not perfect and has a lot of redundancies, but for me it gets the job done.