Random texts

Anything that I feel I have to write down and that I'm not embarassed enough to hide. RSS and ActivityPub (@tokudan@blog.tokudan.de).

Microsoft has some strange ideas about filesystem layouts in general and drive letters are strange as well. Especially Microsoft is unsure about what are valid drive letters. The GUI (explorer) hides all those tricks, meaning they can be used to hide data. They are still accessible, though. They work at least since Windows NT 4 and still work in Windows 10 and will probably continue to work for quite a long time.

Drive letter shenanigans

C:\>subst !: %WINDIR%

C:\>!:\system32\calc.exe

C:\>subst !: /d

Several other characters besides ! are valid as well. If you want to have a drive letter named ^, you need to escape it, as ^ is the escape character in the command prompt: subst ^^: %WINDIR%.

NTFS data streams

This is actually a feature of the NTFS file system: Multiple data streams per file, usable for metadata. Incompatible with other filesystems though and thus very limited practical use. Also the explorer prevents you from using them, so you need to work around its limitations to give them a try. Here are some example usages:

C:\Temp\test>echo hello > world.txt

C:\Temp\test>echo hello world > world.txt:hidden

C:\Temp\test>type world.txt
hello

C:\Temp\test>type world.txt:hidden
Die Syntax für den Dateinamen, Verzeichnisnamen oder die Datenträgerbezeichnung ist falsch.

C:\Temp\test>dir
 Datenträger in Laufwerk C: ist System
 Volumeseriennummer: AB12-C3D4

 Verzeichnis von C:\Temp\test

22.06.2022  12:53    <DIR>          .
22.06.2022  12:53    <DIR>          ..
22.06.2022  12:53                 8 world.txt
               1 Datei(en),              8 Bytes
               2 Verzeichnis(se), 390.119.133.184 Bytes frei

C:\Temp\test>notepad world.txt

C:\Temp\test>notepad world.txt:hidden

Snowflake is a tool that helps users circumvent censorship when Tor is blocked.

Running a snowflake proxy on NixOS is simple. It is already packaged and just needs to be setup as a service.

I decided to wrap it in a container as well – because I can and NixOS makes it incredibly simple.

Here's a section that you can just paste into your configuration.nix:

  containers.snowflake = {
    autoStart = true;
    ephemeral = true;
    config = {
      systemd.services.snowflake = {
        wantedBy = [ "multi-user.target" ];
        serviceConfig = {
          IPAccounting = "yes";
          ExecStart = "${pkgs.snowflake}/bin/proxy";
          DynamicUser = "yes";
          # Read-only filesystem
          ProtectSystem = "strict";
          PrivateDevices = "yes";
          ProtectKernelTunables = "yes";
          ProtectControlGroups = "yes";
          ProtectHome = "yes";
          # Deny access to as many things as possible
          NoNewPrivileges = "yes";
          PrivateUsers = "yes";
          LockPersonality = "yes";
          MemoryDenyWriteExecute = "yes";
          ProtectClock = "yes";
          ProtectHostname = "yes";
          ProtectKernelLogs = "yes";
          ProtectKernelModules = "yes";
          RestrictAddressFamilies = "AF_INET AF_INET6";
          RestrictNamespaces = "yes";
          RestrictRealtime = "yes";
          RestrictSUIDSGID = "yes";
          SystemCallArchitectures = "native";
          SystemCallFilter = "~@chown @clock @cpu-emulation @debug @module @mount @obsolete @raw-io @reboot @setuid @swap @privileged @resources";
          CapabilityBoundingSet = "";
          ProtectProc = "invisible";
          ProcSubset = "pid";
        };
      };
    };
  };

You can get the snowflake logs with this command: machinectl shell snowflake $(which journalctl) -fu snowflake

Keep in mind that running the snowflake proxy causes some traffic, so this may be unsuitable for some metered connections. I had it running for nearly a day and saw roughly 5 GB of data logged by systemd in systemctl status container@snowflake.service, but your mileage may vary.

If you don't want the container overhead, you can just drop the contents of containers.snowflake.config into your configuration.nix, but I prefer the extra layer of isolation.

Some snippets of code that I've used or had to use. This post will probably get updated when I find something new.

- name: Finish the installation of broken packages
  changed_when: false
  environment:
    DEBIAN_FRONTEND: noninteractive
  command:
    cmd: dpkg --force-confdef --force-confold --configure -a

#Pipewire is a nice replacement for #Pulseaudio, even if it still lacks some features and tooling. I switched to pipewire a couple of weeks ago and it solved many small papercuts that pulseaudio had for me, like USB audio being broken after it was disconnected once.

Sometimes I want audio from a single source (e.g. a stream played by firefox) to go to multiple sinks: two headphones for example or speakers and headphones.

While probably not ideal, the following works for me very well. 1. Use pw-top to get the device names 2. Run pw-loopback --capture alsa_output.pci-0000_00_1f.3.analog-stereo --playback alsa_output.usb-headset-00.analog-stereo

This duplicates the output that is being sent to my regular speakers and mirrors it to my usb headset, so whenever I leave the room, I can just put on my headset and can continue to hear whatever is being played.

This sets and exports all variables set in a .env file that can also be used by systemd to setup environment variables.

# Read and export the variables from .env
source "$HOME/.env"
while read -r var; do
        export "${var?}"
done < <(sed -e 's_=.*$__' "$HOME/.env")

Update: Apparently set -a / set +a before and after the sourcing makes it a lot easier. Thanks @bekopharm@social.tchncs.de for this hint.

#bash #systemd

A selection of various command lines for #ffmpeg that I either found online and used and adapted somehow or built myself

rotate iPhone Videos according to their metadata, requires reencoding the video

ffmpeg -i "$v" -movflags use_metadata_tags -c copy -c:v h264 -profile:v high -b:v 16000k "${v%.*}.new.mov"

convert animation videos 10bit to 8bit encoding, so that a raspberry pi can play it

ffmpeg -i abc.mkv -map 0 -c copy -c:v libx264 -profile baseline -tune animation -crf 18 'abc-recode.mkv'

cut and crop

ffmpeg -i stream.mpg -ss 3138 -t 4.0 -y -map v:0 -map a:0 -filter:v 'crop=245:203:924:305' -c:a ac3 -b:a 151k -c:v h264 -b:v 3500k -r 30 boso-3rd-assistant.mkv

cut file into multiple segments according to time (1h)

ffmpeg -i input.mp4 -c copy -map 0 -segment_time 01:00:00 -f segment -reset_timestamps 1 output-%03d.mp4

Encode *.wav to *.flac but copy metadata from *.mp3 files with the same name

for X in *.wav ; do ffmpeg -i "${X%.wav}.mp3" -i "$X" -map_metadata 0 -map 1 -c:a flac "${X%.wav}.flac" ; done

Ansible has issues with “run_once” and “when” in the same task. If the “when” only evaluates to true for some hosts, it's basically undefined wether the task will run or be skipped based on whatever host happens to be the first one to be evaluated. If the first one is skipped, the task won't even run once if all others would evaluate as true.

Example:

- name: Create temporary directory to download agent on ansible host
  when: package.pkgversion != agent_installed_version
  register: tempdir_ansiblehost
  run_once: true
  delegate_to: localhost
  check_mode: no
  tempfile:
    state: directory

#ansible

with_items: "{{ old_mounts | sort | reverse }}"

results in...

item=<list_reverseiterator object at 0x7fdc252dad60>

The “fix” is to use the list filter:

with_items: "{{ old_mounts | sort | reverse | list }}"

#ansible

I had to figure out a way to remove specific hosts that had generated new host keys from the SSH known hosts file on the AWX system. What I came up with is the following playbook:

- hosts: all
  gather_facts: false
  tasks:
  - name: Remove host key from known_hosts
    command:
      cmd: ssh-keygen -R {{ inventory_hostname }}
    delegate_to: "localhost"

I just run this playbook with the limit set to the host or hosts I want to clear and have setup a template that just asks me for that limit.

I known that there is a module known_hosts, but it has a shortcoming in my opinion: It points to ~/.ssh/known_hosts by default instead of parsing the ~/.ssh/config file to determine the default location.

#ansible #awx

#PHP #Webapplications in #NixOS are a bit special, as they commonly violate the split between configuration, data and application. Sometimes it's all in the same directory but more commonly it's a subdirectory that contains the data. Packaging the sources can be easy or complicated, depending on wether there is some build process. For Shaarli I just use their full.tar.gz and don't have to worry about that.

The package expression is very basic:

{ lib, stdenv, fetchurl, config ? null, dataDir ? "/var/lib/shaarli" }:

stdenv.mkDerivation rec {
  name = "shaarli-${version}";
  version = "0.11.1";
  preferLocalBuild = true;

  src = fetchurl {
    url = "https://github.com/shaarli/Shaarli/releases/download/v0.11.1/shaarli-v${version}-full.tar.gz";
    sha256 = "1psijcmi24hk0gxh1zdsm299xj11i7find2045nnx3r96cgnwjpn";
  };

  phases = [ "installPhase" ];
  installPhase = ''
    mkdir $out
    tar xzf $src
    cp -ra Shaarli/. $out/
    find $out -type d -exec chmod 0755 {} \;
    find $out -type f -exec chmod 0644 {} \;
    for a in cache data pagecache tmp; do
      mv $out/$a $out/$a.orig
      ln -s "${dataDir}/$a" $out/$a
    done
  '';

  meta = with stdenv.lib; {
    description = "";
    # License is complicated...
    #license = licenses.agpl3;
    homepage = "https://github.com/shaarli/Shaarli";
    platforms = platforms.all;
    maintainers = with stdenv.lib.maintainers; [ tokudan ];
  };
}

What's uncommon is that I have two optional arguments: config and dataDir. config is not used in my Shaarli derivation and is just part of the boilerplate I use for PHP apps. I use it to feed in a config.php if that makes sense for the PHP app, for example my roundcube config uses it. dataDir on the other hand is used in the installPhase. I move away some directories to $a.orig so the install service can setup the dataDirectory if it doesn't exist yet. It's not perfect, but works for now. Then the directories are replaced with symlinks to /var/lib/shaarli – or whatever was specified in dataDir. This derivation gives me a package that is specific to one instance of shaarli. If I run a second instance, I need to specify a different dataDir, leading to another build of the derivation.

The second part of the equation is the system configuration. How do I include the above derivation in my system? I use nginx and phppool with specific users for each php app. Here is the part of my system configuration that uses the package:

{ config, lib, pkgs, ... }:

let
  phppoolName = "shaarli_pool";
  dataDir = "/var/lib/shaarli";
  vhost = "shaarli.example.com";

  shaarli = pkgs.callPackage ./pkg-shaarli.nix {
    inherit dataDir;
  };
in
{
  services.nginx.virtualHosts."${vhost}" = {
    forceSSL = true;
    enableACME = true;
    root = "${shaarli}";
    extraConfig = ''
      index index.php;
      etag off;
      add_header etag "\"${builtins.substring 11 32 shaarli}\"";
      '';
    locations."/robots.txt" = {
      extraConfig = ''
        add_header Content-Type text/plain;
        return 200 "User-agent: *\nDisallow: /\n";
        '';
    };
    locations."/" = {
      extraConfig = ''
        try_files $uri $uri/ index.php;
        '';
    };
    locations."~ (index)\.php$" = {
      extraConfig = ''
        fastcgi_split_path_info ^(.+\.php)(/.*)$;
        if (!-f $document_root$fastcgi_script_name) {
        return 404;
        }

        fastcgi_pass unix:${config.services.phpfpm.pools."${vhost}".socket};
        fastcgi_index index.php;

        fastcgi_param   QUERY_STRING            $query_string;
        fastcgi_param   REQUEST_METHOD          $request_method;
        fastcgi_param   CONTENT_TYPE            $content_type;
        fastcgi_param   CONTENT_LENGTH          $content_length;

        fastcgi_param   SCRIPT_FILENAME         $document_root$fastcgi_script_name;
        fastcgi_param   SCRIPT_NAME             $fastcgi_script_name;
        fastcgi_param   PATH_INFO               $fastcgi_path_info;
        fastcgi_param   PATH_TRANSLATED         $document_root$fastcgi_path_info;
        fastcgi_param   REQUEST_URI             $request_uri;
        fastcgi_param   DOCUMENT_URI            $document_uri;
        fastcgi_param   DOCUMENT_ROOT           $document_root;
        fastcgi_param   SERVER_PROTOCOL         $server_protocol;

        fastcgi_param   GATEWAY_INTERFACE       CGI/1.1;
        fastcgi_param   SERVER_SOFTWARE         nginx/$nginx_version;

        fastcgi_param   REMOTE_ADDR             $remote_addr;
        fastcgi_param   REMOTE_PORT             $remote_port;
        fastcgi_param   SERVER_ADDR             $server_addr;
        fastcgi_param   SERVER_PORT             $server_port;
        fastcgi_param   SERVER_NAME             $server_name;

        fastcgi_param   HTTPS                   $https;
        fastcgi_param   HTTP_PROXY              "";
        '';
    };
    locations."~ \.php$" = {
      extraConfig = ''
        deny all;
        '';
    };
  };
  services.phpfpm.pools."${vhost}" = {
    user = "shaarli";
    group = "shaarli";
    settings = {
      "listen.owner" = "nginx";
      "listen.group" = "nginx";
      "user" = "shaarli";
      "pm" = "dynamic";
      "pm.max_children" = "75";
      "pm.min_spare_servers" = "5";
      "pm.max_spare_servers" = "20";
      "pm.max_requests" = "10";
      "catch_workers_output" = "1";
    };
  };
  users.extraUsers.shaarli = { group = "shaarli"; };
  users.extraGroups.shaarli = { };
  systemd.services.shaarli-install = {
    serviceConfig.Type = "oneshot";
    wantedBy = [ "multi-user.target" ];
    script = ''
      if [ ! -d "${dataDir}" ]; then
        mkdir -p ${dataDir}/{cache,data,pagecache,tmp}
        cp -R ${shaarli}/data.orig/.htaccess ${dataDir}/cache/
        cp -R ${shaarli}/data.orig/.htaccess ${dataDir}/data/
        cp -R ${shaarli}/data.orig/.htaccess ${dataDir}/pagecache/
        cp -R ${shaarli}/data.orig/.htaccess ${dataDir}/tmp/
      fi
      chown -Rc shaarli:shaarli ${dataDir}
      find ${dataDir} -type d ! -perm 0700 -exec chmod 0700 {} \; -exec chmod g-s {} \;
      find ${dataDir} -type f ! -perm 0600 -exec chmod 0600 {} \;
    '';
  };
}

The let block just defines some variables to be used by the expression, but there are a couple of important options I use below that: The nginx extraConfig contains

      etag off;
      add_header etag "\"${builtins.substring 11 32 shaarli}\"";

This is both nice and bad at the same time: It leaks some information to the outside world by publishing part of the hash of my Shaarli derivation. On the other hand it ensures that Browsers will refresh their caches as needed if I switch to another derivation, as they use that part of the hash to verify if the file on the server has changed and do not rely on the file modification time, which would always be the unix epoch in the nix store.

At the bottom you can see systemd.services.shaarli-install, which is the service that sets up the data directory when the configuration is activated. Note that with its current implementation it cannot detect if the Shaarli version changed and run any update scripts, but that's hopefully not necessary for Shaarli.

This type of packaging seems to work for most php webapps. It's certainly not perfect and has a lot of redundancies, but for me it gets the job done.