Tuesday, June 28, 2022

RHEL 9 Custom Image and GCP

If you run into:
Header V4 RSA/SHA1 Signature, key ID 3e1ba8d5: BAD

This is because RHEL 9 disables SHA1 by default but Google is publishing gpg keys and RPM's relying on it.
Found the workaround over here: https://forums.centos.org/viewtopic.php?t=79048#p332382
You can check your current setting with:
cat /etc/crypto-policies/config

Then change it with:
update-crypto-policies --set DEFAULT:SHA1


For comparison, if you build a server from the CentOS Stream 9 image in GCP, it sets that to LEGACY.
I don't know off-hand if that's a CentOS or Google customization.
This also manifested as a problem with PackageKit importing the keys with the error message appearing in Cockpit.
It still doesn't work though. :(
failed to parse public key for /var/cache/PackageKit/9.0/metadata/google-compute-engine-9-x86_64.tmp/yum-key.gpg

I'll try again sometime later.

Wednesday, April 6, 2022

Adventure in dm-integrity (raw notes)

 Prompted by https://www.youtube.com/watch?v=l55GfAwa8RI I set out to figure out how to use dm-integrity, starting with letting lvm do all the work. 


On AlmaLinux 8.5 I had bad results earlier where any corruption would offline the drive in my testing where ideally it would have just re-written the affected blocks/sectors from the raid 5 I had setup.

Searching around suggested I'd have a bad time below Linux kernel 5.4.

So I jumped to CentOS 9 Stream where the system would immediately kernel panic when running the lvcreate command... so now I'm investigating on the RHEL 9 beta on a different VM.

Here's where I'm gonna dump the notes from that:

First a need a new VM. I've got a Fedora host with Cockpit I just downloaded the beta boot iso onto. Gonna mostly defaults install it, and I'm glad to see Cockpit has VNC console just working these days. (I was about to go to virt-manager before deciding to give it another try)

This is also my first time setting up RHEL 9 so I'm setting it up using an activation key I quickly created on https://access.redhat.com/management/. I let it connect to "Insights" too so I'll have to click around and see what that does later. The default LVM layout was fine. I only gave the VM 10G so far but I'm gonna attach 4x 2G disks after it's installed for playing with lvm. Minimal Install with Guest Agents checked. Setup and allowed remote root logins. This software selection pulls in 377 packages. Got a little bored suring the install and as nice as the cockpit console was, I decided to do some ssh port forwarding and have a SPICE client connect remotely instead. It was cool that resizing the window auto adjusted the VM's display resolution, even in the installer still. I'm hoping mainly if I need to copy/paste, clipboard will somehow just work. Then I remembered I enabled root logins and just ssh'd into the VM.

The fun begins:

Since the installer used the latest available packages, the system is already up to date and here's the kernel I'll be using:

[root@integritytesting ~]# uname -a
Linux integritytesting 5.14.0-70.5.1.el9_0.x86_64 #1 SMP PREEMPT Thu Mar 24 20:26:26 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
[root@integritytesting ~]#

I quickly ran into a problem with attaching more disks in Cockpit

Trying scsi instead of virtio didn't work. Time to get virt-manager going. Got rid of the cdrom. Don't know if that did anything since I couldn't remove "Controller SATA 0" still (it would reappear on its own) but I was able to add 3 more disks, 4 extra for this experiment total.

Here's the disks:

[root@integritytesting ~]# lsblk
NAME          MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
vda           252:0    0  10G  0 disk
├─vda1        252:1    0   1G  0 part /boot
└─vda2        252:2    0   9G  0 part
  ├─rhel-root 253:0    0   8G  0 lvm  /
  └─rhel-swap 253:1    0   1G  0 lvm  [SWAP]
vdb           252:16   0   2G  0 disk
vdc           252:32   0   2G  0 disk
vdd           252:48   0   2G  0 disk
vde           252:64   0   2G  0 disk
[root@integritytesting ~]#

First thing I'm gonna play with is the --raidintegrity option of lvcreate. It's what I was using and assume it auto-configures dm-integrity in the way I hope. If not, I'll try again manually later:

[root@integritytesting ~]# vgcreate vg_integrity /dev/vd{b,c,d,e}
  Physical volume "/dev/vdb" successfully created.
  Physical volume "/dev/vdc" successfully created.
  Physical volume "/dev/vdd" successfully created.
  Physical volume "/dev/vde" successfully created.
  Volume group "vg_integrity" successfully created
[root@integritytesting ~]#

[root@integritytesting ~]# lvcreate -L1G -n safeplace --type raid5 --raidintegrity y --stripes 3 vg_integrity
  Using default stripesize 64.00 KiB.
  Rounding size 1.00 GiB (256 extents) up to stripe boundary size <1.01 GiB (258 extents).
  Creating integrity metadata LV safeplace_rimage_0_imeta with size 8.00 MiB.
  Logical volume "safeplace_rimage_0_imeta" created.
  Creating integrity metadata LV safeplace_rimage_1_imeta with size 8.00 MiB.
  Logical volume "safeplace_rimage_1_imeta" created.
  Creating integrity metadata LV safeplace_rimage_2_imeta with size 8.00 MiB.
  Logical volume "safeplace_rimage_2_imeta" created.
  Creating integrity metadata LV safeplace_rimage_3_imeta with size 8.00 MiB.
  Logical volume "safeplace_rimage_3_imeta" created.
  Logical volume "safeplace" created.
[root@integritytesting ~]#

I'm not gonna bother with filesystems for now so I'll be poking at block devices directly.

Threw some data onto the LV:

[root@integritytesting ~]# (for i in {00000..99999}; do echo "Hello world #$i"; done; echo https://www.youtube.com/watch?v=l55GfAwa8RI ) > /dev/vg_integrity/safeplace
[root@integritytesting ~]# head -n 4 /dev/vg_integrity/safeplace
Hello world #00000
Hello world #00001
Hello world #00002
Hello world #00003
[root@integritytesting ~]#

Now what? I guess I better corrupt it somehow. I'm gonna point strings at the vd{b,c,d,e} and find my Hello World's. Their locations will vary depending on where LVM put the LV's. Had to install strings first though:

binutils-2.35.2-9.el9.i686 : A GNU collection of binary utilities
Repo        : rhel-9-for-x86_64-baseos-beta-rpms
Matched from:
Other       : *bin/strings

And eventually, found some of my Hello's:

[root@integritytesting ~]# for i in /dev/vd{b,c,d,e}; do echo -e "\n## $i"; strings -td $i | grep -C3 -m3 -F 'Hello world'; done

## /dev/vdb
366051256 orld #14)
366051281 Hello wDSk
366051304 d #14874
366051320 %"""#"""Hello world #00000
366051347 Hello world #00001
366051366 Hello world #00002
366051385 Hello world #00003
366051404 Hello world #00004
366051423 Hello world #00005

## /dev/vdc
365961168 o world [3
365961196 Hellg
365961208 %""""""" world #03449
365961230 Hello world #03450
365961249 Hello world #03451
365961268 Hello world #03452
365961287 Hello world #03453
365961306 Hello world #03454
365961325 Hello world #03455

## /dev/vdd
365961144  /4|n~I
365961168 TY 3'
365961208 %"""""""d #06898
365961225 Hello world #06899
365961244 Hello world #06900
365961263 Hello world #06901
365961282 Hello world #06902
365961301 Hello world #06903
365961320 Hello world #06904

## /dev/vde
366116841 2 /3wczH
366116850 9uxU[&
366116857 """#"""347
366116868 Hello world #10348
366116887 Hello world #10349
366116906 Hello world #10350
366116925 Hello world #10351
366116944 Hello world #10352
366116963 Hello world #10353
[root@integritytesting ~]#

I see /dev/vdb has my earliest entries so I'll poke at that.

[root@integritytesting ~]# dd iflag=skip_bytes bs=19 count=4 skip=366051328 if=/dev/vdb 2> /dev/null
Hello world #00000
Hello world #00001
Hello world #00002
Hello world #00003
[root@integritytesting ~]# echo -n Corruption. | dd oflag=seek_bytes of=/dev/vdb seek=366051347
0+1 records in
0+1 records out
11 bytes copied, 0.00015279 s, 72.0 kB/s
[root@integritytesting ~]# dd iflag=skip_bytes bs=19 count=4 skip=366051328 if=/dev/vdb 2> /dev/null
Hello world #00000
Corruption. #00001
Hello world #00002
Hello world #00003
[root@integritytesting ~]#

I'm not seeing the corruption when I check my LV again but there's a bunch of layers, maybe some caching? I'll try getting to to read everything any maybe it will notice, but it didn't. Echo'ing 3 at drop_caches didn't do anything new either.

Deactivating the and reactivating the volume group will tear down all the dm's and set them up again and maybe it will be noticed there.

[root@integritytesting ~]# dmesg
[ 1551.458536] bash (1208): drop_caches: 3
[root@integritytesting ~]# vgchange -ay vg_integrity
  1 logical volume(s) in volume group "vg_integrity" now active
[root@integritytesting ~]# dmesg
[ 1551.458536] bash (1208): drop_caches: 3
[ 1638.650436] device-mapper: integrity: Error on tag mismatch when replaying journal: -84
[ 1638.790914] md/raid:mdX: device dm-5 operational as raid disk 0
[ 1638.790945] md/raid:mdX: device dm-9 operational as raid disk 1
[ 1638.790963] md/raid:mdX: device dm-13 operational as raid disk 2
[ 1638.790982] md/raid:mdX: device dm-17 operational as raid disk 3
[ 1638.791682] md/raid:mdX: raid level 5 active with 4 out of 4 devices, algorithm 2
[ 1638.815196] md/raid:mdX: Disk failure on dm-5, disabling device.
               md/raid:mdX: Operation continuing on 3 devices.
[root@integritytesting ~]#

Aww, I didn't want it to kick out the whole disk. Consulting man lvmraid, I don't see an obvious setting to prevent that. Time to install integritysetup. Unfortunately, I can't inspect what lvm did using that command:

[root@integritytesting ~]# for i in /dev/dm-*; do integritysetup status $i; done
/dev/dm-0 is active and is in use.
  type:    n/a
/dev/dm-1 is active and is in use.
  type:    n/a
/dev/dm-10 is active and is in use.
  type:    n/a
/dev/dm-11 is active and is in use.
  type:    n/a
/dev/dm-12 is active and is in use.
  type:    n/a
/dev/dm-13 is active and is in use.
  type:    n/a
/dev/dm-14 is active and is in use.
  type:    n/a
/dev/dm-15 is active and is in use.
  type:    n/a
/dev/dm-16 is active and is in use.
  type:    n/a
/dev/dm-17 is active and is in use.
  type:    n/a
/dev/dm-2 is active and is in use.
  type:    n/a
/dev/dm-3 is active and is in use.
  type:    n/a
/dev/dm-4 is active and is in use.
  type:    n/a
/dev/dm-5 is active and is in use.
  type:    n/a
/dev/dm-6 is active and is in use.
  type:    n/a
/dev/dm-7 is active and is in use.
  type:    n/a
/dev/dm-8 is active and is in use.
  type:    n/a
/dev/dm-9 is active and is in use.
  type:    n/a


[root@integritytesting ~]# for i in /dev/dm-*; do integritysetup status $i; done
/dev/dm-0 is active and is in use.
  type:    n/a
/dev/dm-1 is active and is in use.
  type:    n/a
/dev/dm-10 is active and is in use.
  type:    n/a
/dev/dm-11 is active and is in use.
  type:    n/a
/dev/dm-12 is active and is in use.
  type:    n/a
/dev/dm-13 is active and is in use.
  type:    n/a
/dev/dm-14 is active and is in use.
  type:    n/a
/dev/dm-15 is active and is in use.
  type:    n/a
/dev/dm-16 is active and is in use.
  type:    n/a
/dev/dm-17 is active and is in use.
  type:    n/a
/dev/dm-2 is active and is in use.
  type:    n/a
/dev/dm-3 is active and is in use.
  type:    n/a
/dev/dm-4 is active and is in use.
  type:    n/a
/dev/dm-5 is active and is in use.
  type:    n/a
/dev/dm-6 is active and is in use.
  type:    n/a
/dev/dm-7 is active and is in use.
  type:    n/a
/dev/dm-8 is active and is in use.
  type:    n/a
/dev/dm-9 is active and is in use.
  type:    n/a


Take 2.

Cleared off vd[b-e] using vgremove then blkdiscard.

Then after skimming the man page:

[root@integritytesting ~]# integritysetup format /dev/vdb

WARNING!
========
This will overwrite data on /dev/vdb irrevocably.

Are you sure? (Type 'yes' in capital letters): YES
Formatted with tag size 4, internal integrity crc32c.
Wiping device to initialize integrity checksum.
You can interrupt this by pressing CTRL+c (rest of not wiped device will contain invalid checksum).
Finished, time 00:00.994, 2016 MiB written, speed   2.0 GiB/s
[root@integritytesting ~]# integritysetup dump /dev/vdb
Info for integrity device /dev/vdb.
superblock_version 5
log2_interleave_sectors 15
integrity_tag_size 4
journal_sections 186
provided_data_sectors 4129048
sector_size 512
log2_blocks_per_bitmap 15
flags fix_padding fix_hmac
[root@integritytesting ~]#

Repeat for each.

Then open them:

[root@integritytesting ~]# lsblk /dev/vdb
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
vdb  252:16   0   2G  0 disk
[root@integritytesting ~]# integritysetup open /dev/vdb vdb-safer
[root@integritytesting ~]# lsblk /dev/vdb
NAME        MAJ:MIN RM SIZE RO TYPE  MOUNTPOINTS
vdb         252:16   0   2G  0 disk
└─vdb-safer 253:2    0   2G  0 crypt
[root@integritytesting ~]#

 Repeat for each.

And realize I'm getting a little ahead of myself, better test what happens with corruption with one.

 Repopulate the old data:

[root@integritytesting ~]# (for i in {00000..99999}; do echo "Hello world #$i"; done; echo https://www.youtube.com/watch?v=l55GfAwa8RI ) > /dev/mapper/vdb-safer

[root@integritytesting ~]# head -n 4 /dev/mapper/vdb-safer
Hello world #00000
Hello world #00001
Hello world #00002
Hello world #00003
[root@integritytesting ~]# strings -td /dev/vdb | grep -C3 -m3 -F 'Hello world'
      0 integrt
   8192 Hello world #00000
   8211 Hello world #00001
   8230 Hello world #00002
   8249 Hello world #00003
   8268 Hello world #00004
   8287 Hello world #00005

Created some corruption:

[root@integritytesting ~]# dd iflag=skip_bytes bs=19 count=4 skip=8192 if=/dev/vdb 2> /dev/null
Hello world #00000
Hello world #00001
Hello world #00002
Hello world #00003
[root@integritytesting ~]# echo -n Corruption. | dd oflag=seek_bytes of=/dev/vdb seek=8211
0+1 records in
0+1 records out
11 bytes copied, 0.00235477 s, 4.7 kB/s
[root@integritytesting ~]# dd iflag=skip_bytes bs=19 count=4 skip=8192 if=/dev/vdb 2> /dev/null
Hello world #00000
Corruption. #00001
Hello world #00002
Hello world #00003

Couldn't see the corruption (assuming caching again)

[root@integritytesting ~]# head -n 4 /dev/mapper/vdb-safer
Hello world #00000
Hello world #00001
Hello world #00002
Hello world #00003
[root@integritytesting ~]# dmesg
[root@integritytesting ~]# sync
[root@integritytesting ~]# echo 3 > /proc/sys/vm/drop_caches
[root@integritytesting ~]# head -n 4 /dev/mapper/vdb-safer
Hello world #00000
Hello world #00001
Hello world #00002
Hello world #00003

[root@integritytesting ~]# integritysetup close vdb-safer

[root@integritytesting ~]# integritysetup open /dev/vdb vdb-safer
[root@integritytesting ~]# dmesg
[  626.300804] bash (1256): drop_caches: 3
[  711.902581] device-mapper: integrity: Error on tag mismatch when replaying journal: -84
[  711.912621] Buffer I/O error on dev dm-2, logical block 516112, async page read

This is promising, nothing about disabling the whole disk, let's see if I wrote enough other data to see it in another block I didn't corrupt.

Strings didn't like the read error so piping it through dd.

That was a failure, couldn't find any readable sector. At this point decided to look up other people's solutions and ended up at the much better looking blog page https://securitypitfalls.wordpress.com/2020/09/27/making-raid-work-dm-integrity-with-md-raid/

The options there suggest... nothing obvious. The sector size, tag size and algorithm is specified, but don't expect those change the overall behavior.

I decided to fix the corruption rather than starting from scratch and closed/opened the device again.

[root@integritytesting ~]# echo -n Hello world | dd oflag=seek_bytes of=/dev/vdb seek=8211

...

[root@integritytesting ~]# head -n 4 /dev/mapper/vdb-safer
Hello world #00000
Hello world #00001
Hello world #00002
Hello world #00003

I decided to throw more data at it, though my bash loop wasn't the fastest source of data:

[root@integritytesting ~]# (export i=0; while true; do printf "Hello world #%16u\n" $i; i=$((i+1)); done) | dd status=progress > /dev
/mapper/vdb-safer
392693248 bytes (393 MB, 375 MiB) copied, 164 s, 2.4 MB/s

...

This gave me time to look at more posts by others, but no new insights. Except maybe using the journal mode isn't going to work.

...

[root@integritytesting ~]# (export i=0; while true; do printf "Hello world #%16u\n" $i; i=$((i+1)); done) | dd status=progress > /dev
/mapper/vdb-safer
2112329728 bytes (2.1 GB, 2.0 GiB) copied, 862 s, 2.5 MB/s
dd: writing to 'standard output': No space left on device
67864+63057921 records in
4129048+0 records out
2114072576 bytes (2.1 GB, 2.0 GiB) copied, 863.149 s, 2.4 MB/s
[root@integritytesting ~]#

But now at least the drive is full:

[root@integritytesting ~]# tail /dev/mapper/vdb-safer
Hello world #        70469076
Hello world #        70469077
Hello world #        70469078
Hello world #        70469079
Hello world #        70469080
Hello world #        70469081
Hello world #        70469082
Hello world #        70469083
Hello world #        70469084
Hello world #        70469

Before corrupting again, I decided to close the integrity device first and I'm wondering if I had corrupted the journal rather than the data. Thinking about it a little more reinforces the idea I can't use the journal and need to rely only on the tags, which I think is what I originally wanted anyways.

[root@integritytesting ~]# strings -td /dev/vdb | grep -C3 -F 'Hello world' | grep -C6 -F 'Hello world #               0'
16764960 Ra|J
16765036 $.yj
--
16895783 KhcJ{
16895806 OViQN
16895942 hmJ,
16896000 Hello world #               0
16896030 Hello world #               1
16896060 Hello world #               2
16896090 Hello world #               3
16896120 Hello world #               4
16896150 Hello world #               5
16896180 Hello world #               6

[root@integritytesting ~]# echo -n Corruption. | dd oflag=seek_bytes of=/dev/vdb seek=16896060
0+1 records in
0+1 records out
11 bytes copied, 0.00220066 s, 5.0 kB/s
[root@integritytesting ~]# strings -td /dev/vdb | grep -C3 -F 'Hello world' | grep -C6 -F 'Hello world #               0'
16764960 Ra|J
16765036 $.yj
--
16895783 KhcJ{
16895806 OViQN
16895942 hmJ,
16896000 Hello world #               0
16896030 Hello world #               1
16896060 Corruption. #               2
16896090 Hello world #               3
16896120 Hello world #               4
16896150 Hello world #               5
16896180 Hello world #               6

No luck, can't read from either end of the device:

[root@integritytesting ~]# tail /dev/mapper/vdb-safer
tail: error reading '/dev/mapper/vdb-safer': Input/output error
[root@integritytesting ~]# head -n 4 /dev/mapper/vdb-safer
head: error reading '/dev/mapper/vdb-safer': Input/output error

Trying again, blkdiscard to clean it up, then format:

[root@integritytesting ~]# blkdiscard -f /dev/vdb
blkdiscard: Operation forced, data will be lost!
[root@integritytesting ~]# integritysetup format --tag-size 8 --integrity-no-journal /dev/vdb

WARNING!
========
This will overwrite data on /dev/vdb irrevocably.

Are you sure? (Type 'yes' in capital letters): YES
WARNING: Requested tag size 8 bytes differs from crc32c size output (4 bytes).
Formatted with tag size 8, internal integrity crc32c.
Wiping device to initialize integrity checksum.
You can interrupt this by pressing CTRL+c (rest of not wiped device will contain invalid checksum).
Finished, time 00:01.231, 2000 MiB written, speed   1.6 GiB/s
[root@integritytesting ~]#

[root@integritytesting ~]# integritysetup dump /dev/vdb
Info for integrity device /dev/vdb.
superblock_version 5
log2_interleave_sectors 15
integrity_tag_size 8
journal_sections 186
provided_data_sectors 4097048
sector_size 512
log2_blocks_per_bitmap 15
flags fix_padding fix_hmac

[root@integritytesting ~]# integritysetup status /dev/mapper/vdb-safer
/dev/mapper/vdb-safer is active.
  type:    INTEGRITY
  tag size: 8
  integrity: crc32c
  device:  /dev/vdb
  sector size:  512 bytes
  interleave sectors: 32768
  size:    4097048 sectors
  mode:    read/write
  failures: 0
  journal: not active

Wrote in some data again.

I changed the format I wrote out slightly and then changed it back and noticed my change didn't appear how I expected so I tried something more obvious.

[root@integritytesting ~]# echo Chris wonders where this is. > /dev/mapper/vdb-safer
[root@integritytesting ~]# head -n4  /dev/mapper/vdb-safer
Chris wonders where this is.

Hello world #               1
Hello world #               2
[root@integritytesting ~]# strings -td /dev/vdb | grep Chris
[root@integritytesting ~]#

Oh, I'm not using direct io so it's cached, found it.

[root@integritytesting ~]# strings -td /dev/vdb | grep Chris
[root@integritytesting ~]# sync
[root@integritytesting ~]# strings -td /dev/vdb | grep Chris
[root@integritytesting ~]# echo 3 > /proc/sys/vm/drop_caches
[root@integritytesting ~]# sync
[root@integritytesting ~]# strings -td /dev/vdb | grep Chris
17027072 Chris wonders where this is.
[root@integritytesting ~]#


And corrupt it:

[root@integritytesting ~]# echo -n Here is some corruption | dd oflag=seek_bytes of=/dev/vdb seek=17027072
0+1 records in
0+1 records out
23 bytes copied, 0.202152 s, 0.1 kB/s
[root@integritytesting ~]# head -n4  /dev/mapper/vdb-safer
head: error reading '/dev/mapper/vdb-safer': Input/output error
[root@integritytesting ~]#


[ 4177.159117] bash (1256): drop_caches: 3
[ 4293.405001] device-mapper: integrity: dm-2: Checksum failed at sector 0x0
[ 4293.407586] device-mapper: integrity: dm-2: Checksum failed at sector 0x0
[ 4293.408111] Buffer I/O error on dev dm-2, logical block 0, async page read


But the whole block device didn't clam up this time.

[root@integritytesting ~]# dd iflag=skip_bytes skip=2560000 conv=noerror if=/dev/mapper/vdb-safer bs=512 count=1
d #           85333
Hello world #           85334
Hello world #           85335
Hello world #           85336
Hello world #           85337
Hello world #           85338
Hello world #           85339
Hello world #           85340
Hello world #           85341
Hello world #           85342
Hello world #           85343
Hello world #           85344
Hello world #           85345
Hello world #           85346
Hello world #           85347
Hello world #           85348
Hello world #           85349
Hello world 1+0 records in
1+0 records out
512 bytes copied, 0.000977178 s, 524 kB/s
[root@integritytesting ~]#

Ok, so dm-integrity will work the way I hope as long as the journal disabled.

 And with enough dd options and padding my input, I was able to write back over the first sector of data.

[root@integritytesting ~]# (echo Chris wonders where this is.; cat /dev/zero) | dd oflag=direct,sync of=/dev/mapper/vdb-safer bs=512 count=1
1+0 records in
1+0 records out
512 bytes copied, 0.00540147 s, 94.8 kB/s
[root@integritytesting ~]# head -n4  /dev/mapper/vdb-safer
Chris wonders where this is.
llo world #              17
Hello world #              18
Hello world #              19
[root@integritytesting ~]#

It skips to 17 there since the nul padding isn't displayed.

So yay, it's working! If I had raid on top of this, I imagine it would be able to reconstruct the sector and write it back, if it wanted to. 

Looking at the man page for lvmraid again, we lose several knobs integritysetup has but at the least I could switch the mode from journal to bitmap. I feel like bitmap is worse than no bitmap though since the bitmap seems like it tells it to ignore the checksum for a block if it crashed while that data was writing. This has to be reliable since it's basically the whole point of using it under the raid.


Day 2

Thought about it some more. Still don't know if bitmap mode is useful. It probably ought to trigger a raid scrub for that block too? Gonna ignore that for now and focus on addressing the bit rot use case rather than any write hole thing.
Starting off, going to do the all in one LVM test again with --raidintegritymode bitmap this time. (journal was the default)
 
Quickly reset everything:
[root@integritytesting ~]# integritysetup close vdb-safer
Device vdb-safer is not active.
[root@integritytesting ~]# integritysetup close vdc-safer
[root@integritytesting ~]# integritysetup close vdd-safer
[root@integritytesting ~]# integritysetup close vde-safer
[root@integritytesting ~]# blkdiscard /dev/vdb
blkdiscard: /dev/vdb contains existing file system (DM_integrity).
blkdiscard: This is destructive operation, data will be lost! Use the -f option to override.
[root@integritytesting ~]# blkdiscard -f /dev/vdb
blkdiscard: Operation forced, data will be lost!
[root@integritytesting ~]# blkdiscard -f /dev/vdc
blkdiscard: Operation forced, data will be lost!
[root@integritytesting ~]# blkdiscard -f /dev/vdd
blkdiscard: Operation forced, data will be lost!
[root@integritytesting ~]# blkdiscard -f /dev/vde
blkdiscard: Operation forced, data will be lost!
[root@integritytesting ~]#
 Then setup the volume group again:
[root@integritytesting ~]# vgcreate vg_integrity /dev/vd{b,c,d,e}
  Devices file PVID Hzh6UBd4qqvLNGpGvu18marCM0k7xI2e not found on device /dev/vdb.
  Devices file PVID rxCOQLDdfmnyntvMYOBBh3anIJNY3FM0 not found on device /dev/vdc.
  Physical volume "/dev/vdb" successfully created.
  Physical volume "/dev/vdc" successfully created.
  Physical volume "/dev/vdd" successfully created.
  Physical volume "/dev/vde" successfully created.
  Volume group "vg_integrity" successfully created
[root@integritytesting ~]# 
 
And the new option on the LV:
[root@integritytesting ~]# lvcreate --raidintegritymode bitmap -L1G -n safeplace --type raid5 --raidintegrity y --stripes 3 vg_integrity
  Using default stripesize 64.00 KiB.
  Rounding size 1.00 GiB (256 extents) up to stripe boundary size <1.01 GiB (258 extents).
  Creating integrity metadata LV safeplace_rimage_0_imeta with size 8.00 MiB.
  Logical volume "safeplace_rimage_0_imeta" created.
  Creating integrity metadata LV safeplace_rimage_1_imeta with size 8.00 MiB.
  Logical volume "safeplace_rimage_1_imeta" created.
  Creating integrity metadata LV safeplace_rimage_2_imeta with size 8.00 MiB.
  Logical volume "safeplace_rimage_2_imeta" created.
  Creating integrity metadata LV safeplace_rimage_3_imeta with size 8.00 MiB.
  Logical volume "safeplace_rimage_3_imeta" created.
  Logical volume "safeplace" created.
[root@integritytesting ~]#
 
I loaded about 30MB of Hello Worlds onto the LV again:
[root@integritytesting ~]# head /dev/vg_integrity/safeplace
Hello world #               0
Hello world #               1
Hello world #               2
Hello world #               3
Hello world #               4
Hello world #               5
Hello world #               6
Hello world #               7
Hello world #               8
Hello world #               9
Time to find where to corrupt it again:
Hopefully this is the real data and not a journal:
[root@integritytesting ~]# strings -td /dev/vdb | grep -C3 -F 'Hello world' | grep -C6 -F 'Hello world #               0
'
  51166 creation_time = 1649325754      # Thu Apr  7 05:02:34 2022
1048576 DmRd
1052672 bitm
5242880 Hello world #               0
5242910 Hello world #               1
5242940 Hello world #               2
5242970 Hello world #               3
5243000 Hello world #               4
5243030 Hello world #               5
5243060 Hello world #               6
^C
 Corrupt it:
 
[root@integritytesting ~]# echo -n Here is some corruption | dd oflag=seek_bytes of=/dev/vdb seek=5242910
0+1 records in
0+1 records out
23 bytes copied, 0.175548 s, 0.1 kB/s
[root@integritytesting ~]# strings -td /dev/vdb | grep -C3 -F 'Hello world' | grep -C6 -F 'Hello world #               0'
  51166 creation_time = 1649325754      # Thu Apr  7 05:02:34 2022
1048576 DmRd
1052672 bitm
5242880 Hello world #               0
5242910 Here is some corruption     1
5242940 Hello world #               2
5242970 Hello world #               3
5243000 Hello world #               4
5243030 Hello world #               5
5243060 Hello world #               6
^C
[root@integritytesting ~]#
Sync and drop buffers/caches again and good news:
[root@integritytesting ~]# echo 3 > /proc/sys/vm/drop_caches
[root@integritytesting ~]# head /dev/vg_integrity/safeplace
Hello world #               0
Hello world #               1
Hello world #               2
Hello world #               3
Hello world #               4
Hello world #               5
Hello world #               6
Hello world #               7
Hello world #               8
Hello world #               9
[root@integritytesting ~]# dmesg
[79513.762560] bash (1256): drop_caches: 3
[79523.021055] device-mapper: integrity: dm-5: Checksum failed at sector 0x0
[79523.022109] device-mapper: integrity: dm-5: Checksum failed at sector 0x0
[79523.037276] md/raid:mdX: read error corrected (8 sectors at 0 on dm-5)
[root@integritytesting ~]#
 
Checking the corruption again:
[root@integritytesting ~]# strings -td /dev/vdb | grep -C3 -F 'Hello world' | grep -C6 -F 'Hello world #               0
'
  51166 creation_time = 1649325754      # Thu Apr  7 05:02:34 2022
1048576 DmRd
1052672 bitm
5242880 Hello world #               0
5242910 Hello world #               1
5242940 Hello world #               2
5242970 Hello world #               3
5243000 Hello world #               4
5243030 Hello world #               5
5243060 Hello world #               6
^C
[root@integritytesting ~]#
 
It fixed itself!
I'm gonna call that a success for now. So it survived a corruption in the data area, but not everything is backed by the integrity layer. The PV's are formatted directly on the disk, then lvm has its metadata. I don't know what sort of consistency check there is for that sort of stuff. Does it only check on an error? I like the "it just works" of configuring the raid and integrity in the single lvcreate command but might still have to resort to manually layering everything.
 

Side experiment:

How about I corrupt the PV a little bit. Well I did and didn't learn much. I had to replace the PV, which required doing the repair on the LV which requires removing the integrity from the LV and then adding it back after repair.