Skip to Main Content

Infrastructure Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

vagrant box ol9 storage issues

Andrew WalkerJan 26 2024

Hi,

I just wanted to document some issues I've been having with the oracle linux 9 boxes at https://yum.oracle.com/boxes/

I was using centos7 official boxes to do dev for binary compatibility with rhel7.

With that now not possible I've been looking at alternative so wanted to give OL9 a go.

I migrated my vagrant provisioner script pretty easily from centos 7 to ol9 and was working away but I've had some strange

behavior.

Firstly if rebooting the VM it will get stuck often on boot. Seemed to be some kind of filesystem corruption.

I first noticed lvm is used for the root fs but lvm commands are broken e.g


sudo vgdisplay
  Devices file sys_wwid t10.ATA_VBOX_HARDDISK_VBb7a57793-09889c58 PVID tOH9wlnxSVA15YQUBxinyJcV8c997NSg last seen on /dev/sda2 not found.

This in itself does not seem fatal unless like I did you needed to add more disks and grow the volumes or add new ones.

I think this could be fixed by disabling the default use_devicesfile = 0 in the /etc/lvm/lvm.conf. The uuid will always

change on clone in vagrant/box so I dont see much in the way of alternatives. See https://portal.nutanix.com/page/documents/kbs/details?targetId=kA07V000000LaGrSAK

I made this change hoping it would solve my issues but unfortunately I still had them.

As time passed I noticed the issues not not only on reboot but also during high load like k8s pulling lots of images the

storage would just disappear altogether crashing the vm.

I see these kinds of errors in dmesg and journal logs.

[   96.356459] ata3.00: exception Emask 0x0 SAct 0x200 SErr 0x0 action 0x6 frozen
[   96.356637] ata3.00: failed command: WRITE FPDMA QUEUED
[   96.356793] ata3.00: cmd 61/00:48:20:69:90/0a:00:04:00:00/40 tag 9 ncq dma 1310720 ou
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[   96.357106] ata3.00: status: { DRDY }
[   96.357298] ata3: hard resetting link
[   96.702971] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   96.703489] ata3.00: configured for UDMA/133
[   96.703702] ata3.00: device reported invalid CHS sector 0
[   96.703892] ata3: EH complete

I tired building my own box from iso without LVM. This did not help.

Ultimately I changed the controller type from SATA to IDE and this has made the dmesg errors disappear.

So far I've had no crashes.

I would happily contribute a PR to the project that produces the boxes but I cannot find the source.

Hopefully this helps someone else.

Comments
Post Details
Added on Jan 26 2024
1 comment
475 views