Skip to main content Accidentally deleted /etc/pve/local/pve-ssl.key, can't start pve-cluster : r/Proxmox

Accidentally deleted /etc/pve/local/pve-ssl.key, can't start pve-cluster

Long story short, I removed a node from my cluster and edited the corosync.conf instead of using delnode. After this error I decided it would be best to simply remove all other nodes from the cluster and to just make a new one. So I successfully did that with one server, but for the other one I somehow managed to delete the pve-ssl.key.

I found this guide https://pve.proxmox.com/wiki/Proxmox_SSL_Error_Fixing and followed it with no errors other then it creating a server.csr instead of server.pem. I tried renaming this file to .pem as well as starting from the top to just leave it as .csr and ignore the .pem missing. Doing systemctl status pve-cluster.service after following the guide gives a repeated "/etc/pve/local/pve-ssl.key: failed to load local private key" even though the pve-ssl.key file exists and is there. I am not sure why it is failing when the setup of the file is similar to the other pve-ssl.key of my other server which works.

Unfortunately I have quite a few vms on this server but cannot backup any of them due to pve-cluster not starting, I also cannot access the webgui. I am now at the point of just scrapping all of my vms and reinstalling proxmox on this server but I was hoping somebody could help me.

thanks

The feeling of not having enough time to finish all your tasks is real! Well, with monday.com’s work management platform, get more done in less time with automations, real-time communication, and notifications. Smash that done button! Try now.
Thumbnail image: The feeling of not having enough time to finish all your tasks is real! Well, with monday.com’s work management platform, get more done in less time with automations, real-time communication, and notifications. Smash that done button! Try now.
Sort by:
Best
Open comment sort options
Edited

AFAIK the .csr file is only a certificate create request and not the actual certificate. You seemed to missed the last step of the manual for creating the server certificate.

openssl x509 -req -in server.csr -CA ca.pem -CAkey ca.key -CAcreateserial -out server.pem -days 365 -sha256

There should also exist an openssl command for this step but I can't remember it right now.

Could you simply retry the whole process again and tell us if it worked?

Otherwise I will recreate your scenario in a vm and try to give you a fix but I then you would have to wait till this evening when I get home.

You were right, I didn't do the last step. I redid the process with the last step and Pve-cluster still doen't start, with journalctl -xe saying it failed to start corosync. So i tried systemctl restart corosync.service and it errors with the same "/etc/pve/local/pve-ssl.key: failed to load local private key" as it did when I was trying to start pve-cluster from before.

More replies
More replies

journalctl -xe output

-- Subject: Unit corosync.service has failed

-- Defined-By: systemd

-- Support: https://www.debian.org/support

--

-- Unit corosync.service has failed.

--

-- The result is failed.

Oct 18 01:20:57 x3650m4 systemd[1]: corosync.service: Unit entered failed state.

Oct 18 01:20:57 x3650m4 systemd[1]: corosync.service: Failed with result 'exit-code'.

Oct 18 01:20:57 x3650m4 pveproxy[3387]: worker exit

Oct 18 01:20:57 x3650m4 pveproxy[2172]: worker 3387 finished

Oct 18 01:20:57 x3650m4 pveproxy[2172]: starting 2 worker(s)

Oct 18 01:20:57 x3650m4 pveproxy[2172]: worker 3530 started

Oct 18 01:20:57 x3650m4 pveproxy[2172]: worker 3531 started

Oct 18 01:20:57 x3650m4 pveproxy[3530]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1683.

Oct 18 01:20:57 x3650m4 pveproxy[3531]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1683.

Oct 18 01:20:57 x3650m4 pveproxy[3388]: worker exit

Oct 18 01:20:57 x3650m4 pveproxy[2172]: worker 3388 finished

Oct 18 01:20:57 x3650m4 pveproxy[2172]: starting 1 worker(s)

Oct 18 01:20:57 x3650m4 pveproxy[2172]: worker 3532 started

Oct 18 01:20:57 x3650m4 pveproxy[3532]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1683.

Oct 18 01:20:57 x3650m4 systemd[1]: pve-cluster.service: Service hold-off time over, scheduling restart.

Oct 18 01:20:57 x3650m4 systemd[1]: Stopped The Proxmox VE cluster filesystem.

-- Subject: Unit pve-cluster.service has finished shutting down

-- Defined-By: systemd

-- Support: https://www.debian.org/support

--

-- Unit pve-cluster.service has finished shutting down.

Oct 18 01:20:57 x3650m4 systemd[1]: pve-cluster.service: Start request repeated too quickly.

Oct 18 01:20:57 x3650m4 systemd[1]: Failed to start The Proxmox VE cluster filesystem.

-- Subject: Unit pve-cluster.service has failed

-- Defined-By: systemd

-- Support: https://www.debian.org/support

--

-- Unit pve-cluster.service has failed.

--

-- The result is failed.

Oct 18 01:20:57 x3650m4 systemd[1]: pve-cluster.service: Unit entered failed state.

Oct 18 01:20:57 x3650m4 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.

Oct 18 01:20:57 x3650m4 systemd[1]: corosync.service: Start request repeated too quickly.

Oct 18 01:20:57 x3650m4 systemd[1]: Failed to start Corosync Cluster Engine.

-- Subject: Unit corosync.service has failed

-- Defined-By: systemd

-- Support: https://www.debian.org/support

--

-- Unit corosync.service has failed.

--

-- The result is failed.

Oct 18 01:20:57 x3650m4 systemd[1]: corosync.service: Failed with result 'exit-code'.

Edited

Check the permissions on the file? Often (private) keys need to be only readable by the owner (root), and unreadable by the group and world. Otherwise the software refuses to use them.

Edited

So along with restarting the process and adding the last part, which I forgot I tried

chown root:root /etc/pve/local/pve-ssl.key

chmod 700 /etc/pve/local/pve-ssl.key

but It still doesn't work, I'm not entirely sure how to modify file permissions so perhaps I am wrong with my use of chown and chmod.

More replies
More replies

In the worst case you can still manually backup the vms and Containers. I would try to use vzdump first to create normal backups but if even that dose not work then you can manually copy the configs and disks by hand.

You'll need to run the oven update certs command to force to cluster to read the new cert. I'm on mobile and can't recall the exact command, but Google "remove node from over cluster" it's the last step in the official proxmox guide. It'll force your cluster to see the new certs.

[deleted]

Comment removed by moderator

How can I find the old name?

How can I find which host name is proxmox configured?

More replies