It is very rare and unusual but there are cases when running the Pipeline Pilot installer to upgrade or add collections to an existing installation, the installer shows an error like:
-----
scitegicsetup
-----
Running jobs have been detected.
Please terminate them from the administration portal.
-----
Since the Pipeline Pilot services are not running, there can be no jobs and it is not possible to log in to the Admin Portal.
The reason for this is most likely that some lock files have been left behind in the folder "
Additional information related to the Pipeline Pilot lock files
- When a job starts, the scisvr process writes a file called "
.lck" to the " /temp/lck/" directory. - Each "*.lck" file corresponds to 1 running job and this is how e.g. the Admin Portal figures out how to stop a running job.
- If a scisvr process dies, the *.lck file is left lingering around. There is a clean-up script that is intended to delete stale lck files but in rare cases, this does not remove all files.
- Several parts of the server use these *.lck files to uniquely identify jobs that are processing. This includes the installer, the Admin Portal and the system that tracks whether the maximum number of running jobs has been reached and further launch requests need to be queued.
- Under normal circumstances, when a job finishes, the scisvr process deletes the lock file to indicate the job is complete.
- There are some situations that leaves a lock file behind after a job seems to finish:
- If the scisvr process crashes before the job finishes. In this case, the *.lck file will linger around even though the process has vanished.
- If the protocol had any type of "viewer" or "client side" component present and the UI is disconnected from the job before the job completes.
- The Tomcat service fires a scheduled task to Apache every few seconds that checks each lock file against the current active processes. Any lock file whose pid does not match an active process is deleted.
- In the case of a stalled job where the process is waiting for a client side component in order to continue, there is an active scisvr process and that process will hang around for up to 7 days before terminating.